Data Manipulation

Reading and Writing Data

Inspecting Data

  • head() - Return the first n rows of a DataFrame

  • tail() - Return the last n rows of a DataFrame

  • info() - Print a concise summary of a DataFrame

  • describe() - Generate descriptive statistics of a DataFrame

Handling Missing Data

  • dropna() - Drop missing values from a DataFrame

  • fillna() - Fill missing values in a DataFrame

Grouping and Aggregating

  • groupby() - Group a DataFrame by one or more columns

  • pivot_table() - Create a pivot table based on the DataFrame

Merging and Joining

  • merge() - Merge two DataFrames based on one or more keys

  • join() - Join columns of another DataFrame

Applying Functions

  • apply() - Apply a function along an axis of the DataFrame

  • map() - Apply a mapping correspondence to a DataFrame

  • replace() - Replace values in a DataFrame

Data Selection
and Indexing

Selecting by Label

  • loc[] - Select rows and columns by label

Selecting by Integer Position

  • iloc[] - Select rows and columns by integer position

Accessing Single Values

  • at - Access a single value for a label

  • iat - Access a single value for a position

Selecting Single Rows/Columns

  • xs() - Select a single row/column by name

Retrieving Items

  • get() - Retrieve item by key (or a default)

Checking Membership

  • isin() - Check whether each element is contained in the values provided

Data Analysis

Descriptive Statistics

  • sum() - Return the sum of the values

  • mean() - Return the mean of the values

  • median() - Return the median of the values

  • std() - Return the standard deviation of the values

  • var() - Return the variance of the values

  • min() - Return the minimum of the values

  • max() - Return the maximum of the values

Quantitative Analysis

  • count() - Return the count of non-NA/null values

  • quantile() - Return values at the given quantile

Correlation and Covariance

  • corr() - Compute pairwise correlation of columns

  • cov() - Compute pairwise covariance of columns

Plotting Functions

plot() - Make plots of DataFrame using underlying matplotlib functionality

hist() - Make histogram of the DataFrame's columns

scatter() - Make scatter plot of DataFrame's columns

boxplot() - Make box plot of DataFrame's columns