Pandas is a powerful data manipulation and analysis library in Python. One of its key features is the DataFrame, which allows you to work with structured data in a tabular format. The append()
method in Pandas enables you to concatenate or combine two or more DataFrames along either axis.
In this guide, we’ll dive into the details of the append()
method, explore its parameters, and provide multiple code examples along with explanations.
Definition: Pandas Dataframe Append Method
The append()
method in Pandas is a powerful tool used for concatenating or combining two or more DataFrames along a specified axis. This method allows for the seamless integration of data, whether it be stacking DataFrames vertically or combining them side by side horizontally.
By understanding its syntax and options, users can efficiently manage and manipulate structured data for a wide range of analytical and computational tasks. Whether it’s for data preprocessing, analysis, or machine learning applications, the append()
method proves to be an indispensable asset in a data scientist’s toolkit.
Syntax
The append()
method in Pandas is used to concatenate two DataFrames. Its syntax is as follows:
DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
Here,
other
: The DataFrame to be appended.ignore_index
: If set toTrue
, the resulting DataFrame will have a new index. Default isFalse
.verify_integrity
: If set toTrue
, it checks for duplicate indices. Default isFalse
.sort
: If set toTrue
, it sorts the resulting DataFrame by columns. Default isFalse
.
Appending Pandas DataFrames Vertically
Appending DataFrames vertically means stacking them one over the other along the rows.
Example 1: Basic Vertical Append
Let’s start with a basic example:
import pandas as pd # Create two sample DataFrames df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]}) # Append df2 to df1 result = df1.append(df2) print(result)
Output:
A B 0 1 3 1 2 4 0 5 7 1 6 8
Example 2: Ignoring Index
# Create two sample DataFrames df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]}) # Append df2 to df1 and ignore index result = df1.append(df2, ignore_index=True) print(result)
Output:
A B 0 1 3 1 2 4 2 5 7 3 6 8
Example 3: Using ignore_index=True
# Create two sample DataFrames df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=[10, 20]) df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]}) # Append df2 to df1 and ignore index result = df1.append(df2, ignore_index=True) print(result)
Output:
A B 0 1 3 1 2 4 2 5 7 3 6 8
Appending DataFrames Horizontally
Appending DataFrames horizontally means concatenating them side by side along the columns.
Example 4: Basic Horizontal Append
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'C': [5, 6], 'D': [7, 8]}) # Append df2 to df1 horizontally result = pd.concat([df1, df2], axis=1) print(result)
Output:
A B C D 0 1 3 5 7 1 2 4 6 8
Example 5: Ignoring Index
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=[10, 20]) df2 = pd.DataFrame({'C': [5, 6], 'D': [7, 8]}) # Append df2 to df1 horizontally and ignore index result = pd.concat([df1, df2], axis=1, ignore_index=True) print(result)
Output:
0 1 2 3 10 1.0 3.0 NaN NaN 20 2.0 4.0 NaN NaN 0 NaN NaN 5.0 7.0 1 NaN NaN 6.0 8.0
Appending with Different Column Names
Example 6: Appending with Different Column Names
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'C': [5, 6], 'D': [7, 8]}) # Append df2 to df1 with different column names result = df1.append(df2, ignore_index=True) print(result)
Output:
A B C D 0 1.0 3.0 NaN NaN 1 2.0 4.0 NaN NaN 2 NaN NaN 5.0 7.0 3 NaN NaN 6.0 8.0
Multiple Dataframes with different Column Names
Let’s now walk ourselves through a more complex scenario where we have multiple DataFrames with different column names and we want to append them while handling missing values.
df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) df2 = pd.DataFrame({'C': [5, 6], 'D': [7, 8]}) df3 = pd.DataFrame({'E': [9, 10], 'F': [11, 12]}) # Append df2 and df3 to df1, handling missing values result = df1.append([df2, df3], ignore_index=True, sort=False) print(result)
Output:
A B C D E F 0 1.0 3.0 NaN NaN NaN NaN 1 2.0 4.0 NaN NaN NaN NaN 2 NaN NaN 5.0 7.0 NaN NaN 3 NaN NaN 6.0 8.0 NaN NaN 4 NaN NaN NaN NaN 9.0 11.0 5 NaN NaN NaN NaN 10.0 12.0
Conclusion
The Pandas dataframe append method is a versatile tool for concatenating DataFrames, either vertically or horizontally. By understanding its usage and options, you can efficiently manage and combine structured data for your analysis or machine-learning tasks. Consider factors like index handling and column names when using this method in your projects.
Must pay a visit to my other articles on Python dataframes to learn about its other methods. Thank you so much for reading this one.