How to convert Pandas DataFrame to List - 8 Code Examples

In this article, we will explore how to convert a Pandas DataFrame to list. We will cover various scenarios and provide detailed code examples along with their outputs and explanations.

Examples explaining Python Dataframe to List

Example 1: Converting a Single Column to a List

import pandas as pd

# Create a sample DataFrame
data = {'Column1': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)

# Convert Column1 to a list
column_list = df['Column1'].tolist()

print(column_list)

Output

[1, 2, 3, 4, 5]

In this example, we create a DataFrame with a single column named ‘Column1’ containing five elements. We then use the tolist() method to convert this column to a list.

Example 2: Converting Multiple Columns to a List

# Create a sample DataFrame
data = {'Column1': [1, 2, 3, 4, 5],
'Column2': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data)

# Convert both columns to a list
column_list = df[['Column1', 'Column2']].values.tolist()

print(column_list)

Output

[[1, 'a'], [2, 'b'], [3, 'c'], [4, 'd'], [5, 'e']]

In this example, we create a DataFrame with two columns (‘Column1’ and ‘Column2’). We use .values.tolist() to convert both columns to a list of lists.

Example 3: Converting Rows to a List

# Create a sample DataFrame
data = {'Column1': [1, 2, 3, 4, 5],
'Column2': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data)

# Convert rows to a list of dictionaries
row_list = df.to_dict(orient='records')

print(row_list)

Output

[{'Column1': 1, 'Column2': 'a'}, {'Column1': 2, 'Column2': 'b'}, {'Column1': 3, 'Column2': 'c'}, {'Column1': 4, 'Column2': 'd'}, {'Column1': 5, 'Column2': 'e'}]

In this example, we convert the rows of the DataFrame into a list of dictionaries using the to_dict() method with the orient='records' parameter.

Example 4: Handling Missing Values

# Create a sample DataFrame with missing values
data = {'Column1': [1, None, 3, 4, 5],
'Column2': ['a', 'b', None, 'd', 'e']}
df = pd.DataFrame(data)

# Convert DataFrame to a list, handling missing values
column_list = df.values.tolist()

print(column_list)

Output

[[1.0, 'a'], [nan, 'b'], [3.0, None], [4.0, 'd'], [5.0, 'e']]

In this example, we create a DataFrame with missing values. The tolist() method handles missing values by converting them to nan.

Example 5: Converting a Specific Range of Rows

# Create a sample DataFrame
data = {'Column1': [1, 2, 3, 4, 5],
'Column2': ['a', 'b', 'c', 'd', 'e']}
df = pd.DataFrame(data)

# Convert rows 2 to 4 to a list of dictionaries
row_list = df.iloc[1:4].to_dict(orient='records')

print(row_list)

Output

[{'Column1': 2, 'Column2': 'b'}, {'Column1': 3, 'Column2': 'c'}, {'Column1': 4, 'Column2': 'd'}]

In this example, we use the iloc indexer to select rows 2 to 4, and then convert them to a list of dictionaries.

Example 6: Converting a DataFrame with DateTime Index

# Create a sample DataFrame with DateTime index
date_rng = pd.date_range(start='2023-10-01', end='2023-10-05')
data = {'Value': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data, index=date_rng)

# Convert DataFrame to a list of tuples
list_of_tuples = list(df.itertuples(index=True, name='PandasDataFrame'))

print(list_of_tuples)

Output

[PandasDataFrame(Index=Timestamp('2023-10-01 00:00:00', freq='D'), Value=10), 
PandasDataFrame(Index=Timestamp('2023-10-02 00:00:00', freq='D'), Value=20), 
PandasDataFrame(Index=Timestamp('2023-10-03 00:00:00', freq='D'), Value=30), 
PandasDataFrame(Index=Timestamp('2023-10-04 00:00:00', freq='D'), Value=40), 
PandasDataFrame(Index=Timestamp('2023-10-05 00:00:00', freq='D'), Value=50)]

In this example, we create a DataFrame with a DateTime index. We then use itertuples() to convert the DataFrame to a list of named tuples.

Example 7: Converting a DataFrame to a Nested Python List

# Create a sample DataFrame
data = {'Column1': [1, 2, 3],
'Column2': ['a', 'b', 'c']}
df = pd.DataFrame(data)

# Convert DataFrame to a nested list
nested_list = df.values.tolist()

print(nested_list)

Output

[[1, 'a'], [2, 'b'], [3, 'c']]

In this final example, we directly convert the DataFrame to a nested list using values.tolist().

Example 8: Converting a Subset of Columns to a List

data = {'Column1': [1, 2, 3, 4, 5],
'Column2': ['a', 'b', 'c', 'd', 'e'],
'Column3': [10.5, 20.2, 15.8, 8.9, 12.1]}
df = pd.DataFrame(data)

# Convert a subset of columns to a list
subset_columns = ['Column1', 'Column3']
subset_list = df[subset_columns].values.tolist()

print(subset_list)

Output

[[1.0, 10.5], [2.0, 20.2], [3.0, 15.8], [4.0, 8.9], [5.0, 12.1]]

In this example, we create a DataFrame with three columns (‘Column1’, ‘Column2’, and ‘Column3’). We are interested in converting only ‘Column1’ and ‘Column3’ to a list.

subset_columns = ['Column1', 'Column3'] defines the subset of columns we want to convert.
df[subset_columns] selects only the specified columns.
.values.tolist() converts the selected columns to a list of lists.

The output is a list of lists where each inner list contains the values of ‘Column1’ and ‘Column3’ for each row.

This example demonstrates how you can selectively choose columns for conversion based on your specific requirements. It’s a handy technique when you’re working with large datasets and only need a portion of the information.

Conclusion

Converting a Pandas DataFrame to list is a common task in data analysis and manipulation. In this article, we covered various scenarios with detailed code examples and explanations. Whether you’re working with single columns, multiple columns, handling missing values, or dealing with DateTime indexes, you now have a solid understanding of how to perform these conversions. Keep experimenting and incorporating these techniques into your data analysis projects. Do pay a visit to my other articles on Pandas Dataframe as well for more information.

Thank you for reading it.

Leave a Comment Cancel Reply

Machine Learning PY