Converting Pandas DataFrame to JSON [8 Code Examples]

In the world of data manipulation and analysis, Pandas is a powerhouse library in Python. It provides easy-to-use data structures and functions to work with structured data. One common task is converting a Python Pandas DataFrame to JSON format. This detailed article will walk you through the process and provide seven complex code examples, each with different values and outputs, to illustrate various scenarios.

Pandas DataFrame to JSON: Basics

The to_json() method in Pandas allows us to convert a DataFrame to a JSON string. It provides various options to customize the output format.

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['John', 'Jane', 'Jim'],
'Age': [30, 25, 35]}
df = pd.DataFrame(data)

# Convert DataFrame to JSON
json_string = df.to_json()
print(json_string)

Output

{"Name":{"0":"John","1":"Jane","2":"Jim"},"Age":{"0":30,"1":25,"2":35}}

In this example, the DataFrame df is converted to a JSON string using to_json(). By default, this method converts the entire DataFrame.

Complex Code Examples Converting Pandas Dataframe to JSON

Example 1: Simple Conversion

# Sample DataFrame
data = {'Name': ['John', 'Jane', 'Jim'],
'Age': [30, 25, 35]}
df = pd.DataFrame(data)

# Convert DataFrame to JSON
json_string = df.to_json()
print(json_string)

Output

{"Name":{"0":"John","1":"Jane","2":"Jim"},"Age":{"0":30,"1":25,"2":35}}

In this example, the entire DataFrame is converted to a JSON object. Column names become keys, and the values are stored as an array.

Example 2: Handling Date Formats

# Sample DataFrame with dates
data = {'Name': ['John', 'Jane', 'Jim'],
'Birthdate': pd.to_datetime(['1990-05-20', '1995-10-15', '1985-03-10'])}
df = pd.DataFrame(data)

# Convert DataFrame to JSON with date format
json_string = df.to_json(date_format='iso')
print(json_string)

Output

{"Name":{"0":"John","1":"Jane","2":"Jim"},"Birthdate":{"0":"1990-05-20T00:00:00.000","1":"1995-10-15T00:00:00.000","2":"1985-03-10T00:00:00.000"}}

Here, we have a DataFrame with a ‘Birthdate’ column. By using the date_format='iso' option, the dates are formatted in ISO 8601 standard.

Example 3: Dealing with NaN Values

# Sample DataFrame with NaN values
data = {'Name': ['John', 'Jane', 'Jim'],
'Salary': [50000, None, 60000]}
df = pd.DataFrame(data)

# Replace NaN values with 'null' and then convert to JSON
df = df.fillna('null')
json_string = df.to_json()
print(json_string)

Output

{"Name":{"0":"John","1":"Jane","2":"Jim"},"Salary":{"0":50000.0,"1":"null","2":60000.0}}

In this example, the DataFrame contains NaN values. We use the na='null' option to convert them to null in the resulting JSON.

Example 4: Nested JSON Structures

# Sample DataFrame with nested data
data = {'Name': ['John', 'Jane', 'Jim'],
'Details': [{'City': 'New York', 'State': 'NY'},
{'City': 'San Francisco', 'State': 'CA'},
{'City': 'Seattle', 'State': 'WA'}]}
df = pd.DataFrame(data)

# Convert DataFrame to JSON with nested structure
json_string = df.to_json(orient='records')
print(json_string)

Output

[{"Name":"John","Details":{"City":"New York","State":"NY"}},{"Name":"Jane","Details":{"City":"San Francisco","State":"CA"}},{"Name":"Jim","Details":{"City":"Seattle","State":"WA"}}]

In this example, the DataFrame contains a nested dictionary in the ‘Details’ column. The orient='records' option creates a list of records, each with a nested structure.

Example 5: Custom Formatting

# Sample DataFrame with custom formatting
data = {'Name': ['John', 'Jane', 'Jim'],
'Salary': [50000, 60000, 55000]}
df = pd.DataFrame(data)

# Convert DataFrame to JSON with custom formatting
json_string = df.to_json(orient='split', date_format='iso')
print(json_string)

Output

{"columns":["Name","Salary"],"index":[0,1,2],"data":[["John",50000],["Jane",60000],["Jim",55000]]}

Here, we use the orient='split' option to format the JSON with keys for columns, index, and data.

Example 6: Selecting Specific Columns

# Sample DataFrame with specific columns
data = {'Name': ['John', 'Jane', 'Jim'],
'Age': [30, 25, 35],
'Salary': [50000, 60000, 55000]}
df = pd.DataFrame(data)

# Convert specific columns to JSON
json_string = df[['Name', 'Salary']].to_json(orient='records')
print(json_string)

Output

[{"Name":"John","Salary":50000},{"Name":"Jane","Salary":60000},{"Name":"Jim","Salary":55000}]

In this example, we select specific columns (‘Name’ and ‘Salary’) and convert them to JSON.

Example 7: Handling Large DataFrames

# Create a large sample DataFrame
data = {'ID': range(1, 10001),
        'Value': range(10001, 20001)}
df = pd.DataFrame(data)
# Convert DataFrame to JSON with split
json_string = df.to_json(orient='split')
print(json_string[:200])  # Print the first 200 characters for demonstration

Output

{"columns":["ID","Value"],"index":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,5

In this example, we create a large DataFrame (10,000 rows) and convert it to JSON using the orient='split' option. Note that we only print the first 200 characters of the JSON for demonstration purposes.

Example 8: Dynamically Converting Multiple DataFrames to JSON

data_1 = {'Name': ['John', 'Jane', 'Jim'],
'Age': [30, 25, 35]}
data_2 = {'Name': ['Jack', 'Jill', 'Jake'],
'Age': [28, 23, 33]}
df_1 = pd.DataFrame(data_1)
df_2 = pd.DataFrame(data_2)

# Store DataFrames in a list
dfs = [df_1, df_2]

# Convert each DataFrame to JSON using a for loop
json_data = {}
for i, df in enumerate(dfs, 1):
  json_data[f'DataFrame_{i}'] = df.to_json()

print(json_data)

Output

{'DataFrame_1': '{"Name":{"0":"John","1":"Jane","2":"Jim"},"Age":{"0":30,"1":25,"2":35}}', 'DataFrame_2': '{"Name":{"0":"Jack","1":"Jill","2":"Jake"},"Age":{"0":28,"1":23,"2":33}}'}

In this example, we create two sample DataFrames (df_1 and df_2). We then store them in a list dfs. Using a for loop, we iterate through the list and convert each DataFrame to JSON. The resulting JSON strings are stored in a dictionary with keys 'DataFrame_1' and 'DataFrame_2'.

Example 9: Exporting Multiple DataFrames to Separate JSON Files

# Creating sample DataFrames
data_1 = {'Name': ['John', 'Jane', 'Jim'],
          'Age': [30, 25, 35]}
data_2 = {'Name': ['Jack', 'Jill', 'Jake'],
          'Age': [28, 23, 33]}

df_1 = pd.DataFrame(data_1)
df_2 = pd.DataFrame(data_2)

# Store DataFrames in a list
dfs = [df_1, df_2]

# Export each DataFrame to a separate JSON file using a for loop
for i, df in enumerate(dfs, 1):
    json_filename = f'dataframe_{i}.json'
    df.to_json(json_filename, orient='records')

print("JSON files exported successfully.")

Output

JSON files exported successfully.

Data of df_1

[{"Name":"John","Age":30},{"Name":"Jane","Age":25},{"Name":"Jim","Age":35}]

Data of df_2

[{"Name":"Jack","Age":28},{"Name":"Jill","Age":23},{"Name":"Jake","Age":33}]

In this example, we have two sample DataFrames (df_1 and df_2). Similar to the previous example, we store them in a list dfs. Using a Python for loop, we iterate through list and export each DataFrame to a separate JSON file. The JSON files will be named dataframe_1.json and dataframe_2.json.

Conclusion

Converting Python Pandas DataFrames to JSON is a crucial skill in data manipulation and export. With this to_json() method, you can customize the output to suit your specific needs. These examples cover a range of scenarios, from basic conversions to handling complex data structures and large DataFrames. Mastering these techniques will empower you to easily work with data in various formats.

Other must-visiting articles on Pandas dataframe are waiting for you on this site. Thank you for reading this one.

Leave a Comment

Your email address will not be published. Required fields are marked *