In this article, we’ll discuss the creation of a Python Pandas DataFrame with different data in detail with the help of practical Python code examples and their explanations.
What is DataFrame in Python Pandas?
- It’s a 2D(two-dimensional) data structure having labels.
- It has rows and columns. Rows have index labels(integers or labels) and by using that, we can easily change/modify, or access the data.
- We can change the size of the data frame as its size is mutable. It allows heterogeneous tabular data structures.
Syntax of Dataframe Pandas
pd.DataFrame(a)
We can pass a Python dictionary, list, etc. to the data frame function of pandas to create a Data frame from it. If we don’t pass anything, then an empty data frame will be generated.
7 Examples Implementing Python Pandas Dataframe
Ex.1 Creating an Empty DataFrame
import pandas as pd # pip install pandas (run it in your terminal in case pandas is not installed) df= pd.DataFrame() print(df)
Output
Empty DataFrame Columns: [] Index: []
No parameters have been passed to the data frame method so it has created an empty dataframe with empty columns and index.
Ex.2 Creating a Pandas DataFrame with a single column using List
dfL=pd.DataFrame([2,5,3]) print(dfL)
Output
Note: In Jupyter Notebook, we can show/print the data using the variable name only as well for better visualization.
Ex.3 Creating a Pandas DataFrame with multiple columns using List
dfNL=pd.DataFrame([[2,78,5],[3,34,2],[4,21]]) print(dfNL)
Output
0 1 2 0 2 78 5.0 1 3 34 2.0 2 4 21 NaN
We created a nested Python list and a dataframe with multiple columns is created from it. Also, we can see that the last nested list has one item missing so in the data frame, it’s specified as NaN(not a number).
Ex.4 Using Dictionary to Create Pandas Dataframe
dc1={'name':['Zeeshan','Furqan','Saqib']} dfDc1=pd.DataFrame(dc1) print(dfDc1)
Output
name 0 Zeeshan 1 Furqan 2 Saqib
The key of the dictionary is used as a column label and the values as data values. Let’s now give it multiple keys and see the output:
dc2={'name':['Zee','Furqan'],'age':['26','16']} dfDc2=pd.DataFrame(dc2) print(dfDc2)
Output
name age 0 Zee 26 1 Furqan 16
Just keep in mind that the array of values shape should be the same. See below:
dc3={'name':['Usman','Zain'],'age':['18']} dfR=pd.DataFrame(dc3) print(dfR)
Output
ValueError: All arrays must be of the same length
It’ll raise this value error exception as the shape is not the same. You can fix it by giving a scalar value to the ‘age’ like just ’18’ without square brackets(‘age’:’18’) and the output will be like this:
name age 0 Usman 18 1 Zain 18
Or you can just specify another item in the array passed to the key ‘age’ to fix this error as this is the appropriate solution for it.
Ex.5 List of Dictionaries to Create Python Pandas Dataframe
dLst=[{'country':'Pakistan','city':'Havelian'},{'country':'Afghanistan','city':'Kabul'}] dfRes=pd.DataFrame(dLst) print(dfRes)
Output
country city 0 Pakistan Havelian 1 Afghanistan Kabul
We created a list of Python dictionaries and passed it as an argument to the data frame method. The result shows a data frame with items in the same column in which the keys are the same. If the keys are not the same, then new columns will be created.
Let’s try adding a value in one of the key/value pairs. See below:
'code':'22500' # we added a key-value pair in the first dictionary. You can modify the second or other one or well.
Output
country city code 0 Pakistan Havelian 22500 1 Afghanistan Kabul NaN
You can see that a new column has been created with the specified value used.
Ex.6 Dataframe from dictionary having Series Values
dctSeries={'Roll no.':pd.Series([433,678,231,765]), 'Age':pd.Series([21,19,26])} dfResult=pd.DataFrame(dctSeries) print(dfResult)
Output
Roll no. Age 0 433 21.0 1 678 19.0 2 231 26.0 3 765 NaN
In this code, we created a dictionary and passed the pandas series as values to it. The result shows that we can easily create a data frame from a dictionary having series as values as well.
Ex.7 Using a List of Python Tuples to Create DataFrame
lstTuples=[(6,5,4,3,4),(2,1,3,9,8),('a','b','c','d','r')] dfT=pd.DataFrame(lstTuples) print(dfT)
Output
0 1 2 3 4 0 6 5 4 3 4 1 2 1 3 9 8 2 a b c d r
It shows that we can pass a list of Python tuples to the data frame method as well in order to create a data frame.
Your Task
Do try it with other data and post the output in the comment section. Do ask if you find any confusion/error while creating Pandas data frame. We’d be happy to resolve it.
Images of the discussed Python Code Examples
Conclusion
In conclusion, we’re sure this article has increased your knowledge of how to properly create a Python pandas data frame using various types of data. We hope the code examples used in this article will help you a lot in understanding the creation of a pandas data frame.
Do visit our other articles for other methods of Python Pandas. In the later articles, we’ll explain the reading and visualization of CSV files and more, so stay tuned. Thank you for reading it.