Pandas is the powerful python data analysis toolkit.
=> it is the open source,fast and efficient Dataframe object for data manipulation
=> Reading and writing data structure and fifferent formate:csv,tsv,XML,json,zip e.t.c
=> Data pre-processing used pandas for missing values e.t.c
Pandas Data Structure
.Series
it is one dimensional labeled homogenous array.
| apples |
| 3 |
| 2 |
| 0 |
| 1 |
| oranges |
| 0 |
| 3 |
| 7 |
| 2 |
.Data frame
it is 2-dimensional labeled heterogenous tabular structure.
series+series=Data frame
| apple | oranges |
| 3 | 0 |
| 2 | 3 |
| 0 | 7 |
| 1 | 2 |
.Panel
it is 3D labeled array.
so , pandas can read and write 3 types of data structure.
note:Numpy array is used for the implementation of pandas data object.
Installation and import pandas
pip install pandas
import pandas as pd
You can check version of pandas as below:

Let's Learn Each Data Structure of pandas :
1.Series
import pandas as pd
data=[1,'one',-10,3.2,"Nepal"]
s1=pd.Series(data)
print(s1)
print(type(s1))
# for empty Series
empty_series=pd.Series([])
print(empty_series)
country_series=pd.Series(["Nepal","Australia","india","England"])
print(country_series)
output:
0 1
1 one
2 -10
3 3.2
4 Nepal
dtype: object
<class 'pandas.core.series.Series'>
Series([], dtype: object)
0 Nepal
1 Australia
2 india
3 England
dtype: object
import pandas as pd
s3=pd.Series([1,2,3,4,5],index=['a','b','c','d','e'])
# note index must be wqual to the number of elements in series
print(s3)
s4=pd.Series([1,2,3,4],index=['a','b','c','d'],dtype=float)
print(s4,"\n")
s5=pd.Series(0.5)
print(s5 ,"\n")
s6=pd.Series(0.5,index=[1,2,3])
print(s6,"\n")
s7=pd.Series({"a":1,"b":2,"c":3})
# you can also create Series using dictionary
print(s7,"\n")
output:
a 1
b 2
c 3
d 4
e 5
dtype: int64
a 1.0
b 2.0
c 3.0
d 4.0
dtype: float64
0 0.5
dtype: float64
1 0.5
2 0.5
3 0.5
dtype: float64
2.DataFrame
Pandas Dataframe is two-dimensional,size-mutable,potentially heterrogenous tabular data structure with labeled axes(row and colums).
import pandas as pd
empty_df=pd.DataFrame()
print(empty_df,'\n')
lst=['a','b','c']
df1=pd.DataFrame(lst)
print(df1,'\n')
lst2=[[1,2,3],[4,5,6],[7,8,9]]
df2=pd.DataFrame(lst2)
print(df2,'\n')
output:
Empty DataFrame
Columns: []
Index: []
0
0 a
1 b
2 c
0 1 2
0 1 2 3
1 4 5 6
2 7 8 9
Now we sre going to create Datframe from list
import pandas as pd
dict1={"ID":[1,2,3,4,5]}
df3=pd.DataFrame(dict1)
print(df3,'\n')
dict2={"ID":[1,2,3,4,5],"SN":[6,7,8,9,10]}
#note size of list must be same
df4=pd.DataFrame(dict2)
print(df4,'\n')
# now creating with list of dictionary
li_dict=[{"a":1,"b":2},{"a":5,"b":5,"c":10}]
df5=pd.DataFrame(li_dict)
print(df5,'\n')
#note if the dictionary size is unequal then pandas will manage it by eeplacing it by NaN
#now make dataframe from dictionary of Series
dic_series={
"ID":pd.Series([1,2,3,4,5]),
"SN":pd.Series([6,7,8,9,10])
}
df6=pd.DataFrame(dic_series)
print(df6,'\n')
#output:
ID
0 1
1 2
2 3
3 4
4 5
ID SN
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
a b c
0 1 2 NaN
1 5 5 10.0
ID SN
0 1 6
1 2 7
2 3 8
3 4 9
4 5 10
We can create DataFrame zip(),list of tuple e.t.c
CSV File:
CSV: it is the extension of the file . it's full form is "comma separated values". csv formate is based to store data in tabular formate.
Advantages of CSV file:
1.universal
2.easy to understand
3.quick to create
How to Read csv file in pandas
import pandas as pd
import os
# print(help(pd.read_csv))
#you can learn more from above statement
cwd=os.getcwd()
# df=pd.read_csv('test.csv')
df=pd.read_csv(f'{cwd}/test.csv')
print(type(df))
print(df.columns)
#output:
<class 'pandas.core.frame.DataFrame'>
Index(['id', 'name', 'address'], dtype='object')
df=pd.read_csv('locations',nrows=1)
# to read 1st row data
df1=pd.read_csv('locations',usecols=[0])
# to read 0 index column data
df1=pd.read_csv('locations',usecols=[0,1])
# to read 0 and 1 index column data
df1=pd.read_csv('locations',usecols=[1,3,5])
# to read 1,3,5 index column data
#if you want to skip the rows then
df1=pd.read_csv('locations',skiprows=1)
#it will skip every one row while reading data from file
'''
if you want to skip any specific row's then, you need to write index of row
inside the list'''
df1=pd.read_csv('locations',skiprows=[0,5])
'''
if you want to make any column as the first row, by default 0,1,2 , if youu want
to remove this used index_col=''
'''
df1=pd.read_csv('locations',index_col='ID')
#or you want directly give index value as index_col=2
#header,prefix and names
df1=pd.read_csv('test.csv',header=1)
#any index number you wan to give header , you can used header=None as well
df2=pd.read_csv('test.csv')
df2.columns = ['Columns' + str(col) for col in df2.columns]
'''
if you want specific name for each columns then used names
'''
df3=pd.read_csv('test.csv',header=0,names=['sn','name','address'])
print(df3)
How to Write csv file in pandas
pandas write csv file is mainly used for data processing - to clean raw data and find some useful instance.
Amanda Martines 5 days ago
Exercitation photo booth stumptown tote bag Banksy, elit small batch freegan sed. Craft beer elit seitan exercitation, photo booth et 8-bit kale chips proident chillwave deep v laborum. Aliquip veniam delectus, Marfa eiusmod Pinterest in do umami readymade swag. Selfies iPhone Kickstarter, drinking vinegar jean.
ReplyBaltej Singh 5 days ago
Drinking vinegar stumptown yr pop-up artisan sunt. Deep v cliche lomo biodiesel Neutra selfies. Shorts fixie consequat flexitarian four loko tempor duis single-origin coffee. Banksy, elit small.
ReplyMarie Johnson 5 days ago
Kickstarter seitan retro. Drinking vinegar stumptown yr pop-up artisan sunt. Deep v cliche lomo biodiesel Neutra selfies. Shorts fixie consequat flexitarian four loko tempor duis single-origin coffee. Banksy, elit small.
Reply