Hi everyone, welcome back. Pandas is a library for the Python programming language and is commonly used for data science purposes. Pandas can refer to both “Panel Data” and “Python Data Analysis”. Pandas is used specifically for working with data sets and provides various functions and support regarding data. In order to use Pandas on your machine, you will need Python and Pandas installed and ready to go. Getting started with Pandas can be found here.
Pandas makes many data-related tasks easy for developers and data scientists. Pandas makes this possible by using dataframes and manipulating them. The structure of Pandas dataframes are similar to that of a 2D array or a table.
Creating DataFrames
Let’s see an example of how to create a dataframe:
import pandas as pd
data = {
'drinks': ["coke", "juice", "water"],
'food': ["soup", "salad", "sandwich"]
}
myDataFrame = pd.DataFrame(data)
print(myDataFrame)Output: drinks food
0 coke soup
1 juice salad
2 water sandwich
As you can see in our output, we have displayed our data frame. Our output provides us with our data as well as our headers and index numbers of our items.
Loc Property
Now let’s say that we only want to get one row of items. We can do this by using the “loc” property and specifying a row index.
import pandas as pd
data = {
'drinks': ["coke", "juice", "water"],
'food': ["soup", "salad", "sandwich"]
}
myDataFrame = pd.DataFrame(data)
print(myDataFrame.loc[0])Output:drinks coke
food soup
Name: 0, dtype: object
Our output looks a bit different now. Our headers are no longer at the top but instead on the left side with the associated item next to it.
What do we do if we want to get more than one row of items within a dataframe? We can pass in a list of indexes with the “loc” property:
import pandas as pd
data = {
'drinks': ["coke", "juice", "water"],
'food': ["soup", "salad", "sandwich"]
}
myDataFrame = pd.DataFrame(data)
print(myDataFrame.loc[[0, 1]])