Today we will deal with two techniques we need to cover before moving to more advanced Pandas topics: Reindexing and element deletion.
It will be a bit shorter than the first two articles in the series, but that doesn't mean it's not important. Both techniques are very useful, and you will probably use them in your day-to-day work if you become a Pandas practitioner.
Good, let's get started!
Reindexing is a fancy word for creating a new dataframe/series with an altered index.
import pandas as pd ser = pd.Series([2,1,3,4,7,6,5], index=['b', 'a', 'c', 'd', 'g', 'f', 'e']) print(ser)
b 2 a 1 c 3 d 4 g 7 f 6 e 5 dtype: int64
The reindex function receives a list of index elements and creates a new dataframe (or series) in which the rows/elements follow the order specified in that list.
For example, we can create a new series where the numbers are ordered in ascending order by providing the following input for reindex:
ordered_ser = ser.reindex(['a', 'b', 'c', 'd', 'e', 'f', 'g']) print(ordered_ser)
a 1 b 2 c 3 d 4 e 5 f 6 g 7 dtype: int64
You don't need to pass every element in the original index, you can provide a list with only the elements you need:
# This will create a new dataframe with the last four elements, in descending order ordered_ser = ser.reindex(['g', 'f', 'e', 'd']) print(ordered_ser)
g 7 f 6 e 5 d 4 dtype: int64
Sometimes you want to reindex the series/dataframe to expand the range of elements. In this case, you will probably find that some of the elements are set to NaN:
ser = pd.Series(['azul', 'rojo', 'verde'], index=[0,4,8]) ser.reindex(range(12))
0 azul 1 NaN 2 NaN 3 NaN 4 rojo 5 NaN 6 NaN 7 NaN 8 verde 9 NaN 10 NaN 11 NaN dtype: object
# In this case, you can specify a fill method to dictate what will happen to the empty entries # ffill, for example, performs a forward fill ser.reindex(range(12), method='ffill')
0 azul 1 azul 2 azul 3 azul 4 rojo 5 rojo 6 rojo 7 rojo 8 verde 9 verde 10 verde 11 verde dtype: object
Frames behave pretty much the same way, but they also let you reindex by column. Let's take a look at a final reindexing example using a dataframe:
import numpy as np frame = pd.DataFrame(np.arange(16).reshape(4,4), index = ['First', 'Second', 'Third', 'Fourth'], columns = ['Alpha', 'Beta', 'Gamma', 'Delta']) frame
# We can reindex using the row index frame.reindex(['Fourth', 'Second'])
# Or, reindex using the columns frame.reindex(columns=['Alpha', 'Gamma'])
Now we will learn how to remove elements from both series and dataframes. This is usually achieved using the drop method.
Note that calls to drop don't alter the original series/dataframe. Instead, they return a new one without the specified elements. If for some reason you need to alter the original series/dataframe, you can pass
inplace=True as an argument.
ser = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd']) print(ser)
a 1 b 2 c 3 d 4 dtype: int64
# You can pass to drop the index value of the element you want to delete ser.drop('b')
a 1 c 3 d 4 dtype: int64
# You can also pass a list of index values ser.drop(['a', 'c'])
b 2 d 4 dtype: int64
Dataframes let you drop elements using both the row index and the column index.
# Let's drop the second and fourth rows frame.drop(['Second', 'Fourth'])
# If you add an additional argument set to axis='columns' (or axis=1) you will drop using the column index # Let's get rid of the Alpha and Beta columns frame.drop(['Alpha', 'Beta'], axis='columns')
When exploring data, you will need to alter indexes and delete rows with elements you don't need. As with all previous articles, I'd like to encourage you to practice these techniques on your own until you feel comfortable with them.
In the next article, we will learn how to perform arithmetic operations with dataframes and series.
Thank you for reading!
What to do next
- Share this article with friends and colleagues. Thank you for helping me reach people who might find this information useful.
- You can find the source code for this series in this repo.
- This article is based on Python for Data Analysis. These and other very helpful books can be found in the recommended reading list.
- Send me an email with questions, comments or suggestions (it's in the About Me page)