Cornellius Yudha Wijaya
2025-01-21 08:00:00
www.kdnuggets.com

Image by Author | Midjourney
Let’s learn how to perform grouping and aggregation in Pandas.
Preparation
We would need the Pandas packages installed, so we can install them using the following code:
With the packages installed, let’s jump into the article.
Data Grouping and Aggregation with Pandas
The information in the data can sometimes be too big and complex to consume. That is why we often perform grouping and aggregation to get concise information. A single number or set of values can provide much more detailed information than the whole data set.
Let’s try to perform data grouping. First, we would create a sample dataset.
import pandas as pd
df = pd.DataFrame({
'Fruit': ['Banana', 'Orange', 'Banana', 'Orange', 'Banana'],
'Size': ['Small', 'Small', 'Large', 'Large', 'Small'],
'Price': [100, 150, 200, 50, 300]})
We can use the groupby function to group the data.
It’s also possible to group the data with multiple columns.
df.groupby(['Fruit', 'Size'])
That’s all for data grouping. Now, we would try the aggregation function with the grouped data. For example, we would use multiple columns for each group and try to sum all the values for each group.
df.groupby(['Fruit', 'Size']).sum()
Output:
Price
Fruit Size
Banana Large 200
Small 400
Orange Large 50
Small 150
We can also perform multiple aggregations of our grouped data.
df.groupby(['Fruit', 'Size']).agg(['sum', 'mean', 'count'])
Output:
Price
sum mean count
Fruit Size
Banana Large 200 200.0 1
Small 400 200.0 2
Orange Large 50 50.0 1
Small 150 150.0 1
If required, we can perform different aggregation methods on different columns. We can map them like this.
aggs= {
'Price': ['sum', 'mean'],
'Size': ['count']
}
df.groupby('Fruit').agg(aggs)
Output:
Price Size
sum mean count
Fruit
Banana 600 200.0 3
Orange 200 100.0 2
We can create our aggregation function and use it in the grouped data.
def maxminrange(series):
return series.max() - series.min()
df.groupby('Fruit')['Price'].agg(maxminrange)
Output:
Fruit
Banana 200
Orange 100
That’s how you perform advanced grouping and aggregation. Mastering these techniques will help you immensely during data analysis.
Additional Resources
Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.
Transform your cleaning routine with the Shark AI Ultra Voice Control Robot Vacuum! This high-tech marvel boasts over 32,487 ratings, an impressive 4.2 out of 5 stars, and has been purchased over 900 times in the past month. Perfect for keeping your home spotless with minimal effort, this vacuum is now available for the unbeatable price of $349.99!
Don’t miss out on this limited-time offer. Order now and let Shark AI do the work for you!
Support Techcratic
If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.
Bitcoin Address:
bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge
Please verify this address before sending funds.
Bitcoin QR Code
Simply scan the QR code below to support Techcratic.
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.