2025-05-18 11:56:00
github.com
Buckaroo is a modern data table for Jupyter that expedites the most common exploratory data analysis tasks. The most basic data analysis task – looking at the raw data, is cumbersome with the existing pandas tooling. Buckaroo starts with a modern performant data table, is sortable, has value formatting, and scrolls infinitely. On top of the core table experience extra features like summary stats, histograms, smart sampling, auto-cleaning, and a low code UI are added. All of the functionality has sensible defaults that can be overridden to customize the experience for your workflow.
Play with Buckaroo without any installation.
Full Tour
run pip install buckaroo
then restart your jupyter server
The following code shows Buckaroo on a simple dataframe
import pandas as pd
import buckaroo
pd.DataFrame({'a':[1, 2, 10, 30, 50, 60, 50], 'b': ['foo', 'foo', 'bar', pd.NA, pd.NA, pd.NA, pd. NA]})
When you run import buckaroo
in a Jupyter notebook, Buckaroo becomes the default display method for Pandas and Polars DataFrames
Buckaroo works in the following notebook environments
jupyter lab
(version >=3.6.0)jupyter notebook
(version >=7.0)- Marimo
VS Code notebooks
(with extra install)- Jupyter Lite
Google colab
Buckaroo works with the following DataFrame libraries
pandas
(version >=1.3.5)polars
optionalgeopandas
optional (deprecated, if you are interested in geopandas, please get in touch)
Buckaroo has extensive docs and tests, the best way to learn about the system is from feature example videos on youtube
The interactive styling gallery lets you see different styling configurations. You can live edit code and play with different configs.
The following examples are loaded into a jupyter lite environment with Buckaroo installed.
The core data grid of buckaroo is based on AG-Grid. This loads 1000s of cells in less than a second, with highly customizable display, formatting and scrolling. Data is loaded lazily into the browser as you scroll, and serialized with parquet. You no longer have to use df.head()
to poke at portions of your data.
By default numeric columns are formatted to use a fixed width font and commas are added. This allows quick visual confirmation of magnitudes in a column.
Histograms for every column give you a very quick overview of the distribution of values, including uniques and N/A.
The summary stats view can be toggled by clicking on the 0
below the Σ
icon. Summary stats are similar to df.describe
and extensible.
All of the data visible in the table (rows shown), is sortable by clicking on a column name, further clicks change sort direction then disable sort for that column. Because extreme values are included with sample rows, you can see outlier values too.
Search is built into Buckaroo so you can quickly find the rwos you are looking for.
Buckaroo has a simple low code UI with python code gen. This view can be toggled by clicking the checkbox below the λ
(lambda) icon.
Select a cleaning method from the status bar. Buckaroo has heuristic autocleaning. The autocleaning system inspects each column and runs statistics to decide if a cleaning methods should be applied to the column (parsing as dates, stripping non integer characters and treating as an integer, parsing implied booleans “yes” “no” to booleans), then adds those cleaning operations to the low code UI. Different cleaning methods can be tried because dirty data isn’t deterministic and there are multiple approaches that could properly apply to any situation.
Buckaroo summary stats are built on the Pluggable Analysis Framework that allows individual summary stats to be overridden, and new summary stats to be built in terms of existing summary stats. Care is taken to prevent errors in summary stats from preventing display of a dataframe.
Buckaroo can automatically clean dataframes to remove common data errors (a single string in a column of ints, recognizing date times…). This feature is in beta. You can access it by invoking buckaroo as BuckarooWidget(df, auto_clean=True)
For a development installation:
git clone https://github.com/paddymul/buckaroo.git
cd buckaroo
#we need to build against 3.6.5, jupyterlab 4.0 has different JS typing that conflicts
# the installable still works in JL4
pip install build twine pytest sphinx polars mypy jupyterlab==3.6.5 pandas-stubs geopolars pyarrow
pip install -ve .
Enabling development install for Jupyter notebook:
Enabling development install for JupyterLab:
jupyter labextension develop . --overwrite
Note for developers: the --symlink
argument on Linux or OS X allows one to modify the JavaScript code in-place. This feature is not available with Windows.
`
There are a series of examples of the components in examples/ex.
Instructions
cd buckaroo
uv venv
source ~/buckaroo/.venv/bin/activate
uv sync -q
cd ~/buckaroo
uv add $PACKAGE_NAME
cd ~/buckaroo
uv add --group $GROUP_NAME --quiet $PACKAGE_NAME
update CHANGELOG.md
git commit -m "updated changelog for release $VERSION_NUMBER"
git tag $VERSION_NUMBER # no leading v in the version number
git push origin tag $VERSION_NUMBER
navigate to create new buckaroo release
Follow instructions
We ❤️ contributions.
Have you had a good experience with this project? Why not share some love and contribute code, or just let us know about any issues you had with it?
We welcome issue reports; be sure to choose the proper issue template for your issue, so that we can be sure you’re providing the necessary information.
Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.
Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!
Help Power Techcratic’s Future – Scan To Support
If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.
As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!
BITCOIN bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge Scan the QR code with your crypto wallet app |
DOGECOIN D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA Scan the QR code with your crypto wallet app |
ETHEREUM 0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a Scan the QR code with your crypto wallet app |
Please read the Privacy and Security Disclaimer on how Techcratic handles your support.
Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.