• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Monday, June 9, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

    Artificial Intelligence

    Deploy Amazon SageMaker Projects with Terraform Cloud

  • Crypto
    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto Lost $1.64 Billion to Hackers in Q1 2025

    Why Is Crypto Down Today? – June 9, 2025

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    Bitcoin Bull Cycle is Over: CryptoQuant CEO

    CEX Volumes Hit 2020 Lows as Market Shifts to HODL Mode

    Central African Republic to Launch Tokenized Land Sales on Solana

    Central African Republic to Launch Tokenized Land Sales on Solana

    XRP Added to Nasdaq Crypto US Settlement Price Index

    XRP Added to Nasdaq Crypto US Settlement Price Index

    Cetus Protocol Relaunches After $220M Hack, Restores Liquidity

    Cetus Protocol Relaunches After $220M Hack, Restores Liquidity

  • Cybersecurity
    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

    Cybersecurity

    HPE Issues Security Patch for StoreOnce Bug Allowing Remote Authentication Bypass

    Cybersecurity

    Critical 10-Year-Old Roundcube Webmail Bug Allows Authenticated Users Run Malicious Code

    Cybersecurity

    Android Trojan Crocodilus Now Active in 8 Countries, Targeting Banks and Crypto Wallets

    Cybersecurity

    Microsoft and CrowdStrike Launch Shared Threat Actor Glossary to Cut Attribution Confusion

  • Deals
    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    RoboCop Rogue City (PS5)

    RoboCop Rogue City (PS5)

    My Universe: Puppies and Kittens – PlayStation 4

    My Universe: Puppies and Kittens – PlayStation 4

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    5-in-1 Memory Card Reader, USB OTG Adapter & SD Card Reader for i-Phone/i-Pad, USB C and…

    5-in-1 Memory Card Reader, USB OTG Adapter & SD Card Reader for i-Phone/i-Pad, USB C and…

    Samsung Galaxy A23 5G SM-A236U Factory Unlocked 64GB Black (Renewed)

    Samsung Galaxy A23 5G SM-A236U Factory Unlocked 64GB Black (Renewed)

    ORICO USB3.0 External 4 Bay Hard Drive Enclosure for 2.5 3.5 inch SSD HDD with…

    ORICO USB3.0 External 4 Bay Hard Drive Enclosure for 2.5 3.5 inch SSD HDD with…

  • Gaming
    The Callisto Protocol showed me what makes a GOOD GAME (Raptor Review)

    The Callisto Protocol showed me what makes a GOOD GAME (Raptor Review)

    REDRAGON S101 GAMING KEYBOARD

    Sony’s fast InZone M10S QD-OLED gaming monitor is over $300 off

    starship troopers: extermination, 2

    starship troopers: extermination, 2

    As Dusk Falls – Part 5 | XBOX SERIES X |

    As Dusk Falls – Part 5 | XBOX SERIES X |

    DIABLO 4: VESSEL OF HATRED SUCKS! (Review after 100%)

    DIABLO 4: VESSEL OF HATRED SUCKS! (Review after 100%)

    How to pay your taxes in Dune: Awakening

    How to pay your taxes in Dune: Awakening

    Sonic Frontiers  (Nintendo Switch Review)

    Sonic Frontiers (Nintendo Switch Review)

    Review on Gotham Knights – 50+ Hours on Hard Mode

    Review on Gotham Knights – 50+ Hours on Hard Mode

    Gungrave G.O.R.E. 12 – Labyrinth – Stage 11 – Hong Kong Back Alleys Area 1 | Easy | No Comm | Plat

    Gungrave G.O.R.E. 12 – Labyrinth – Stage 11 – Hong Kong Back Alleys Area 1 | Easy | No Comm | Plat

  • Tesla
    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    CAR GUYS Tire Shine Spray | High Gloss & Satin Finish | Non-Greasy, UV Protection,…

    CAR GUYS Tire Shine Spray | High Gloss & Satin Finish | Non-Greasy, UV Protection,…

    7PCS Center Console Organizer Tray for 2024 2025 Tesla Model 3 Accessories, Armrest…

    7PCS Center Console Organizer Tray for 2024 2025 Tesla Model 3 Accessories, Armrest…

    Spigen Door Striker and Hinge Cover Designed for Tesla Model 3 & Y [Compatible with…

    Spigen Door Striker and Hinge Cover Designed for Tesla Model 3 & Y [Compatible with…

  • UFO
    NASA UAP Researchers Share Shocking UFO Evidence!

    NASA UAP Researchers Share Shocking UFO Evidence!

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    Roswell: The After-Action Report

    Roswell: The After-Action Report

    Alien UFO theories: AskReddit #ufo #alien #extraterrestrial #askreddit #reddit #creepystories #scary

    Alien UFO theories: AskReddit #ufo #alien #extraterrestrial #askreddit #reddit #creepystories #scary

    Resident Alien: Season One [DVD]

    Resident Alien: Season One [DVD]

    Close Up Zoom Moon 1000x beatiful view #SHORTVIDEO

    Close Up Zoom Moon 1000x beatiful view #SHORTVIDEO

    Black Triangle UFO: The Truth Behind the TR-3B Mystery

    Black Triangle UFO: The Truth Behind the TR-3B Mystery

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

    Artificial Intelligence

    Deploy Amazon SageMaker Projects with Terraform Cloud

  • Crypto
    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto Lost $1.64 Billion to Hackers in Q1 2025

    Why Is Crypto Down Today? – June 9, 2025

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    Bitcoin Bull Cycle is Over: CryptoQuant CEO

    CEX Volumes Hit 2020 Lows as Market Shifts to HODL Mode

    Central African Republic to Launch Tokenized Land Sales on Solana

    Central African Republic to Launch Tokenized Land Sales on Solana

    XRP Added to Nasdaq Crypto US Settlement Price Index

    XRP Added to Nasdaq Crypto US Settlement Price Index

    Cetus Protocol Relaunches After $220M Hack, Restores Liquidity

    Cetus Protocol Relaunches After $220M Hack, Restores Liquidity

  • Cybersecurity
    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

    Cybersecurity

    HPE Issues Security Patch for StoreOnce Bug Allowing Remote Authentication Bypass

    Cybersecurity

    Critical 10-Year-Old Roundcube Webmail Bug Allows Authenticated Users Run Malicious Code

    Cybersecurity

    Android Trojan Crocodilus Now Active in 8 Countries, Targeting Banks and Crypto Wallets

    Cybersecurity

    Microsoft and CrowdStrike Launch Shared Threat Actor Glossary to Cut Attribution Confusion

  • Deals
    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    RoboCop Rogue City (PS5)

    RoboCop Rogue City (PS5)

    My Universe: Puppies and Kittens – PlayStation 4

    My Universe: Puppies and Kittens – PlayStation 4

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    5-in-1 Memory Card Reader, USB OTG Adapter & SD Card Reader for i-Phone/i-Pad, USB C and…

    5-in-1 Memory Card Reader, USB OTG Adapter & SD Card Reader for i-Phone/i-Pad, USB C and…

    Samsung Galaxy A23 5G SM-A236U Factory Unlocked 64GB Black (Renewed)

    Samsung Galaxy A23 5G SM-A236U Factory Unlocked 64GB Black (Renewed)

    ORICO USB3.0 External 4 Bay Hard Drive Enclosure for 2.5 3.5 inch SSD HDD with…

    ORICO USB3.0 External 4 Bay Hard Drive Enclosure for 2.5 3.5 inch SSD HDD with…

  • Gaming
    The Callisto Protocol showed me what makes a GOOD GAME (Raptor Review)

    The Callisto Protocol showed me what makes a GOOD GAME (Raptor Review)

    REDRAGON S101 GAMING KEYBOARD

    Sony’s fast InZone M10S QD-OLED gaming monitor is over $300 off

    starship troopers: extermination, 2

    starship troopers: extermination, 2

    As Dusk Falls – Part 5 | XBOX SERIES X |

    As Dusk Falls – Part 5 | XBOX SERIES X |

    DIABLO 4: VESSEL OF HATRED SUCKS! (Review after 100%)

    DIABLO 4: VESSEL OF HATRED SUCKS! (Review after 100%)

    How to pay your taxes in Dune: Awakening

    How to pay your taxes in Dune: Awakening

    Sonic Frontiers  (Nintendo Switch Review)

    Sonic Frontiers (Nintendo Switch Review)

    Review on Gotham Knights – 50+ Hours on Hard Mode

    Review on Gotham Knights – 50+ Hours on Hard Mode

    Gungrave G.O.R.E. 12 – Labyrinth – Stage 11 – Hong Kong Back Alleys Area 1 | Easy | No Comm | Plat

    Gungrave G.O.R.E. 12 – Labyrinth – Stage 11 – Hong Kong Back Alleys Area 1 | Easy | No Comm | Plat

  • Tesla
    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    CAR GUYS Tire Shine Spray | High Gloss & Satin Finish | Non-Greasy, UV Protection,…

    CAR GUYS Tire Shine Spray | High Gloss & Satin Finish | Non-Greasy, UV Protection,…

    7PCS Center Console Organizer Tray for 2024 2025 Tesla Model 3 Accessories, Armrest…

    7PCS Center Console Organizer Tray for 2024 2025 Tesla Model 3 Accessories, Armrest…

    Spigen Door Striker and Hinge Cover Designed for Tesla Model 3 & Y [Compatible with…

    Spigen Door Striker and Hinge Cover Designed for Tesla Model 3 & Y [Compatible with…

  • UFO
    NASA UAP Researchers Share Shocking UFO Evidence!

    NASA UAP Researchers Share Shocking UFO Evidence!

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    Roswell: The After-Action Report

    Roswell: The After-Action Report

    Alien UFO theories: AskReddit #ufo #alien #extraterrestrial #askreddit #reddit #creepystories #scary

    Alien UFO theories: AskReddit #ufo #alien #extraterrestrial #askreddit #reddit #creepystories #scary

    Resident Alien: Season One [DVD]

    Resident Alien: Season One [DVD]

    Close Up Zoom Moon 1000x beatiful view #SHORTVIDEO

    Close Up Zoom Moon 1000x beatiful view #SHORTVIDEO

    Black Triangle UFO: The Truth Behind the TR-3B Mystery

    Black Triangle UFO: The Truth Behind the TR-3B Mystery

No Result
View All Result
Techcratic
No Result
View All Result
Home AI

Data Science ETL Pipelines with DuckDB

AI by AI
May 30, 2025
in AI
Reading Time: 9 mins read
128 8
A A
0
Share on FacebookShare on XShare on LinkedIn

Cornellius Yudha Wijaya
2025-05-30 08:00:00
www.kdnuggets.com

Data Science ETL Pipelines with DuckDB
Image by Author | Ideogram

 

ETL — meaning Extract, Transform, Load — is a process that moves and prepares data for subsequent use, such as data analysis or machine learning modelling. ETL is a crucial activity for data scientists, as it enables us to acquire the necessary data for our work.

To assist the ETL process, various tools are available to facilitate our work, and one of them is DuckDB. DuckDB is an open-source OLAP SQL database management system designed to handle data analytics workloads with in-memory processing effectively. It’s an excellent tool for data scientists, regardless of the size of the data being worked with.

Creating a data science ETL pipeline is crucial for data scientists; it’s essential to understand the process thoroughly. In this article, we will learn how to create an ETL pipeline using DuckDB.

 

Preparation

 
First, we will set up all the necessary components to simulate the ETL pipeline in a real-world data science project. All the code demonstrated in this article is also available in the GitHub repository.

The first thing we need will be the dataset for our article. In this example, we will use the data scientist salary data from Kaggle. For the data warehouse, we will utilize the DuckDB-powered cloud data warehouse, known as Motherduck. Register for a free account, then select to create a table from the files using the data science salary data, and place them in the my_db database.

If you complete this step, you can then query the dataset, and it will be displayed as shown in the image below.

 
Data Science ETL Pipelines with DuckDB
 

Once the database is ready, acquire the access tokens, which we will use to access the cloud database.

Next, open your IDE, such as Visual Studio Code, to set up the pipeline environment. The first step is to create the virtual environment, which can be done using the following code.

python -m venv duckdb_venv

 

You can change the virtual environment name to any name you like. Activate the virtual environment, and we will install all the required libraries. Create a text file called requirements.txt and fill it with the following library names.

duckdb
pandas
pyarrow
python-dotenv

 

With the file ready, we will install the library required for the project using the code below.

pip install -r requirements.txt

 

If every library is successfully installed, we will set up the environment variable using the .env file. Create the file and insert the MOTHERDUCK_TOKEN inside, using the token you just acquired from Motherduck.

Now that the preparation is complete, let’s proceed with setting up the ETL pipeline using DuckDB.

 

ETL pipelines with DuckDB

 
Working with DuckDB is similar to working with SQL operations, but with much simpler connectivity. We will utilize the DuckDB in-memory feature to process our data by running queries in the Python environment but we will load the data back into the Motherduck cloud database.

First, create a Python file that will contain the ETL pipeline. I made a file called etl_duckdb.py, but you can use different names if you prefer.

Inside the file, we will explore how to set up the data science ETL pipeline with DuckDB. Initially, we will need to connect DuckDB to the cloud database to retrieve the necessary data.

import os
import duckdb
from dotenv import load_dotenv

load_dotenv()
MD_TOKEN = os.getenv("MOTHERDUCK_TOKEN")
con = duckdb.connect(f'md:?motherduck_token={MD_TOKEN}')

 

After that, we will create a schema named analytics to store the data we extract.

con.sql("CREATE SCHEMA IF NOT EXISTS analytics;")

 

You can see that the operations in DuckDB are exactly how you will use the SQL queries. If you are already familiar with SQL operations, then creating the pipeline will become much easier.

Next, we will extract the raw data into another table just to show you that it’s possible to use DuckDB for extracting and loading the same data into another table.

con.sql("""
CREATE OR REPLACE TABLE raw_salaries AS
SELECT
    work_year,
    experience_level,
    employment_type,
    job_title,
    salary,
    salary_currency,
    salary_in_usd,
    employee_residence,
    remote_ratio,
    company_location,
    company_size
FROM my_db.ds_salaries;
""")

 

With the data prepared, we can perform any transformations and load the modified data for subsequent analysis.

For example, let’s transform the data into average salary data based on work year and experience level, which we will load into the table avg_salary_year_exp.

con.sql("""
CREATE OR REPLACE TABLE analytics.avg_salary_year_exp AS
SELECT
    work_year,
    experience_level,
    ROUND(AVG(salary_in_usd), 2) AS avg_usd_salary
FROM raw_salaries
GROUP BY work_year, experience_level
ORDER BY work_year, experience_level;
""")

 

Let’s check the transformed data that we have loaded into the table using the following code.

con.sql("SELECT * FROM analytics.avg_salary_year_exp LIMIT 5").show()

 

The result is a table shown in the output below.

┌───────────┬──────────────────┬────────────────┐
│ work_year │ experience_level │ avg_usd_salary │
│   int64   │     varchar      │     double     │
├───────────┼──────────────────┼────────────────┤
│      2020 │ EN               │       57511.61 │
│      2020 │ EX               │      139944.33 │
│      2020 │ MI               │       87564.72 │
│      2020 │ SE               │       137240.5 │
│      2021 │ EN               │       54905.25 │
└───────────┴──────────────────┴────────────────┘

 

Using DuckDB, we can efficiently perform ETL without any hassle.

As DuckDB is quite flexible in helping our operations, we can also utilize Pandas to perform ETL operations.

For example, we can take the previous average salary data and transform it into a DataFrame object, where we can transform it even further.

df_avg = con.sql("SELECT * FROM analytics.avg_salary_year_exp").df()
df_avg["avg_salary_k"] = df_avg["avg_usd_salary"] / 1_000

 

We can see the result of the DataFrame we have transformed using the code below.

 

Where the output is similar to the one below.

  work_year experience_level  avg_usd_salary  avg_salary_k
0       2020               EN        57511.61      57.51161
1       2020               EX       139944.33     139.94433
2       2020               MI        87564.72      87.56472
3       2020               SE       137240.50     137.24050
4       2021               EN        54905.25      54.90525

 

Using the DataFrame above, we can register it in DuckDB, which will treat the DataFrame as a table using the code below.

con.register("pandas_avg_salary", df_avg)

 

The Pandas DataFrame is now ready for further processing; for example, we can transform the data and reload it into the cloud database.

con.sql("""
CREATE OR REPLACE TABLE analytics.avg_salary_year_exp_pandas AS
SELECT
  work_year,
  experience_level,
  avg_salary_k
FROM pandas_avg_salary
WHERE avg_salary_k > 100
ORDER BY avg_salary_k DESC
""")

 

You can see the result using the code below.

con.sql("SELECT * FROM analytics.avg_salary_year_exp_pandas LIMIT 5").show()

 

The output is shown below.

┌───────────┬──────────────────┬──────────────┐
│ work_year │ experience_level │ avg_salary_k │
│   int64   │     varchar      │    double    │
├───────────┼──────────────────┼──────────────┤
│      2023 │ EX               │    203.70568 │
│      2022 │ EX               │    188.26029 │
│      2021 │ EX               │      186.128 │
│      2023 │ SE               │    159.56893 │
│      2022 │ SE               │    147.65969 │
└───────────┴──────────────────┴──────────────┘

 

That’s all you need to develop a simple ETL pipeline for a data science project. You can extend the pipeline with automation and a scheduler using a CRON job, depending on the project requirements.

 

Conclusion

 
ETL, or Extract, Transform, Load, is a process that moves and prepares data for further usage. For a data scientist, ETL is useful for any work that requires data, such as data analysis or machine learning modelling.

In this article, we have learned how to create an ETL pipeline for data science work using DuckDB. We demonstrated how to extract data from a cloud database, transformed it using SQL queries and Pandas DataFrames, and loaded it back into the cloud database.

I hope this has helped!
 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Source Link



Shark AI Ultra Voice Control Robot Vacuum

Transform your cleaning routine with the Shark AI Ultra Voice Control Robot Vacuum! This high-tech marvel boasts over 32,487 ratings, an impressive 4.2 out of 5 stars, and has been purchased over 900 times in the past month. Perfect for keeping your home spotless with minimal effort, this vacuum is now available for the unbeatable price of $349.99!

Don’t miss out on this limited-time offer. Order now and let Shark AI do the work for you!


Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

Help Power Techcratic’s Future – Scan To Support

If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.

As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!

BITCOIN

Bitcoin Logo

Bitcoin QR Code

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Scan the QR code with your crypto wallet app

DOGECOIN

Dogecoin Logo

Dogecoin QR Code

D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA

Scan the QR code with your crypto wallet app

ETHEREUM

Ethereum Logo

Ethereum QR Code

0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a

Scan the QR code with your crypto wallet app

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: AI NEWS
Share168Tweet105Share29
Previous Post

Ryushi: The Aftermath and the Abomination

Next Post

How Deep Can Offshore Oil Rigs Drill?

AI

AI

Explore the dynamic realm of AI, where breakthroughs and trends are shaping the future. Stay informed and see how AI is making an impact. Don’t miss the crucial updates—read the latest articles here at Techcratic.

Related Posts

Artificial Intelligence
AI

10 Awesome OCR Models for 2025

June 6, 2025
1.3k
Artificial Intelligence
AI

5 Error Handling Patterns in Python (Beyond Try-Except)

June 6, 2025
1.3k
Artificial Intelligence
AI

Top 5 Alternative Data Career Paths and How to Learn Them for Free

June 5, 2025
1.3k
Artificial Intelligence
AI

Implementing Machine Learning Pipelines with Apache Spark

June 3, 2025
1.4k
Artificial Intelligence
AI

Learn Power BI for Free This Week

June 2, 2025
1.4k
Artificial Intelligence
AI

Build GraphRAG applications using Amazon Bedrock Knowledge Bases

June 2, 2025
1.3k
Load More
Next Post
How Deep Can Offshore Oil Rigs Drill?

How Deep Can Offshore Oil Rigs Drill?

AIR DRY Clay TIPS: Sculpting For Beginners

AIR DRY Clay TIPS: Sculpting For Beginners

Smartphone

Woah! The top-rated Razr Kishi V2 controller just scored a whopping 50% discount at Amazon, but it won't last long

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • AnandTech
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • Home
  • Apple
  • Gaming
  • Microsoft
  • AnandTech