• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Tuesday, June 10, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    7 Python Errors That Are Actually Features

    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

  • Crypto
    Uniswap Surges 24% on $88B Volume, Targeting $12

    Uniswap Surges 24% on $88B Volume, Targeting $12

    No One Fell for It: Paraguay’s Bitcoin Legal Tender Announcement Was a Zero-Sum Hack

    No One Fell for It: Paraguay’s Bitcoin Legal Tender Announcement Was a Zero-Sum Hack

    Pi Network Dives Toward $1 – Here’s Why Investors Are Nervous

    XRP Price to Pump With Golden Cross and Long-Term Holder Data

    Franklin Templeton Debuts Second-by-Second ‘Intraday Yield’ on Blockchain Platform

    Franklin Templeton Debuts Second-by-Second ‘Intraday Yield’ on Blockchain Platform

    Bitcoin ETFs Bounce Back With $386 Million Inflow as Ether ETFs Maintain Bull Run

    Bitcoin ETFs Bounce Back With $386 Million Inflow as Ether ETFs Maintain Bull Run

    Bitcoin Core Developers Merge Controversial Policy Changes: Is a Fork Ahead?

    Bitcoin Core Developers Merge Controversial Policy Changes: Is a Fork Ahead?

    Crypto to “Become Part of All Sectors” Under Trump: Kevin O’Leary

    Russian Crypto CEO Charged in $530M Laundering Fraud

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

  • Cybersecurity
    Cybersecurity

    Researchers Uncover 20+ Configuration Risks, Including Five CVEs, in Salesforce Industry Cloud

    Cybersecurity

    Adobe Releases Patch Fixing 254 Vulnerabilities, Closing High-Severity Security Gaps

    Cybersecurity

    Researcher Found Flaw to Discover Phone Numbers Linked to Any Google Account

    Cybersecurity

    CISA Adds Erlang SSH and Roundcube Flaws to Known Exploited Vulnerabilities Catalog

    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

  • Deals
    Cable Matters 10Gbps Short USB C to Micro USB 3.0 Cable – 1ft, USB-C Hard Drive Cable,…

    Cable Matters 10Gbps Short USB C to Micro USB 3.0 Cable – 1ft, USB-C Hard Drive Cable,…

    HP Samsung Electronics CLT-M406S Toner, Magenta

    HP Samsung Electronics CLT-M406S Toner, Magenta

    SAMSUNG Galaxy S23 FE 5G, US Version, 128GB, Black – Unlocked (Renewed)

    SAMSUNG Galaxy S23 FE 5G, US Version, 128GB, Black – Unlocked (Renewed)

    LaCie Rugged SSD 1TB, Externe SSD, voor Mac & PC, USB-C, Schok- Regen- en drukbestendig,…

    LaCie Rugged SSD 1TB, Externe SSD, voor Mac & PC, USB-C, Schok- Regen- en drukbestendig,…

    Kingspec 44PIN IDE PATA MLC 2GB 4GB 8GB 16GB 32GB DOM SSD Disk On Module For Network…

    Kingspec 44PIN IDE PATA MLC 2GB 4GB 8GB 16GB 32GB DOM SSD Disk On Module For Network…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    Hitachi MAF0058 Mass Air Flow Sensor

    Hitachi MAF0058 Mass Air Flow Sensor

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

  • Gaming
    Elden ring dlc walkthrough part 2

    Elden ring dlc walkthrough part 2

    Elden Ring Quick character build with Cheat Engine – Detailed Walkthrough for Creating builds Faste

    Elden Ring Quick character build with Cheat Engine – Detailed Walkthrough for Creating builds Faste

    The D&D Movie IS NOT WOKE!  A Review

    The D&D Movie IS NOT WOKE! A Review

    The Legends of Zelda BOTW Switch 2 – Final Boss and Ending (4K60FPS)

    The Legends of Zelda BOTW Switch 2 – Final Boss and Ending (4K60FPS)

    The Legend of Zelda Breath of the Wild Walkthrough Part 7 (E3 2016 Gameplay)

    The Legend of Zelda Breath of the Wild Walkthrough Part 7 (E3 2016 Gameplay)

    Blue Lion Supercomputer Will Run on NVIDIA Vera Rubin

    Blue Lion Supercomputer Will Run on NVIDIA Vera Rubin

    BOTW – Breadcrumbs – Walkthrough 68, pt. 7 (Sasa Kai Shrine)

    BOTW – Breadcrumbs – Walkthrough 68, pt. 7 (Sasa Kai Shrine)

    Yellow Wind Sage Boss Theme | Black Myth: Wukong

    Yellow Wind Sage Boss Theme | Black Myth: Wukong

    Baldurs Gate 3 REVIEW (In Progress) – My Brutally Honest Opinion & Is It Worth It? (BG3 Review)

    Baldurs Gate 3 REVIEW (In Progress) – My Brutally Honest Opinion & Is It Worth It? (BG3 Review)

  • Tesla
    iZEEKER 2.5K Dash Cam WiFi Dash Camera for Cars, Mini Car Camera 1440P Front Dashcams…

    iZEEKER 2.5K Dash Cam WiFi Dash Camera for Cars, Mini Car Camera 1440P Front Dashcams…

    2 Pack For Tesla Model X 2017-2024 Front/Back Under Seat Storage Organizer,TPE…

    2 Pack For Tesla Model X 2017-2024 Front/Back Under Seat Storage Organizer,TPE…

    GOOACC 200PCS Car Plastic Rivets Fasteners Push Retainer Kit, 10 Most Popular Sizes Auto…

    GOOACC 200PCS Car Plastic Rivets Fasteners Push Retainer Kit, 10 Most Popular Sizes Auto…

    Tera Electric Vehicle Charger Tesla: ETL Certified Level 2 48 Amp 240 Volt DIY Stickers…

    Tera Electric Vehicle Charger Tesla: ETL Certified Level 2 48 Amp 240 Volt DIY Stickers…

    Tesla (TSLA) sales are crashing in China, and things are about to get worse

    Tesla (TSLA) sales are crashing in China, and things are about to get worse

    Lifting Jack Pad for Model 3/Y/S/X,4 PCS Jack Pad with Tire Repair Tool & Storage Box,…

    Lifting Jack Pad for Model 3/Y/S/X,4 PCS Jack Pad with Tire Repair Tool & Storage Box,…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

  • UFO
    Alien Research

    Alien Research

    History Classics: UFOs & Aliens

    History Classics: UFOs & Aliens

    Mysteries Of Ancient Aliens According To Hinduism || #shorts || #youtube || #religion ||

    Mysteries Of Ancient Aliens According To Hinduism || #shorts || #youtube || #religion ||

    The Light Gate Welcomes Rafael Lugo, Contactee, August 21st, 2023

    The Light Gate Welcomes Rafael Lugo, Contactee, August 21st, 2023

    FOCO NFL Mens Football Team Logo Moccasin Slippers Shoes

    FOCO NFL Mens Football Team Logo Moccasin Slippers Shoes

    Horrifying Encounter While Truck Driving #scary #paranormal

    Horrifying Encounter While Truck Driving #scary #paranormal

    Vintage Gators Personalized Name Apparel Retro Classic T-Shirt

    Vintage Gators Personalized Name Apparel Retro Classic T-Shirt

    Pop Culture Conspiracy Theories! Taylor Swift, BRAT, and The Simpson Predictions!

    Pop Culture Conspiracy Theories! Taylor Swift, BRAT, and The Simpson Predictions!

    Mufon and Ufos: The Proof is Out There [DVD]

    Mufon and Ufos: The Proof is Out There [DVD]

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    7 Python Errors That Are Actually Features

    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

  • Crypto
    Uniswap Surges 24% on $88B Volume, Targeting $12

    Uniswap Surges 24% on $88B Volume, Targeting $12

    No One Fell for It: Paraguay’s Bitcoin Legal Tender Announcement Was a Zero-Sum Hack

    No One Fell for It: Paraguay’s Bitcoin Legal Tender Announcement Was a Zero-Sum Hack

    Pi Network Dives Toward $1 – Here’s Why Investors Are Nervous

    XRP Price to Pump With Golden Cross and Long-Term Holder Data

    Franklin Templeton Debuts Second-by-Second ‘Intraday Yield’ on Blockchain Platform

    Franklin Templeton Debuts Second-by-Second ‘Intraday Yield’ on Blockchain Platform

    Bitcoin ETFs Bounce Back With $386 Million Inflow as Ether ETFs Maintain Bull Run

    Bitcoin ETFs Bounce Back With $386 Million Inflow as Ether ETFs Maintain Bull Run

    Bitcoin Core Developers Merge Controversial Policy Changes: Is a Fork Ahead?

    Bitcoin Core Developers Merge Controversial Policy Changes: Is a Fork Ahead?

    Crypto to “Become Part of All Sectors” Under Trump: Kevin O’Leary

    Russian Crypto CEO Charged in $530M Laundering Fraud

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

  • Cybersecurity
    Cybersecurity

    Researchers Uncover 20+ Configuration Risks, Including Five CVEs, in Salesforce Industry Cloud

    Cybersecurity

    Adobe Releases Patch Fixing 254 Vulnerabilities, Closing High-Severity Security Gaps

    Cybersecurity

    Researcher Found Flaw to Discover Phone Numbers Linked to Any Google Account

    Cybersecurity

    CISA Adds Erlang SSH and Roundcube Flaws to Known Exploited Vulnerabilities Catalog

    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

  • Deals
    Cable Matters 10Gbps Short USB C to Micro USB 3.0 Cable – 1ft, USB-C Hard Drive Cable,…

    Cable Matters 10Gbps Short USB C to Micro USB 3.0 Cable – 1ft, USB-C Hard Drive Cable,…

    HP Samsung Electronics CLT-M406S Toner, Magenta

    HP Samsung Electronics CLT-M406S Toner, Magenta

    SAMSUNG Galaxy S23 FE 5G, US Version, 128GB, Black – Unlocked (Renewed)

    SAMSUNG Galaxy S23 FE 5G, US Version, 128GB, Black – Unlocked (Renewed)

    LaCie Rugged SSD 1TB, Externe SSD, voor Mac & PC, USB-C, Schok- Regen- en drukbestendig,…

    LaCie Rugged SSD 1TB, Externe SSD, voor Mac & PC, USB-C, Schok- Regen- en drukbestendig,…

    Kingspec 44PIN IDE PATA MLC 2GB 4GB 8GB 16GB 32GB DOM SSD Disk On Module For Network…

    Kingspec 44PIN IDE PATA MLC 2GB 4GB 8GB 16GB 32GB DOM SSD Disk On Module For Network…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    Hitachi MAF0058 Mass Air Flow Sensor

    Hitachi MAF0058 Mass Air Flow Sensor

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

  • Gaming
    Elden ring dlc walkthrough part 2

    Elden ring dlc walkthrough part 2

    Elden Ring Quick character build with Cheat Engine – Detailed Walkthrough for Creating builds Faste

    Elden Ring Quick character build with Cheat Engine – Detailed Walkthrough for Creating builds Faste

    The D&D Movie IS NOT WOKE!  A Review

    The D&D Movie IS NOT WOKE! A Review

    The Legends of Zelda BOTW Switch 2 – Final Boss and Ending (4K60FPS)

    The Legends of Zelda BOTW Switch 2 – Final Boss and Ending (4K60FPS)

    The Legend of Zelda Breath of the Wild Walkthrough Part 7 (E3 2016 Gameplay)

    The Legend of Zelda Breath of the Wild Walkthrough Part 7 (E3 2016 Gameplay)

    Blue Lion Supercomputer Will Run on NVIDIA Vera Rubin

    Blue Lion Supercomputer Will Run on NVIDIA Vera Rubin

    BOTW – Breadcrumbs – Walkthrough 68, pt. 7 (Sasa Kai Shrine)

    BOTW – Breadcrumbs – Walkthrough 68, pt. 7 (Sasa Kai Shrine)

    Yellow Wind Sage Boss Theme | Black Myth: Wukong

    Yellow Wind Sage Boss Theme | Black Myth: Wukong

    Baldurs Gate 3 REVIEW (In Progress) – My Brutally Honest Opinion & Is It Worth It? (BG3 Review)

    Baldurs Gate 3 REVIEW (In Progress) – My Brutally Honest Opinion & Is It Worth It? (BG3 Review)

  • Tesla
    iZEEKER 2.5K Dash Cam WiFi Dash Camera for Cars, Mini Car Camera 1440P Front Dashcams…

    iZEEKER 2.5K Dash Cam WiFi Dash Camera for Cars, Mini Car Camera 1440P Front Dashcams…

    2 Pack For Tesla Model X 2017-2024 Front/Back Under Seat Storage Organizer,TPE…

    2 Pack For Tesla Model X 2017-2024 Front/Back Under Seat Storage Organizer,TPE…

    GOOACC 200PCS Car Plastic Rivets Fasteners Push Retainer Kit, 10 Most Popular Sizes Auto…

    GOOACC 200PCS Car Plastic Rivets Fasteners Push Retainer Kit, 10 Most Popular Sizes Auto…

    Tera Electric Vehicle Charger Tesla: ETL Certified Level 2 48 Amp 240 Volt DIY Stickers…

    Tera Electric Vehicle Charger Tesla: ETL Certified Level 2 48 Amp 240 Volt DIY Stickers…

    Tesla (TSLA) sales are crashing in China, and things are about to get worse

    Tesla (TSLA) sales are crashing in China, and things are about to get worse

    Lifting Jack Pad for Model 3/Y/S/X,4 PCS Jack Pad with Tire Repair Tool & Storage Box,…

    Lifting Jack Pad for Model 3/Y/S/X,4 PCS Jack Pad with Tire Repair Tool & Storage Box,…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

  • UFO
    Alien Research

    Alien Research

    History Classics: UFOs & Aliens

    History Classics: UFOs & Aliens

    Mysteries Of Ancient Aliens According To Hinduism || #shorts || #youtube || #religion ||

    Mysteries Of Ancient Aliens According To Hinduism || #shorts || #youtube || #religion ||

    The Light Gate Welcomes Rafael Lugo, Contactee, August 21st, 2023

    The Light Gate Welcomes Rafael Lugo, Contactee, August 21st, 2023

    FOCO NFL Mens Football Team Logo Moccasin Slippers Shoes

    FOCO NFL Mens Football Team Logo Moccasin Slippers Shoes

    Horrifying Encounter While Truck Driving #scary #paranormal

    Horrifying Encounter While Truck Driving #scary #paranormal

    Vintage Gators Personalized Name Apparel Retro Classic T-Shirt

    Vintage Gators Personalized Name Apparel Retro Classic T-Shirt

    Pop Culture Conspiracy Theories! Taylor Swift, BRAT, and The Simpson Predictions!

    Pop Culture Conspiracy Theories! Taylor Swift, BRAT, and The Simpson Predictions!

    Mufon and Ufos: The Proof is Out There [DVD]

    Mufon and Ufos: The Proof is Out There [DVD]

No Result
View All Result
Techcratic
No Result
View All Result
Home AI

Data Science ETL Pipelines with DuckDB

AI by AI
May 30, 2025
in AI
Reading Time: 9 mins read
128 8
A A
0
Share on FacebookShare on XShare on LinkedIn

Cornellius Yudha Wijaya
2025-05-30 08:00:00
www.kdnuggets.com

Data Science ETL Pipelines with DuckDB
Image by Author | Ideogram

 

ETL — meaning Extract, Transform, Load — is a process that moves and prepares data for subsequent use, such as data analysis or machine learning modelling. ETL is a crucial activity for data scientists, as it enables us to acquire the necessary data for our work.

To assist the ETL process, various tools are available to facilitate our work, and one of them is DuckDB. DuckDB is an open-source OLAP SQL database management system designed to handle data analytics workloads with in-memory processing effectively. It’s an excellent tool for data scientists, regardless of the size of the data being worked with.

Creating a data science ETL pipeline is crucial for data scientists; it’s essential to understand the process thoroughly. In this article, we will learn how to create an ETL pipeline using DuckDB.

 

Preparation

 
First, we will set up all the necessary components to simulate the ETL pipeline in a real-world data science project. All the code demonstrated in this article is also available in the GitHub repository.

The first thing we need will be the dataset for our article. In this example, we will use the data scientist salary data from Kaggle. For the data warehouse, we will utilize the DuckDB-powered cloud data warehouse, known as Motherduck. Register for a free account, then select to create a table from the files using the data science salary data, and place them in the my_db database.

If you complete this step, you can then query the dataset, and it will be displayed as shown in the image below.

 
Data Science ETL Pipelines with DuckDB
 

Once the database is ready, acquire the access tokens, which we will use to access the cloud database.

Next, open your IDE, such as Visual Studio Code, to set up the pipeline environment. The first step is to create the virtual environment, which can be done using the following code.

python -m venv duckdb_venv

 

You can change the virtual environment name to any name you like. Activate the virtual environment, and we will install all the required libraries. Create a text file called requirements.txt and fill it with the following library names.

duckdb
pandas
pyarrow
python-dotenv

 

With the file ready, we will install the library required for the project using the code below.

pip install -r requirements.txt

 

If every library is successfully installed, we will set up the environment variable using the .env file. Create the file and insert the MOTHERDUCK_TOKEN inside, using the token you just acquired from Motherduck.

Now that the preparation is complete, let’s proceed with setting up the ETL pipeline using DuckDB.

 

ETL pipelines with DuckDB

 
Working with DuckDB is similar to working with SQL operations, but with much simpler connectivity. We will utilize the DuckDB in-memory feature to process our data by running queries in the Python environment but we will load the data back into the Motherduck cloud database.

First, create a Python file that will contain the ETL pipeline. I made a file called etl_duckdb.py, but you can use different names if you prefer.

Inside the file, we will explore how to set up the data science ETL pipeline with DuckDB. Initially, we will need to connect DuckDB to the cloud database to retrieve the necessary data.

import os
import duckdb
from dotenv import load_dotenv

load_dotenv()
MD_TOKEN = os.getenv("MOTHERDUCK_TOKEN")
con = duckdb.connect(f'md:?motherduck_token={MD_TOKEN}')

 

After that, we will create a schema named analytics to store the data we extract.

con.sql("CREATE SCHEMA IF NOT EXISTS analytics;")

 

You can see that the operations in DuckDB are exactly how you will use the SQL queries. If you are already familiar with SQL operations, then creating the pipeline will become much easier.

Next, we will extract the raw data into another table just to show you that it’s possible to use DuckDB for extracting and loading the same data into another table.

con.sql("""
CREATE OR REPLACE TABLE raw_salaries AS
SELECT
    work_year,
    experience_level,
    employment_type,
    job_title,
    salary,
    salary_currency,
    salary_in_usd,
    employee_residence,
    remote_ratio,
    company_location,
    company_size
FROM my_db.ds_salaries;
""")

 

With the data prepared, we can perform any transformations and load the modified data for subsequent analysis.

For example, let’s transform the data into average salary data based on work year and experience level, which we will load into the table avg_salary_year_exp.

con.sql("""
CREATE OR REPLACE TABLE analytics.avg_salary_year_exp AS
SELECT
    work_year,
    experience_level,
    ROUND(AVG(salary_in_usd), 2) AS avg_usd_salary
FROM raw_salaries
GROUP BY work_year, experience_level
ORDER BY work_year, experience_level;
""")

 

Let’s check the transformed data that we have loaded into the table using the following code.

con.sql("SELECT * FROM analytics.avg_salary_year_exp LIMIT 5").show()

 

The result is a table shown in the output below.

┌───────────┬──────────────────┬────────────────┐
│ work_year │ experience_level │ avg_usd_salary │
│   int64   │     varchar      │     double     │
├───────────┼──────────────────┼────────────────┤
│      2020 │ EN               │       57511.61 │
│      2020 │ EX               │      139944.33 │
│      2020 │ MI               │       87564.72 │
│      2020 │ SE               │       137240.5 │
│      2021 │ EN               │       54905.25 │
└───────────┴──────────────────┴────────────────┘

 

Using DuckDB, we can efficiently perform ETL without any hassle.

As DuckDB is quite flexible in helping our operations, we can also utilize Pandas to perform ETL operations.

For example, we can take the previous average salary data and transform it into a DataFrame object, where we can transform it even further.

df_avg = con.sql("SELECT * FROM analytics.avg_salary_year_exp").df()
df_avg["avg_salary_k"] = df_avg["avg_usd_salary"] / 1_000

 

We can see the result of the DataFrame we have transformed using the code below.

 

Where the output is similar to the one below.

  work_year experience_level  avg_usd_salary  avg_salary_k
0       2020               EN        57511.61      57.51161
1       2020               EX       139944.33     139.94433
2       2020               MI        87564.72      87.56472
3       2020               SE       137240.50     137.24050
4       2021               EN        54905.25      54.90525

 

Using the DataFrame above, we can register it in DuckDB, which will treat the DataFrame as a table using the code below.

con.register("pandas_avg_salary", df_avg)

 

The Pandas DataFrame is now ready for further processing; for example, we can transform the data and reload it into the cloud database.

con.sql("""
CREATE OR REPLACE TABLE analytics.avg_salary_year_exp_pandas AS
SELECT
  work_year,
  experience_level,
  avg_salary_k
FROM pandas_avg_salary
WHERE avg_salary_k > 100
ORDER BY avg_salary_k DESC
""")

 

You can see the result using the code below.

con.sql("SELECT * FROM analytics.avg_salary_year_exp_pandas LIMIT 5").show()

 

The output is shown below.

┌───────────┬──────────────────┬──────────────┐
│ work_year │ experience_level │ avg_salary_k │
│   int64   │     varchar      │    double    │
├───────────┼──────────────────┼──────────────┤
│      2023 │ EX               │    203.70568 │
│      2022 │ EX               │    188.26029 │
│      2021 │ EX               │      186.128 │
│      2023 │ SE               │    159.56893 │
│      2022 │ SE               │    147.65969 │
└───────────┴──────────────────┴──────────────┘

 

That’s all you need to develop a simple ETL pipeline for a data science project. You can extend the pipeline with automation and a scheduler using a CRON job, depending on the project requirements.

 

Conclusion

 
ETL, or Extract, Transform, Load, is a process that moves and prepares data for further usage. For a data scientist, ETL is useful for any work that requires data, such as data analysis or machine learning modelling.

In this article, we have learned how to create an ETL pipeline for data science work using DuckDB. We demonstrated how to extract data from a cloud database, transformed it using SQL queries and Pandas DataFrames, and loaded it back into the cloud database.

I hope this has helped!
 
 

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.

Source Link



Shark AI Ultra Voice Control Robot Vacuum

Transform your cleaning routine with the Shark AI Ultra Voice Control Robot Vacuum! This high-tech marvel boasts over 32,487 ratings, an impressive 4.2 out of 5 stars, and has been purchased over 900 times in the past month. Perfect for keeping your home spotless with minimal effort, this vacuum is now available for the unbeatable price of $349.99!

Don’t miss out on this limited-time offer. Order now and let Shark AI do the work for you!


Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

Help Power Techcratic’s Future – Scan To Support

If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.

As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!

BITCOIN

Bitcoin Logo

Bitcoin QR Code

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Scan the QR code with your crypto wallet app

DOGECOIN

Dogecoin Logo

Dogecoin QR Code

D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA

Scan the QR code with your crypto wallet app

ETHEREUM

Ethereum Logo

Ethereum QR Code

0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a

Scan the QR code with your crypto wallet app

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: AI NEWS
Share168Tweet105Share29
Previous Post

Ryushi: The Aftermath and the Abomination

Next Post

How Deep Can Offshore Oil Rigs Drill?

AI

AI

Explore the dynamic realm of AI, where breakthroughs and trends are shaping the future. Stay informed and see how AI is making an impact. Don’t miss the crucial updates—read the latest articles here at Techcratic.

Related Posts

Artificial Intelligence
AI

7 Python Errors That Are Actually Features

June 10, 2025
1.3k
Artificial Intelligence
AI

10 Awesome OCR Models for 2025

June 6, 2025
1.3k
Artificial Intelligence
AI

5 Error Handling Patterns in Python (Beyond Try-Except)

June 6, 2025
1.3k
Artificial Intelligence
AI

Top 5 Alternative Data Career Paths and How to Learn Them for Free

June 5, 2025
1.3k
Artificial Intelligence
AI

Implementing Machine Learning Pipelines with Apache Spark

June 3, 2025
1.4k
Artificial Intelligence
AI

Learn Power BI for Free This Week

June 2, 2025
1.4k
Load More
Next Post
How Deep Can Offshore Oil Rigs Drill?

How Deep Can Offshore Oil Rigs Drill?

AIR DRY Clay TIPS: Sculpting For Beginners

AIR DRY Clay TIPS: Sculpting For Beginners

Smartphone

Woah! The top-rated Razr Kishi V2 controller just scored a whopping 50% discount at Amazon, but it won't last long

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • AnandTech
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • Home
  • Apple
  • Gaming
  • Microsoft
  • AnandTech