• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Tuesday, July 1, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

    Artificial Intelligence

    Instruction-Following Pruning for Large Language Models

    Artificial Intelligence

    How to Combine Streamlit, Pandas, and Plotly for Interactive Data Apps

    Artificial Intelligence

    Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

    Artificial Intelligence

    Automate Data Quality Reports with n8n: From CSV to Professional Analysis

    Artificial Intelligence

    NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

    Artificial Intelligence

    5 Things You Need to Know About Agentic AI

    Artificial Intelligence

    Normalizing Flows are Capable Generative Models

    Artificial Intelligence

    Update on the AWS DeepRacer Student Portal

  • App Zone
    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

  • Apple

    July 1, 2025 – Apple AI rumors, 10 years of Apple Music

    Will the iPhone 17 lineup really have higher refresh rate displays?

    The biggest iPhone 17 mystery left could swing a lot of upgrade decisions

    Apple hit with $110M damages in 3G patents lawsuit

    Apple hit with $110M damages in 3G patents lawsuit

    Photos iOS 26 vs iOS 18: Compared

    Photos iOS 26 vs iOS 18: Compared

    Here’s everything new for Apple Photos in iOS 26

    Here’s everything new for Apple Photos in iOS 26

    Apple gains ground with new Macs despite market challenges

    Apple gains ground with new Macs despite market challenges

    Anker Power Bank, Zolo, MagGo, recall

    Anker Power Bank, Zolo, MagGo, recall

    Developer for Linux on Apple Silicon Macs resigns, citing ‘major failure of leadership’

    New ‘MacBook’ rumor sounds like Apple’s taking the iPad approach

    Apple Music 10 year celebration

    Apple Music 10 year celebration

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Deals
    Skytech King 95 Gaming PC Desktop, Intel i7 14700F 2.1 GHz (5.3GHz Turbo), NVIDIA RTX…

    Skytech King 95 Gaming PC Desktop, Intel i7 14700F 2.1 GHz (5.3GHz Turbo), NVIDIA RTX…

    ASRock – B550M PRO SE – ASRock B550M Pro SE Gaming Desktop Motherboard – AMD PRO565…

    ASRock – B550M PRO SE – ASRock B550M Pro SE Gaming Desktop Motherboard – AMD PRO565…

    Soundcore A30i by Anker, Smart Noise Cancelling Earbuds, Lipstick-Shaped Stylish Design,…

    Soundcore A30i by Anker, Smart Noise Cancelling Earbuds, Lipstick-Shaped Stylish Design,…

    ADATA Premier 256GB MicroSDHC/SDXC UHS-I Class 10 V10 A1 Memory Card with Adapter Read…

    ADATA Premier 256GB MicroSDHC/SDXC UHS-I Class 10 V10 A1 Memory Card with Adapter Read…

    acer Wireless Mouse for Laptop, 2.4GHz Computer Mouse 3 Adjustable DPI Office Cordless…

    acer Wireless Mouse for Laptop, 2.4GHz Computer Mouse 3 Adjustable DPI Office Cordless…

    STGAubron Gaming PC Computer Desktop, GeForce GTX 1660 Ti 6G, Intel Core I7 up to 3.9…

    STGAubron Gaming PC Computer Desktop, GeForce GTX 1660 Ti 6G, Intel Core I7 up to 3.9…

    Sonic & SEGA All-Stars Racing – Xbox 360

    Sonic & SEGA All-Stars Racing – Xbox 360

    Carnival Games – Nintendo Wii (Renewed)

    Carnival Games – Nintendo Wii (Renewed)

    Transformers Devastation – PlayStation 3

    Transformers Devastation – PlayStation 3

  • Tech Eats
    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

  • Tesla
    Airpod Holder for Tesla MagSafe Wireless Charger- Fits All Airpods and All Model 3, X,…

    Airpod Holder for Tesla MagSafe Wireless Charger- Fits All Airpods and All Model 3, X,…

    Tesla unveils its LFP battery factory, claims it’s almost ready

    Tesla unveils its LFP battery factory, claims it’s almost ready

    Garmin GPS Mount – Ultra-Sticky Dash Holder for Car & Truck Dashboard & Windshield,…

    Garmin GPS Mount – Ultra-Sticky Dash Holder for Car & Truck Dashboard & Windshield,…

    Elon Musk goes from sleeping on Tesla’s factory floor to sleeping in sales office

    Elon Musk goes from sleeping on Tesla’s factory floor to sleeping in sales office

    2 PCS H13/9008 Car LED Light Canbus Error-free Decoder, Plug-and-play Retrofit Radio…

    2 PCS H13/9008 Car LED Light Canbus Error-free Decoder, Plug-and-play Retrofit Radio…

    Tesla fires Musk’s chief of staff who became head of North America and Europe

    Tesla fires Musk’s chief of staff who became head of North America and Europe

    Wireless Charge Mat for 2024 2025 Tesla Cybertruck,Center Console Wireless Charger…

    Wireless Charge Mat for 2024 2025 Tesla Cybertruck,Center Console Wireless Charger…

    Truck Bed Cargo Mesh Net for Tesla Cybertruck 2024,with 6 Carabiners Stretchable Storage…

    Truck Bed Cargo Mesh Net for Tesla Cybertruck 2024,with 6 Carabiners Stretchable Storage…

    Motor Trend Premium FlexTough Deep Dish Rear Rubber Floor Mat Liners, Heavy Duty…

    Motor Trend Premium FlexTough Deep Dish Rear Rubber Floor Mat Liners, Heavy Duty…

  • UFO
    Funny Bigfoot Sasquatch Alien Spaceship UFO Introvert Bruh T-Shirt

    Funny Bigfoot Sasquatch Alien Spaceship UFO Introvert Bruh T-Shirt

    The Most Terrifying Unsolved UFO Mysteries | Best of Close Encounters

    The Most Terrifying Unsolved UFO Mysteries | Best of Close Encounters

    FINALLY! Biggest ALIEN SEARCH OPERATION's Results are Out | Breakthrough Listen Project Results

    FINALLY! Biggest ALIEN SEARCH OPERATION's Results are Out | Breakthrough Listen Project Results

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory Hall, 6′ Cable & Safety Rope, ETL Listed 2 Pack

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory Hall, 6′ Cable & Safety Rope, ETL Listed 2 Pack

    MindBlowing Alien Encounter Giant Mouse Discovered on Mars

    MindBlowing Alien Encounter Giant Mouse Discovered on Mars

    Franco Collectibles Adventure Time Bedding Super Soft Cozy Plush Throw, 46 in x 60 in, (Officially Licensed Product)

    Franco Collectibles Adventure Time Bedding Super Soft Cozy Plush Throw, 46 in x 60 in, (Officially Licensed Product)

    Alien 3's Workprint: What Else Was Cut From the Film?

    Alien 3's Workprint: What Else Was Cut From the Film?

    Simple Area 51 Minimal UFO Tattoo Line Art Graphic Tee UFO T-Shirt

    Simple Area 51 Minimal UFO Tattoo Line Art Graphic Tee UFO T-Shirt

    UFO hearing: Pentagon shows declassified photos and video, clip of unexplainable floating object

    UFO hearing: Pentagon shows declassified photos and video, clip of unexplainable floating object

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

    Artificial Intelligence

    Instruction-Following Pruning for Large Language Models

    Artificial Intelligence

    How to Combine Streamlit, Pandas, and Plotly for Interactive Data Apps

    Artificial Intelligence

    Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

    Artificial Intelligence

    Automate Data Quality Reports with n8n: From CSV to Professional Analysis

    Artificial Intelligence

    NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

    Artificial Intelligence

    5 Things You Need to Know About Agentic AI

    Artificial Intelligence

    Normalizing Flows are Capable Generative Models

    Artificial Intelligence

    Update on the AWS DeepRacer Student Portal

  • App Zone
    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

  • Apple

    July 1, 2025 – Apple AI rumors, 10 years of Apple Music

    Will the iPhone 17 lineup really have higher refresh rate displays?

    The biggest iPhone 17 mystery left could swing a lot of upgrade decisions

    Apple hit with $110M damages in 3G patents lawsuit

    Apple hit with $110M damages in 3G patents lawsuit

    Photos iOS 26 vs iOS 18: Compared

    Photos iOS 26 vs iOS 18: Compared

    Here’s everything new for Apple Photos in iOS 26

    Here’s everything new for Apple Photos in iOS 26

    Apple gains ground with new Macs despite market challenges

    Apple gains ground with new Macs despite market challenges

    Anker Power Bank, Zolo, MagGo, recall

    Anker Power Bank, Zolo, MagGo, recall

    Developer for Linux on Apple Silicon Macs resigns, citing ‘major failure of leadership’

    New ‘MacBook’ rumor sounds like Apple’s taking the iPad approach

    Apple Music 10 year celebration

    Apple Music 10 year celebration

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Deals
    Skytech King 95 Gaming PC Desktop, Intel i7 14700F 2.1 GHz (5.3GHz Turbo), NVIDIA RTX…

    Skytech King 95 Gaming PC Desktop, Intel i7 14700F 2.1 GHz (5.3GHz Turbo), NVIDIA RTX…

    ASRock – B550M PRO SE – ASRock B550M Pro SE Gaming Desktop Motherboard – AMD PRO565…

    ASRock – B550M PRO SE – ASRock B550M Pro SE Gaming Desktop Motherboard – AMD PRO565…

    Soundcore A30i by Anker, Smart Noise Cancelling Earbuds, Lipstick-Shaped Stylish Design,…

    Soundcore A30i by Anker, Smart Noise Cancelling Earbuds, Lipstick-Shaped Stylish Design,…

    ADATA Premier 256GB MicroSDHC/SDXC UHS-I Class 10 V10 A1 Memory Card with Adapter Read…

    ADATA Premier 256GB MicroSDHC/SDXC UHS-I Class 10 V10 A1 Memory Card with Adapter Read…

    acer Wireless Mouse for Laptop, 2.4GHz Computer Mouse 3 Adjustable DPI Office Cordless…

    acer Wireless Mouse for Laptop, 2.4GHz Computer Mouse 3 Adjustable DPI Office Cordless…

    STGAubron Gaming PC Computer Desktop, GeForce GTX 1660 Ti 6G, Intel Core I7 up to 3.9…

    STGAubron Gaming PC Computer Desktop, GeForce GTX 1660 Ti 6G, Intel Core I7 up to 3.9…

    Sonic & SEGA All-Stars Racing – Xbox 360

    Sonic & SEGA All-Stars Racing – Xbox 360

    Carnival Games – Nintendo Wii (Renewed)

    Carnival Games – Nintendo Wii (Renewed)

    Transformers Devastation – PlayStation 3

    Transformers Devastation – PlayStation 3

  • Tech Eats
    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

  • Tesla
    Airpod Holder for Tesla MagSafe Wireless Charger- Fits All Airpods and All Model 3, X,…

    Airpod Holder for Tesla MagSafe Wireless Charger- Fits All Airpods and All Model 3, X,…

    Tesla unveils its LFP battery factory, claims it’s almost ready

    Tesla unveils its LFP battery factory, claims it’s almost ready

    Garmin GPS Mount – Ultra-Sticky Dash Holder for Car & Truck Dashboard & Windshield,…

    Garmin GPS Mount – Ultra-Sticky Dash Holder for Car & Truck Dashboard & Windshield,…

    Elon Musk goes from sleeping on Tesla’s factory floor to sleeping in sales office

    Elon Musk goes from sleeping on Tesla’s factory floor to sleeping in sales office

    2 PCS H13/9008 Car LED Light Canbus Error-free Decoder, Plug-and-play Retrofit Radio…

    2 PCS H13/9008 Car LED Light Canbus Error-free Decoder, Plug-and-play Retrofit Radio…

    Tesla fires Musk’s chief of staff who became head of North America and Europe

    Tesla fires Musk’s chief of staff who became head of North America and Europe

    Wireless Charge Mat for 2024 2025 Tesla Cybertruck,Center Console Wireless Charger…

    Wireless Charge Mat for 2024 2025 Tesla Cybertruck,Center Console Wireless Charger…

    Truck Bed Cargo Mesh Net for Tesla Cybertruck 2024,with 6 Carabiners Stretchable Storage…

    Truck Bed Cargo Mesh Net for Tesla Cybertruck 2024,with 6 Carabiners Stretchable Storage…

    Motor Trend Premium FlexTough Deep Dish Rear Rubber Floor Mat Liners, Heavy Duty…

    Motor Trend Premium FlexTough Deep Dish Rear Rubber Floor Mat Liners, Heavy Duty…

  • UFO
    Funny Bigfoot Sasquatch Alien Spaceship UFO Introvert Bruh T-Shirt

    Funny Bigfoot Sasquatch Alien Spaceship UFO Introvert Bruh T-Shirt

    The Most Terrifying Unsolved UFO Mysteries | Best of Close Encounters

    The Most Terrifying Unsolved UFO Mysteries | Best of Close Encounters

    FINALLY! Biggest ALIEN SEARCH OPERATION's Results are Out | Breakthrough Listen Project Results

    FINALLY! Biggest ALIEN SEARCH OPERATION's Results are Out | Breakthrough Listen Project Results

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory Hall, 6′ Cable & Safety Rope, ETL Listed 2 Pack

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory Hall, 6′ Cable & Safety Rope, ETL Listed 2 Pack

    MindBlowing Alien Encounter Giant Mouse Discovered on Mars

    MindBlowing Alien Encounter Giant Mouse Discovered on Mars

    Franco Collectibles Adventure Time Bedding Super Soft Cozy Plush Throw, 46 in x 60 in, (Officially Licensed Product)

    Franco Collectibles Adventure Time Bedding Super Soft Cozy Plush Throw, 46 in x 60 in, (Officially Licensed Product)

    Alien 3's Workprint: What Else Was Cut From the Film?

    Alien 3's Workprint: What Else Was Cut From the Film?

    Simple Area 51 Minimal UFO Tattoo Line Art Graphic Tee UFO T-Shirt

    Simple Area 51 Minimal UFO Tattoo Line Art Graphic Tee UFO T-Shirt

    UFO hearing: Pentagon shows declassified photos and video, clip of unexplainable floating object

    UFO hearing: Pentagon shows declassified photos and video, clip of unexplainable floating object

No Result
View All Result
Techcratic
No Result
View All Result
Home AI

Building an Automatic Speech Recognition System with PyTorch & Hugging Face

AI by AI
March 26, 2025
in AI
Reading Time: 7 mins read
128
A A
0

Josep Ferrer
2025-03-26 10:00:00
www.kdnuggets.com

Building an Automatic Speech Recognition (ASR) System with PyTorch & Hugging Face
Image by Author

 

Automatic speech recognition (ASR) is a crucial technology in many applications, from voice assistants to transcription services. In this tutorial, we aim to build an ASR pipeline capable of transcribing speech into text using pre-trained models from Hugging Face. We will use a lightweight dataset for efficiency and employ Wav2Vec2, a powerful self-supervised model for speech recognition.

Our system will:

  1. Load and preprocess a speech dataset
  2. Fine-tune a pre-trained Wav2Vec2 model
  3. Evaluate the model’s performance using word error rate (WER)
  4. Deploy the model for real-time speech-to-text inference

To keep our model lightweight and efficient, we will use a small speech dataset rather than large datasets like Common Voice.

 

Step 1: Installing Dependencies

 
Before we start, we need to install the necessary libraries. These libraries will allow us to load datasets, process audio files, and fine-tune our model.

pip install torch torchaudio transformers datasets soundfile jiwer

 

The main purpose for the following libraries:

  1. transformers: Provides pre-trained Wav2Vec2 models for speech recognition
  2. datasets: Loads and processes speech datasets
  3. torchaudio: Handles audio processing and manipulation
  4. soundfile: Reads and writes .wav files
  5. jiwer: Computes the WER for evaluating ASR performance

 

Step 2: Loading a Lightweight Speech Dataset

 
Instead of using large datasets like Common Voice, we use SUPERB KS, a small dataset ideal for quick experimentation. This dataset consists of short spoken commands like “yes,” “no,” and “stop.”

from datasets import load_dataset

dataset = load_dataset("superb", "ks", split="train[:1%]")  # Load only 1% of the data for quick testing
print(dataset)

 

This loads a tiny subset of the dataset to reduce computational cost while still allowing us to fine-tune the model. Warning: the dataset still requires storage space, so be mindful of disk usage when working with larger splits.

 

Step 3: Preprocessing the Audio Data

 
To train our ASR model, we need to ensure that the audio data is in the correct format. The Wav2Vec2 model requires:

  1. 16 kHz sample rate
  2. No padding or truncation (handled dynamically)

We define a function to process the audio and extract relevant features.

import torchaudio

def preprocess_audio(batch):
    speech_array, sampling_rate = torchaudio.load(batch["audio"]["path"])
    batch["speech"] = speech_array.squeeze().numpy()
    batch["sampling_rate"] = sampling_rate
    batch["target_text"] = batch["label"]  # Use labels as text output
    return batch

dataset = dataset.map(preprocess_audio)

 

This ensures all audio files are loaded correctly and formatted properly for further processing.

 

Step 4: Loading a Pre-trained Wav2Vec2 Model

 
We use a pre-trained Wav2Vec2 model from Hugging Face’s model hub. This model has already been trained on a large dataset and can be fine-tuned for our specific task.

from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC

processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

 

Here we define both the processor that converts raw audio into model-friendly features and the model, consisting of a Wav2Vec2 pre-trained on 960 hours of speech.

 

Step 5: Preparing Data for the Model

 
We must tokenize and encode the audio so that the model can understand it.

def preprocess_for_model(batch):
    inputs = processor(batch["speech"], sampling_rate=16000, return_tensors="pt", padding=True)
    batch["input_values"] = inputs.input_values[0]
    return batch

dataset = dataset.map(preprocess_for_model, remove_columns=["speech", "sampling_rate", "audio"])

 

This step ensures that our dataset is compatible with the Wav2Vec2 model.

 

Step 6: Defining Training Arguments

 
Before training, we need to set up our training configuration. This includes batch size, learning rate, and optimization steps.

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir="./wav2vec2",
    per_device_train_batch_size=4,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_dir="./logs",
    learning_rate=1e-4,
    warmup_steps=500,
    max_steps=4000,
    save_total_limit=2,
    gradient_accumulation_steps=2,
    fp16=True,
    push_to_hub=False,
)

 

Step 7: Training the Model

 
Using Hugging Face’s Trainer, we fine-tune our Wav2Vec2 model.

from transformers import Trainer

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=dataset,
    tokenizer=processor,
)

trainer.train()

 

Step 8: Evaluating the Model

 
To measure how well our model transcribes speech, we compute the WER.

import torch
from jiwer import wer

def transcribe(batch):
    inputs = processor(batch["input_values"], return_tensors="pt", padding=True)
    with torch.no_grad():
        logits = model(inputs.input_values).logits
    predicted_ids = torch.argmax(logits, dim=-1)
    batch["predicted_text"] = processor.batch_decode(predicted_ids)[0]
    return batch

results = dataset.map(transcribe)
wer_score = wer(results["target_text"], results["predicted_text"])
print(f"Word Error Rate: {wer_score:.2f}")

 

A lower WER score indicates better performance.

 

Step 9: Running Inference on New Audio

 
Finally, we can use our trained model to transcribe real-world speech.

import torchaudio
import soundfile as sf

speech_array, sampling_rate = torchaudio.load("example.wav")
inputs = processor(speech_array.squeeze().numpy(), sampling_rate=16000, return_tensors="pt", padding=True)

with torch.no_grad():
    logits = model(inputs.input_values).logits

predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(predicted_ids)

 

Conclusion

 
And that’s it. You’ve successfully built an ASR system using PyTorch & Hugging Face with a lightweight dataset.
 
 

Josep Ferrer is an analytics engineer from Barcelona. He graduated in physics engineering and is currently working in the data science field applied to human mobility. He is a part-time content creator focused on data science and technology. Josep writes on all things AI, covering the application of the ongoing explosion in the field.

Source Link



Shark AI Ultra Voice Control Robot Vacuum

Transform your cleaning routine with the Shark AI Ultra Voice Control Robot Vacuum! This high-tech marvel boasts over 32,487 ratings, an impressive 4.2 out of 5 stars, and has been purchased over 900 times in the past month. Perfect for keeping your home spotless with minimal effort, this vacuum is now available for the unbeatable price of $349.99!

Don’t miss out on this limited-time offer. Order now and let Shark AI do the work for you!


Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

Help Power Techcratic’s Future – Scan To Support

If Techcratic’s content and insights have helped you, consider giving back by supporting the platform with crypto. Every contribution makes a difference, whether it’s for high-quality content, server maintenance, or future updates. Techcratic is constantly evolving, and your support helps drive that progress.

As a solo operator who wears all the hats, creating content, managing the tech, and running the site, your support allows me to stay focused on delivering valuable resources. Your support keeps everything running smoothly and enables me to continue creating the content you love. I’m deeply grateful for your support, it truly means the world to me! Thank you!

BITCOIN

Bitcoin Logo

Bitcoin QR Code

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Scan the QR code with your crypto wallet app

DOGECOIN

Dogecoin Logo

Dogecoin QR Code

D64GwvvYQxFXYyan3oQCrmWfidf6T3JpBA

Scan the QR code with your crypto wallet app

ETHEREUM

Ethereum Logo

Ethereum QR Code

0xe9BC980DF3d985730dA827996B43E4A62CCBAA7a

Scan the QR code with your crypto wallet app

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: AI NEWS
Share162Share28ShareShare4ShareTweet101
Previous Post

Best coffee maker deals from Amazon’s Big Spring Sale

Next Post

Canon goes all in on vlogging with the PowerShot V1 compact and R50 V mirrorless cameras

AI

AI

Explore the dynamic realm of AI, where breakthroughs and trends are shaping the future. Stay informed and see how AI is making an impact. Don’t miss the crucial updates—read the latest articles here at Techcratic.

Related Posts

Artificial Intelligence
AI

EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

July 1, 2025
1.3k
Artificial Intelligence
AI

Instruction-Following Pruning for Large Language Models

June 30, 2025
1.3k
Artificial Intelligence
AI

How to Combine Streamlit, Pandas, and Plotly for Interactive Data Apps

June 27, 2025
1.3k
Artificial Intelligence
AI

Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

June 26, 2025
1.3k
Artificial Intelligence
AI

Automate Data Quality Reports with n8n: From CSV to Professional Analysis

June 26, 2025
1.3k
Artificial Intelligence
AI

NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

June 24, 2025
1.3k
Artificial Intelligence
AI

5 Things You Need to Know About Agentic AI

June 23, 2025
1.3k
Artificial Intelligence
AI

Normalizing Flows are Capable Generative Models

June 20, 2025
1.3k
Load More
Next Post
Canon goes all in on vlogging with the PowerShot V1 compact and R50 V mirrorless cameras

Canon goes all in on vlogging with the PowerShot V1 compact and R50 V mirrorless cameras

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Forbes
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Gaming
  • I Like Cats ™
  • I Like Dogs ™
  • MacRumors
  • Macworld
  • Tech Deals
  • Techcratic ™
  • Techs Got To Eat ™
  • Tesla
  • UFO
  • Wired