• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Saturday, July 12, 2025
Techcratic
No Result
View All Result
  • AI
    Artificial Intelligence

    Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping

    Artificial Intelligence

    Overcoming Vocabulary Constraints with Pixel-level Fallback

    Artificial Intelligence

    Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails

    Artificial Intelligence

    10 Surprising Things You Can Do with Python’s datetime Module

    Artificial Intelligence

    New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

    Artificial Intelligence

    Unlock retail intelligence by transforming data into actionable insights using generative AI with Amazon Q Business

    Artificial Intelligence

    Scale generative AI use cases, Part 1: Multi-tenant hub and spoke architecture using AWS Transit Gateway

    Artificial Intelligence

    Build a just-in-time knowledge base with Amazon Bedrock

    Artificial Intelligence

    Transforming network operations with AI: How Swisscom built a network assistant using Amazon Bedrock

  • Apple
    Apple’s M5 chip could continue an unexpected new trend: report

    Apple’s M5 chip could continue an unexpected new trend: report

    The best displays to pair with your new Mac

    The best displays to pair with your new Mac

    New ‘HomePad’ product gets fresh launch timing update

    New ‘HomePad’ product gets fresh launch timing update

    M4 MacBook Air, iPhone, AirPods, iPad, more 9to5Mac

    M4 MacBook Air, iPhone, AirPods, iPad, more 9to5Mac

    Leaker teases Dynamic Island’s ‘significant evolution’ ahead

    Leaker teases Dynamic Island’s ‘significant evolution’ ahead

    July 11, 2025 – Apple’s plans for new Macs, iPads, and more

    Apple Watch Ultra 3: Three new features are coming next year

    Apple Watch Ultra 3: Three new features are coming this fall

    Report: Apple set to land US F1 streaming rights in $150 million+ deal

    Report: Apple set to land US F1 streaming rights in $150 million+ deal

    Fresh leak confirms new iPhone 17 and iPhone 17 Air colors

    Fresh leak confirms new iPhone 17 and iPhone 17 Air colors

  • ComputerWorld
    AI coding tools can slow down seasoned developers by 19%

    AI coding tools can slow down seasoned developers by 19%

    Will IT turn the AI bot battle into a money maker? (And is that even a good idea?) – Computerworld

    Will IT turn the AI bot battle into a money maker? (And is that even a good idea?) – Computerworld

    Tariff uncertainty hits US PC shipments in Q2 – Computerworld

    Tariff uncertainty hits US PC shipments in Q2 – Computerworld

    The fast way to fix a frozen Start menu or taskbar in Windows – Computerworld

    The fast way to fix a frozen Start menu or taskbar in Windows – Computerworld

    Microsoft’s 19-hour Outlook outage exposes fragility in cloud infrastructure – Computerworld

    Microsoft’s 19-hour Outlook outage exposes fragility in cloud infrastructure – Computerworld

    A magically minimalist Android makeover – Computerworld

    A magically minimalist Android makeover – Computerworld

    From conversation to task completion – Computerworld

    From conversation to task completion – Computerworld

    OpenAI and Perplexity enter browser wars to take on Chrome – Computerworld

    OpenAI and Perplexity enter browser wars to take on Chrome – Computerworld

    Using OneDrive for Web – Computerworld

    Using OneDrive for Web – Computerworld

  • Gaming
    Tips for Playing Majora's Mask for the first time!

    Tips for Playing Majora's Mask for the first time!

    Hebra's Side Quest | Zelda: Breath of the Wild 100% Walkthrough “103/127” (No Commentary)

    Hebra's Side Quest | Zelda: Breath of the Wild 100% Walkthrough “103/127” (No Commentary)

    Zelda Breath of the Wild – 27 Ways to Make a Campfire

    Zelda Breath of the Wild – 27 Ways to Make a Campfire

    Ocarina of Time Walkthrough Let's Play – Part 3

    Ocarina of Time Walkthrough Let's Play – Part 3

    Should Charles Martinet voice Mario in The Super Mario Bros. Movie?*

    Should Charles Martinet voice Mario in The Super Mario Bros. Movie?*

    Final Fantasy producer Yoshi-P says ‘there’s no clear-cut answer’ to whether Final Fantasy will return to turn-based RPGs, because the direction of the next Final Fantasy will be up to the directors of the next Final Fantasy

    Final Fantasy producer Yoshi-P says ‘there’s no clear-cut answer’ to whether Final Fantasy will return to turn-based RPGs, because the direction of the next Final Fantasy will be up to the directors of the next Final Fantasy

    Zelda Ocarina of Time 3D 100% Walkthrough – Part 72/78 – Gerudo Training Ground Part 1 (Commentary)

    Zelda Ocarina of Time 3D 100% Walkthrough – Part 72/78 – Gerudo Training Ground Part 1 (Commentary)

    Capcom celebrates the first anniversary of Kunitsu-Gami: Path of the Goddess by ripping out its Denuvo DRM

    Capcom celebrates the first anniversary of Kunitsu-Gami: Path of the Goddess by ripping out its Denuvo DRM

    The Legend of Zelda: Breath of the Wild – Daka Tuss Shrine Walkthrough

    The Legend of Zelda: Breath of the Wild – Daka Tuss Shrine Walkthrough

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Art
    FUZZY ANTEATER SAD ORIGIN STORY! The Amazing Digital Circus UNOFFICIAL Animation

    FUZZY ANTEATER SAD ORIGIN STORY! The Amazing Digital Circus UNOFFICIAL Animation

    How Artificial Intelligence is slowly becoming less art-ificial | Amir Baradaran | TEDxBocaRaton

    How Artificial Intelligence is slowly becoming less art-ificial | Amir Baradaran | TEDxBocaRaton

    Northern Lights Aurora Night Sky Vector Art Tutorial Affinity Designer V2

    Northern Lights Aurora Night Sky Vector Art Tutorial Affinity Designer V2

    The 3-Piece Collage Trick I Can’t Stop Using #visualdiary

    The 3-Piece Collage Trick I Can’t Stop Using #visualdiary

    GIGANOTOSSAURO (Desenhando com Thiago)

    GIGANOTOSSAURO (Desenhando com Thiago)

    Artist Spotlight: Nina Molloy

    Artist Spotlight: Nina Molloy

    Interactive Particle Systems from Gaussian Splats in TouchDesigner! (2025 + Mac Compatible)

    Interactive Particle Systems from Gaussian Splats in TouchDesigner! (2025 + Mac Compatible)

    I Made Leon S. Kennedy (Resident Evil 4) | Timelapse Sculpture

    I Made Leon S. Kennedy (Resident Evil 4) | Timelapse Sculpture

    Pixel art is here

    Pixel art is here

  • Tech Deals
    Intel Core i5-13500 2.50GHz 14 Cores LGA1700 Desktop Processor Boxed (Raptor Lake)

    Intel Core i5-13500 2.50GHz 14 Cores LGA1700 Desktop Processor Boxed (Raptor Lake)

    Hitachi 322898 Side Handle C10FCE2 C10FCH2 Replacement Part

    Hitachi 322898 Side Handle C10FCE2 C10FCH2 Replacement Part

    PC-VR Streaming Air Link Compatible with Meta Quest 3S/3/2 AX3000 WiFi6 VR Router,…

    PC-VR Streaming Air Link Compatible with Meta Quest 3S/3/2 AX3000 WiFi6 VR Router,…

    Crucial X9 2TB Portable SSD, Up to 1050MB/s, USB 3.2 USB-C, External Solid State Drive,…

    Crucial X9 2TB Portable SSD, Up to 1050MB/s, USB 3.2 USB-C, External Solid State Drive,…

    CORSAIR VIRTUOSO RGB WIRELESS XT Multiplatform Gaming Headset With Bluetooth – Dolby…

    CORSAIR VIRTUOSO RGB WIRELESS XT Multiplatform Gaming Headset With Bluetooth – Dolby…

    [Designed for Microsoft Surface] Cable Matters Desk Mount for Microsoft Surface…

    [Designed for Microsoft Surface] Cable Matters Desk Mount for Microsoft Surface…

    Asrock B650MProRSWiFi Mb Asrockb650m Pro Rs WiFi R

    Asrock B650MProRSWiFi Mb Asrockb650m Pro Rs WiFi R

    Wireless Retro Game Console

    Wireless Retro Game Console

    RK ROYAL KLUDGE S108 Typewriter Keyboard, Retro Mechanical Gaming Keyboard Wired 108…

    RK ROYAL KLUDGE S108 Typewriter Keyboard, Retro Mechanical Gaming Keyboard Wired 108…

  • Techs Got To Eat
    Bacon & Spinach Mug Quiche: 3-Minute Gourmet Breakfast

    Bacon & Spinach Mug Quiche: 3-Minute Gourmet Breakfast

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

  • Tesla
    Upgrade Rear Trunk Hook for Tesla Model Y Trunk Grocery Bag Hooks for 5 Seater Tesla…

    Upgrade Rear Trunk Hook for Tesla Model Y Trunk Grocery Bag Hooks for 5 Seater Tesla…

    Rear Trunk Spoiler Wing ABS Fit for Tesla Model Y 2020 2021 2022 2023 2024 High…

    Rear Trunk Spoiler Wing ABS Fit for Tesla Model Y 2020 2021 2022 2023 2024 High…

    2024-2025 Tesla Model 3 Highland Trunk Organizer – 2PCS Waterproof TPE Rear Trunk…

    2024-2025 Tesla Model 3 Highland Trunk Organizer – 2PCS Waterproof TPE Rear Trunk…

    Elon Musk says Tesla Robotaxi is coming to California, but no one other than shareholders believe him

    Elon Musk says Tesla Robotaxi is coming to California, but no one other than shareholders believe him

    1038548-00-I Charging Port Door Assembly Compatible with 2016 2017 2018 2019 2020 Tesla…

    1038548-00-I Charging Port Door Assembly Compatible with 2016 2017 2018 2019 2020 Tesla…

    Panasonic reportedly delays production ramp at US battery factory due to low Tesla demand

    Panasonic reportedly delays production ramp at US battery factory due to low Tesla demand

    KAYI Car Seat Cover, Memory Foam Car Seat Cushion, Non-Slip Bottom Breathable Car Seat…

    KAYI Car Seat Cover, Memory Foam Car Seat Cushion, Non-Slip Bottom Breathable Car Seat…

    Ziciner 2 PCS Motorcycle Chain Brush, Bike Chain Cleaning Brush with Double-Ended…

    Ziciner 2 PCS Motorcycle Chain Brush, Bike Chain Cleaning Brush with Double-Ended…

    2PCS Drill Brush Attachment, 4″ & 2″ Power Scrubber Cleaning Brush for Carpet Car…

    2PCS Drill Brush Attachment, 4″ & 2″ Power Scrubber Cleaning Brush for Carpet Car…

  • UFO
    PARANORMAL ACTIVITY: NEXT OF KIN (2021) Ending Explained

    PARANORMAL ACTIVITY: NEXT OF KIN (2021) Ending Explained

    Planner 2025-2026, Academic Year Weekly and Monthly Calender Planner, July 2025 – June 2026, Spiral Bound School Planning Tool, Perfect for Teacher, Student, Women & Men – A5 (6.3″ x 8.5″), Green

    Planner 2025-2026, Academic Year Weekly and Monthly Calender Planner, July 2025 – June 2026, Spiral Bound School Planning Tool, Perfect for Teacher, Student, Women & Men – A5 (6.3″ x 8.5″), Green

    Dissecting 9/11 Conspiracy Theories – First Tuesday Lecture

    Dissecting 9/11 Conspiracy Theories – First Tuesday Lecture

    The Chosen: Season 1-4 – Standard Edition DVD 4-Pack

    The Chosen: Season 1-4 – Standard Edition DVD 4-Pack

    AI This Week Unidentified Flying Object

    AI This Week Unidentified Flying Object

    007: James Bond – Sean Connery 6-Film Collection (4K Ultra HD + Digital)

    007: James Bond – Sean Connery 6-Film Collection (4K Ultra HD + Digital)

    F 14 D fighter jet sightings

    F 14 D fighter jet sightings

    Roswell New Mexico UFO Vintage Retro T T-Shirt

    Roswell New Mexico UFO Vintage Retro T T-Shirt

    Top 5 Sky Phenomena That Look Totally Broken

    Top 5 Sky Phenomena That Look Totally Broken

  • AI
    Artificial Intelligence

    Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping

    Artificial Intelligence

    Overcoming Vocabulary Constraints with Pixel-level Fallback

    Artificial Intelligence

    Uphold ethical standards in fashion using multimodal toxicity detection with Amazon Bedrock Guardrails

    Artificial Intelligence

    10 Surprising Things You Can Do with Python’s datetime Module

    Artificial Intelligence

    New capabilities in Amazon SageMaker AI continue to transform how organizations develop AI models

    Artificial Intelligence

    Unlock retail intelligence by transforming data into actionable insights using generative AI with Amazon Q Business

    Artificial Intelligence

    Scale generative AI use cases, Part 1: Multi-tenant hub and spoke architecture using AWS Transit Gateway

    Artificial Intelligence

    Build a just-in-time knowledge base with Amazon Bedrock

    Artificial Intelligence

    Transforming network operations with AI: How Swisscom built a network assistant using Amazon Bedrock

  • Apple
    Apple’s M5 chip could continue an unexpected new trend: report

    Apple’s M5 chip could continue an unexpected new trend: report

    The best displays to pair with your new Mac

    The best displays to pair with your new Mac

    New ‘HomePad’ product gets fresh launch timing update

    New ‘HomePad’ product gets fresh launch timing update

    M4 MacBook Air, iPhone, AirPods, iPad, more 9to5Mac

    M4 MacBook Air, iPhone, AirPods, iPad, more 9to5Mac

    Leaker teases Dynamic Island’s ‘significant evolution’ ahead

    Leaker teases Dynamic Island’s ‘significant evolution’ ahead

    July 11, 2025 – Apple’s plans for new Macs, iPads, and more

    Apple Watch Ultra 3: Three new features are coming next year

    Apple Watch Ultra 3: Three new features are coming this fall

    Report: Apple set to land US F1 streaming rights in $150 million+ deal

    Report: Apple set to land US F1 streaming rights in $150 million+ deal

    Fresh leak confirms new iPhone 17 and iPhone 17 Air colors

    Fresh leak confirms new iPhone 17 and iPhone 17 Air colors

  • ComputerWorld
    AI coding tools can slow down seasoned developers by 19%

    AI coding tools can slow down seasoned developers by 19%

    Will IT turn the AI bot battle into a money maker? (And is that even a good idea?) – Computerworld

    Will IT turn the AI bot battle into a money maker? (And is that even a good idea?) – Computerworld

    Tariff uncertainty hits US PC shipments in Q2 – Computerworld

    Tariff uncertainty hits US PC shipments in Q2 – Computerworld

    The fast way to fix a frozen Start menu or taskbar in Windows – Computerworld

    The fast way to fix a frozen Start menu or taskbar in Windows – Computerworld

    Microsoft’s 19-hour Outlook outage exposes fragility in cloud infrastructure – Computerworld

    Microsoft’s 19-hour Outlook outage exposes fragility in cloud infrastructure – Computerworld

    A magically minimalist Android makeover – Computerworld

    A magically minimalist Android makeover – Computerworld

    From conversation to task completion – Computerworld

    From conversation to task completion – Computerworld

    OpenAI and Perplexity enter browser wars to take on Chrome – Computerworld

    OpenAI and Perplexity enter browser wars to take on Chrome – Computerworld

    Using OneDrive for Web – Computerworld

    Using OneDrive for Web – Computerworld

  • Gaming
    Tips for Playing Majora's Mask for the first time!

    Tips for Playing Majora's Mask for the first time!

    Hebra's Side Quest | Zelda: Breath of the Wild 100% Walkthrough “103/127” (No Commentary)

    Hebra's Side Quest | Zelda: Breath of the Wild 100% Walkthrough “103/127” (No Commentary)

    Zelda Breath of the Wild – 27 Ways to Make a Campfire

    Zelda Breath of the Wild – 27 Ways to Make a Campfire

    Ocarina of Time Walkthrough Let's Play – Part 3

    Ocarina of Time Walkthrough Let's Play – Part 3

    Should Charles Martinet voice Mario in The Super Mario Bros. Movie?*

    Should Charles Martinet voice Mario in The Super Mario Bros. Movie?*

    Final Fantasy producer Yoshi-P says ‘there’s no clear-cut answer’ to whether Final Fantasy will return to turn-based RPGs, because the direction of the next Final Fantasy will be up to the directors of the next Final Fantasy

    Final Fantasy producer Yoshi-P says ‘there’s no clear-cut answer’ to whether Final Fantasy will return to turn-based RPGs, because the direction of the next Final Fantasy will be up to the directors of the next Final Fantasy

    Zelda Ocarina of Time 3D 100% Walkthrough – Part 72/78 – Gerudo Training Ground Part 1 (Commentary)

    Zelda Ocarina of Time 3D 100% Walkthrough – Part 72/78 – Gerudo Training Ground Part 1 (Commentary)

    Capcom celebrates the first anniversary of Kunitsu-Gami: Path of the Goddess by ripping out its Denuvo DRM

    Capcom celebrates the first anniversary of Kunitsu-Gami: Path of the Goddess by ripping out its Denuvo DRM

    The Legend of Zelda: Breath of the Wild – Daka Tuss Shrine Walkthrough

    The Legend of Zelda: Breath of the Wild – Daka Tuss Shrine Walkthrough

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Art
    FUZZY ANTEATER SAD ORIGIN STORY! The Amazing Digital Circus UNOFFICIAL Animation

    FUZZY ANTEATER SAD ORIGIN STORY! The Amazing Digital Circus UNOFFICIAL Animation

    How Artificial Intelligence is slowly becoming less art-ificial | Amir Baradaran | TEDxBocaRaton

    How Artificial Intelligence is slowly becoming less art-ificial | Amir Baradaran | TEDxBocaRaton

    Northern Lights Aurora Night Sky Vector Art Tutorial Affinity Designer V2

    Northern Lights Aurora Night Sky Vector Art Tutorial Affinity Designer V2

    The 3-Piece Collage Trick I Can’t Stop Using #visualdiary

    The 3-Piece Collage Trick I Can’t Stop Using #visualdiary

    GIGANOTOSSAURO (Desenhando com Thiago)

    GIGANOTOSSAURO (Desenhando com Thiago)

    Artist Spotlight: Nina Molloy

    Artist Spotlight: Nina Molloy

    Interactive Particle Systems from Gaussian Splats in TouchDesigner! (2025 + Mac Compatible)

    Interactive Particle Systems from Gaussian Splats in TouchDesigner! (2025 + Mac Compatible)

    I Made Leon S. Kennedy (Resident Evil 4) | Timelapse Sculpture

    I Made Leon S. Kennedy (Resident Evil 4) | Timelapse Sculpture

    Pixel art is here

    Pixel art is here

  • Tech Deals
    Intel Core i5-13500 2.50GHz 14 Cores LGA1700 Desktop Processor Boxed (Raptor Lake)

    Intel Core i5-13500 2.50GHz 14 Cores LGA1700 Desktop Processor Boxed (Raptor Lake)

    Hitachi 322898 Side Handle C10FCE2 C10FCH2 Replacement Part

    Hitachi 322898 Side Handle C10FCE2 C10FCH2 Replacement Part

    PC-VR Streaming Air Link Compatible with Meta Quest 3S/3/2 AX3000 WiFi6 VR Router,…

    PC-VR Streaming Air Link Compatible with Meta Quest 3S/3/2 AX3000 WiFi6 VR Router,…

    Crucial X9 2TB Portable SSD, Up to 1050MB/s, USB 3.2 USB-C, External Solid State Drive,…

    Crucial X9 2TB Portable SSD, Up to 1050MB/s, USB 3.2 USB-C, External Solid State Drive,…

    CORSAIR VIRTUOSO RGB WIRELESS XT Multiplatform Gaming Headset With Bluetooth – Dolby…

    CORSAIR VIRTUOSO RGB WIRELESS XT Multiplatform Gaming Headset With Bluetooth – Dolby…

    [Designed for Microsoft Surface] Cable Matters Desk Mount for Microsoft Surface…

    [Designed for Microsoft Surface] Cable Matters Desk Mount for Microsoft Surface…

    Asrock B650MProRSWiFi Mb Asrockb650m Pro Rs WiFi R

    Asrock B650MProRSWiFi Mb Asrockb650m Pro Rs WiFi R

    Wireless Retro Game Console

    Wireless Retro Game Console

    RK ROYAL KLUDGE S108 Typewriter Keyboard, Retro Mechanical Gaming Keyboard Wired 108…

    RK ROYAL KLUDGE S108 Typewriter Keyboard, Retro Mechanical Gaming Keyboard Wired 108…

  • Techs Got To Eat
    Bacon & Spinach Mug Quiche: 3-Minute Gourmet Breakfast

    Bacon & Spinach Mug Quiche: 3-Minute Gourmet Breakfast

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

  • Tesla
    Upgrade Rear Trunk Hook for Tesla Model Y Trunk Grocery Bag Hooks for 5 Seater Tesla…

    Upgrade Rear Trunk Hook for Tesla Model Y Trunk Grocery Bag Hooks for 5 Seater Tesla…

    Rear Trunk Spoiler Wing ABS Fit for Tesla Model Y 2020 2021 2022 2023 2024 High…

    Rear Trunk Spoiler Wing ABS Fit for Tesla Model Y 2020 2021 2022 2023 2024 High…

    2024-2025 Tesla Model 3 Highland Trunk Organizer – 2PCS Waterproof TPE Rear Trunk…

    2024-2025 Tesla Model 3 Highland Trunk Organizer – 2PCS Waterproof TPE Rear Trunk…

    Elon Musk says Tesla Robotaxi is coming to California, but no one other than shareholders believe him

    Elon Musk says Tesla Robotaxi is coming to California, but no one other than shareholders believe him

    1038548-00-I Charging Port Door Assembly Compatible with 2016 2017 2018 2019 2020 Tesla…

    1038548-00-I Charging Port Door Assembly Compatible with 2016 2017 2018 2019 2020 Tesla…

    Panasonic reportedly delays production ramp at US battery factory due to low Tesla demand

    Panasonic reportedly delays production ramp at US battery factory due to low Tesla demand

    KAYI Car Seat Cover, Memory Foam Car Seat Cushion, Non-Slip Bottom Breathable Car Seat…

    KAYI Car Seat Cover, Memory Foam Car Seat Cushion, Non-Slip Bottom Breathable Car Seat…

    Ziciner 2 PCS Motorcycle Chain Brush, Bike Chain Cleaning Brush with Double-Ended…

    Ziciner 2 PCS Motorcycle Chain Brush, Bike Chain Cleaning Brush with Double-Ended…

    2PCS Drill Brush Attachment, 4″ & 2″ Power Scrubber Cleaning Brush for Carpet Car…

    2PCS Drill Brush Attachment, 4″ & 2″ Power Scrubber Cleaning Brush for Carpet Car…

  • UFO
    PARANORMAL ACTIVITY: NEXT OF KIN (2021) Ending Explained

    PARANORMAL ACTIVITY: NEXT OF KIN (2021) Ending Explained

    Planner 2025-2026, Academic Year Weekly and Monthly Calender Planner, July 2025 – June 2026, Spiral Bound School Planning Tool, Perfect for Teacher, Student, Women & Men – A5 (6.3″ x 8.5″), Green

    Planner 2025-2026, Academic Year Weekly and Monthly Calender Planner, July 2025 – June 2026, Spiral Bound School Planning Tool, Perfect for Teacher, Student, Women & Men – A5 (6.3″ x 8.5″), Green

    Dissecting 9/11 Conspiracy Theories – First Tuesday Lecture

    Dissecting 9/11 Conspiracy Theories – First Tuesday Lecture

    The Chosen: Season 1-4 – Standard Edition DVD 4-Pack

    The Chosen: Season 1-4 – Standard Edition DVD 4-Pack

    AI This Week Unidentified Flying Object

    AI This Week Unidentified Flying Object

    007: James Bond – Sean Connery 6-Film Collection (4K Ultra HD + Digital)

    007: James Bond – Sean Connery 6-Film Collection (4K Ultra HD + Digital)

    F 14 D fighter jet sightings

    F 14 D fighter jet sightings

    Roswell New Mexico UFO Vintage Retro T T-Shirt

    Roswell New Mexico UFO Vintage Retro T T-Shirt

    Top 5 Sky Phenomena That Look Totally Broken

    Top 5 Sky Phenomena That Look Totally Broken

No Result
View All Result
Techcratic
No Result
View All Result
Home Hacker News

MinishLab/semhash: Fast Semantic Text Deduplication

Hacker News by Hacker News
January 12, 2025
in Hacker News
Reading Time: 18 mins read
122
A A
0

2025-01-12 11:20:00
github.com

SemHash logo

SemHash is a lightweight and flexible tool for deduplicating datasets using semantic similarity. It combines fast embedding generation from Model2Vec with efficient ANN-based similarity search through Vicinity.

SemHash supports both single-dataset deduplication (e.g., cleaning up a train set) and multi-dataset deduplication (e.g., ensuring no overlap between a test set and a train set). It works with simple datasets, such as text lists, and more complex ones, like multi-column QA datasets. Additionally, it includes functions to inspect deduplication results, making it easier to understand and refine your data cleaning process.

Install the package with:

Deduplicate a single dataset with the following code (note: the examples assume you have datasets installed, which you can install with pip install datasets):

from datasets import load_dataset
from semhash import SemHash

# Load a dataset to deduplicate
texts = load_dataset("ag_news", split="train")["text"]

# Initialize a SemHash instance
semhash = SemHash.from_records(records=texts)

# Deduplicate the texts
deduplicated_texts = semhash.self_deduplicate().deduplicated

Or, deduplicate across two datasets with the following code (e.g., eliminating train/test leakage):

from datasets import load_dataset
from semhash import SemHash

# Load two datasets to deduplicate
train_texts = load_dataset("ag_news", split="train")["text"]
test_texts = load_dataset("ag_news", split="test")["text"]

# Initialize a SemHash instance with the training data
semhash = SemHash.from_records(records=train_texts)

# Deduplicate the test data against the training data, optionally with a specific threshold
deduplicated_test_texts = semhash.deduplicate(records=test_texts, threshold=0.9).deduplicated

Or, deduplicate multi-column datasets with the following code (e.g., deduplicating a QA dataset):

from datasets import load_dataset
from semhash import SemHash

# Load the dataset
dataset = load_dataset("squad_v2", split="train")

# Convert the dataset to a list of dictionaries
records = [dict(row) for row in dataset]

# Initialize SemHash with the columns to deduplicate
semhash = SemHash.from_records(records=records, columns=["question", "context"])

# Deduplicate the records
deduplicated_records = semhash.self_deduplicate().deduplicated

The deduplicate and self_deduplicate functions return a DeduplicationResult. This object stores the deduplicated corpus, a set of duplicate objec (along with the objects that caused duplication), and several useful functions to further inspect the deduplication result. Examples of how these functions can be used can be found in the usage section.

  • Fast: SemHash uses model2vec to embed texts and vicinity to perform similarity search, making it extremely fast.
  • Scalable: SemHash can deduplicate large datasets with millions of records thanks to the ANN backends in Vicinity.
  • Flexible: SemHash can be used to deduplicate a single dataset or across two datasets, and can also be used to deduplicate multi-column datasets (such as QA datasets).
  • Lightweight: SemHash is a lightweight package with minimal dependencies, making it easy to install and use.
  • Explainable: Easily inspect the duplicates and what caused them with the DeduplicationResult object. You can also view the lowest similarity duplicates to find the right threshold for deduplication for your dataset.

The following examples show the various ways you can use SemHash to deduplicate datasets. These examples assume you have the datasets library installed, which you can install with pip install datasets.

Deduplicate a single dataset

The following code snippet shows how to deduplicate a single dataset using SemHash (in this example, the train split of the AG News dataset):

from datasets import load_dataset
from semhash import SemHash

# Load a dataset to deduplicate
texts = load_dataset("ag_news", split="train")["text"]

# Initialize a SemHash instance
semhash = SemHash.from_records(records=texts)

# Deduplicate the texts
deduplicated_texts = semhash.self_deduplicate()
Deduplicate across two datasets

The following code snippet shows how to deduplicate across two datasets using SemHash (in this example, the train/test split of the AG News dataset):

from datasets import load_dataset
from semhash import SemHash

# Initialize a SemHash instance
semhash = SemHash()

# Load two datasets to deduplicate
train_texts = load_dataset("ag_news", split="train")["text"]
test_texts = load_dataset("ag_news", split="test")["text"]

# Initialize a SemHash instance
semhash = SemHash.from_records(records=train_texts)

# Deduplicate the test data against the training data
deduplicated_test_texts = semhash.deduplicate(records=test_texts)
Deduplicate multi-column datasets

The following code snippet shows how to deduplicate multi-column datasets using SemHash (in this example, the train split of the QA dataset SQuAD 2.0, which consists of questions, contexts, and answers):

from datasets import load_dataset
from semhash import SemHash

# Load the dataset
dataset = load_dataset("squad_v2", split="train")

# Convert the dataset to a list of dictionaries
records = [dict(row) for row in dataset]

# Initialize SemHash with the columns to deduplicate
semhash = SemHash.from_records(records=records, columns=["question", "context"])

# Deduplicate the records
deduplicated_records = semhash.self_deduplicate().deduplicated
DeduplicationResult functionality

The DeduplicationResult object returned by the deduplicate and self_deduplicate functions contains several useful functions to inspect the deduplication result. The following code snippet shows how to use these functions:

from datasets import load_dataset
from semhash import SemHash

# Load a dataset to deduplicate
texts = load_dataset("ag_news", split="train")["text"]

# Initialize a SemHash instance
semhash = SemHash.from_records(records=texts)

# Deduplicate the texts
deduplication_result = semhash.self_deduplicate()

# Check the deduplicated texts
deduplication_result.deduplicated
# Check the duplicates
deduplication_result.duplicates
# See what percentage of the texts were duplicates
deduplication_result.duplicate_ratio
# See what percentage of the texts were exact duplicates
deduplication_result.exact_duplicate_ratio

# Get the least similar text from the duplicates. This is useful for finding the right threshold for deduplication.
least_similar = deduplication_result.get_least_similar_from_duplicates()

# Rethreshold the duplicates. This allows you to instantly rethreshold the duplicates with a new threshold without having to re-deduplicate the texts.
deduplication_result.rethreshold(0.95)
Using custom encoders

The following code snippet shows how to use a custom encoder with SemHash:

from datasets import load_dataset
from model2vec import StaticModel
from semhash import SemHash

# Load a dataset to deduplicate
texts = load_dataset("ag_news", split="train")["text"]

# Load an embedding model (in this example, a multilingual model)
model = StaticModel.from_pretrained("minishlab/M2V_multilingual_output")

# Initialize a SemHash with the model and custom encoder
semhash = SemHash.from_records(records=texts, model=model)

# Deduplicate the texts
deduplicated_texts = semhash.self_deduplicate()

Any encoder can be used that adheres to our encoder protocol. For example, any sentence-transformers model can be used as an encoder:

from datasets import load_dataset
from semhash import SemHash
from sentence_transformers import SentenceTransformer

# Load a dataset to deduplicate
texts = load_dataset("ag_news", split="train")["text"]

# Load a sentence-transformers model
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# Initialize a SemHash with the model and custom encoder
semhash = SemHash.from_records(records=texts, model=model)

# Deduplicate the texts
deduplicated_texts = semhash.self_deduplicate()

NOTE: By default, we use the ANN (approximate-nearest neighbors) backend for deduplication. We recommend keeping this since the recall for smaller datasets is ~100%, and it’s needed for larger datasets (>1M samples) since these will take too long to deduplicate without ANN. If you want to use the flat/exact-matching backend, you can set use_ann=False in the SemHash constructor:

semhash = SemHash.from_records(records=texts, use_ann=False)

We’ve benchmarked SemHash on a variety of datasets to measure the deduplication performance and speed. The benchmarks were run with the following setup:

  • The benchmarks were all run on CPU
  • The benchmarks were all run with use_ann=True
  • The used encoder is the default encoder (potion-base-8M).
  • The timings include the encoding time, index building time, and deduplication time.

Train Deduplication Benchmark

DatasetOriginal Train SizeDeduplicated Train Size% RemovedDeduplication Time (s)
bbc122511446.610.57
senteval_cr301229900.730.14
tweet_sentiment_extraction27481266952.861.77
emotion16000156951.910.77
amazon_counterfactual500049920.160.33
ag_news12000010692110.905.20
enron_spam317162054035.242.03
subj800079900.120.63
sst5854485260.210.58
20_newgroups11314106845.570.73
hatespeech_offensive22783220903.040.92
ade176371571810.880.73
imdb25000248300.681.76
massive_scenario11514936618.660.47
student1175196385645.668.80
squad_v213031910969815.828.81
wikitext180135088464550.8983.53

Train/Test Deduplication Benchmark

DatasetTrain SizeTest SizeDeduplicated Test Size% RemovedDeduplication Time (s)
bbc1225100087013.000.71
senteval_cr30127537500.400.13
tweet_sentiment_extraction27481353434123.451.53
emotion16000200019263.700.65
amazon_counterfactual5000500049900.200.51
ag_news1200007600619818.453.74
enron_spam317162000106047.001.94
subj8000200019990.050.62
sst58544221022050.230.59
20_newgroups11314753270985.762.25
hatespeech_offensive22783200019253.750.77
ade176375879495215.770.81
imdb2500025000247950.822.81
massive_scenario115142974219026.360.46
student1175195000239352.143.78
squad_v213031911873118630.087.13
wikitext18013504358213950.9240.32

As can be seen, SemHash is extremely fast, and scales to large datasets with millions of records. There are some notable examples of train/test leakage, such as enron_spam and student, where the test dataset contains a significant amount of semantic overlap with the training dataset.

Reproducing the Benchmarks

To run the benchmarks yourself, you can use the following command (assuming you have the datasets library installed):

python -m benchmarks.run_benchmarks

Optionally, the datasets can be updated in the datasets.py file.

Source Link


Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.

Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!


Unlock unlimited streaming with a free Amazon Prime trial!
Sign up today!

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Bitcoin QR code for donations

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: Hacker News
Share162Share28ShareShare4ShareTweet101
Previous Post

AMD accuses Intel’s Arrow Lake of being a ‘horrible’ product and implies a lack of options for consumers has caused the Ryzen 7 9800X3D shortage

Next Post

Galaxy S25 Ultra could be the coolest Samsung phone yet

Hacker News

Hacker News

Stay updated with Hacker News, where technology meets entrepreneurial spirit. Get the latest on tech trends, startup news, and discussions from the tech community. Read the latest updates here at Techcratic.

Related Posts

Lucas Sifoni – Leveraging Elixir’s hot code loading capabilities to modularize a monolithic app
Hacker News

Lucas Sifoni – Leveraging Elixir’s hot code loading capabilities to modularize a monolithic app

July 11, 2025
1.3k
HDD Clicker – Serdashop
Hacker News

HDD Clicker – Serdashop

July 11, 2025
1.3k
SANDISK 2TB PORTABLE SSD
Hacker News

I’m more proud of these 128 kilobytes than anything I’ve built since | by Mike Hall | Jul, 2025

July 11, 2025
1.3k
AI Agent Benchmarks are Broken
Hacker News

AI Agent Benchmarks are Broken

July 11, 2025
1.3k
The Day Someone Created 184 Billion Bitcoin
Hacker News

The Day Someone Created 184 Billion Bitcoin

July 11, 2025
1.3k
fosrl/pangolin: Tunneled Reverse Proxy Server with Identity and Access Control and Dashboard UI
Hacker News

fosrl/pangolin: Tunneled Reverse Proxy Server with Identity and Access Control and Dashboard UI

July 10, 2025
1.3k
AI Coding Tools Can Actually Reduce Productivity
Hacker News

AI Coding Tools Can Actually Reduce Productivity

July 10, 2025
1.3k
crypto theft • The Register
Hacker News

crypto theft • The Register

July 10, 2025
1.3k
Load More
Next Post

Galaxy S25 Ultra could be the coolest Samsung phone yet

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Forbes
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

How Do Hydrogen-Powered Auto Engines Work?

PARANORMAL ACTIVITY: NEXT OF KIN (2021) Ending Explained

Don’t miss this budget-priced home security Prime Day bargain

Tips for Playing Majora's Mask for the first time!

Amazon Has More Nintendo Switch 2 Accessories Than Ever for Prime Day

Planner 2025-2026, Academic Year Weekly and Monthly Calender Planner, July 2025 – June 2026, Spiral Bound School Planning Tool, Perfect for Teacher, Student, Women & Men – A5 (6.3″ x 8.5″), Green

Techcratic – Your All In One Tech Hub
© 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Gaming
  • I Like Cats ™
  • I Like Dogs ™
  • MacRumors
  • Macworld
  • Tech Deals
  • Techcratic ™
  • Techs Got To Eat ™
  • Tesla
  • UFO
  • Wired