• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Tuesday, June 10, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

    Artificial Intelligence

    Deploy Amazon SageMaker Projects with Terraform Cloud

  • Crypto
    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Publicly Traded Firm KULR Acquires 118.6 Bitcoin, Treasury Reaches 920 BTC

    Publicly Traded Firm KULR Acquires 118.6 Bitcoin, Treasury Reaches 920 BTC

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto Lost $1.64 Billion to Hackers in Q1 2025

    Why Is Crypto Down Today? – June 9, 2025

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    Bitcoin Bull Cycle is Over: CryptoQuant CEO

    CEX Volumes Hit 2020 Lows as Market Shifts to HODL Mode

  • Cybersecurity
    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

    Cybersecurity

    HPE Issues Security Patch for StoreOnce Bug Allowing Remote Authentication Bypass

    Cybersecurity

    Critical 10-Year-Old Roundcube Webmail Bug Allows Authenticated Users Run Malicious Code

    Cybersecurity

    Android Trojan Crocodilus Now Active in 8 Countries, Targeting Banks and Crypto Wallets

    Cybersecurity

    Microsoft and CrowdStrike Launch Shared Threat Actor Glossary to Cut Attribution Confusion

  • Deals
    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    Hitachi MAF0058 Mass Air Flow Sensor

    Hitachi MAF0058 Mass Air Flow Sensor

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    RoboCop Rogue City (PS5)

    RoboCop Rogue City (PS5)

    My Universe: Puppies and Kittens – PlayStation 4

    My Universe: Puppies and Kittens – PlayStation 4

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

  • Gaming
    Medieval Dynasty fans should check out this third-person medieval city builder that just got a demo on Steam

    Medieval Dynasty fans should check out this third-person medieval city builder that just got a demo on Steam

    Hogwarts Legacy: Percival Rackham's Trial Walkthrough

    Hogwarts Legacy: Percival Rackham's Trial Walkthrough

    Breeding is KEY to an EARLY GAME OP Base- Palworld

    Breeding is KEY to an EARLY GAME OP Base- Palworld

    Apple’s new UI for Macs and iPhones ‘combines the optical qualities of glass with a fluidity only Apple can achieve,’ but it sure looks like an awful lot like Windows Vista circa 2007

    Apple’s new UI for Macs and iPhones ‘combines the optical qualities of glass with a fluidity only Apple can achieve,’ but it sure looks like an awful lot like Windows Vista circa 2007

    HYPERCHARGE UNBOXED – CUSTOMIZATIONS

    HYPERCHARGE UNBOXED – CUSTOMIZATIONS

    Scars Above: First 10 Minutes of Gameplay | New Sci-Fi Action Game

    Scars Above: First 10 Minutes of Gameplay | New Sci-Fi Action Game

    2 Years with Steam Deck: My Honest Review and Experiences

    2 Years with Steam Deck: My Honest Review and Experiences

    Dune: Awakening buried treasure: How to find it and get a Sandbike Scanner

    Dune: Awakening buried treasure: How to find it and get a Sandbike Scanner

    LittleBigPlanet 3 – Five Nights at Freddy's The Movie Full Trailer  – LBP3 FNAF Animation

    LittleBigPlanet 3 – Five Nights at Freddy's The Movie Full Trailer – LBP3 FNAF Animation

  • Tesla
    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

  • UFO
    FPV Drone with 4K Dual HD Cameras Upgraded Version RC Quadcopter

    FPV Drone with 4K Dual HD Cameras Upgraded Version RC Quadcopter

    Incredible UFO Sightings by Pilots Challenge Our Understanding of Technology !

    Incredible UFO Sightings by Pilots Challenge Our Understanding of Technology !

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory, 6′ Cable & Safety Rope, ETL Listed 1 Pack

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory, 6′ Cable & Safety Rope, ETL Listed 1 Pack

    Rewi beklaut Dner & Neue Projekte mit dem kompletten UFO

    Rewi beklaut Dner & Neue Projekte mit dem kompletten UFO

    Spacecraft Systems Engineering

    Spacecraft Systems Engineering

    NASA UAP Researchers Share Shocking UFO Evidence!

    NASA UAP Researchers Share Shocking UFO Evidence!

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

    Artificial Intelligence

    Learn Power BI for Free This Week

    Artificial Intelligence

    Build GraphRAG applications using Amazon Bedrock Knowledge Bases

    Artificial Intelligence

    How to Use Deep Research Like a Pro

    Artificial Intelligence

    World-Consistent Video Diffusion With Explicit 3D Modeling

    Artificial Intelligence

    Deploy Amazon SageMaker Projects with Terraform Cloud

  • Crypto
    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Bitcoin’s $200K Price Forecast ‘Conservative,’ Says Bernstein

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Ripple Backs XRP Ledger Startups in Japan With up to $200K per Project

    Publicly Traded Firm KULR Acquires 118.6 Bitcoin, Treasury Reaches 920 BTC

    Publicly Traded Firm KULR Acquires 118.6 Bitcoin, Treasury Reaches 920 BTC

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    ETF Weekly Flows: $129 Million Outflow for Bitcoin and $281 Million Inflow for Ether

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    DOGE Gets Distilled: Heritage Unleashes Dogecoin-Themed Bourbon

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto ETFs centralize what was meant to be decentralized.

    Crypto Lost $1.64 Billion to Hackers in Q1 2025

    Why Is Crypto Down Today? – June 9, 2025

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    The Blockchain Group Unveils $343 Million Capital Program to Boost Bitcoin Treasury Strategy

    Bitcoin Bull Cycle is Over: CryptoQuant CEO

    CEX Volumes Hit 2020 Lows as Market Shifts to HODL Mode

  • Cybersecurity
    Cybersecurity

    Malicious Browser Extensions Infect 722 Users Across Latin America Since Early 2025

    Cybersecurity

    Empower Users and Protect Against GenAI Data Loss

    Cybersecurity

    Popular Chrome Extensions Leak API Keys, User Data via HTTP and Hardcoded Credentials

    Cybersecurity

    Critical Cisco ISE Auth Bypass Flaw Impacts Cloud Deployments on AWS, Azure, and OCI

    Cybersecurity

    Why Traditional DLP Solutions Fail in the Browser Era

    Cybersecurity

    HPE Issues Security Patch for StoreOnce Bug Allowing Remote Authentication Bypass

    Cybersecurity

    Critical 10-Year-Old Roundcube Webmail Bug Allows Authenticated Users Run Malicious Code

    Cybersecurity

    Android Trojan Crocodilus Now Active in 8 Countries, Targeting Banks and Crypto Wallets

    Cybersecurity

    Microsoft and CrowdStrike Launch Shared Threat Actor Glossary to Cut Attribution Confusion

  • Deals
    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    GD90 Mini PC, 12th Gen Intel i9-12900HK(14C/20T), 32GB DDR4 RAM 1TB SSD Desktop Mini…

    Hitachi MAF0058 Mass Air Flow Sensor

    Hitachi MAF0058 Mass Air Flow Sensor

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    Canon PG-245 Genuine Black Ink Cartridge, Compatible with iP2820,…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    GTRACING Gaming Chair with Footrest Speakers Video Game Chair Bluetooth Music Heavy Duty…

    RoboCop Rogue City (PS5)

    RoboCop Rogue City (PS5)

    My Universe: Puppies and Kittens – PlayStation 4

    My Universe: Puppies and Kittens – PlayStation 4

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Disney’s Little Mermaid: Ariel’s Undersea Adventure – Nintendo DS (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    Family Game Pack 2001- PlayStation (Renewed)

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

    StarTech.com Cisco GLC-T Compatible SFP Module – 1000BASE-T – SFP to RJ45 Cat6/Cat5e -…

  • Gaming
    Medieval Dynasty fans should check out this third-person medieval city builder that just got a demo on Steam

    Medieval Dynasty fans should check out this third-person medieval city builder that just got a demo on Steam

    Hogwarts Legacy: Percival Rackham's Trial Walkthrough

    Hogwarts Legacy: Percival Rackham's Trial Walkthrough

    Breeding is KEY to an EARLY GAME OP Base- Palworld

    Breeding is KEY to an EARLY GAME OP Base- Palworld

    Apple’s new UI for Macs and iPhones ‘combines the optical qualities of glass with a fluidity only Apple can achieve,’ but it sure looks like an awful lot like Windows Vista circa 2007

    Apple’s new UI for Macs and iPhones ‘combines the optical qualities of glass with a fluidity only Apple can achieve,’ but it sure looks like an awful lot like Windows Vista circa 2007

    HYPERCHARGE UNBOXED – CUSTOMIZATIONS

    HYPERCHARGE UNBOXED – CUSTOMIZATIONS

    Scars Above: First 10 Minutes of Gameplay | New Sci-Fi Action Game

    Scars Above: First 10 Minutes of Gameplay | New Sci-Fi Action Game

    2 Years with Steam Deck: My Honest Review and Experiences

    2 Years with Steam Deck: My Honest Review and Experiences

    Dune: Awakening buried treasure: How to find it and get a Sandbike Scanner

    Dune: Awakening buried treasure: How to find it and get a Sandbike Scanner

    LittleBigPlanet 3 – Five Nights at Freddy's The Movie Full Trailer  – LBP3 FNAF Animation

    LittleBigPlanet 3 – Five Nights at Freddy's The Movie Full Trailer – LBP3 FNAF Animation

  • Tesla
    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    j Junsun Portable Electric Car Charger Level 2 EV Charger 32A 240V for Tesla 21ft Cable…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Model Y Mud Flaps for Tesla Model Y Accessories 2024 Mud Flaps Tire Splash Guards fit…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    Tesla CCS Adapter, Fast and Efficient Charging Adapter for Tesla Model 3 Y S X, Portable…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 PCS LED Reverse Lights, 4014 45SMD 6500K 800LM High Bright Brake Light Turn Signal…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    4 Pack Trailer Ball Cover, 2.36In x 2.24In x 1.97In Waterproof Dustproof Towing Hitch…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    ClimaTex Heavy Duty Car, Truck, Van, and SUV Automotive Floor Mat for Floor Protection,…

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    2 Pcs Tow Hook Covers Compatible with Tesla Cybertruck Accessories 2024 2025 (Red)

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    MAXDOM Under Seat Storage Fit for 2024+ Tesla Cybertruck Rear Underseat Organizer Box…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

    Car USB Hub Charger for Tesla Model Y 2021-2024 and Model 3 2021-2023,Fast…

  • UFO
    FPV Drone with 4K Dual HD Cameras Upgraded Version RC Quadcopter

    FPV Drone with 4K Dual HD Cameras Upgraded Version RC Quadcopter

    Incredible UFO Sightings by Pilots Challenge Our Understanding of Technology !

    Incredible UFO Sightings by Pilots Challenge Our Understanding of Technology !

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory, 6′ Cable & Safety Rope, ETL Listed 1 Pack

    CINOTON 160W UFO LED High Bay Light, Aluminum LED Shop Lights with 24000LM, 5000K Commercial Bay Lighting for Warehouse Garage Workshop Factory, 6′ Cable & Safety Rope, ETL Listed 1 Pack

    Rewi beklaut Dner & Neue Projekte mit dem kompletten UFO

    Rewi beklaut Dner & Neue Projekte mit dem kompletten UFO

    Spacecraft Systems Engineering

    Spacecraft Systems Engineering

    NASA UAP Researchers Share Shocking UFO Evidence!

    NASA UAP Researchers Share Shocking UFO Evidence!

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    UFOs Over Phoenix: Confessions of a 911 Operator [DVD]

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    Have Aliens Visited Earth? | COLOSSAL MYSTERIES

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

    MINDBLOWING Encounters Unraveling the Secrets of Higher Dimensions

No Result
View All Result
Techcratic
No Result
View All Result
Home MIT Tech

Method prevents an AI model from being overconfident about wrong answers | MIT News

MIT Tech by MIT Tech
October 14, 2024
in MIT Tech
Reading Time: 5 mins read
124 7
A A
0
Share on FacebookShare on XShare on LinkedIn


Adam Zewe | MIT News
2024-07-31 00:00:00
news.mit.edu

People use large language models for a huge array of tasks, from translating an article to identifying financial fraud. However, despite the incredible capabilities and versatility of these models, they sometimes generate inaccurate responses.

On top of that problem, the models can be overconfident about wrong answers or underconfident about correct ones, making it tough for a user to know when a model can be trusted.

Researchers typically calibrate a machine-learning model to ensure its level of confidence lines up with its accuracy. A well-calibrated model should have less confidence about an incorrect prediction, and vice-versa. But because large language models (LLMs) can be applied to a seemingly endless collection of diverse tasks, traditional calibration methods are ineffective.

Now, researchers from MIT and the MIT-IBM Watson AI Lab have introduced a calibration method tailored to large language models. Their method, called Thermometer, involves building a smaller, auxiliary model that runs on top of a large language model to calibrate it.

Thermometer is more efficient than other approaches — requiring less power-hungry computation — while preserving the accuracy of the model and enabling it to produce better-calibrated responses on tasks it has not seen before.

By enabling efficient calibration of an LLM for a variety of tasks, Thermometer could help users pinpoint situations where a model is overconfident about false predictions, ultimately preventing them from deploying that model in a situation where it may fail.

“With Thermometer, we want to provide the user with a clear signal to tell them whether a model’s response is accurate or inaccurate, in a way that reflects the model’s uncertainty, so they know if that model is reliable,” says Maohao Shen, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on Thermometer.

Shen is joined on the paper by Gregory Wornell, the Sumitomo Professor of Engineering who leads the Signals, Information, and Algorithms Laboratory in the Research Laboratory for Electronics, and is a member of the MIT-IBM Watson AI Lab; senior author Soumya Ghosh, a research staff member in the MIT-IBM Watson AI Lab; as well as others at MIT and the MIT-IBM Watson AI Lab. The research was recently presented at the International Conference on Machine Learning.

Universal calibration

Since traditional machine-learning models are typically designed to perform a single task, calibrating them usually involves one task-specific method. On the other hand, since LLMs have the flexibility to perform many tasks, using a traditional method to calibrate that model for one task might hurt its performance on another task.

Calibrating an LLM often involves sampling from the model multiple times to obtain different predictions and then aggregating these predictions to obtain better-calibrated confidence. However, because these models have billions of parameters, the computational costs of such approaches rapidly add up.

“In a sense, large language models are universal because they can handle various tasks. So, we need a universal calibration method that can also handle many different tasks,” says Shen.

With Thermometer, the researchers developed a versatile technique that leverages a classical calibration method called temperature scaling to efficiently calibrate an LLM for a new task.

In this context, a “temperature” is a scaling parameter used to adjust a model’s confidence to be aligned with its prediction accuracy. Traditionally, one determines the right temperature using a labeled validation dataset of task-specific examples.

Since LLMs are often applied to new tasks, labeled datasets can be nearly impossible to acquire. For instance, a user who wants to deploy an LLM to answer customer questions about a new product likely does not have a dataset containing such questions and answers.

Instead of using a labeled dataset, the researchers train an auxiliary model that runs on top of an LLM to automatically predict the temperature needed to calibrate it for this new task.

They use labeled datasets of a few representative tasks to train the Thermometer model, but then once it has been trained, it can generalize to new tasks in a similar category without the need for additional labeled data.

A Thermometer model trained on a collection of multiple-choice question datasets, perhaps including one with algebra questions and one with medical questions, could be used to calibrate an LLM that will answer questions about geometry or biology, for instance.

“The aspirational goal is for it to work on any task, but we are not quite there yet,” Ghosh says.   

The Thermometer model only needs to access a small part of the LLM’s inner workings to predict the right temperature that will calibrate its prediction for data points of a specific task. 

An efficient approach

Importantly, the technique does not require multiple training runs and only slightly slows the LLM. Plus, since temperature scaling does not alter a model’s predictions, Thermometer preserves its accuracy.

When they compared Thermometer to several baselines on multiple tasks, it consistently produced better-calibrated uncertainty measures while requiring much less computation.

“As long as we train a Thermometer model on a sufficiently large number of tasks, it should be able to generalize well across any new task, just like a large language model, it is also a universal model,” Shen adds.

The researchers also found that if they train a Thermometer model for a smaller LLM, it can be directly applied to calibrate a larger LLM within the same family.

In the future, they want to adapt Thermometer for more complex text-generation tasks and apply the technique to even larger LLMs. The researchers also hope to quantify the diversity and number of labeled datasets one would need to train a Thermometer model so it can generalize to a new task.

This research was funded, in part, by the MIT-IBM Watson AI Lab.

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Bitcoin QR code for donations

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: MIT Tech
Share162Tweet101Share28
Previous Post

Microscope system sharpens scientists’ view of neural circuit connections | MIT News

Next Post

How to Use Ventoy: Create Multi-Boot USB Drives Easily

MIT Tech

MIT Tech

Discover cutting-edge research and technological breakthroughs with MIT Tech. Explore innovative projects and academic insights shaping the future of technology. Stay informed with the latest articles here at Techcratic.

Related Posts

Helping machines understand visual content with AI | MIT News
MIT Tech

Helping machines understand visual content with AI | MIT News

June 9, 2025
1.3k
AI-enabled control system helps autonomous drones stay on target in uncertain environments | MIT News
MIT Tech

AI-enabled control system helps autonomous drones stay on target in uncertain environments | MIT News

June 9, 2025
1.3k
New facility to accelerate materials solutions for fusion energy | MIT News
MIT Tech

New facility to accelerate materials solutions for fusion energy | MIT News

June 9, 2025
1.3k
The Download: An inspiring toy robot arm, and why AM radio matters
MIT Tech

The Download: An inspiring toy robot arm, and why AM radio matters

June 9, 2025
1.3k
Animation technique simulates the motion of squishy objects | MIT News
MIT Tech

Animation technique simulates the motion of squishy objects | MIT News

June 6, 2025
1.3k
Former MIT researchers advance a new model for innovation | MIT News
MIT Tech

Former MIT researchers advance a new model for innovation | MIT News

June 6, 2025
1.3k
Load More
Next Post
How to Use Ventoy: Create Multi-Boot USB Drives Easily

How to Use Ventoy: Create Multi-Boot USB Drives Easily

Taiwan’s TSMC is planning more chip fabs in Europe

Taiwan's TSMC is planning more chip fabs in Europe

How to increase the rate of plastics recycling | MIT News

How to increase the rate of plastics recycling | MIT News

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • AnandTech
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Weird Stuff
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • Home
  • Apple
  • Gaming
  • Microsoft
  • AnandTech