• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Thursday, July 3, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

    Artificial Intelligence

    Instruction-Following Pruning for Large Language Models

    Artificial Intelligence

    How to Combine Streamlit, Pandas, and Plotly for Interactive Data Apps

    Artificial Intelligence

    Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

    Artificial Intelligence

    Automate Data Quality Reports with n8n: From CSV to Professional Analysis

    Artificial Intelligence

    NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

    Artificial Intelligence

    5 Things You Need to Know About Agentic AI

    Artificial Intelligence

    Normalizing Flows are Capable Generative Models

    Artificial Intelligence

    Update on the AWS DeepRacer Student Portal

  • App Zone
    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

  • Apple
    M4 iPad Pro, iPad A16, Apple Pencil Pro, AirTag, more 9to5Mac

    M4 iPad Pro, iPad A16, Apple Pencil Pro, AirTag, more 9to5Mac

    iPhone expansion, Foxconn India drops Chinese experts, mystery

    Qantas data breach exposes personal details of millions

    Qantas data breach exposes personal details of millions

    July 2, 2025 – iPhone Fold, Apple vs DOJ

    Five new Apple products are launching early this year, here’s what’s coming

    Apple is launching 15+ new products this fall, here’s what’s coming

    iOS 26’s new Liquid Glass design looks like a major win for Apple

    iOS 26’s new Liquid Glass design looks like a major win for Apple

    OLED MacBook Pro still expected for 2026 release

    OLED MacBook Pro still expected for 2026 release

    Trump Vietnam deal, costs, AirPods, iPad, Apple Watch, Mac mini

    Trump Vietnam deal, costs, AirPods, iPad, Apple Watch, Mac mini

    iPadOS 26 is perfect for the larger iPad model that’s coming

    Apple’s ‘iPad Fold’ won’t be launching any time soon, per new report

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Deals
    SanDisk 128GB Extreme PRO SDXC UHS-I Memory Card – C10, U3, V30, 4K UHD, SD Card -…

    SanDisk 128GB Extreme PRO SDXC UHS-I Memory Card – C10, U3, V30, 4K UHD, SD Card -…

    Vantec USB 3.0 Multi-Card Reader UHS-II, SD 4.0, Multi-LUN (UGT-CR615), Black

    Vantec USB 3.0 Multi-Card Reader UHS-II, SD 4.0, Multi-LUN (UGT-CR615), Black

    TAGRY Bluetooth Headphones True Wireless Earbuds 60H Playback LED Power Display…

    TAGRY Bluetooth Headphones True Wireless Earbuds 60H Playback LED Power Display…

    SABRENT 13 Port High Speed USB 2.0 Hub with Power Adapter and 2 Control Switches…

    SABRENT 13 Port High Speed USB 2.0 Hub with Power Adapter and 2 Control Switches…

    ORICO MiniTower 2 Bay RAID Enclosure Compatible NVMe SSD 10Gbps with Expansion Hub…

    ORICO MiniTower 2 Bay RAID Enclosure Compatible NVMe SSD 10Gbps with Expansion Hub…

    Blue Yeti USB Mic for Recording and Streaming on PC and Mac with Blue VOCE Effects, 4…

    Blue Yeti USB Mic for Recording and Streaming on PC and Mac with Blue VOCE Effects, 4…

    Lenovo V15 Laptop | 15.6″ FHD Anti-Glare Display | AMD Ryzen 7 7730U | 40GB RAM | 1TB…

    Lenovo V15 Laptop | 15.6″ FHD Anti-Glare Display | AMD Ryzen 7 7730U | 40GB RAM | 1TB…

    SABRENT 1TB Rocket Nano XTRM External SSD, USB 3.2 / Thunderbolt 3, Speeds Up to…

    SABRENT 1TB Rocket Nano XTRM External SSD, USB 3.2 / Thunderbolt 3, Speeds Up to…

    MIXAGE 64GB CompactFlash Card UDMA7, 120MB/s Read 45MB/s Write, 85TB TBW, for Canon…

    MIXAGE 64GB CompactFlash Card UDMA7, 120MB/s Read 45MB/s Write, 85TB TBW, for Canon…

  • Tech Eats
    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

  • Tesla
    Center Console Cover for Tesla Model 3 Model Y Leather Armrest Box Cushion Protector…

    Center Console Cover for Tesla Model 3 Model Y Leather Armrest Box Cushion Protector…

    Femuar Car Trunk Organizer with Large Capacity Waterproof Car Accessories for Women &…

    Femuar Car Trunk Organizer with Large Capacity Waterproof Car Accessories for Women &…

    [Replacement] 4Pcs Roof Rack Cover Cap Rail End Shell for Tesla for Model 3 2017 2018…

    [Replacement] 4Pcs Roof Rack Cover Cap Rail End Shell for Tesla for Model 3 2017 2018…

    Lower Center Console Organizer Tray for Tesla Model Y 2021-2024 & Model 3 2021-2023,…

    Lower Center Console Organizer Tray for Tesla Model Y 2021-2024 & Model 3 2021-2023,…

    Tesla unveils new cheaper, but nerfed ‘Long Range’ Cybertruck

    Tesla confirms Cybertruck sales are down to just ~5,000 units

    OEDRO Floor Mats Fit for Tesla Model 3 Highland 2024 2025, All Weather Waterproof…

    OEDRO Floor Mats Fit for Tesla Model 3 Highland 2024 2025, All Weather Waterproof…

    Tesla (TSLA) confirms 384,000 deliveries in Q2 2025, right on expectations

    Tesla (TSLA) confirms 384,000 deliveries in Q2 2025, right on expectations

    2025 Upgraded NACS to CCS Adapter, 500A/1000V, 250 kW DC Fast Charging for Ford, Rivian,…

    2025 Upgraded NACS to CCS Adapter, 500A/1000V, 250 kW DC Fast Charging for Ford, Rivian,…

    for Tesla Model 3 Floor mat, 3D Custom car mat for Model 3 2017-2025, All Weather Floor…

    for Tesla Model 3 Floor mat, 3D Custom car mat for Model 3 2017-2025, All Weather Floor…

  • UFO
    Pegasus Hobbies PEG9119 Model Kit, Multi, Standard Size

    Pegasus Hobbies PEG9119 Model Kit, Multi, Standard Size

    The Venus Mission That Shocked Soviet Scientists  – Space Exploration Missions

    The Venus Mission That Shocked Soviet Scientists – Space Exploration Missions

    Unidentified

    Unidentified

    Paranormal Activity: The Ghost Dimension | official trailer (2015)

    Paranormal Activity: The Ghost Dimension | official trailer (2015)

    Interstellar Secrets of Ancient Civilizations | Ancient Aliens

    Interstellar Secrets of Ancient Civilizations | Ancient Aliens

    UFO Shape LED Dual Purpose Lamp, Portable Adjustable Light Color and Brightness Outdoor Lamp, Motion Sensor Night Lamp, Magnetic Fixation, Rechargeable Flying Saucer Shape Lamp (White)

    UFO Shape LED Dual Purpose Lamp, Portable Adjustable Light Color and Brightness Outdoor Lamp, Motion Sensor Night Lamp, Magnetic Fixation, Rechargeable Flying Saucer Shape Lamp (White)

    UFO – Unidentified Flying Object (Drum Cover)

    UFO – Unidentified Flying Object (Drum Cover)

    Mademark x MTV – Cryptid MTV Logo Featuring Bigfoot, UFO, Aliens & Nessie T-Shirt

    Mademark x MTV – Cryptid MTV Logo Featuring Bigfoot, UFO, Aliens & Nessie T-Shirt

    Area 51 Security Costume T-Shirt

    Area 51 Security Costume T-Shirt

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video

    Artificial Intelligence

    Instruction-Following Pruning for Large Language Models

    Artificial Intelligence

    How to Combine Streamlit, Pandas, and Plotly for Interactive Data Apps

    Artificial Intelligence

    Tailor responsible AI with new safeguard tiers in Amazon Bedrock Guardrails

    Artificial Intelligence

    Automate Data Quality Reports with n8n: From CSV to Professional Analysis

    Artificial Intelligence

    NewDay builds A Generative AI based Customer service Agent Assist with over 90% accuracy

    Artificial Intelligence

    5 Things You Need to Know About Agentic AI

    Artificial Intelligence

    Normalizing Flows are Capable Generative Models

    Artificial Intelligence

    Update on the AWS DeepRacer Student Portal

  • App Zone
    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Apple: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Launcher Apps for Android: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Card Game Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Medical Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Travel Apps of 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Casual Game Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Food Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Sport Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

    Top 3 Productivity Apps for 2025: Features, Pros, and Cons

  • Apple
    M4 iPad Pro, iPad A16, Apple Pencil Pro, AirTag, more 9to5Mac

    M4 iPad Pro, iPad A16, Apple Pencil Pro, AirTag, more 9to5Mac

    iPhone expansion, Foxconn India drops Chinese experts, mystery

    Qantas data breach exposes personal details of millions

    Qantas data breach exposes personal details of millions

    July 2, 2025 – iPhone Fold, Apple vs DOJ

    Five new Apple products are launching early this year, here’s what’s coming

    Apple is launching 15+ new products this fall, here’s what’s coming

    iOS 26’s new Liquid Glass design looks like a major win for Apple

    iOS 26’s new Liquid Glass design looks like a major win for Apple

    OLED MacBook Pro still expected for 2026 release

    OLED MacBook Pro still expected for 2026 release

    Trump Vietnam deal, costs, AirPods, iPad, Apple Watch, Mac mini

    Trump Vietnam deal, costs, AirPods, iPad, Apple Watch, Mac mini

    iPadOS 26 is perfect for the larger iPad model that’s coming

    Apple’s ‘iPad Fold’ won’t be launching any time soon, per new report

  • Retro Rewind
    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Games April 1995

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 57 April 1994

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: Blast from the Past – 35 Iconic Commercials of 1988!

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: PC World Magazine August 1998

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: Computer Shopper Magazine September 1997

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: PC Magazine December 2015

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: EDGE Magazine RETRO #1: The Guide to Classic Videogame Playing and Collecting

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Computer Gaming World Magazine Issue 73 December 1998

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

    Retro Rewind: Electronic Gaming Monthly Magazine Number 55 February 1994

  • Tech Deals
    SanDisk 128GB Extreme PRO SDXC UHS-I Memory Card – C10, U3, V30, 4K UHD, SD Card -…

    SanDisk 128GB Extreme PRO SDXC UHS-I Memory Card – C10, U3, V30, 4K UHD, SD Card -…

    Vantec USB 3.0 Multi-Card Reader UHS-II, SD 4.0, Multi-LUN (UGT-CR615), Black

    Vantec USB 3.0 Multi-Card Reader UHS-II, SD 4.0, Multi-LUN (UGT-CR615), Black

    TAGRY Bluetooth Headphones True Wireless Earbuds 60H Playback LED Power Display…

    TAGRY Bluetooth Headphones True Wireless Earbuds 60H Playback LED Power Display…

    SABRENT 13 Port High Speed USB 2.0 Hub with Power Adapter and 2 Control Switches…

    SABRENT 13 Port High Speed USB 2.0 Hub with Power Adapter and 2 Control Switches…

    ORICO MiniTower 2 Bay RAID Enclosure Compatible NVMe SSD 10Gbps with Expansion Hub…

    ORICO MiniTower 2 Bay RAID Enclosure Compatible NVMe SSD 10Gbps with Expansion Hub…

    Blue Yeti USB Mic for Recording and Streaming on PC and Mac with Blue VOCE Effects, 4…

    Blue Yeti USB Mic for Recording and Streaming on PC and Mac with Blue VOCE Effects, 4…

    Lenovo V15 Laptop | 15.6″ FHD Anti-Glare Display | AMD Ryzen 7 7730U | 40GB RAM | 1TB…

    Lenovo V15 Laptop | 15.6″ FHD Anti-Glare Display | AMD Ryzen 7 7730U | 40GB RAM | 1TB…

    SABRENT 1TB Rocket Nano XTRM External SSD, USB 3.2 / Thunderbolt 3, Speeds Up to…

    SABRENT 1TB Rocket Nano XTRM External SSD, USB 3.2 / Thunderbolt 3, Speeds Up to…

    MIXAGE 64GB CompactFlash Card UDMA7, 120MB/s Read 45MB/s Write, 85TB TBW, for Canon…

    MIXAGE 64GB CompactFlash Card UDMA7, 120MB/s Read 45MB/s Write, 85TB TBW, for Canon…

  • Tech Eats
    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Cheesy Broccoli Rice Mug: 5-Minute Super Comfort Food

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Top 10 Vegetarian Recipes for 2025: Easy and Nutritious Meals for Busy People

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Mug Lasagna: 5-Minute Microwave Meat Lover’s Dream

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon Fried Rice Mug: 5-Minute Microwave Meal

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Bacon & Cheddar Mug Biscuit: 2-Minute Savory Comfort

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Loaded Bacon Cheesy Potato Mug: 5-Minute Comfort Food

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Peanut Butter Banana Mug Muffin: 5-Minute Protein Snack

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Oreo Mug Cake: 2-Minute Cookie & Cake Combo!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

    Tiramisu Mug Cake: Coffee Lover’s Dream in 2 Minutes!

  • Tesla
    Center Console Cover for Tesla Model 3 Model Y Leather Armrest Box Cushion Protector…

    Center Console Cover for Tesla Model 3 Model Y Leather Armrest Box Cushion Protector…

    Femuar Car Trunk Organizer with Large Capacity Waterproof Car Accessories for Women &…

    Femuar Car Trunk Organizer with Large Capacity Waterproof Car Accessories for Women &…

    [Replacement] 4Pcs Roof Rack Cover Cap Rail End Shell for Tesla for Model 3 2017 2018…

    [Replacement] 4Pcs Roof Rack Cover Cap Rail End Shell for Tesla for Model 3 2017 2018…

    Lower Center Console Organizer Tray for Tesla Model Y 2021-2024 & Model 3 2021-2023,…

    Lower Center Console Organizer Tray for Tesla Model Y 2021-2024 & Model 3 2021-2023,…

    Tesla unveils new cheaper, but nerfed ‘Long Range’ Cybertruck

    Tesla confirms Cybertruck sales are down to just ~5,000 units

    OEDRO Floor Mats Fit for Tesla Model 3 Highland 2024 2025, All Weather Waterproof…

    OEDRO Floor Mats Fit for Tesla Model 3 Highland 2024 2025, All Weather Waterproof…

    Tesla (TSLA) confirms 384,000 deliveries in Q2 2025, right on expectations

    Tesla (TSLA) confirms 384,000 deliveries in Q2 2025, right on expectations

    2025 Upgraded NACS to CCS Adapter, 500A/1000V, 250 kW DC Fast Charging for Ford, Rivian,…

    2025 Upgraded NACS to CCS Adapter, 500A/1000V, 250 kW DC Fast Charging for Ford, Rivian,…

    for Tesla Model 3 Floor mat, 3D Custom car mat for Model 3 2017-2025, All Weather Floor…

    for Tesla Model 3 Floor mat, 3D Custom car mat for Model 3 2017-2025, All Weather Floor…

  • UFO
    Pegasus Hobbies PEG9119 Model Kit, Multi, Standard Size

    Pegasus Hobbies PEG9119 Model Kit, Multi, Standard Size

    The Venus Mission That Shocked Soviet Scientists  – Space Exploration Missions

    The Venus Mission That Shocked Soviet Scientists – Space Exploration Missions

    Unidentified

    Unidentified

    Paranormal Activity: The Ghost Dimension | official trailer (2015)

    Paranormal Activity: The Ghost Dimension | official trailer (2015)

    Interstellar Secrets of Ancient Civilizations | Ancient Aliens

    Interstellar Secrets of Ancient Civilizations | Ancient Aliens

    UFO Shape LED Dual Purpose Lamp, Portable Adjustable Light Color and Brightness Outdoor Lamp, Motion Sensor Night Lamp, Magnetic Fixation, Rechargeable Flying Saucer Shape Lamp (White)

    UFO Shape LED Dual Purpose Lamp, Portable Adjustable Light Color and Brightness Outdoor Lamp, Motion Sensor Night Lamp, Magnetic Fixation, Rechargeable Flying Saucer Shape Lamp (White)

    UFO – Unidentified Flying Object (Drum Cover)

    UFO – Unidentified Flying Object (Drum Cover)

    Mademark x MTV – Cryptid MTV Logo Featuring Bigfoot, UFO, Aliens & Nessie T-Shirt

    Mademark x MTV – Cryptid MTV Logo Featuring Bigfoot, UFO, Aliens & Nessie T-Shirt

    Area 51 Security Costume T-Shirt

    Area 51 Security Costume T-Shirt

No Result
View All Result
Techcratic
No Result
View All Result
Home MIT Tech

Method prevents an AI model from being overconfident about wrong answers | MIT News

MIT Tech by MIT Tech
October 14, 2024
in MIT Tech
Reading Time: 5 mins read
124
A A
0


Adam Zewe | MIT News
2024-07-31 00:00:00
news.mit.edu

People use large language models for a huge array of tasks, from translating an article to identifying financial fraud. However, despite the incredible capabilities and versatility of these models, they sometimes generate inaccurate responses.

On top of that problem, the models can be overconfident about wrong answers or underconfident about correct ones, making it tough for a user to know when a model can be trusted.

Researchers typically calibrate a machine-learning model to ensure its level of confidence lines up with its accuracy. A well-calibrated model should have less confidence about an incorrect prediction, and vice-versa. But because large language models (LLMs) can be applied to a seemingly endless collection of diverse tasks, traditional calibration methods are ineffective.

Now, researchers from MIT and the MIT-IBM Watson AI Lab have introduced a calibration method tailored to large language models. Their method, called Thermometer, involves building a smaller, auxiliary model that runs on top of a large language model to calibrate it.

Thermometer is more efficient than other approaches — requiring less power-hungry computation — while preserving the accuracy of the model and enabling it to produce better-calibrated responses on tasks it has not seen before.

By enabling efficient calibration of an LLM for a variety of tasks, Thermometer could help users pinpoint situations where a model is overconfident about false predictions, ultimately preventing them from deploying that model in a situation where it may fail.

“With Thermometer, we want to provide the user with a clear signal to tell them whether a model’s response is accurate or inaccurate, in a way that reflects the model’s uncertainty, so they know if that model is reliable,” says Maohao Shen, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on Thermometer.

Shen is joined on the paper by Gregory Wornell, the Sumitomo Professor of Engineering who leads the Signals, Information, and Algorithms Laboratory in the Research Laboratory for Electronics, and is a member of the MIT-IBM Watson AI Lab; senior author Soumya Ghosh, a research staff member in the MIT-IBM Watson AI Lab; as well as others at MIT and the MIT-IBM Watson AI Lab. The research was recently presented at the International Conference on Machine Learning.

Universal calibration

Since traditional machine-learning models are typically designed to perform a single task, calibrating them usually involves one task-specific method. On the other hand, since LLMs have the flexibility to perform many tasks, using a traditional method to calibrate that model for one task might hurt its performance on another task.

Calibrating an LLM often involves sampling from the model multiple times to obtain different predictions and then aggregating these predictions to obtain better-calibrated confidence. However, because these models have billions of parameters, the computational costs of such approaches rapidly add up.

“In a sense, large language models are universal because they can handle various tasks. So, we need a universal calibration method that can also handle many different tasks,” says Shen.

With Thermometer, the researchers developed a versatile technique that leverages a classical calibration method called temperature scaling to efficiently calibrate an LLM for a new task.

In this context, a “temperature” is a scaling parameter used to adjust a model’s confidence to be aligned with its prediction accuracy. Traditionally, one determines the right temperature using a labeled validation dataset of task-specific examples.

Since LLMs are often applied to new tasks, labeled datasets can be nearly impossible to acquire. For instance, a user who wants to deploy an LLM to answer customer questions about a new product likely does not have a dataset containing such questions and answers.

Instead of using a labeled dataset, the researchers train an auxiliary model that runs on top of an LLM to automatically predict the temperature needed to calibrate it for this new task.

They use labeled datasets of a few representative tasks to train the Thermometer model, but then once it has been trained, it can generalize to new tasks in a similar category without the need for additional labeled data.

A Thermometer model trained on a collection of multiple-choice question datasets, perhaps including one with algebra questions and one with medical questions, could be used to calibrate an LLM that will answer questions about geometry or biology, for instance.

“The aspirational goal is for it to work on any task, but we are not quite there yet,” Ghosh says.   

The Thermometer model only needs to access a small part of the LLM’s inner workings to predict the right temperature that will calibrate its prediction for data points of a specific task. 

An efficient approach

Importantly, the technique does not require multiple training runs and only slightly slows the LLM. Plus, since temperature scaling does not alter a model’s predictions, Thermometer preserves its accuracy.

When they compared Thermometer to several baselines on multiple tasks, it consistently produced better-calibrated uncertainty measures while requiring much less computation.

“As long as we train a Thermometer model on a sufficiently large number of tasks, it should be able to generalize well across any new task, just like a large language model, it is also a universal model,” Shen adds.

The researchers also found that if they train a Thermometer model for a smaller LLM, it can be directly applied to calibrate a larger LLM within the same family.

In the future, they want to adapt Thermometer for more complex text-generation tasks and apply the technique to even larger LLMs. The researchers also hope to quantify the diversity and number of labeled datasets one would need to train a Thermometer model so it can generalize to a new task.

This research was funded, in part, by the MIT-IBM Watson AI Lab.

Source Link

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Bitcoin QR code for donations

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: MIT Tech
Share162Share28ShareShare4ShareTweet101
Previous Post

Microscope system sharpens scientists’ view of neural circuit connections | MIT News

Next Post

How to Use Ventoy: Create Multi-Boot USB Drives Easily

MIT Tech

MIT Tech

Discover cutting-edge research and technological breakthroughs with MIT Tech. Explore innovative projects and academic insights shaping the future of technology. Stay informed with the latest articles here at Techcratic.

Related Posts

3 Questions: How MIT’s venture studio is partnering with MIT labs to solve “holy grail” problems | MIT News
MIT Tech

3 Questions: How MIT’s venture studio is partnering with MIT labs to solve “holy grail” problems | MIT News

July 2, 2025
1.3k
New method combines imaging and sequencing to study gene function in intact tissue | MIT News
MIT Tech

New method combines imaging and sequencing to study gene function in intact tissue | MIT News

June 30, 2025
1.3k
Faces of MIT: Ylana Lopez | MIT News
MIT Tech

Faces of MIT: Ylana Lopez | MIT News

June 27, 2025
1.3k
Face-to-face with Es Devlin | MIT News
MIT Tech

Face-to-face with Es Devlin | MIT News

June 26, 2025
1.3k
Travels with Rambax | MIT Technology Review
MIT Tech

Travels with Rambax | MIT Technology Review

June 25, 2025
1.3k
Art rhymes
MIT Tech

Art rhymes

June 24, 2025
1.3k
An epic year for women’s sports
MIT Tech

An epic year for women’s sports

June 24, 2025
1.3k
LLMs factor in unrelated information when recommending medical treatments | MIT News
MIT Tech

LLMs factor in unrelated information when recommending medical treatments | MIT News

June 23, 2025
1.3k
Load More
Next Post
How to Use Ventoy: Create Multi-Boot USB Drives Easily

How to Use Ventoy: Create Multi-Boot USB Drives Easily

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Forbes
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Gaming
  • I Like Cats ™
  • I Like Dogs ™
  • MacRumors
  • Macworld
  • Tech Deals
  • Techcratic ™
  • Techs Got To Eat ™
  • Tesla
  • UFO
  • Wired