• About TC
  • Affiliate Disclaimer
  • Privacy Policy
  • TOS
  • Contact
Tuesday, June 17, 2025
Techcratic
  • TC
  • AI
    Artificial Intelligence

    How Apollo Tyres is unlocking machine insights using agentic AI-powered Manufacturing Reasoner

    Artificial Intelligence

    Automatically Build AI Workflows with Magical AI

    Artificial Intelligence

    Amazon Nova Lite enables Bito to offer a free tier option for its AI-powered code reviews

    Artificial Intelligence

    Bridging the Gap: New Datasets Push Recommender Research Toward Real-World Scale

    Artificial Intelligence

    7 Python Errors That Are Actually Features

    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

  • Crypto
    Uniswap Surges 24% on $88B Volume, Targeting $12

    Pump.fun Accused of Stealing $741 M in Fees, Critics Warn

    Canada Approves First XRP Spot ETF on Toronto Stock Exchange

    Canada Approves First XRP Spot ETF on Toronto Stock Exchange

    Fold Announces $250M Equity Deal to Bolster Bitcoin Treasury

    Fold Announces $250M Equity Deal to Bolster Bitcoin Treasury

    Key BTC price levels to watch as fed rate cut hopes fade

    Key BTC price levels to watch as fed rate cut hopes fade

    Theminermag Bitcoin Mining Update: May/June 2025

    Theminermag Bitcoin Mining Update: May/June 2025

    Warning: Blackrock Could Orchestrate Institutional Bitcoin Takeover

    Warning: Blackrock Could Orchestrate Institutional Bitcoin Takeover

    The Curious Case of the Pentagon Pizza Index: It Accurately Predicts Wars

    The Curious Case of the Pentagon Pizza Index: It Accurately Predicts Wars

    Best Presales to Buy Today – Which Coins Are Poised for a Breakout?

    Crypto Price Prediction Today: XRP, Cardano, Dogecoin

    Bybit Debuts Hybrid Exchange Byreal on Solana, Targets Q3 Mainnet Launch

    Bybit Debuts Hybrid Exchange Byreal on Solana, Targets Q3 Mainnet Launch

  • Cybersecurity
    Cybersecurity

    Hard-Coded ‘b’ Password in Sitecore XP Sparks Major RCE Risk in Enterprise Deployments

    Cybersecurity

    AI Agents Run on Secret Accounts — Learn How to Secure Them in This Webinar

    Cybersecurity

    How to Address the Expanding Security Risk

    Cybersecurity

    ConnectWise to Rotate ScreenConnect Code Signing Certificates Due to Security Risks

    Cybersecurity

    5 Lessons from River Island

    Cybersecurity

    INTERPOL Dismantles 20,000+ Malicious IPs Linked to 69 Malware Variants in Operation Secure

    Cybersecurity

    SinoTrack GPS Devices Vulnerable to Remote Vehicle Control via Default Passwords

    Cybersecurity

    Researchers Uncover 20+ Configuration Risks, Including Five CVEs, in Salesforce Industry Cloud

    Cybersecurity

    Adobe Releases Patch Fixing 254 Vulnerabilities, Closing High-Severity Security Gaps

  • Deals
    Razer Enki X Essential Gaming Chair: All-Day Comfort – Built-in Lumbar Arch – Optimized…

    Razer Enki X Essential Gaming Chair: All-Day Comfort – Built-in Lumbar Arch – Optimized…

    MSI Thin 15.6 inch FHD 144Hz Gaming Laptop Intel Core i5-13420H NVIDIA GeForce RTX…

    MSI Thin 15.6 inch FHD 144Hz Gaming Laptop Intel Core i5-13420H NVIDIA GeForce RTX…

    Sonic’s Ultimate Genesis Collection (Platinum Hits) – Xbox 360 (Renewed)

    Sonic’s Ultimate Genesis Collection (Platinum Hits) – Xbox 360 (Renewed)

    Donkey Kong Country Returns (Renewed)

    Donkey Kong Country Returns (Renewed)

    Buffalo Games CHRONOLOGY – The Game Where You Make History – 20th Anniversary Edition

    Buffalo Games CHRONOLOGY – The Game Where You Make History – 20th Anniversary Edition

    Sprunki Plush Toys, Horror Games Plushies Toy for Fans, Soft Stuffed Animal Pillow…

    Sprunki Plush Toys, Horror Games Plushies Toy for Fans, Soft Stuffed Animal Pillow…

    Western Digital 8TB WD Red Plus NAS Internal Hard Drive HDD – 5640 RPM, SATA 6 Gb/s,…

    Western Digital 8TB WD Red Plus NAS Internal Hard Drive HDD – 5640 RPM, SATA 6 Gb/s,…

    Seagate BarraCuda Mobile Hard Drive 4TB SATA 6Gb/s 128MB Cache 2.5-Inch 15mm…

    Seagate BarraCuda Mobile Hard Drive 4TB SATA 6Gb/s 128MB Cache 2.5-Inch 15mm…

    Lexar 128GB (2-PK) Professional SILVER PRO SD Card, UHS-II, C10, U3, V60, Full HD, 4K,…

    Lexar 128GB (2-PK) Professional SILVER PRO SD Card, UHS-II, C10, U3, V60, Full HD, 4K,…

  • Gaming

    Forspoken Reviews (Absolutely Fantastic)

    ‘I can’t even hit anything’: Path of Exile player watches helplessly as his mercenary wipes out screens of enemies before he can even reach them

    ‘I can’t even hit anything’: Path of Exile player watches helplessly as his mercenary wipes out screens of enemies before he can even reach them

    I'm Still On The Fence About Sonic Frontiers – Here's Why

    I'm Still On The Fence About Sonic Frontiers – Here's Why

    Ultimate Review: God of War Ragnarok DLC

    Ultimate Review: God of War Ragnarok DLC

    REDRAGON S101 GAMING KEYBOARD

    Chase the Skies and Vibrant Visuals Playable Today

    Layers Of Fear – Gameplay Walkthroughs Day 14 ( Horror Games )

    Layers Of Fear – Gameplay Walkthroughs Day 14 ( Horror Games )

    Hogwarts Legacy San Bakar's Trial Main Quest Full Walkthrough

    Hogwarts Legacy San Bakar's Trial Main Quest Full Walkthrough

    I Found TREASURE with SoloViner in Palworld! (PART 2)

    I Found TREASURE with SoloViner in Palworld! (PART 2)

    Intel’s next-gen Nova Lake CPU rumoured to get up to 52 cores, over double the count of Arrow Lake across all segments

    Intel’s next-gen Nova Lake CPU rumoured to get up to 52 cores, over double the count of Arrow Lake across all segments

  • Tesla
    2025 Upgrade Sunshade Roof for Tesla Model Y Accessories, [Graphene Cooling Tech & High…

    2025 Upgrade Sunshade Roof for Tesla Model Y Accessories, [Graphene Cooling Tech & High…

    Tesla (TSLA) is sitting on so much inventory it has to take over parking lots all over the US

    Tesla (TSLA) is sitting on so much inventory it has to take over parking lots all over the US

    Tesla (TSLA) plans to pause production at Gigafactory Texas for second time in 2 months

    DEWALT CCS1 to NACS Fast Charging Adapter for All 2021 and Newer Tesla Models Excluding…

    DEWALT CCS1 to NACS Fast Charging Adapter for All 2021 and Newer Tesla Models Excluding…

    6PCS Trunk Mats & Frunk Mat & Backrest Mats for New 2025 2026 Tesla Model Y Juniper…

    6PCS Trunk Mats & Frunk Mat & Backrest Mats for New 2025 2026 Tesla Model Y Juniper…

    Tesla gives update on Tesla Semi factory, says on track for volume production in 2026

    Tesla gears up to start selling Tesla Semi electric truck in Europe

    Center Console Organizer Tesla Cybertruck Center Console Cover Cup Holder, Console…

    Center Console Organizer Tesla Cybertruck Center Console Cover Cup Holder, Console…

    Bloomberg just released the most embarrassing report about Tesla, Waymo, and self-driving

    BYD overtakes Tesla as China’s EV giants dominate global sales

    BYD overtakes Tesla as China’s EV giants dominate global sales

  • UFO
    The Alien Experiment | He saw Aliens #vigyanrecharge

    The Alien Experiment | He saw Aliens #vigyanrecharge

    UFO Completes 5 Orbits Around the Moon?! | Ancient Aliens | #Shorts

    UFO Completes 5 Orbits Around the Moon?! | Ancient Aliens | #Shorts

    A Pleiadian Contactee Describes His Experience

    A Pleiadian Contactee Describes His Experience

    Aidatain Outer Space Spaceship Tapestry Interior International Space Station Wall Hanging, Art Large Tapestry Spacecraft Backdrop 80″X 60″ Flannel for Bedroom Home Decor TFNAT0123

    Aidatain Outer Space Spaceship Tapestry Interior International Space Station Wall Hanging, Art Large Tapestry Spacecraft Backdrop 80″X 60″ Flannel for Bedroom Home Decor TFNAT0123

    A protagonista feia de intergalatic    #geek #games #sony #playsation #fyp #intergalactic #jogos

    A protagonista feia de intergalatic #geek #games #sony #playsation #fyp #intergalactic #jogos

    Inspiration for Space Exploration | The West Wing

    Inspiration for Space Exploration | The West Wing

    Pop Culture Conspiracy Theories! Taylor Swift, SHEIN, and Deadpool & Wolverine!!

    Pop Culture Conspiracy Theories! Taylor Swift, SHEIN, and Deadpool & Wolverine!!

    What is Unidentified Flying Object?

    What is Unidentified Flying Object?

    The Visitor

    The Visitor

No Result
View All Result
  • TC
  • AI
    Artificial Intelligence

    How Apollo Tyres is unlocking machine insights using agentic AI-powered Manufacturing Reasoner

    Artificial Intelligence

    Automatically Build AI Workflows with Magical AI

    Artificial Intelligence

    Amazon Nova Lite enables Bito to offer a free tier option for its AI-powered code reviews

    Artificial Intelligence

    Bridging the Gap: New Datasets Push Recommender Research Toward Real-World Scale

    Artificial Intelligence

    7 Python Errors That Are Actually Features

    Artificial Intelligence

    10 Awesome OCR Models for 2025

    Artificial Intelligence

    5 Error Handling Patterns in Python (Beyond Try-Except)

    Artificial Intelligence

    Top 5 Alternative Data Career Paths and How to Learn Them for Free

    Artificial Intelligence

    Implementing Machine Learning Pipelines with Apache Spark

  • Crypto
    Uniswap Surges 24% on $88B Volume, Targeting $12

    Pump.fun Accused of Stealing $741 M in Fees, Critics Warn

    Canada Approves First XRP Spot ETF on Toronto Stock Exchange

    Canada Approves First XRP Spot ETF on Toronto Stock Exchange

    Fold Announces $250M Equity Deal to Bolster Bitcoin Treasury

    Fold Announces $250M Equity Deal to Bolster Bitcoin Treasury

    Key BTC price levels to watch as fed rate cut hopes fade

    Key BTC price levels to watch as fed rate cut hopes fade

    Theminermag Bitcoin Mining Update: May/June 2025

    Theminermag Bitcoin Mining Update: May/June 2025

    Warning: Blackrock Could Orchestrate Institutional Bitcoin Takeover

    Warning: Blackrock Could Orchestrate Institutional Bitcoin Takeover

    The Curious Case of the Pentagon Pizza Index: It Accurately Predicts Wars

    The Curious Case of the Pentagon Pizza Index: It Accurately Predicts Wars

    Best Presales to Buy Today – Which Coins Are Poised for a Breakout?

    Crypto Price Prediction Today: XRP, Cardano, Dogecoin

    Bybit Debuts Hybrid Exchange Byreal on Solana, Targets Q3 Mainnet Launch

    Bybit Debuts Hybrid Exchange Byreal on Solana, Targets Q3 Mainnet Launch

  • Cybersecurity
    Cybersecurity

    Hard-Coded ‘b’ Password in Sitecore XP Sparks Major RCE Risk in Enterprise Deployments

    Cybersecurity

    AI Agents Run on Secret Accounts — Learn How to Secure Them in This Webinar

    Cybersecurity

    How to Address the Expanding Security Risk

    Cybersecurity

    ConnectWise to Rotate ScreenConnect Code Signing Certificates Due to Security Risks

    Cybersecurity

    5 Lessons from River Island

    Cybersecurity

    INTERPOL Dismantles 20,000+ Malicious IPs Linked to 69 Malware Variants in Operation Secure

    Cybersecurity

    SinoTrack GPS Devices Vulnerable to Remote Vehicle Control via Default Passwords

    Cybersecurity

    Researchers Uncover 20+ Configuration Risks, Including Five CVEs, in Salesforce Industry Cloud

    Cybersecurity

    Adobe Releases Patch Fixing 254 Vulnerabilities, Closing High-Severity Security Gaps

  • Deals
    Razer Enki X Essential Gaming Chair: All-Day Comfort – Built-in Lumbar Arch – Optimized…

    Razer Enki X Essential Gaming Chair: All-Day Comfort – Built-in Lumbar Arch – Optimized…

    MSI Thin 15.6 inch FHD 144Hz Gaming Laptop Intel Core i5-13420H NVIDIA GeForce RTX…

    MSI Thin 15.6 inch FHD 144Hz Gaming Laptop Intel Core i5-13420H NVIDIA GeForce RTX…

    Sonic’s Ultimate Genesis Collection (Platinum Hits) – Xbox 360 (Renewed)

    Sonic’s Ultimate Genesis Collection (Platinum Hits) – Xbox 360 (Renewed)

    Donkey Kong Country Returns (Renewed)

    Donkey Kong Country Returns (Renewed)

    Buffalo Games CHRONOLOGY – The Game Where You Make History – 20th Anniversary Edition

    Buffalo Games CHRONOLOGY – The Game Where You Make History – 20th Anniversary Edition

    Sprunki Plush Toys, Horror Games Plushies Toy for Fans, Soft Stuffed Animal Pillow…

    Sprunki Plush Toys, Horror Games Plushies Toy for Fans, Soft Stuffed Animal Pillow…

    Western Digital 8TB WD Red Plus NAS Internal Hard Drive HDD – 5640 RPM, SATA 6 Gb/s,…

    Western Digital 8TB WD Red Plus NAS Internal Hard Drive HDD – 5640 RPM, SATA 6 Gb/s,…

    Seagate BarraCuda Mobile Hard Drive 4TB SATA 6Gb/s 128MB Cache 2.5-Inch 15mm…

    Seagate BarraCuda Mobile Hard Drive 4TB SATA 6Gb/s 128MB Cache 2.5-Inch 15mm…

    Lexar 128GB (2-PK) Professional SILVER PRO SD Card, UHS-II, C10, U3, V60, Full HD, 4K,…

    Lexar 128GB (2-PK) Professional SILVER PRO SD Card, UHS-II, C10, U3, V60, Full HD, 4K,…

  • Gaming

    Forspoken Reviews (Absolutely Fantastic)

    ‘I can’t even hit anything’: Path of Exile player watches helplessly as his mercenary wipes out screens of enemies before he can even reach them

    ‘I can’t even hit anything’: Path of Exile player watches helplessly as his mercenary wipes out screens of enemies before he can even reach them

    I'm Still On The Fence About Sonic Frontiers – Here's Why

    I'm Still On The Fence About Sonic Frontiers – Here's Why

    Ultimate Review: God of War Ragnarok DLC

    Ultimate Review: God of War Ragnarok DLC

    REDRAGON S101 GAMING KEYBOARD

    Chase the Skies and Vibrant Visuals Playable Today

    Layers Of Fear – Gameplay Walkthroughs Day 14 ( Horror Games )

    Layers Of Fear – Gameplay Walkthroughs Day 14 ( Horror Games )

    Hogwarts Legacy San Bakar's Trial Main Quest Full Walkthrough

    Hogwarts Legacy San Bakar's Trial Main Quest Full Walkthrough

    I Found TREASURE with SoloViner in Palworld! (PART 2)

    I Found TREASURE with SoloViner in Palworld! (PART 2)

    Intel’s next-gen Nova Lake CPU rumoured to get up to 52 cores, over double the count of Arrow Lake across all segments

    Intel’s next-gen Nova Lake CPU rumoured to get up to 52 cores, over double the count of Arrow Lake across all segments

  • Tesla
    2025 Upgrade Sunshade Roof for Tesla Model Y Accessories, [Graphene Cooling Tech & High…

    2025 Upgrade Sunshade Roof for Tesla Model Y Accessories, [Graphene Cooling Tech & High…

    Tesla (TSLA) is sitting on so much inventory it has to take over parking lots all over the US

    Tesla (TSLA) is sitting on so much inventory it has to take over parking lots all over the US

    Tesla (TSLA) plans to pause production at Gigafactory Texas for second time in 2 months

    DEWALT CCS1 to NACS Fast Charging Adapter for All 2021 and Newer Tesla Models Excluding…

    DEWALT CCS1 to NACS Fast Charging Adapter for All 2021 and Newer Tesla Models Excluding…

    6PCS Trunk Mats & Frunk Mat & Backrest Mats for New 2025 2026 Tesla Model Y Juniper…

    6PCS Trunk Mats & Frunk Mat & Backrest Mats for New 2025 2026 Tesla Model Y Juniper…

    Tesla gives update on Tesla Semi factory, says on track for volume production in 2026

    Tesla gears up to start selling Tesla Semi electric truck in Europe

    Center Console Organizer Tesla Cybertruck Center Console Cover Cup Holder, Console…

    Center Console Organizer Tesla Cybertruck Center Console Cover Cup Holder, Console…

    Bloomberg just released the most embarrassing report about Tesla, Waymo, and self-driving

    BYD overtakes Tesla as China’s EV giants dominate global sales

    BYD overtakes Tesla as China’s EV giants dominate global sales

  • UFO
    The Alien Experiment | He saw Aliens #vigyanrecharge

    The Alien Experiment | He saw Aliens #vigyanrecharge

    UFO Completes 5 Orbits Around the Moon?! | Ancient Aliens | #Shorts

    UFO Completes 5 Orbits Around the Moon?! | Ancient Aliens | #Shorts

    A Pleiadian Contactee Describes His Experience

    A Pleiadian Contactee Describes His Experience

    Aidatain Outer Space Spaceship Tapestry Interior International Space Station Wall Hanging, Art Large Tapestry Spacecraft Backdrop 80″X 60″ Flannel for Bedroom Home Decor TFNAT0123

    Aidatain Outer Space Spaceship Tapestry Interior International Space Station Wall Hanging, Art Large Tapestry Spacecraft Backdrop 80″X 60″ Flannel for Bedroom Home Decor TFNAT0123

    A protagonista feia de intergalatic    #geek #games #sony #playsation #fyp #intergalactic #jogos

    A protagonista feia de intergalatic #geek #games #sony #playsation #fyp #intergalactic #jogos

    Inspiration for Space Exploration | The West Wing

    Inspiration for Space Exploration | The West Wing

    Pop Culture Conspiracy Theories! Taylor Swift, SHEIN, and Deadpool & Wolverine!!

    Pop Culture Conspiracy Theories! Taylor Swift, SHEIN, and Deadpool & Wolverine!!

    What is Unidentified Flying Object?

    What is Unidentified Flying Object?

    The Visitor

    The Visitor

No Result
View All Result
Techcratic
No Result
View All Result
Home Hacker News

GitHub – deepseek-ai/DeepSeek-R1

Hacker News by Hacker News
January 20, 2025
in Hacker News
Reading Time: 10 mins read
126 4
A A
0

2025-01-20 07:37:00
github.com

DeepSeek-V3


Code License


Model License

Paper Link👁️

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1.
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.
However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance,
we introduce DeepSeek-R1, which incorporates cold-start data before RL.
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.


Post-Training: Large-Scale Reinforcement Learning on the Base Model

  • We directly apply reinforcement learning (RL) to the base model without relying on supervised fine-tuning (SFT) as a preliminary step. This approach allows the model to explore chain-of-thought (CoT) for solving complex problems, resulting in the development of DeepSeek-R1-Zero. DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities of LLMs can be incentivized purely through RL, without the need for SFT. This breakthrough paves the way for future advancements in this area.

  • We introduce our pipeline to develop DeepSeek-R1. The pipeline incorporates two RL stages aimed at discovering improved reasoning patterns and aligning with human preferences, as well as two SFT stages that serve as the seed for the model’s reasoning and non-reasoning capabilities.
    We believe the pipeline will benefit the industry by creating better models.


Distillation: Smaller Models Can Be Powerful Too

  • We demonstrate that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models. The open source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future.
  • Using the reasoning data generated by DeepSeek-R1, we fine-tuned several dense models that are widely used in the research community. The evaluation results demonstrate that the distilled smaller dense models perform exceptionally well on benchmarks. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community.

DeepSeek-R1-Zero & DeepSeek-R1 are trained based on DeepSeek-V3-Base.
For more details regrading the model architecture, please refer to DeepSeek-V3 repository.

DeepSeek-R1-Distill Models

DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1.
We slightly change their configs and tokenizers. Please use our setting to run these models.

For all our models, the maximum generation length is set to 32,768 tokens. For benchmarks requiring sampling, we use a temperature of $0.6$, a top-p value of $0.95$, and generate 64 responses per query to estimate pass@1.

Category Benchmark (Metric) Claude-3.5-Sonnet-1022 GPT-4o 0513 DeepSeek V3 OpenAI o1-mini OpenAI o1-1217 DeepSeek R1
Architecture – – MoE – – MoE
# Activated Params – – 37B – – 37B
# Total Params – – 671B – – 671B
English MMLU (Pass@1) 88.3 87.2 88.5 85.2 91.8 90.8
MMLU-Redux (EM) 88.9 88.0 89.1 86.7 – 92.9
MMLU-Pro (EM) 78.0 72.6 75.9 80.3 – 84.0
DROP (3-shot F1) 88.3 83.7 91.6 83.9 90.2 92.2
IF-Eval (Prompt Strict) 86.5 84.3 86.1 84.8 – 83.3
GPQA-Diamond (Pass@1) 65.0 49.9 59.1 60.0 75.7 71.5
SimpleQA (Correct) 28.4 38.2 24.9 7.0 47.0 30.1
FRAMES (Acc.) 72.5 80.5 73.3 76.9 – 82.5
AlpacaEval2.0 (LC-winrate) 52.0 51.1 70.0 57.8 – 87.6
ArenaHard (GPT-4-1106) 85.2 80.4 85.5 92.0 – 92.3
Code LiveCodeBench (Pass@1-COT) 33.8 34.2 – 53.8 63.4 65.9
Codeforces (Percentile) 20.3 23.6 58.7 93.4 96.6 96.3
Codeforces (Rating) 717 759 1134 1820 2061 2029
SWE Verified (Resolved) 50.8 38.8 42.0 41.6 48.9 49.2
Aider-Polyglot (Acc.) 45.3 16.0 49.6 32.9 61.7 53.3
Math AIME 2024 (Pass@1) 16.0 9.3 39.2 63.6 79.2 79.8
MATH-500 (Pass@1) 78.3 74.6 90.2 90.0 96.4 97.3
CNMO 2024 (Pass@1) 13.1 10.8 43.2 67.6 – 78.8
Chinese CLUEWSC (EM) 85.4 87.9 90.9 89.9 – 92.8
C-Eval (EM) 76.7 76.0 86.5 68.9 – 91.8
C-SimpleQA (Correct) 55.4 58.7 68.0 40.3 – 63.7

Distilled Model Evaluation

Model AIME 2024 pass@1 AIME 2024 cons@64 MATH-500 pass@1 GPQA Diamond pass@1 LiveCodeBench pass@1 CodeForces rating
GPT-4o-0513 9.3 13.4 74.6 49.9 32.9 759
Claude-3.5-Sonnet-1022 16.0 26.7 78.3 65.0 38.9 717
o1-mini 63.6 80.0 90.0 60.0 53.8 1820
QwQ-32B-Preview 44.0 60.0 90.6 54.5 41.9 1316
DeepSeek-R1-Distill-Qwen-1.5B 28.9 52.7 83.9 33.8 16.9 954
DeepSeek-R1-Distill-Qwen-7B 55.5 83.3 92.8 49.1 37.6 1189
DeepSeek-R1-Distill-Qwen-14B 69.7 80.0 93.9 59.1 53.1 1481
DeepSeek-R1-Distill-Qwen-32B 72.6 83.3 94.3 62.1 57.2 1691
DeepSeek-R1-Distill-Llama-8B 50.4 80.0 89.1 49.0 39.6 1205
DeepSeek-R1-Distill-Llama-70B 70.0 86.7 94.5 65.2 57.5 1633

5. Chat Website & API Platform

You can chat with DeepSeek-R1 on DeepSeek’s official website: chat.deepseek.com, and switch on the button “DeepThink”

We also provide OpenAI-Compatible API at DeepSeek Platform: platform.deepseek.com

Please visit DeepSeek-V3 repo for more information about running DeepSeek-R1 locally.

DeepSeek-R1-Distill Models

DeepSeek-R1-Distill models can be utilized in the same manner as Qwen or Llama models.

For instance, you can easily start a service using vLLM:

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager

You can also easily start a service using SGLang

python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --trust-remote-code --tp 2

NOTE: We recommend setting an appropriate temperature (between 0.5 and 0.7) when running these models, otherwise you may encounter issues with endless repetition or incoherent output.

This code repository and the model weights are licensed under the MIT License.
DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:

  • DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1.
  • DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license.
  • DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.

If you have any questions, please raise an issue or contact us at service@deepseek.com.

Source Link


Keep your files stored safely and securely with the SanDisk 2TB Extreme Portable SSD. With over 69,505 ratings and an impressive 4.6 out of 5 stars, this product has been purchased over 8K+ times in the past month. At only $129.99, this Amazon’s Choice product is a must-have for secure file storage.

Help keep private content private with the included password protection featuring 256-bit AES hardware encryption. Order now for just $129.99 on Amazon!


Start your free Amazon Prime trial
today and unlock unlimited streaming and more!

Support Techcratic

If you find value in Techcratic’s insights and articles, consider supporting us with Bitcoin. Your support helps me, as a solo operator, continue delivering high-quality content while managing all the technical aspects, from server maintenance to blog writing, future updates, and improvements. Support Innovation! Thank you.

Bitcoin Address:

bc1qlszw7elx2qahjwvaryh0tkgg8y68enw30gpvge

Please verify this address before sending funds.

Bitcoin QR Code

Simply scan the QR code below to support Techcratic.

Bitcoin QR code for donations

Please read the Privacy and Security Disclaimer on how Techcratic handles your support.

Disclaimer: As an Amazon Associate, Techcratic may earn from qualifying purchases.

Tags: Hacker News
Share162ShareTweet101
Previous Post

Jaw-Dropping Alien Documentary I The UFO Conclusions I Absolute Mysteries

Next Post

Phoenix Suns vs. Cleveland Cavaliers 2025 livestream: Watch NBA online

Hacker News

Hacker News

Stay updated with Hacker News, where technology meets entrepreneurial spirit. Get the latest on tech trends, startup news, and discussions from the tech community. Read the latest updates here at Techcratic.

Related Posts

Time Series Forecasting with Graph Transformers
Hacker News

Time Series Forecasting with Graph Transformers

June 17, 2025
1.3k
ku9nov/faynoSync: Simple Auto Updater service written in Golang.
Hacker News

ku9nov/faynoSync: Simple Auto Updater service written in Golang.

June 17, 2025
1.3k
The Drawbridges Go Up | Drew Breunig
Hacker News

The Drawbridges Go Up | Drew Breunig

June 17, 2025
1.3k
OpenTelemetry for Go: measuring the overhead
Hacker News

OpenTelemetry for Go: measuring the overhead

June 16, 2025
1.3k
Getting free internet on a cruise, saving $170
Hacker News

Getting free internet on a cruise, saving $170

June 16, 2025
1.3k
ccbikai/ssh-ai-chat: Chat with AI over SSH.
Hacker News

ccbikai/ssh-ai-chat: Chat with AI over SSH.

June 16, 2025
1.3k
rorosen/zeekstd: Rust implementation of the Zstandard Seekable Format
Hacker News

rorosen/zeekstd: Rust implementation of the Zstandard Seekable Format

June 16, 2025
1.3k
Solving LinkedIn Queens with APL
Hacker News

Solving LinkedIn Queens with APL

June 16, 2025
1.3k
Load More
Next Post

Phoenix Suns vs. Cleveland Cavaliers 2025 livestream: Watch NBA online

Your Tech Resources

  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Forbes
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo
  • Google News
  • Hacker News
  • Harvard Tech
  • I Like Cats ™
  • I Like Dogs ™
  • LifeHacker
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • PC World
  • Photofocus
  • Physics
  • Random Tech
  • Retro Rewind ™
  • Robot Report
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Art
  • Tech Careers
  • Tech Deals
  • Techcratic ™
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Tech News

  • 30 Second Tech ™
  • AI
  • Apple Insider
  • Ars Technica
  • CNET
  • ComputerWorld
  • Crypto News
  • Cybersecurity
  • Endgadget
  • ExtremeTech
  • Fossbytes
  • Gaming
  • GeekWire
  • Gizmodo

Tech News

  • Harvard Tech
  • MacRumors
  • Macworld
  • Mashable
  • Microsoft
  • MIT Tech
  • Physics
  • PC World
  • Random Tech
  • Retro Rewind ™
  • SiliconANGLE
  • SlashGear
  • Smartphone
  • StackSocial
  • Tech Careers

Tech News​

  • Tech Art
  • TechCrunch
  • Techdirt
  • TechRepublic
  • Techs Got To Eat ™
  • TechSpot
  • Tesla
  • The Verge
  • TNW
  • Trusted Reviews
  • UFO
  • VentureBeat
  • Visual Capitalist
  • Wired
  • ZDNet

Site Links

  • About Techcratic
  • Affiliate Disclaimer
  • Affiliate Link Policy
  • Contact Techcratic
  • Dealors Discount Store
  • Privacy and Security Disclaimer
  • Privacy Policy
  • RSS Feed
  • Site Map
  • Support Techcratic
  • Techcratic
  • Tech Deals
  • TOS
  • 𝕏
Click For A Secret Deal

Techcratic – Your All In One Tech Hub © 2020 – 2025
All Rights Reserved
∞

No Result
View All Result
  • 30 Second Tech ™
  • AI
  • App Zone ™
  • Apple
  • Ars Technica
  • CNET
  • Crypto News
  • Cybersecurity
  • Endgadget
  • Gaming
  • I Like Cats ™
  • I Like Dogs ™
  • MacRumors
  • Macworld
  • Tech Deals
  • Techcratic ™
  • Techs Got To Eat ™
  • Tesla
  • UFO
  • Wired