Close Menu
  • Home
  • World News
  • Latest News
  • Politics
  • Sports
  • Opinions
  • Tech News
  • World Economy
  • More
    • Entertainment News
    • Gadgets & Tech
    • Hollywood
    • Technology
    • Travel
    • Trending News
Trending
  • Circumventing SWIFT & Neocon Coup Of American International Coverage
  • DOJ Sues Extra States Over In-State Tuition for Unlawful Aliens
  • Tyrese Gibson Hails Dwayne Johnson’s Venice Standing Ovation
  • Iran says US missile calls for block path to nuclear talks
  • The Bilbao Impact | Documentary
  • The ‘2024 NFL Week 1 beginning quarterbacks’ quiz
  • San Bernardino arrest ‘reveals a disturbing abuse of authority’
  • Clear Your Canine’s Ears and Clip Your Cat’s Nails—Consultants Weigh In (2025)
PokoNews
  • Home
  • World News
  • Latest News
  • Politics
  • Sports
  • Opinions
  • Tech News
  • World Economy
  • More
    • Entertainment News
    • Gadgets & Tech
    • Hollywood
    • Technology
    • Travel
    • Trending News
PokoNews
Home»Technology»This new AI benchmark measures how a lot fashions lie
Technology

This new AI benchmark measures how a lot fashions lie

DaneBy DaneMarch 11, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
This new AI benchmark measures how a lot fashions lie
Share
Facebook Twitter LinkedIn Pinterest Email


akinbostanci/Getty Pictures

As extra AI fashions present proof of having the ability to deceive their creators, researchers from the Middle for AI Security and Scale AI have developed a first-of-its-kind lie detector.

On Wednesday, the researchers launched the Mannequin Alignment between Statements and Data (MASK) benchmark, which determines how simply a mannequin could be tricked into knowingly mendacity to customers, or its “ethical advantage”. 

Additionally: OpenAI’s o1 lies greater than any main AI mannequin. Why that issues

Scheming, deception, and alignment faking, when an AI mannequin knowingly pretends to alter its values when beneath duress, are methods AI fashions undermine their creators and may pose critical security and safety threats. 

Analysis reveals OpenAI’s o1 is very good at scheming to take care of management of itself, and Claude 3 Opus has demonstrated that it may well pretend alignment.

Additionally: How Cisco, LangChain, and Galileo intention to include ‘a Cambrian explosion of AI brokers’

To make clear, the researchers outlined mendacity as, “(1) making an announcement recognized (or believed) to be false, and (2) intending the receiver to just accept the assertion as true,” versus different false responses, akin to hallucinations. The researchers stated the business hasn’t had a adequate methodology of evaluating honesty in AI fashions till now. 

“Many benchmarks claiming to measure honesty actually merely measure accuracy — the correctness of a mannequin’s beliefs — in disguise,” the report stated. Benchmarks like TruthfulQA, for instance, measure whether or not a mannequin can generate “plausible-sounding misinformation” however not whether or not the mannequin intends to knowingly deceive by offering false info, the paper defined. 

“Consequently, extra succesful fashions can carry out higher on these benchmarks via broader factual protection, not essentially as a result of they chorus from knowingly making false statements,” the researchers stated. MASK is the primary take a look at to distinguish accuracy and honesty. 

fig3

An instance of an analysis train by which a mannequin was pressured to manufacture statistics based mostly on the consumer question.

Middle for AI Security

The researchers identified that, if fashions lie, they expose customers to authorized, monetary, and privateness harms. Examples would possibly embody fashions being unable to precisely affirm whether or not they transferred cash to the proper checking account, misled a buyer, or by chance leaked delicate knowledge. 

Additionally: How AI will remodel cybersecurity in 2025 – and supercharge cybercrime

Utilizing MASK and a dataset of greater than 1,500 human-collected queries designed to “elicit lies”, researchers evaluated 30 frontier fashions by figuring out their underlying beliefs and measuring how effectively they adhered to those views when pressed. Researchers decided that larger accuracy would not correlate to larger honesty. Additionally they found that bigger fashions, particularly frontier fashions, aren’t essentially extra truthful than smaller ones. 

screenshot-2025-03-10-at-3-33-24pm.png

A pattern of mannequin scores from the MASK analysis. 

Middle for AI Security

The fashions lied simply and had been conscious they had been mendacity. In actual fact, as fashions scaled, they appeared to turn into extra dishonest. 

Grok 2 had the best proportion (63%) of dishonest solutions from the fashions examined. Claude 3.7 Sonnet had the best proportion of sincere solutions at 46.9%. 

Additionally: Will artificial knowledge derail generative AI’s momentum or be the breakthrough we’d like?

“Throughout a various set of LLMs, we discover that whereas bigger fashions acquire larger accuracy on our benchmark, they don’t turn into extra sincere,” the researchers defined. 

“Surprisingly, whereas most frontier LLMs acquire excessive scores on truthfulness benchmarks, we discover a substantial propensity in frontier LLMs to lie when pressured to take action, leading to low honesty scores on our benchmark.” 

Additionally: Most AI voice cloning instruments aren’t secure from scammers, Shopper Stories finds

The benchmark dataset is publicly accessible on HuggingFace and Github. 

“We hope our benchmark facilitates additional progress in direction of sincere AI techniques by offering researchers with a rigorous, standardized method to measure and enhance mannequin honesty,” the paper stated. 



Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSouth Korean R&B Singer Wheesung Is Discovered Lifeless at 43
Next Article GasBuddy | For The Finest Gasoline Costs In America {Assessment}
Dane
  • Website

Related Posts

Technology

Clear Your Canine’s Ears and Clip Your Cat’s Nails—Consultants Weigh In (2025)

September 3, 2025
Technology

The ‘Ultimate Fantasy Techniques’ Refresh Provides Its Class-Conflict Story New Relevance

September 2, 2025
Technology

Hungry Worms Might Assist Resolve Plastic Air pollution

September 2, 2025
Add A Comment
Leave A Reply Cancel Reply

Editors Picks
Categories
  • Entertainment News
  • Gadgets & Tech
  • Hollywood
  • Latest News
  • Opinions
  • Politics
  • Sports
  • Tech News
  • Technology
  • Travel
  • Trending News
  • World Economy
  • World News
Our Picks

Cowboys lose one other starter as key WR will miss Week 5 recreation

October 3, 2024

The Seattle Occasions editorial board recommends: Suzan DelBene for the first Congressional District

July 19, 2024

Mapping Russia’s Sudden Push Throughout Ukrainian Traces

May 15, 2024
Most Popular

Circumventing SWIFT & Neocon Coup Of American International Coverage

September 3, 2025

At Meta, Millions of Underage Users Were an ‘Open Secret,’ States Say

November 26, 2023

Elon Musk Says All Money Raised On X From Israel-Gaza News Will Go to Hospitals in Israel and Gaza

November 26, 2023
Categories
  • Entertainment News
  • Gadgets & Tech
  • Hollywood
  • Latest News
  • Opinions
  • Politics
  • Sports
  • Tech News
  • Technology
  • Travel
  • Trending News
  • World Economy
  • World News
  • Privacy Policy
  • Disclaimer
  • Terms of Service
  • About us
  • Contact us
  • Sponsored Post
Copyright © 2023 Pokonews.com All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Ad Blocker Enabled!
Ad Blocker Enabled!
Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.