Close Menu
    Facebook X (Twitter) Instagram
    TRENDING :
    • 5 Must-Know B2B Deals to Maximize Savings
    • What Is Social Media Community Management and Its Importance?
    • 10 Powerful Teamwork Tactics for Collaboration Enhancement
    • Top 7 Bookkeeping Apps for Small Businesses
    • What Is Computer Asset Management and Its Importance?
    • What Are Personal Micro Loans and How Do They Function?
    • How ‘Nirvanna the Band’ helped revitalize a landmark Toronto venue
    • Watch out, Spotify: This free site is a music lover’s dream
    Populist Bulletin
    • Home
    • US Politics
    • World Politics
    • Economy
    • Business
    • Headline News
    Populist Bulletin
    Home»Business»AI’s most important benchmark in 2026? Trust
    Business 7 Mins Read

    AI’s most important benchmark in 2026? Trust

    Business 7 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
    Follow Us
    Google News Flipboard
    Share
    Facebook Twitter LinkedIn Pinterest Email

    In 2026 (and beyond) the best benchmark for large language models won’t be MMLU or AgentBench or GAIA. It will be trust—something AI will have to rebuild before it can be broadly useful and valuable to both consumers and businesses.

    Researchers identify several different kinds of AI trust. In people who use chatbots as companions or confidants, they measure a feeling that the AI is benevolent or has integrity. In people who use AI for productivity or business, they measure something called “competence trust,” or the belief that the AI is accurate and doesn’t hallucinate facts. I’ll focus on that second kind.

    Competence trust can grow or shrink. An AI tool user, quite rationally, begins by giving the AI simple tasks—perhaps looking up facts or summarizing long documents. If the AI does a good job of these things, the user naturally thinks “what else can I do with this?” They may give the AI a slightly harder task. If the AI continues to get things right, trust grows. If the AI fails or provides a low-quality answer, the user will think twice about trying to automate the task next time.

    Steps forward, steps back

    Today’s AI chatbots, which are powered by large generative AI models, are far better than the ones we had in 2023 and 2024. But AI tools are just beginning to build trust with most users, and most C-suite executives who hope the tools will streamline business functions. My own trust of chatbots grew in 2025. But it has also diminished. 

    Example: I entered a long conversation with one of the popular chatbots about the contents of a long document. The AI made some interesting observations about the work, and suggested some sensible ways of filling in gaps. Then it made an observation that seemed to contradict something I knew was in the document. 

    When I pointed out the missing data, it immediately admitted its mistake. When I asked it (again) if it had digested the full document, it again insisted it had. Another AI chatbot returned a research report that it said was based on 20 sources. But there were no citations in the text connecting specific statements to specific sources. After it added the citations within the text, I noted that in two places the AI had relied on a single, not-very-trustworthy source for a key fact. 

    I learned that AI models still struggle with long chats involving large amounts of information, and that they’re not good at telling the user when they’re in over their heads. The experience adjusted my trust in the tools.

    Grappling with ambiguity

    As we enter 2026, generative AI’s story is still in its early chapters. The story started with AI labs developing models that could converse, write, and summarize. Now the big AI labs seem confident that AI agents can autonomously work through complex tasks, calling on tools and checking their work against expert data. They seem convinced that the agents will soon manage ambiguity with humanlike judgment. 

    If large companies begin to trust that these agents can reliably do such jobs, it would mean enormous revenues for the AI company that developed them. Based on their current investments of hundreds of billions into AI infrastructure, the AI companies and their backers seem to believe this outcome is close at hand. 

    Even if the AI could bring human-level intellect to business scenarios tomorrow, it may still take time to build trust among decision-makers and workers. Today, trust in AI isn’t high. The consulting firm KPMG surveyed 48,000 people in 47 countries (two-thirds of which use AI regularly) and found that while 83% believe AI will be beneficial, only 46% actually trust the output of AI tools. Some may have a false trust in the technology: two-thirds of the respondents say they sometimes rely on AI output without evaluating its accuracy.

    But I doubt that AI agents are ready to complete complex tasks and manage ambiguity like human experts might. As the AI is used by more people and businesses, they will encounter a universe of unique problems within various contexts that they’ve never seen before. I doubt that current AI agents understand the ways of humans and the world well enough to improvise their way through such situations. Not yet anyway. 

    The limitations of the models

    The fact is that AI companies are using the same kind of (transformer-based) AI models to underpin reasoning agents that they used for early chatbots that were essentially word generators. The core function of such models, and the objective of all their training, is predicting the next word (or pixel or audio bit) in a sequence, Microsoft AI CEO (and Google DeepMind cofounder) Mustafa Suleyman explained in a recent podcast. “It is using that very simple likelihood-of-word prediction function to simulate what it’s like to have a great conversation or to answer complex questions,” he said. 

    Suleyman and others doubt it. Suleyman believes that current models don’t account for some of the key drivers of the things humans say and do. “Naturally, we would expect that something that has the hallmarks of intelligence also has the underlying synthetic physiology that we do, but it doesn’t,” Suleyman said. “There is no pain network. There is no emotional system. There is no inner will or drive or desire.” 

    AI pioneer (and Turing Prize winner) Yann LeCun says the LLMs of today are useful enough to be applied in some valuable ways, but thinks they’ll never achieve the general or human-level intelligence needed to do the really high-value work the AI companies hope they will. In order to learn to intuit paths through real-world complexity the AI would need a much higher-bandwidth training regimen than just words, images, and computer code, LeCun says. They may need to learn the world via something more like the multisensory experience babies have, and possess the uncanny ability to process and store all that information quickly, as babies can, he says. 

    Suleyman and LeCun may be wrong. Companies like OpenAI and Anthropic may achieve human-level intelligence using models whose origin is in language. 

    AI governance matters

    Meanwhile, competence is just one factor in AI trust among business users. Enterprises use governance platforms to monitor whether and how AI systems might be creating regulatory compliance issues or exposing the company to risk of cyberattack, for example. “When it comes to AI, large enterprise companies . . . want to be trusted by customers, investors, and regulators,” says Navrina Singh, founder and CEO of the governance platform Credo AI. “AI governance isn’t slowing us down, it’s the only thing that allows measurable trust and lets intelligence scale without breaking the world.”

    In the meantime the pace at which humans delegate tasks to AI will be moderated by trust. AI tools should be used for tasks they’re good at, so that confidence in the results grows. That’ll take time, and it’s a moving target because the AI is continually improving. Discovering and delegating new tasks for AI, monitoring the results, and adjusting expectations will very likely become a routine part of work in the 21st century.  

    No, AI won’t suddenly reinvent business all at once next year. 2026 won’t be the “year of the agent.” It’ll take a decade for AI tools to prove out and become battle-hardened. Trust is the hardening agent.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    5 Must-Know B2B Deals to Maximize Savings

    May 30, 2026

    What Is Social Media Community Management and Its Importance?

    May 30, 2026

    10 Powerful Teamwork Tactics for Collaboration Enhancement

    May 30, 2026
    Top News
    Business 2 Mins Read

    Children’s ibuprofen recalled nationwide after customer complaints of ‘gel-like mass and black particles’

    Business 2 Mins Read

    Nearly 90,000 bottles of children’s ibuprofen have been recalled across the United States, according to…

    Becoming a U.S. citizen just became harder: How the new civics test questions differ from the old ones

    October 25, 2025

    The real reason so many enterprise AI initiatives are failing? LLMs were never built to run a company 

    April 21, 2026

    President Trump vs. Nicolás Maduro: Mexico and China Weigh In | The Gateway Pundit

    August 24, 2025
    Top Trending
    Business 7 Mins Read

    5 Must-Know B2B Deals to Maximize Savings

    Business 7 Mins Read

    To maximize savings in your B2B deals, focus on five essential strategies.…

    Business 7 Mins Read

    What Is Social Media Community Management and Its Importance?

    Business 7 Mins Read

    Social media community management is about actively engaging with your audience across…

    Business 13 Mins Read

    10 Powerful Teamwork Tactics for Collaboration Enhancement

    Business 13 Mins Read

    In terms of enhancing collaboration within teams, implementing effective teamwork tactics can…

    Categories
    • Business
    • Economy
    • Headline News
    • Top News
    • US Politics
    • World Politics
    About us

    The Populist Bulletin was founded with a fervent commitment to inform, inspire, empower and spark meaningful conversations about the economy, business, politics, government accountability, globalization, and the preservation of American cultural heritage.

    We are devoted to delivering straightforward, unfiltered, compelling, relatable stories that resonate with the majority of the American public, while boldly challenging false mainstream narratives that seem to only serve entrenched elitists, and foreign interests.

    Top Picks

    5 Must-Know B2B Deals to Maximize Savings

    May 30, 2026

    What Is Social Media Community Management and Its Importance?

    May 30, 2026

    10 Powerful Teamwork Tactics for Collaboration Enhancement

    May 30, 2026
    Categories
    • Business
    • Economy
    • Headline News
    • Top News
    • US Politics
    • World Politics
    Copyright © 2025 Populist Bulletin. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.