Fact File: X’s AI chatbot claims Carney isn’t PM, Russian video set in Toronto
Advertisement
Read this article for free:
or
Already have an account? Log in here »
We need your support!
Local journalism needs your support!
As we navigate through unprecedented times, our journalists are working harder than ever to bring you the latest local updates to keep you safe and informed.
Now, more than ever, we need your support.
Starting at $15.99 plus taxes every four weeks you can access your Brandon Sun online and full access to all content as it appears on our website.
Subscribe Nowor call circulation directly at (204) 727-0527.
Your pledge helps to ensure we provide the news that matters most to your community!
To continue reading, please subscribe:
Add Brandon Sun access to your Free Press subscription for only an additional
$1 for the first 4 weeks*
*Your next subscription payment will increase by $1.00 and you will be charged $20.00 plus GST for four weeks. After four weeks, your payment will increase to $24.00 plus GST every four weeks.
Read unlimited articles for free today:
or
Already have an account? Log in here »
The artificial intelligence chatbot Grok made several errors over the past few weeks when asked to verify information on the X platform. In one instance, it miscaptioned video of a 2021 altercation involving hospital workers in Russia as taking place in Toronto a year earlier. In the other, it claimed multiple times Mark Carney “has never been Prime Minister.”
A computer scientist says AI chatbots “hallucinate” false information because they are not built to verify facts, but rather predict the next word in a sentence, and says there is an overconfidence among users who rely on chatbots to verify information despite their unreliability.
THE CLAIMS
“Mark Carney has never been Prime Minister,” the artificial intelligence chatbot Grok wrote in several
responses to users on the X platform, formerly Twitter, last week.
The chatbot built by Elon Musk’s company xAI, which owns X, doubled down when users pushed back against the false claim.
“My previous response is accurate,” Grok wrote.
Days earlier, Grok responded to a user’s inquiry about a video that appears to show hospital workers restraining and hitting a patient in an elevator.
When someone asked Grok to verify where the video took place, it claimed the video showed an incident at Toronto General Hospital from May 2020 that resulted in the death of 43-year-old Danielle Stephanie Warriner.
“If it’s in Canada why do the uniforms have Russian writing?” asked one user.
Grok claimed the uniforms were “standard green attire for Toronto General Hospital security” and said the video depicted a “fully Canadian event.”
THE FACTS
A reverse image search of a still from the video brings up multiple news
articles from Russian media from August 2021.
When translated into English, details show the video first spread on the Telegram channel Mash and took place in the Russian city of Yaroslavl.
Yaroslavl Regional Psychiatric Hospital said it fired two employees who were caught on the leaked CCTV video hitting the woman after leading her into an elevator at a residential building, according to the reports.
The 2020 incident at Toronto General Hospital that Grok referred to was partly captured on video; it shows part of an interaction between patient Warriner and security staff. The staff faced manslaughter and criminal negligence charges after Warriner died after the interaction, and the charges were later dropped.
Carney is indeed the Prime Minister and has been since he won the Liberal leadership election in March, followed by the Liberal party’s general election win on April 28.
In both cases, Grok eventually
corrected its mistakes after several prompts from users. But why did Grok repeat falsehoods, and why did it double down when corrected?
‘THEY DON’T HAVE ANY NOTION OF THE TRUTH’
Grok and other chatbots like ChatGPT and Google’s Gemini are large language models, or LLMs. They can recognize and generate text by training on text from the internet.
Large language models are “primarily just trained to predict the next word in a sentence, very much like auto-complete in our phone,” said Vered Shwartz, assistant professor of computer science at the University of British Columbia and CIFAR AI chair at the Vector Institute.
“Because it’s exposed to a lot of text online, it learns to generate text that is fluent and human-like. It’s also learning a lot of word knowledge, anything that people discuss online … it can usually give you factually correct answers,” she said.
When they provide factually incorrect information, it’s known as a “hallucination,” an aspect of language models that researchers say is inevitable because of how they are trained.
“They don’t have any notion of the truth … it just generates the statistically most likely next word,” Shwartz said.
“The result is that you get this really fluent text that looks human-like and often written in a very authoritative manner. But they don’t necessarily always reflect the information that they learned from the web. Sometimes they inappropriately kind of generalize or mix and match facts that are not true,” she said.
A large language model’s quality partly depends on the quality of the data they train on. While most models are proprietary, it’s generally understood they train on large portions of the web.
But while the models might vary slightly, hallucinations are inherent to all of them, not just Grok, Shwartz said.
Grok has multimodal capabilities, meaning it can respond to text inquiries and analyze video. It can associate what it sees in a video with textual description but is “by no means trained to do any kind of fact-checking … it’s just trying to understand what happens in a video and answer questions based on that,” Shwartz said.
She added that the models might double down on incorrect answers because they’re trained on the argumentative style that is common online. Some companies might customize their chatbots to sound more authoritative, or to be more deferential to users.
While it’s become increasingly common for people to lean on these chatbots to verify the information they see online, Shwartz said that’s “concerning.”
Internet users have a tendency to anthropomorphize chatbots, which are designed to mimic human language, and Shwartz said that causes overconfidence in the ability of large language models to verify information.
“They’re so used to humanizing (chatbots) and so they say, ‘Oh, it doubled down so it must be confident,'” she said.
“The premise of people using (large language models) to do fact-checking is flawed … it has no capability of doing that.”
This report by The Canadian Press was first published Nov. 25, 2025.