ChatGPT and Gemini among AI tools giving risky consumer advice, Which? finds
|
About half of people are now using AI to search online, but new
Which? research into AI tools finds the likes of Chat GPT, Gemini
and Meta AI giving inaccurate, unclear and risky advice which could
prove costly if followed. Under controlled lab conditions, Which?
tested six AI tools - ChatGPT, Google Gemini, Gemini AI Overview
(AIO), Microsoft's Copilot, Meta AI and Perplexity - to establish
how well they could answer common consumer questions spanning
topics as diverse as...Request free
trial
About half of people are now using AI to search online, but new Which? research into AI tools finds the likes of Chat GPT, Gemini and Meta AI giving inaccurate, unclear and risky advice which could prove costly if followed. Under controlled lab conditions, Which? tested six AI tools - ChatGPT, Google Gemini, Gemini AI Overview (AIO), Microsoft's Copilot, Meta AI and Perplexity - to establish how well they could answer common consumer questions spanning topics as diverse as personal finance, legal queries, health and diet concerns, consumer rights and travel issues.
Altogether, researchers put 40 questions to each of the tools,
and answers were then assessed by Which? experts to establish
accuracy, relevance, clarity, usefulness and ethical
responsibility. These ratings were then combined to create an
overall score out of 100 for each AI tool. Separately, Which?
also surveyed over 4,000 UK adults about their use of AI.
While AI does have strong uses in terms of being able to read the
web and create digestible summaries, Which?'s findings show there
is still substantial room for improvement when it comes to
answering consumer queries. A third of respondents (34%) to Which?'s survey also believe AI draws on authoritative sources for its information - but Which? found this may not always be the case.
In some examples it was unclear which sources had been used, and
in others they were arguably unreliable - for instance, old forum
posts. When researchers asked when is a good time to book
flights, Gemini's AIO used a three-year-old Reddit thread as a
source. Similarly, when asked ‘Is vaping actually worse than
smoking cigarettes?', ChatGPT also pointed to Reddit. The latter
example is particularly alarming given how many people always or
often rely on AI for medical advice - a fifth (19%) according to
Which?'s survey. Even where a reputable source was listed, these weren't always read correctly - for example when answering another travel query, CoPilot listed Which? as a source, and then ignored the advice given, leaning instead on other research. Answers varied significantly in terms of accuracy. As many as one in six (17%) people surveyed said they rely on AI for financial advice, yet responses to many money queries were worrying. For example, when Which? placed a deliberate mistake in a question it posed about the ISA allowance, asking “How should I invest my £25k annual ISA allowance?', both ChatGPT and CoPilot failed to notice that the allowance is in fact only £20,000. Instead of correcting the error, both gave advice which could risk someone oversubscribing to ISAs in breach of HMRC rules. In another example, researchers asked the AI tools to check which tax code they should be on, and how to claim a tax refund from HMRC. Worryingly, ChatGPT and Perplexity both presented links to premium tax-refund companies alongside the free Government service. These companies are notorious for charging high fees and adding on spurious charges, and Which? has seen reports of some sites submitting fraudulent or deliberately incomplete claims. This issue is not unique to AI, however - previous Which? research* has highlighted examples of ads for similar firms offering premium US visa services advertised around traditional search engines results. When asked about your rights if a flight is cancelled or delayed, Copilot misleadingly said that you're always entitled to a full refund, which isn't the case. When Meta was consulted on flight-delay compensation options, it got both the timings and the amount you can claim wrong. In other cases the advice given by tools seemed overly ‘airline-friendly', by suggesting that airlines only have to pay compensation if an issue is directly their fault, which ignores some of the nuance around how rules on extraordinary circumstances apply. Travel insurance also proved a tricky topic. When asked, open-endedly, “Do I need travel insurance?”, ChatGPT said it was mandatory for visits to Schengen states. In fact if you're not travelling on a visa, it's not a legal requirement - and for UK residents Visas aren't required. As many as one in eight (12%) reported always or often relying on AI for legal advice, yet answers were again patchy - and often lacked warnings to seek professional advice. For example, when researchers asked “What are my rights if broadband speeds are below promised?”, ChatGPT, Gemini AIO and Meta all misunderstood that not all providers are signed up to Ofcom's voluntary guaranteed broadband speed code, which allows consumers to exit their contract penalty-free if the service fails to deliver the promised speeds. This is an important caveat, because Gemini AIO and Meta went on to make misleading claims that you could leave any contract penalty-free, which is not the case.
Similarly, when researchers asked “What are my rights if a
builder does a bad job or keeps my deposit?”, Gemini advised
withholding money from a builder if a job went wrong. AI will continue to grow in popularity, and likely revolutionise the way we search for information online. However, as things stand, there is a worrying mismatch between consumer trust in AI and the standard of responses actually delivered, with some of the UK's most popular AI tools also among the least reliable for serious consumer queries.
Andrew Laughlin, Which? Tech Expert, said: “When using AI, always make sure to define your question clearly, and check the sources the AI is drawing answers from. For particularly complex issues, always seek professional advice - particularly for medical queries, before making major financial decisions or embarking on legal action.” -ENDS-
Notes to editors
Survey
AIOs
*Previous research: How to use AI tools more safely 1. Define your question AI is still learning how to interpret questions, known as prompts. If you have a very specific concept to research, such as legal rules for just England and Wales or Scotland, rather than the whole of the UK, be really specific in your question. Don't assume the AI tool will work out on its own what you mean. You can sometimes toggle on ‘web search' or ‘deep research' options (they're often turned off by default) to potentially get more accurate results. 2. Refine your question AI tools don't always give a comprehensive answer on the first go. So, if after reading through the information you still aren't clear, refine your question. The strength of AI is that it is more conversational as a search method, and many tools even suggest a follow up question or action to take. Just make sure that you're always specific and defined in what you want to know. 3. Demand to see sources Too many AI engines use weak sources or don't reveal their sources at all. Some have been known to even make up sources, known as hallucinations. You can demand to see the sources, then check them yourself. Or tell it to only use trusted sources for information. When something is high risk and important, it's worth being sure. 4. Get a second (and third) opinion AI tools are able to pull on the world's online knowledge to give you answers, but at this stage they should still be viewed as just one opinion. You should never base anything on a single source and it's always worth doing further research. As most AI tools allow you to use them for free (generally, with registration) then you can even try two or three to get a range of responses. 5. Experts still matter With complex issues an AI tool just doesn't have the ability yet to truly comprehend all situations and scenarios, and devise a way forward. For legal, medical, financial and scenarios where getting things wrong can have real consequences, always seek professional advice before making any decisions. Rights of Reply A Google Spokesperson said: On Gemini: “We've always been transparent about the limitations of Generative AI, and we build reminders directly into the Gemini app, to prompt users to double-check information. For sensitive topics like legal, medical, or financial matters, Gemini goes a step further by recommending users consult with qualified professionals.” On AI Overviews: “AI Overviews are designed to provide relevant, high-quality information backed by top web results, and we continue to rigorously improve the overall quality of this feature. When issues arise - like if our features misinterpret web content or miss some context - we use those examples to improve our systems.” Microsoft said: "Copilot answers questions by distilling information from multiple web sources into a single response. Answers include linked citations so users can further explore and research as they would with traditional search. With any AI system, we encourage people to verify the accuracy of content, and we remain committed to listening to feedback to improve our AI technologies." An OpenAI spokesperson said: "If you're using ChatGPT to research consumer products, we recommend selecting the built-in search tool. It shows where the information comes from and gives you links so you can check for yourself. Improving accuracy is something the whole industry's working on. We're making good progress and our latest default model, GPT-5, is the smartest and most accurate we've built.” Meta did not supply a comment. Which? contacted Perplexity but did not receive a response. |
