Core Concepts AI
Posts
Why AI Tries to Put Glue on Pizza

Why AI Tries to Put Glue on Pizza

How instant answers trip over simple math, contradict themselves, and what it teaches us about AI’s limits.

August 26, 2025

In partnership with

Yes…No…Maybe So?

I’m sure you’ve seen this before: You type a simple question into Google and it gives you that little instant AI answer box with a brief and confident-sounding “answer.”

Perhaps the answer says “No,” and then, two sentences later, changes to “Yes.”

In one famously bizarre case, Google’s AI Overview recommended adding non‑toxic glue to pizza sauce to keep the cheese from sliding off. And of course, this was a rightly mocked and troubling example of an AI hallucination.

Google responded by curbing the feature’s prevalence and tightening its safeguards.

Let’s take a look at a more recent example of this reasoning breakdown in the following viral screenshot (from X/Twitter):

Here, the user asked a simple question: Was 1995 30 years ago?

The AI answer begins: No, 1995 was not 30 years ago.

But then, a few words later, the very same answer continues: …If today is July 25, 2025, then 30 years ago would be 1995. So, yes, 1995 was 30 years ago.

In a single paragraph, the AI confidently denies and then affirms the same fact.

This actually serves as a pretty interesting example of how various AI models work (or don’t work):

It's Not Google's Best AI Doing This

These instant responses are not powered by Google’s biggest, most capable models. Those are expensive and slow.

Instead, the little AI boxes that pop up with a one-shot answer are optimized for latency and price, and are often powered by smaller models or constrained setups. Why? Because, well, large deliberative models are slower and expensive to run at internet scale. To optimize for fast, cheap outputs, you usually must sacrifice multi-step reasoning…especially for math and logic.

Smaller models, on average, make more basic mistakes.

Giphy

How This Mess Actually Happens

AI doesn't think the way we do. When you answer a question, you think it through first, then speak. AI generates text one word at a time while it's figuring things out.

Here's what went wrong with the 1995 question:

The first “No”
The model begins writing by following a common text pattern: people often phrase this question as “No, actually it was X years ago.” That’s the statistical groove it fell into.

The correction mid-answer
As it keeps generating text one word at a time, the math of “2025 – 30 = 1995” kicks in, and the model produces the right conclusion.

The contradiction
Because the system doesn’t pause, plan, or go back to fix its initial mistake, both the wrong statement and the right statement end up side by side. Humans would usually catch ourselves and revise. The model just plows ahead.

Why This Matters

For a casual user, this looks absurd: how can the same answer say “no” and “yes”? But it illustrates two key truths about instant AI systems:

They don’t reason globally; they just predict the next likely word.
Small, fast models are especially prone to these slips because they don’t have the horsepower—or the design scaffolding—to check their work.

It’s not that the AI “doesn’t know” what 2025 minus 30 equals. It’s that the way it writes is fundamentally different from the way humans reason and then speak.

AI generates text one token (smallest unit of text that a language model reads or generates) at a time, always predicting the most statistically likely next word given what’s already been produced. It doesn’t have a “complete reasoning plan” before it speaks.

That’s why the response can begin with a confident “No…” but shift mid‑sentence when the math inadvertently “clicks,” resulting in a sudden “yes.”

It’s a design tradeoff: speed versus reasoning.

The Ghost in the Machine (Learning)

Google has publicly acknowledged issues with AI Overviews and described mitigation steps after a wave of high-profile errors (e.g., “put glue on pizza”). The company attributes many failures to “data voids,” poor source quality, and odd queries, while also asserting that aggregate satisfaction remains high. Whatever the root causes, these episodes illustrate the fragility of fast, summary-style answers

In other words, when you ask something like, “Was 1995 30 years ago?”, the small models can stumble on what seems obvious to a human.

How to Use AI Without Getting Burned

I understand the mockery, but I also know these tools are genuinely useful when you know how to use them properly. To start, relying on the AI overview for anything is probably not the way to go…instead, use Gemini or another large language model with more power in its design.

Here's how to get better answers without falling into the contradiction trap:

Make it show its work. Don't just ask "What's the answer?" Say "Show me the steps" or "Explain your reasoning first, then give me the answer." Forcing AI to work through a problem step by step makes it far less likely to contradict itself.

Force a second opinion. Ask the AI to solve the same problem a different way and check if the answers match. It's like asking someone to double-check their math. You'd be surprised how often this catches mistakes.

Switch to slow mode when it matters. Many AI systems have a "think harder" option. If you're working on something important, use it. The quick answers are fine for casual stuff, but not when accuracy counts.

Always check the sources. This is huge. Don't trust AI summaries of articles or data. Click through to the original sources. AI loves to confidently summarize things that don't exist or misrepresent what actually happened.

Because the moment you stop checking its work is the moment you end up putting glue on your pizza.

North Light AI is an applied-AI studio that builds real-world tools and training programs for mission-driven teams—from nonprofits and educators to SMBs and government agencies. Their offerings include smart grant-writing assistance (GrantSpace), supplier qualification platforms (Prime Ready), AI-generated risk assessments (Baseline AI), as well as custom consulting and workforce training. Their focus? Deliver AI solutions that actually make work easier and outcomes more meaningful. northlightai.com

Is Your Amazon Strategy Actually Working? Here’s What Top Brands Do Differently.

At Cartograph, we’ve worked with some of the most innovative brands in CPG—OLIPOP, Starface, and Rao’s—and understand the nuances of selling consumables on Amazon.

Are you a fast-growing brand in Food & Beverage, Supplements, Beauty & Personal Care, Household, Pet, or Baby?
Growing 50%+ YoY?
Do you know your Amazon profitability (and are you happy with it)?

We’ve spent the past 7 years helping CPG brands scale profitably on Amazon. What makes Cartograph different:
• Deep CPG focus
• No more than 4 brands per team
• Monthly P&L forecasts within 5% accuracy
• Daily reporting via Slack

Click below to get a custom, human review of your Amazon account—not just another automated report.

Get An Amazon Audit