Three AIs, One AC Unit, and a Lesson About Trust

Same question. Three AI models. Three completely different answers. Here is what it taught me about picking the right one, giving real context, and the one feature in Claude almost nobody uses.

The Netherlands hit a heatwave. My attic room hit the temperature where sleeping through the night stopped being possible. So I did what every overheated person does. I bought a portable AC unit, hauled the thing up a flight of Dutch stairs (a workout I did not order), and opened the manual.

The QR code linked to a video that did not exist. The PDF manual had been pulled from the manufacturer's site. The installation guide skipped steps. I was on my own.

I had a question. Could I connect this portable AC to the duct system already running through the room. My gut said no. But I wanted to see what the LLMs would do with it. So I asked the same question three times.

Three answers. Same question. Wildly different relationships to the truth.

Gemini said yes. You can connect it.

Technically correct. Mechanically possible. Also wrong. The outlet sizes were slightly off, the pressure differentials were a real issue, and the safety risks were not trivial. Gemini answered the question I asked. It did not answer the question I needed.

ChatGPT said hire someone. It would not tell me how. It pushed me toward an installation company.

For context, I had been running a test where my account settings were gendered female. I have been playing with this to see what shifts in the responses. The whole exchange felt like "silly girl, get a man to fix it." No information. No reasoning. Just a transaction nudge. If I had been a contractor asking the same thing, I do not think the answer looks the same.

Claude said no, and here is why. It walked through the specific risks, pointed me to the actual right solution, and offered a stopgap for the heatwave tonight.

One question, three failure modes. Gemini failed at judgment. ChatGPT failed at respect. Claude got it right.

What this is actually about

This is not an LLM beauty pageant. The point is that picking the right model is a real choice with real stakes, and most people are not making it deliberately.

Different models have different defaults. Some optimize for technical accuracy. Some optimize for commercial deflection, pushing you toward a paid transaction rather than an answer. Some optimize for the actual question underneath the question. You need to know which one you are talking to before you trust the answer.

The ChatGPT response is worth sitting with for a second. That was not a safety move. ChatGPT did not refuse on safety grounds. It refused on identity grounds. The judgment was that I was not the right person to do this, and the priority was routing me to a commercial transaction. Those are different things. Safety would have said "here is what could go wrong, here is what to check." Commercial deflection said "hire someone." Read your refusals carefully. They are telling you what the model actually values.

Especially when the cost of being wrong is not "the chatbot said something weird." It is "I damaged a duct system." Or "I shipped a strategy memo built on a hallucinated stat." Or "I made a hire based on a summary that missed the context."

So. Some things I have learned about catching this early.

Missing context misses the mark

If you do not tell the model what the work is for, who it is for, and why it matters, you will get an answer that is shaped like an answer but is not actually one.

Simon Sinek's *Start with Why* has been on the leadership shelf for years for a reason. The point of that book is that humans cannot follow direction without context. They need to know the why before the what. Leaders who skip the why end up with teams that execute the wrong thing perfectly.

The same lesson applies to AI. Maybe more so. A model has even less inherited context than a new hire. It does not know your business, your constraints, your audience, or your stakes unless you tell it. Start with why is not a leadership cliche when you are working with an LLM. It is the difference between a useful answer and a confidently wrong one.

The portable AC question is a perfect example. I asked "can I connect this." I did not say "I am trying to sleep tonight, I am in a rental, I am not a HVAC tech, and I do not want to break anything that I would have to pay to repair." With that context, Gemini's answer collapses. Without it, it sounds plausible.

Lead with why. Then the question. The model has more to work with and you get fewer technically-correct, practically-wrong answers.

Use the edit button. Stop arguing with bad answers.

This is the single most overlooked feature in Claude and it costs people quality every day.

If you send a prompt, get a wrong answer, and correct it in the next message, the wrong answer is still in the conversation. The model is pulling from it. Every follow-up is partially anchored to the bad reasoning. You are correcting in public while the original error sits in the context window doing damage.

Go back to the original prompt. Click edit. Add the missing context. Regenerate.

The bad answer disappears from the model's working memory. The model gets a clean shot at the question. You stop fighting your own context.

This is the closest thing to a power move that exists in chat interfaces.

Push back on technicalities

Most models will give you the spec sheet answer. Numbers. Compatibility tables. Yes-or-no. Done.

Great. Now check whether that answer is actually right for your situation. Ask the model to think about your audience, your use case, your constraints. Ask it what could go wrong. Ask what it would tell a friend in the same position.

Technically correct is the worst kind of wrong. It feels like an answer. It even is an answer. It just is not the one you needed.

Treat answers as drafts, not facts

LLMs are wrong constantly. Sometimes confidently. Sometimes elegantly. The model does not know when it is hallucinating. You have to.

Cross-check. Run the same question through a second model and tell it the first one's answer. Search the actual source. Ask for citations and then check them. If you find yourself skimming, that is the signal to close the tab and come back later. Your judgment is the limited resource. Spend it like one.

The takeaway

I got my answer about the AC unit. Claude was right. The duct connection would have been a slow-motion mistake.

But the real lesson is not which model to use. It is that you have to stay in the loop. The models that sound the most confident are not the most reliable. The models that ask the better question are. And the difference between those two is the difference between a tool that helps you and a tool that costs you.

Pick deliberately. Give context. Push back. Edit, do not argue. Check the work.

The AI is not the expert. You are.

Where Does Your Organization Stand?

If you are reading this and wondering whether your team is picking AI tools deliberately or by default, that is the right question to be asking.

Our MACH and AI Readiness Assessment benchmarks your organization against enterprise leaders from the MACH Alliance 2026 Report. Free, about five minutes, and you walk away with a personalized analysis of where you stand on composable maturity and AI readiness.

Want to go deeper? Let's talk. Fidget Labs helps organizations cut through the AI hype and build enablement strategies that actually scale.

Take the Assessment Let's Talk