AI mistakes you're probably making

TL;DR

Validate the problem and reproduce the failure before involving AI; AI is weakest when it’s asked to diagnose an issue without the exact error and traces.

Briefing Cornell Notes

Briefing

AI coding tools deliver disappointing results most often when developers treat them like a last-resort safety net and feed them the wrong kind of context. The core fix is practical: pick problems you can already root-cause, give the model the smallest useful slice of information, and make the environment and agent instructions reliable—so the model can “autocomplete” the right change instead of wandering through noise.

A major mistake is selecting the wrong problem. When a team already knows how to fix an issue, the right move is to fix it directly—then use AI as a comparison tool. The better workflow starts with validating the problem, reproducing it, and applying the obvious fix. If that fails, the next step should be a genuinely new solution path (different technology, different approach, or deeper debugging), not a sudden leap to AI on a problem that’s poorly understood. AI tends to shine when the task is bounded and the expected outcome is known, because then it can be evaluated against a known-good solution. A hydration-error example illustrates the trap: asking an AI to “find the problem” without first having the exact error and the right context leads to guesswork. Once the exact error and relevant traces are provided, the model can produce a correct diagnosis and diff.

That leads to the second mistake: context rot from overstuffing. Models are next-token predictors; they don’t “read the whole codebase” the way humans do. Feeding an entire repository dump into chat (or trying to use massive context windows as a cure-all) often makes outputs worse because the signal gets buried. The transcript argues against “repo flattening” approaches that guarantee huge context and therefore tend to generate garbage. Even when models support large context windows, performance can degrade past tens of thousands of tokens due to distraction—analogized to a Jira ticket where the real issue is buried in irrelevant paragraphs.

Instead, the best context is targeted. Tools that provide search and retrieval (rather than full-code paste) perform better: agent instructions should point the model to where to look, and the model should pull only the relevant files. The speaker recommends using small, continuously improved “agent MD” / “cloud MD” files—less like a giant manual and more like a gotchas pile that prevents recurring failure modes. A concrete example: updating instructions so the agent stops running PNPM dev commands, and adding a step to run pnpm generate to refresh Convex types after schema changes. This kind of “teach the agent your environment” work is framed as quick but high leverage.

Finally, the transcript warns against configuration bloat—“MCP hell.” Loading dozens of MCP servers or copying long, template-heavy skills often increases context noise and breaks workflows. The suggested approach is minimal configuration, clear prompts, and plan-first iteration. When output is wrong, the fix should be to revert and adjust the plan or the agent instructions—not to keep appending more instructions to a polluted history. If the environment itself is broken (monorepo type-checking, ESLint/tsconfig paths, root-level commands), the agent will repeatedly chase “ghosts” and waste cycles.

The overall message is not anti-AI; it’s anti-misuse. AI gets dramatically better when developers treat it like a junior engineer that needs the right task, the right breadcrumbs, and a clean working setup—then they build durable memory by documenting recurring mistakes so the next run starts smarter.

Cornell Notes

AI coding tools work best when developers stop treating them as a last-resort guesser and instead give them the right problem, the right context, and a working environment. The transcript’s biggest claim is that “more context” often means “more noise,” because models autocomplete from what’s in context and can degrade when distracted by huge dumps. The recommended workflow is to validate and reproduce issues, fix what’s obvious, then use AI on problems you already understand so you can compare outcomes. For reliable results, keep agent instructions (agent MD / cloud MD) small and gotcha-focused, avoid MCP/config bloat, and fix broken dev/type-check setups so the agent isn’t chasing pre-existing errors. When outputs go wrong, revert and adjust the plan or instructions rather than appending more to a bad history.

Why does “using AI only after everything else fails” often produce weak results?

Because the model needs enough task definition to autocomplete a correct change. If the underlying issue is poorly understood, AI is asked to do root-cause work without the exact error, traces, or reproduction steps—so it guesses. The transcript contrasts a better approach: validate the problem, reproduce it, apply the obvious fix, and only then try new solution paths. AI is most effective when the developer can already root-cause or at least knows what “correct” looks like, so the model can be evaluated and guided rather than used as a blind fallback.

What is “context rot,” and why is pasting an entire codebase into chat usually harmful?

Context rot is performance degradation when too much irrelevant or noisy information crowds out the signal the model needs. Since models are next-token predictors, they increase the likelihood of the next tokens based on what’s present; if the majority of the context is irrelevant repository text, the model generates more irrelevant output. The transcript argues against flattening repositories into a single AI-friendly file and against relying on very large context windows as a universal fix—performance can drop sharply once context gets too large.

What counts as “good context” for coding agents?

Good context is targeted and minimal: describe the failing behavior and the exact error, then let the agent use search/retrieval tools to pull only relevant files. The transcript recommends using “agent MD” / “cloud MD” files that briefly map where things are and include gotchas, rather than dumping the whole repo. It also notes a behavioral difference: some models (e.g., Codex in the transcript) search more before editing, while others (e.g., Opus) may edit sooner—so the instructions should tell the model where not to touch and where to solve.

How should developers use agent instructions (agent MD / cloud MD) to improve reliability?

Treat these files as a continuously updated gotchas pile, not a full manual. Keep them small, markdown-heavy, and focused on recurring failure modes. The transcript gives examples: preventing the agent from running PNPM dev when a dev server is already running, and adding pnpm generate to refresh Convex types after schema changes. This reduces repeated mistakes because the agent’s “memory” resets each run; documenting environment-specific rules prevents the same errors from recurring.

What is “MCP hell,” and why does configuration bloat hurt?

MCP hell refers to loading many MCP servers and skills that add context noise and complexity without improving outcomes. The transcript claims that more features and more orchestration rarely convert a struggling setup into a good one; it often makes the experience worse. The recommended stance is minimal configuration: avoid dozens of MCPs, avoid long template skills unless they solve a specific problem, and make subtle adjustments to cloud MD / agent MD and prompts instead.

When an agent produces bad output, what should be changed first?

Revert and fix the plan or the instructions, rather than appending more text to a polluted history. Because the model’s behavior is driven by autocomplete over prior messages, bad instructions early can outweigh good ones later. Plan mode is recommended: it turns execution into a set of clarifying questions when uncertain, so developers can correct the plan before the agent makes widespread changes. If the plan is wrong, update the plan; if the understanding of the codebase is wrong, update agent MD / cloud MD.

Review Questions

What are the signs that you’re feeding an AI agent too much context, and how would you reduce it without losing the failing details?
How would you decide whether to fix the environment (tsconfig/ESLint/monorepo root commands) versus updating agent instructions?
Describe a workflow for using AI to solve a bug you already understand, and explain how you’d validate the AI’s output against a known-good fix.

Key Points

1
Validate the problem and reproduce the failure before involving AI; AI is weakest when it’s asked to diagnose an issue without the exact error and traces.
2
Use AI on problems you already know how to solve to compare outputs and build intuition about what the model does correctly.
3
Avoid dumping entire repositories into chat; models can degrade when context becomes mostly noise (“context rot”).
4
Prefer targeted context plus search/retrieval tools; give the smallest useful description and let the agent pull only relevant files.
5
Keep agent MD / cloud MD small and gotcha-focused; document recurring environment-specific mistakes so they don’t repeat each run.
6
Fix broken dev/type-check environments (monorepo root configs, ESLint/tsconfig paths) so the agent isn’t chasing ghost errors.
7
Don’t chase bad outputs by appending more instructions; revert, adjust the plan or agent instructions, and rerun.

Highlights

AI coding failures often come from using the model as a blind fallback when the exact error and root context are missing—once the right trace is provided, the model can produce a correct diff.

More context isn’t better: pasting whole codebases or relying on huge context windows can worsen results due to context rot and next-token autocomplete behavior.

Agent reliability improves when environment quirks are encoded in small agent MD / cloud MD gotchas files—e.g., stopping PNPM dev and running pnpm generate after Convex schema changes.

Configuration bloat (“MCP hell”)—dozens of MCP servers and long skills—adds noise and complexity without turning a weak setup into a strong one.

When outputs are wrong, revert and adjust the plan or instructions; appending more to a bad history can lock in bad autocomplete signals.

Topics

Problem Selection
Context Management
Agent Instructions
MCP Configuration
Environment Reliability

Mentioned

Theo
Ben Davis
Adam
MCP
LM
LSP
PR
Jira
TSX
VS Code