90% of AI Users Are Getting Mediocre Output. Don't Be One of Them (Stop Prompting, Do THIS Instead)
Based on AI News & Strategy Daily | Nate B Jones's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Default AI outputs tend to optimize for what typical human raters prefer, which produces median answers that miss individual constraints.
Briefing
Default settings in major AI assistants tend to produce “median” answers—competent, broadly acceptable output that misses the user’s specific constraints. That mismatch isn’t a mystery or a quality failure so much as a training outcome: reinforcement learning from human feedback pushes models toward what typical human raters prefer, not toward what any one individual needs. The result feels “almost right” rather than truly tailored—recommendations land on tourist spots, advice stays generic, and code follows common conventions instead of your exact workflow.
The training mechanism matters because the optimization target is effectively the statistical middle. Models generate multiple responses to the same prompt, human raters compare them, and the system learns to produce outputs that raters pick as clearer or more helpful. Those raters aren’t experts in the user’s domain and don’t share the user’s private preferences or constraints. When millions of such comparisons accumulate, the model learns to satisfy the widest set of people. That’s why default ChatGPT, Claude, and Gemini outputs often feel “okay” even when they’re technically correct: they’re tuned to a hypothetical typical person.
Escaping that median no longer depends solely on prompt-writing. The transcript lays out four “levers” that steer behavior persistently—memory, instructions, apps/tools, and style/tone—so the assistant can adapt across sessions instead of starting from scratch each time. Memory lets the system retain facts and preferences across conversations. ChatGPT’s memory includes explicit saved memories plus broader chat-history context, with project-scoped memory and changes like temporary chats retaining personalization settings. Claude uses a different approach: it can retrieve past conversations and maintain a periodically updated memory summary, with memory isolated by project by default. Gemini’s “personal intelligence” connects to Google apps such as Gmail, Photos, and YouTube, with personalization settings controlling how much data is used.
Instructions are the second lever: persistent behavioral rules about how the assistant should respond. ChatGPT supports custom instructions, project workspaces with their own instructions, and custom GPTs. Claude splits guidance across profile preferences, project instructions, and style; it also emphasizes style controls that can be generated from uploaded writing samples. For developers, the transcript highlights Claude Code workflows using a “claude.md” file where teams add rules when the model makes mistakes, check them into Git, and treat the file as a living standard.
The third lever—apps and tools—determines what external capabilities the model can use, including web search, file access, and code execution. A key infrastructure piece is Model Context Protocol (MCP), described as a universal interface that lets AI systems connect to external tools through a standardized protocol. The practical takeaway: tool enablement changes response character, and connectivity varies by platform and MCP server (e.g., Stripe being trickier than Figma on Claude). The fourth lever, style and tone control, includes ChatGPT’s multiple personalities and granular settings (warmth, enthusiasm, headers, emojis) and Claude’s presets (formal, concise, explanatory) plus custom style.
The transcript closes with a discipline-based strategy: vague guidance doesn’t steer output; specific instructions do. High performers capture corrections, encode them back into memory/instructions/style, and update rules when the assistant repeatedly errs. Steering improves personalization, but it won’t fix everything—hallucinations aren’t solved by context, and creative outputs still gravitate toward the center of the training distribution. Still, for frequent, repeatable work, even a few hours of setup can compound into consistently better results.
Cornell Notes
Default AI outputs often land near the “median” because training optimizes for what typical human raters prefer, not for any one person’s unique constraints. The transcript argues that the fix is to stop relying only on prompts and instead use four persistent levers: memory, instructions, tools, and style/tone. Memory keeps relevant facts across chats (with platform-specific implementations like ChatGPT’s saved memories and project memory, Claude’s project-scoped memory, and Gemini’s Google-app personalization). Instructions define durable response behavior, while style controls shape tone and formatting. Tools and MCP connections determine what the model can verify or execute, which can materially change output quality.
Why do default AI answers often feel “almost right” but not truly yours?
What does “memory” change, and how do the platforms differ?
How should “instructions” be written to actually steer output?
What role do tools and MCP play in getting better results?
How do style controls affect the assistant’s output, and what’s the recommended approach?
Review Questions
- Which part of the training pipeline pushes models toward “median” output, and why does that matter for personalization?
- Pick one lever (memory, instructions, tools, or style). What specific setting would you change first, and what outcome would you expect?
- What’s the difference between correcting output in your head versus encoding corrections back into the assistant’s memory/instructions?
Key Points
- 1
Default AI outputs tend to optimize for what typical human raters prefer, which produces median answers that miss individual constraints.
- 2
Reinforcement learning from human feedback trains models to satisfy the broadest preference distribution, not any single user’s needs.
- 3
Memory, instructions, tools, and style/tone are persistent levers that can steer behavior beyond one-off prompting.
- 4
ChatGPT, Claude, and Gemini implement memory differently—especially around project scoping and Google-app personalization—so setup should match your workflow boundaries.
- 5
Tool enablement (including MCP connections) changes response character by altering what the model can verify, search, or execute.
- 6
Effective steering requires specificity; vague directives like “be concise” often fail to reshape output.
- 7
Compounding comes from capturing repeated corrections and updating the assistant’s settings or rules, not just reacting when output feels off.