90% of AI Users Are Getting Mediocre Output. Don't Be One of Them (Stop Prompting, Do THIS Instead)

TL;DR

Default AI outputs tend to optimize for what typical human raters prefer, which produces median answers that miss individual constraints.

Briefing Cornell Notes

Briefing

Default settings in major AI assistants tend to produce “median” answers—competent, broadly acceptable output that misses the user’s specific constraints. That mismatch isn’t a mystery or a quality failure so much as a training outcome: reinforcement learning from human feedback pushes models toward what typical human raters prefer, not toward what any one individual needs. The result feels “almost right” rather than truly tailored—recommendations land on tourist spots, advice stays generic, and code follows common conventions instead of your exact workflow.

The training mechanism matters because the optimization target is effectively the statistical middle. Models generate multiple responses to the same prompt, human raters compare them, and the system learns to produce outputs that raters pick as clearer or more helpful. Those raters aren’t experts in the user’s domain and don’t share the user’s private preferences or constraints. When millions of such comparisons accumulate, the model learns to satisfy the widest set of people. That’s why default ChatGPT, Claude, and Gemini outputs often feel “okay” even when they’re technically correct: they’re tuned to a hypothetical typical person.

Escaping that median no longer depends solely on prompt-writing. The transcript lays out four “levers” that steer behavior persistently—memory, instructions, apps/tools, and style/tone—so the assistant can adapt across sessions instead of starting from scratch each time. Memory lets the system retain facts and preferences across conversations. ChatGPT’s memory includes explicit saved memories plus broader chat-history context, with project-scoped memory and changes like temporary chats retaining personalization settings. Claude uses a different approach: it can retrieve past conversations and maintain a periodically updated memory summary, with memory isolated by project by default. Gemini’s “personal intelligence” connects to Google apps such as Gmail, Photos, and YouTube, with personalization settings controlling how much data is used.

Instructions are the second lever: persistent behavioral rules about how the assistant should respond. ChatGPT supports custom instructions, project workspaces with their own instructions, and custom GPTs. Claude splits guidance across profile preferences, project instructions, and style; it also emphasizes style controls that can be generated from uploaded writing samples. For developers, the transcript highlights Claude Code workflows using a “claude.md” file where teams add rules when the model makes mistakes, check them into Git, and treat the file as a living standard.

The third lever—apps and tools—determines what external capabilities the model can use, including web search, file access, and code execution. A key infrastructure piece is Model Context Protocol (MCP), described as a universal interface that lets AI systems connect to external tools through a standardized protocol. The practical takeaway: tool enablement changes response character, and connectivity varies by platform and MCP server (e.g., Stripe being trickier than Figma on Claude). The fourth lever, style and tone control, includes ChatGPT’s multiple personalities and granular settings (warmth, enthusiasm, headers, emojis) and Claude’s presets (formal, concise, explanatory) plus custom style.

The transcript closes with a discipline-based strategy: vague guidance doesn’t steer output; specific instructions do. High performers capture corrections, encode them back into memory/instructions/style, and update rules when the assistant repeatedly errs. Steering improves personalization, but it won’t fix everything—hallucinations aren’t solved by context, and creative outputs still gravitate toward the center of the training distribution. Still, for frequent, repeatable work, even a few hours of setup can compound into consistently better results.

Cornell Notes

Default AI outputs often land near the “median” because training optimizes for what typical human raters prefer, not for any one person’s unique constraints. The transcript argues that the fix is to stop relying only on prompts and instead use four persistent levers: memory, instructions, tools, and style/tone. Memory keeps relevant facts across chats (with platform-specific implementations like ChatGPT’s saved memories and project memory, Claude’s project-scoped memory, and Gemini’s Google-app personalization). Instructions define durable response behavior, while style controls shape tone and formatting. Tools and MCP connections determine what the model can verify or execute, which can materially change output quality.

Why do default AI answers often feel “almost right” but not truly yours?

The transcript ties it to reinforcement learning from human feedback. Models generate multiple responses to the same prompt; human raters compare them and select the preferred output. The optimization target is what raters—who don’t know the user’s private constraints—choose as most helpful or clear. Over many evaluations, the model learns to satisfy the broadest set of preferences, producing statistically “middle” responses. That’s why recommendations can skew toward generic tourist options, advice can stay broadly applicable, and code can follow common conventions rather than your specific workflow.

What does “memory” change, and how do the platforms differ?

Memory reduces the need to re-provide context every time. ChatGPT’s memory includes explicit saved memories plus broader references to chat history, with project-only memory to prevent unrelated discussions from bleeding across contexts; temporary chats can now retain memory/personalization settings. Claude uses two parts: retrieval of past conversations and a periodically updated memory summary, with memory scoped to projects by default for clean separation. Gemini’s personalization connects to Google apps (Gmail, Photos, YouTube), and users can tune how many apps are connected, trading personalization for privacy surface area.

How should “instructions” be written to actually steer output?

The transcript emphasizes specificity. “Be concise” or “be direct” doesn’t reliably reshape output. Instead, instructions should state concrete behavioral rules tied to request types—e.g., “For factual questions, answer in one sentence” or “For analysis requests, walk through reasoning step by step.” ChatGPT supports custom instructions, project workspaces, and custom GPTs. Claude distributes guidance across profile preferences and project instructions, and it also offers style features that can be generated from uploaded writing samples to match sentence structure and tone.

What role do tools and MCP play in getting better results?

Tools change what the model can do and what inputs it can use—web search, reading documents, creating files, and running code. MCP (Model Context Protocol) is presented as a universal interface for connecting AI systems to external tools via a standardized protocol, with thousands of MCP servers available. The practical implication is that tool enablement affects response character: enabling web search can shift the model toward internet-backed answers, and MCP connectivity differs by platform and server (e.g., Stripe can be tricky on Claude while Figma may be easier).

How do style controls affect the assistant’s output, and what’s the recommended approach?

Style controls adjust communication patterns—tone, structure, and presentation. ChatGPT includes multiple personalities (friendly, candid, nerdy, cynical) plus granular dials like warmth, enthusiasm, headers, and emojis. The transcript advises avoiding conflicts between personality and instructions (e.g., don’t demand verbosity while selecting a concise personality). Claude offers presets (formal, concise, explanatory) and a custom style option; if not creating custom style, pick the preset that matches real usage rather than aspirational behavior.

Review Questions

Which part of the training pipeline pushes models toward “median” output, and why does that matter for personalization?
Pick one lever (memory, instructions, tools, or style). What specific setting would you change first, and what outcome would you expect?
What’s the difference between correcting output in your head versus encoding corrections back into the assistant’s memory/instructions?

Key Points

1
Default AI outputs tend to optimize for what typical human raters prefer, which produces median answers that miss individual constraints.
2
Reinforcement learning from human feedback trains models to satisfy the broadest preference distribution, not any single user’s needs.
3
Memory, instructions, tools, and style/tone are persistent levers that can steer behavior beyond one-off prompting.
4
ChatGPT, Claude, and Gemini implement memory differently—especially around project scoping and Google-app personalization—so setup should match your workflow boundaries.
5
Tool enablement (including MCP connections) changes response character by altering what the model can verify, search, or execute.
6
Effective steering requires specificity; vague directives like “be concise” often fail to reshape output.
7
Compounding comes from capturing repeated corrections and updating the assistant’s settings or rules, not just reacting when output feels off.

Highlights

Median output is a training artifact: human raters pick preferred responses, and the model learns to satisfy the middle of the preference distribution.

Memory isn’t just convenience—it’s a structural change that lets the assistant carry your constraints across sessions, with project scoping preventing context bleed.

MCP is framed as “USB-C for AI,” enabling standardized connections to external tools; tool enablement can materially shift answer quality.

Style control works best when it matches real usage and doesn’t conflict with instruction rules, preventing wasted tokens and inconsistent behavior.

Steering improves personalization but won’t solve hallucinations, and creative outputs still tend to gravitate toward the center of the training distribution.

Topics

Median AI Output
Reinforcement Learning from Human Feedback
Memory Settings
Instruction Design
Model Context Protocol (MCP)
Style and Tone Controls