How **WE** Use AI In Software Development

TL;DR

AI coding works best when humans keep control of architecture and review AI output, especially in early codebase stages where patterns aren’t established.

Briefing Cornell Notes

Briefing

AI-assisted coding is most useful when it’s treated like a limited collaborator—good for accelerating well-bounded tasks and prototypes, but risky as a default driver for long-term, pattern-heavy codebases. Across the discussion, the biggest practical theme wasn’t whether LLMs can generate code; it was how teams decide when to delegate, how to manage quality, and what kinds of work actually benefit.

Dax describes using AI agents as a way to push productivity while still keeping humans in control. His workflow often splits a coding task into a “dumber” portion and a more complex portion: the AI handles the routine work, then he checks results before continuing. He avoids AI in early codebase stages, arguing that LLMs struggle when there’s not yet consistency in abstractions and patterns. For Open Code—his model-independent command-line agent—he says the project was largely rewritten manually to establish those patterns first. Even when AI-generated code gets discarded, he values the learning: it can surface ideas and reveal pitfalls faster than slow, manual discovery.

Casey’s stance is far more skeptical, rooted less in model capability and more in what programming is supposed to be. He argues that AI-generated web code often becomes “stacking another piece of failure” on top of already-fragile systems—especially when the underlying web platform and abstractions are poorly designed. His critique extends to maintainability, security, and performance, warning that large-scale projects can accumulate messy, hard-to-audit code. He also points to the ephemerality of web development: APIs change, deployment environments shift, and knowledge can become obsolete quickly—making AI-generated scaffolding feel like a short-lived convenience rather than durable engineering.

TJ lands in the middle, treating AI as a tool for specific productivity wins rather than a replacement for engineering judgment. He cites practical areas where LLM help is immediately valuable: clearing repetitive backlog items, adding tests, improving CI-driven workflows, and reducing manual chores like logging boilerplate or error-handling scaffolding. He also supports “vibe coding” for low-stakes, fast-moving goals—like quick prototypes or short-lived promotional sites—where throwing code away is acceptable. His estimate ranges widely (roughly 25% to 100%) depending on whether the work is exploratory or tied to long-term maintenance.

PrimeTime’s perspective emphasizes a similar split: lean into AI when the goal is to get something working quickly, but recognize an inflection point where deeper complexity demands human understanding. He describes improving his vibe-coding process by forcing the AI to propose a plan first, then iterating through corrections before code generation. He also argues that AI is best at the “weather in 10 minutes” stage—useful for near-term outcomes—while longer-horizon predictions become unreliable.

The group ultimately converges on a shared reality: marketing often overfocuses on “zero to one” demos, but most engineering effort is iterative work on existing systems. AI can still change the day-to-day by making prototypes cheaper and by helping teams burn down repetitive tasks—yet it doesn’t eliminate the need for strong fundamentals, careful review, and thoughtful architecture. The conversation ends with a promise to dig into what it actually takes to build a real agent, beyond simple API calls that route questions to a model.

Cornell Notes

The discussion treats AI coding as a delegation problem, not a magic replacement for engineering. Dax uses AI for bounded “dumber” tasks while keeping humans responsible for architecture and early codebase pattern-setting; he avoids AI-driven foundations and values learning even when generated code is thrown away. Casey rejects AI coding for many web contexts, arguing it often produces large, messy outputs on top of already-bad abstractions, raising concerns about maintainability, security, and performance. TJ and PrimeTime describe a practical middle ground: AI helps most with repetitive work, tests, and logging, and vibe coding can be effective for low-stakes prototypes, but long-term projects still require deep human understanding. The key takeaway is that AI’s best ROI comes from iterative, well-scoped use with review—not blind full automation.

When does AI coding become genuinely useful rather than just “code generation”?

The panel repeatedly ties usefulness to scope and review. Dax delegates only the routine portion of a task, then checks the output before proceeding, and he avoids AI in early codebase stages where abstractions and patterns aren’t established. TJ points to concrete wins: adding tests, clearing repetitive backlog items, and reducing boilerplate like logging and error-handling. PrimeTime adds that AI is most reliable for near-term outcomes—“weather in 10 minutes”—and less dependable as complexity grows. In contrast, Casey argues AI is often counterproductive in web stacks where the underlying abstractions are already fragile, so the generated code becomes hard to maintain.

Why do some participants avoid AI in the early stages of a project?

Dax says early codebases lack consistency in abstractions and patterns, which makes LLM output less aligned with the eventual structure. His approach is to rewrite or build foundations manually first, then use AI once the codebase has stable conventions. PrimeTime echoes this by describing an inflection point: AI can accelerate early momentum, but deeper complexity requires human understanding of edge conditions and the existing design. The underlying logic is that AI can’t reliably invent the “house style” that later maintenance will depend on.

What’s the strongest critique of AI coding in web development, and what does it imply?

Casey’s critique is that AI often produces huge amounts of code to accomplish something simple, because it works on top of a stack that’s already poorly designed. He argues this “stacks another piece of failure” onto existing problems, and he worries about security, performance, and maintainability. He also highlights the web’s ephemerality: APIs change, deployments break, and knowledge can become obsolete quickly. The implication is that AI-generated scaffolding may feel like short-term convenience unless the underlying platform and abstractions improve.

How do the panelists justify “vibe coding” or throwing code away?

TJ and Dax treat vibe coding as a low-cost way to explore ideas and validate direction. TJ says it’s especially suitable for short-lived or low-stakes work—goofy websites, one-day promotional efforts—where correctness and long-term maintainability matter less. Dax adds that prototypes used to be too expensive to discard, but AI reduces that cost, enabling faster experimentation and quicker decisions about whether an idea is good or bad. PrimeTime also uses AI to get something up quickly (e.g., game features) to test whether the concept is fun before investing in deeper engineering.

What practical workflow improvements help make AI output more reliable?

PrimeTime describes a workflow in Cursor where he asks the AI to produce a plan first—what files to change and why—then he corrects it multiple times before requesting completion. Dax similarly uses a human-in-the-loop model: he lets AI handle a portion, then checks results. TJ also emphasizes that AI is most valuable when integrated with existing engineering practices like CI and test suites, where automated checks can validate the generated changes. Together, these approaches reduce the risk of blindly accepting incorrect or inconsistent code.

Why does the conversation keep returning to “iteration on existing codebases” instead of only “zero to one”?

PrimeTime and TJ argue that most real engineering work is incremental maintenance and backlog burn-down, not brand-new product creation. PrimeTime notes that marketing and funding often highlight zero-to-one demos because the pitch is bigger—“everyone can build software”—but day-to-day value is more likely to come from making iteration cheaper and faster. TJ adds that AI can help with repetitive tasks and small engineering chores that accumulate over time. Dax also frames Open Code as aimed at established companies and iterative needs, not just early-stage invention.

Review Questions

Which tasks did Dax say he would delegate to AI, and what did he avoid delegating—especially early in a codebase?
How do Casey’s concerns about web abstractions and API churn shape his view of AI-generated code?
What workflow does PrimeTime use to improve AI reliability (planning vs direct generation), and how does that relate to the “inflection point” idea?

Key Points

1
AI coding works best when humans keep control of architecture and review AI output, especially in early codebase stages where patterns aren’t established.
2
A common workflow is splitting tasks into routine and complex parts: AI handles the routine work, then developers verify and integrate the results.
3
AI can reduce the cost of prototyping, making it easier to explore ideas quickly and discard prototypes when they fail validation.
4
Skepticism in web development centers on maintainability and the “stacking failure” problem—large AI outputs built on top of fragile abstractions.
5
AI’s strongest near-term value is in scoped, repetitive engineering tasks such as tests, logging boilerplate, and backlog cleanup with CI support.
6
Marketing tends to overemphasize zero-to-one demos; practical impact is more likely in iterative maintenance and incremental improvements to existing systems.

Highlights

Dax’s rule of thumb: delegate only the “dumber” parts, avoid AI-driven foundations until the codebase has consistent abstractions, and treat generated code as learning even when it’s discarded.

Casey’s core objection isn’t that AI can’t write code—it’s that AI often generates massive, messy outputs on top of already-bad web stacks, worsening security, performance, and maintainability risks.

PrimeTime’s reliability hack: ask for a plan first (which files change and why), iterate on that plan, then generate code—turning AI from a guesser into a guided collaborator.

How WE Use AI In Software Development