AMA: Scaling AI Applications into the Enterprise

TL;DR

Enterprise AI adoption depends on quantifiable ROI and customer experience improvements, not just impressive demos.

Briefing Cornell Notes

Briefing

Enterprise AI adoption hinges on two things: proving measurable ROI fast enough to win internal buy-in, and building systems with guardrails that can evolve as models change. Decagon and Clay describe approaches that treat AI deployment less like a one-time “pilot” and more like an ongoing product cycle—complete with evaluations, controlled rollouts, and governance that non-technical teams can manage.

Clay’s co-founder, Verun, frames the company’s origin around scaling human capability—giving teams an order-of-magnitude boost in what they can do with data and automation. Clay started with data enrichment for cold email marketing agencies, then shifted into a go-to-market platform built on intent signals and automated actions. AI changed the economics and workflow of go-to-market: usage-based pricing reduced the need to charge by seat or rep, and AI made sales less “one-to-one” and more “one-to-many,” enabling systems that scale growth efforts. In enterprise deployments, Clay emphasizes that pilots must translate into quantifiable outcomes—ticket resolution, cost reduction, revenue impact, and customer satisfaction—otherwise the business case stalls among competing priorities.

Decagon’s co-founder, Jesse, ties the company’s founding to customer discovery rather than a preconceived product. Customer support kept surfacing as a problem where GenAI agents can deliver both value and measurable impact. Decagon’s agent design aims to match what human support agents do while adding proactive capabilities. When new models arrive, Decagon relies on evaluation suites for each use-case “surface area,” including checks for off-rails behavior. Crucially, customers can define their own evals and test sets—effectively unit tests—so performance can be validated against each enterprise’s real workflows. If results look good, rollouts happen via A/B testing (e.g., starting with a small audience share and expanding).

Both companies stress guardrails as a prerequisite for enterprise trust, especially in customer-facing contexts where incorrect outputs can carry reputational and contractual risk. Decagon highlights “agent operating procedures” (AOPs)—a structured way for non-technical stakeholders to set and customize constraints—shifting responsibility from a central AI team to the business users who understand policy and risk. The risk conversation is also evolving: instead of obsessing over rare catastrophic mistakes, enterprises increasingly look at error rates and how they compare to human performance.

On deployment success, the panel pushes back on the idea that enterprise AI fails because of the models alone. The biggest failure mode is organizational: traditional go-to-market teams operate in silos, while top-performing teams run growth like product development. That means building reusable systems, starting with small betas, using appropriate data sources, and keeping a human-in-the-loop for escalation. Decagon and Clay also differentiate by choosing wedges that create numerical proof—Clay starts with data quality and enrichment tests, while Decagon focuses on agent performance and measurable customer support outcomes.

Finally, they address market strategy tradeoffs: whether to go deep in one vertical or stay horizontal. Clay leans toward horizontal because conversation automation needs are broadly similar across industries, while Decagon’s enterprise focus still allows it to serve multiple verticals without losing product fit. Resource allocation advice is pragmatic: early-stage companies should be go-to-market driven, but scaling go-to-market too aggressively can mask product bottlenecks. Their shared takeaway for founders is to avoid overfitting to others’ playbooks and instead build from personal strengths and curiosity—then iterate step by step.

Cornell Notes

Enterprise AI succeeds when teams can (1) prove ROI quickly and (2) ship with guardrails that can be updated as models evolve. Decagon builds GenAI customer support agents with per-use-case evaluations, lets customers define their own test sets, and rolls out new models via A/B testing to small audiences first. Clay’s go-to-market platform uses data enrichment and intent signals to automate growth actions, and it emphasizes usage-based pricing and “one-to-many” sales workflows. Both companies argue that risk management should shift from fear of one-off errors to tracking error rates, while governance tools like Decagon’s agent operating procedures let non-technical stakeholders enforce constraints.

How do Decagon and Clay keep up with fast-changing model capabilities without breaking production systems?

Decagon treats model updates as a testing problem: it maintains evaluation suites for each agent use-case “surface area,” including checks for off-rails or off-topic behavior. When a new model appears, Decagon runs those evaluations against benchmarks; customers can also define their own evals and test sets (unit-test style) so results match their real workflows. If performance improves, Decagon rolls out via A/B tests (for example, starting at 10% of an audience and expanding). Clay’s approach is more about observing user behavior and product usage patterns—such as how users were already running chat-based model calls, searches, and scraping—then sequencing those actions into new agent workflows.

What guardrail strategy helps enterprises feel safe deploying AI agents that interact with customers?

Both companies emphasize enforceable constraints and escalation paths. Decagon highlights that customer-facing agents must never produce disallowed content, so guardrails must be customizable and enforceable. It uses “agent operating procedures” (AOPs) as a structured way for non-technical users to set constraints, reducing the burden on the core AI team and aligning outputs with enterprise policy. The risk mindset also shifts toward error rates: instead of treating any mistake as catastrophic news, enterprises increasingly evaluate whether AI error rates are dramatically lower than human error rates.

Why do pilots fail in enterprise AI, according to these companies?

The panel points to business-case and rollout mechanics rather than model quality alone. A pilot fails when outcomes aren’t quantifiable—if stakeholders can’t see how much money is saved or made, or how customer experience changes (e.g., ticket resolution volume, human time reduction, NPS improvement), the project loses momentum. Another failure pattern is treating deployment like a binary gate: requiring near-100% accuracy and a full rollout before learning. Successful teams run smaller betas, keep a human-in-the-loop for escalation, and treat deployment like product launches.

How do Clay and Decagon differentiate in a crowded AI market?

Clay differentiates with a quantitative wedge: start with data enrichment and run data tests to compare vendors, then use AI to clean and aggregate new data points that competitors can’t easily access. After winning on data, Clay uses its platform to automate go-to-market workflows. Decagon differentiates by product execution and by democratizing agent building for non-technical users—using natural-language configuration and AOP-style procedures instead of slow, expensive decision-tree style systems that require engineering resources.

Should an enterprise AI company go deep in one vertical or stay horizontal?

Clay argues for horizontal winners: while industries have nuances, the core need—automating conversations and improving quality—remains similar across verticals. Clay starts with “lowest hanging fruit” (technology B2B, especially) and then expands up the customer maturity ladder (SMB to mid-market to enterprise) before moving into other industries like finance or healthcare. Decagon’s enterprise customer base is also described as broad, suggesting horizontal capability can still handle contractual and operational differences as governance controls scale.

How should early-stage teams allocate resources between product and go-to-market?

Jesse says early growth must be go-to-market driven because the company needs the next step quickly; long-term optimization too early can waste time. As the customer base grows, the balance shifts toward product scaling. Vin adds a caution: scaling go-to-market is easier than scaling engineering, so aggressive hiring can paper over product problems. The stated preference is to scale engineering roughly one-to-one with go-to-market as the business matures.

Review Questions

What mechanisms allow Decagon to validate new model releases safely, and how do customer-defined evaluations change the process?
Which metrics do Clay and Decagon treat as essential for enterprise pilots to convert into real deals?
How do AOPs (agent operating procedures) shift responsibility for guardrails between AI teams and business stakeholders?

Key Points

1
Enterprise AI adoption depends on quantifiable ROI and customer experience improvements, not just impressive demos.
2
Model updates should be managed through per-use-case evaluations and controlled rollouts (e.g., A/B testing to small audience slices).
3
Guardrails must be enforceable and customizable, especially for customer-facing agents where incorrect outputs carry reputational risk.
4
Governance can be operationalized through structured procedures (like agent operating procedures) so non-technical stakeholders can set constraints.
5
Successful deployments treat AI like product launches: start small with betas, use human-in-the-loop escalation, and expand after learning.
6
Differentiation often comes from a measurable wedge (data tests for Clay; agent performance and usability for Decagon) rather than generic “AI automation” claims.
7
Resource allocation should evolve: go-to-market focus early, then tighter coupling with product/engineering scaling as bottlenecks appear.

Highlights

Decagon lets enterprise customers define their own evaluation suites and test sets, turning model upgrades into a measurable, customer-specific validation cycle.

Clay ties enterprise adoption to business outcomes—ticket resolution, reduced human effort, and customer satisfaction—arguing that unclear pilot results kill the business case.

Guardrails are framed as a shift from fear of one-off failures to managing error rates, with structured procedures enabling non-technical teams to enforce constraints.

Topics

Mentioned

Kimberly Tan
Jesse
Verun
GenAI
ROI
NPS
AOPs
SDR
AE
GT M