What nobody tells you about captchas

TL;DR

Mass abuse can arrive instantly (e.g., 300,000 requests in 30 minutes), so CAPTCHA choice must handle both security and user experience.

Briefing Cornell Notes

Briefing

CAPTCHAs for AI chat apps aren’t just an annoyance—they’re a high-stakes engineering and cost problem, and the “invisible” modes that promise low friction can quietly wreck user experience. After getting hit with massive abuse (300,000 requests in 30 minutes) and trying multiple CAPTCHA providers, the creator of T3 Chat landed on H Captcha as the most reliable option for keeping bots out without triggering constant user-facing failures.

The journey started with Cloudflare Turnstile, chosen largely because it’s free for small deployments. Setup was workable, but integration friction showed up immediately in React, and development became painful when Turnstile used browser debugging checks that hijacked the devtools experience—especially for developer-heavy user bases who keep consoles open. The bigger issue was reliability: invisible Turnstile challenges were failing far too often. The creator couldn’t find a practical “hybrid” approach where invisibility could default but fall back to a visible challenge only when needed, without building a complicated verification pipeline and maintaining two separate flows. That design gap forced either constant invisible failures (bad UX) or visible challenges (more friction), and the invisible mode’s failure rates were severe enough to cause frequent user complaints.

Switching to Google’s Recapture V3 improved some things but introduced a different set of problems: documentation quality was described as exceptionally poor, and the invisible mode relied on fingerprinting and returned a risk score rather than a clean pass/fail. Even after tuning thresholds (0.5 used to reduce failures), the system still produced alarming outcomes—such as many users getting “risky” scores consistently, including reports of capture failures on Safari iPhones. Recapture also lacked a built-in way to conditionally “promote” invisible checks into visible challenges, meaning the app team would have had to build extra infrastructure to do that themselves.

H Captcha became the turning point. Pricing was similar in principle (about a dollar per thousand), but the key feature was a “99.9% passive mode” that stays invisible for most users and automatically escalates to a challenge only when confidence drops. That escalation logic happens client-side inside H Captcha’s JavaScript, so the app team rarely even sees server-side verification failures. In reported metrics, Recapture’s failure rate sat around 8–9% during normal traffic hours, while H Captcha’s was dramatically lower—down to roughly 0.1% in the creator’s recent window. Over the last 24 hours, challenges were opened only 16 times, and no server-side challenge failures were recorded, suggesting most “failures” were user-abandonment behaviors rather than systemic bot-blocking errors.

The practical takeaway is blunt: CAPTCHA providers’ documentation, testability, and invisible-mode behavior matter as much as raw cost. Turnstile was held back by invisible reliability and missing “promote from invisible” capabilities. Recapture was close but undermined by invisible failure patterns and hard-to-implement documentation gaps. H Captcha, despite sparse documentation about its passive mode, delivered the experience the app needed—fewer blocks, fewer complaints, and fewer visible challenges—so the creator now recommends it as the default choice when user experience is the priority over per-request verification cost. The creator also flags Radar (Work OS) as a potential future option, especially for account-level fraud prevention, though it doesn’t solve signed-out challenges.

Cornell Notes

The CAPTCHA problem for T3 Chat wasn’t just stopping bots—it was preventing invisible challenges from breaking real users while keeping costs under control. Cloudflare Turnstile was free and easy to start, but invisible mode reliability was poor and development became annoying due to browser-debugging checks. Google Recapture V3 improved reliability in some ways but suffered from extremely weak documentation and a scoring-based invisible mode that still produced high “risky” rates, especially on privacy-focused browsers and Safari. H Captcha stood out because its “99.9% passive mode” stays invisible for most users and automatically escalates to a visible challenge when confidence is low, eliminating most server-side failures. The result was a sharp drop in challenge failures and user complaints, with challenges rarely shown at all.

Why did invisible CAPTCHA modes create unexpected user-experience failures in this setup?

Invisible modes rely on device characteristics and browser behavior rather than showing a visible challenge. In Turnstile’s case, the creator reported non-stop invisible failures and couldn’t implement a true hybrid flow (invisible by default, visible only when needed) without building extra verification logic and maintaining two code paths. With Recapture V3, the invisible flow returned a risk score (0–1) rather than a simple pass/fail, so even after lowering the threshold (0.5) to reduce false blocks, many users still received “risky” scores. Privacy-focused browsers (e.g., Brave) and Safari on iPhones were specifically associated with capture failures, worsening sentiment.

What was the core engineering gap that forced extra complexity when using Turnstile or Recapture?

Both systems lacked an easy, built-in “promote from invisible” capability. The creator wanted invisibility by default, but a visible challenge only when confidence was low. Turnstile required a checkbox in challenge mode to raise pass rates, and the client couldn’t know whether it passed—so the app would need a separate verification pipeline to decide when to show a challenge. Recapture similarly didn’t provide a conditional escalation mechanism, so the team would have had to build an additional modal flow and token verification endpoint to handle low-confidence cases.

How did the creator reduce CAPTCHA cost without sacrificing bot resistance?

Instead of verifying every message, the team shifted toward verifying a user for a period using tokens that expire. They also leveraged Google Cloud credits (from being a Y Combinator company) to test Recapture V3 despite its per-assessment pricing. The strategy aimed to reduce the number of CAPTCHA checks from per-request to per-user, lowering total spend while still filtering abusive traffic.

What made H Captcha operationally different from Turnstile and Recapture in practice?

H Captcha’s “99.9% passive mode” automatically stays invisible for most users and only triggers a visible challenge when it has reason to believe the user isn’t real. The escalation happens inside H Captcha’s JavaScript before the app submits tokens to the server, so the app team saw very few server-side verification failures. The creator reported that challenges were opened only 16 times in 24 hours, and no server-side challenge failures occurred in that window.

Why did the creator still criticize Recapture V3 even after getting it working?

Recapture V3 was described as extremely hard to implement due to poor documentation, even though the final code was simpler once the architecture was understood. Reliability issues persisted: invisible mode fingerprinting reduced performance in privacy-heavy browsers, and the scoring model meant that many users could be classified as “risky” consistently. Threshold tuning (0.5 vs 0.7 vs 0.8) produced large swings in failure rates, including a scenario where using 0.8 could cause roughly a third of requests to fail.

What does the creator’s scoring rubric reveal about what matters beyond raw CAPTCHA pricing?

The creator rated providers across ease of integration, invisible-mode reliability, cost, documentation, and testability. Turnstile scored poorly on invisible reliability and dev/test ergonomics; Recapture scored very low on documentation; H Captcha scored highest overall because it delivered the desired passive/invisible behavior with automatic escalation. Cost mattered too—Turnstile’s free tier was attractive, but scaling beyond a couple keys required enterprise pricing, while Recapture and H Captcha both had per-thousand economics that could become expensive at high volume.

Review Questions

Which specific missing capability (invisible-to-visible escalation) forced the app team to build extra verification flows with Turnstile and Recapture?
How did the scoring model in Recapture V3 (risk thresholds like 0.5/0.7/0.8) change the balance between false positives and user friction?
What evidence did the creator use to claim H Captcha reduced both server-side failures and user-facing challenge frequency?

Key Points

1
Mass abuse can arrive instantly (e.g., 300,000 requests in 30 minutes), so CAPTCHA choice must handle both security and user experience.
2
Invisible CAPTCHA modes can fail silently; without a “promote from invisible” mechanism, apps may need complex two-path verification logic.
3
Turnstile’s invisible mode caused frequent failures and also created dev friction due to browser-debugging checks that interfered with devtools.
4
Recapture V3’s invisible mode uses fingerprinting and returns a risk score, making threshold tuning critical and potentially punishing to real users.
5
H Captcha’s “99.9% passive mode” automatically escalates to visible challenges when confidence is low, reducing server-side failures and visible interruptions.
6
CAPTCHA provider documentation and testability can be as important as pricing, since poor docs can lead to distrust and implementation risk.
7
Account-level fraud tools like Radar (Work OS) may help for signed-in users, but they don’t solve signed-out CAPTCHA needs by themselves.

Highlights

Invisible CAPTCHA reliability can be worse than expected: Turnstile’s invisible mode produced frequent failures and user complaints, while Recapture’s scoring thresholds still left many users “risky.”

Recapture V3’s invisible flow returns a score (0–1), so small threshold changes can swing failure rates dramatically—up to roughly a third of requests at a higher threshold.

H Captcha’s “99.9% passive mode” is the differentiator: it stays invisible for most users and only triggers visible challenges when confidence drops, leading to very few challenge openings (16 in 24 hours).

The “promote from invisible” capability is the missing puzzle piece that drove the creator to build (or avoid building) extra infrastructure with other providers.

Topics

CAPTCHA Reliability
Invisible Challenges
Bot Mitigation
Rate Limiting
Fraud Prevention