Am I going to jail for web scraping?
Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
A Delaware court found booking.com liable under the CFAA for scraping Ryanair’s website and using the data to book tickets for profit without authorization.
Briefing
A Delaware federal court ruling found that booking.com violated the Computer Fraud and Abuse Act (CFAA) by scraping Ryanair’s website—an outcome that puts “publicly accessible” data into a more legally risky category when scraping is tied to unauthorized access and resale. The decision matters because it signals that scraping can cross from gray-area automation into federal criminal exposure, depending on how the access is obtained and what the scraper does with the data.
The dispute centered on booking.com’s extraction of Ryanair ticket information and booking it for profit without authorization. Booking.com attempted to counter-sue Ryanair for defamation after Ryanair labeled the company an “online travel agency pirate,” but the court rejected that claim. The core legal takeaway wasn’t the branding fight; it was the court’s willingness to treat the scraping conduct as actionable under the CFAA.
The transcript places this case in a broader pattern of scraping litigation. In 2015, a company called 3Taps scraped data from Craigslist for a site called PadMapper despite Craigslist blocking its IP addresses and sending a cease-and-desist letter. The court cited Craigslist’s ability to use the CFAA to protect public data, and 3Taps ultimately agreed to stop scraping and pay $1 million. The message: once a site tells a scraper to stop, continuing can trigger serious legal consequences.
Yet outcomes have not been uniform. In 2019, HighQ Labs sued LinkedIn after scraping LinkedIn data to predict when employees might leave their jobs. LinkedIn also sent a cease-and-desist letter, but the court ruled for HighQ, allowing access to LinkedIn’s public data; that decision was later affirmed by the Supreme Court. The transcript frames this as a counterweight to the Craigslist-style approach, suggesting that scraping public information may be treated differently when the conduct doesn’t involve the same kind of unauthorized access.
More recently, a lawsuit tied to AI training data also ended in a scraper-friendly direction. A judge dismissed—“with prejudice”—a case claiming GitHub Copilot violated software developers’ rights by ignoring open-source licenses when scraping code to train the tool. That dismissal, as described, means the claim couldn’t be refiled.
So will someone go to jail for web scraping? The transcript’s practical bottom line is cautious: if scraping involves publicly available data and no fraud, the odds of jail are “extremely low.” The bigger risk, it warns, is civil litigation—especially from large corporations that can impose crushing legal costs. In short, the legal boundary appears less about whether data is visible in a browser and more about authorization, intent, and whether a site has demanded the scraping stop.
Cornell Notes
A Delaware court ruled that booking.com violated the CFAA by scraping Ryanair’s website, highlighting that “publicly accessible” web data can still create federal legal risk when access is unauthorized and tied to profit. The transcript contrasts this with earlier cases where scraping public data was treated more leniently, including HighQ Labs v. LinkedIn (public data access allowed and affirmed by the Supreme Court). It also notes a separate AI-related win: a judge dismissed a GitHub Copilot licensing lawsuit with prejudice. Overall, the practical guidance is that jail risk is low for scraping public data without fraud, but the financial and legal exposure from lawsuits can be severe—especially after a site blocks or demands you stop.
Why did booking.com’s scraping of Ryanair become a CFAA problem?
What happened in the 3Taps vs. Craigslist dispute, and what precedent did it reinforce?
How did HighQ Labs v. LinkedIn differ from the Craigslist-style outcome?
What does the GitHub Copilot licensing dismissal suggest about scraping for AI training?
If data is visible in a browser, what still determines legal risk?
Review Questions
- Which factors in the booking.com vs. Ryanair case made the scraping legally risky under the CFAA?
- How do the outcomes in the Craigslist and LinkedIn disputes differ, and what does that imply about scraping public data?
- What does a “dismissed with prejudice” outcome mean for future similar claims, based on the GitHub Copilot example?
Key Points
- 1
A Delaware court found booking.com liable under the CFAA for scraping Ryanair’s website and using the data to book tickets for profit without authorization.
- 2
Defamation countersuits tied to scraping disputes can fail even when the underlying scraping conduct is the main legal issue.
- 3
Ignoring a cease-and-desist and continuing to scrape after IP blocking can trigger CFAA exposure, as shown by 3Taps’s $1 million settlement.
- 4
Scraping public data isn’t automatically illegal; HighQ Labs v. LinkedIn allowed access to public information and was affirmed by the Supreme Court.
- 5
AI training cases may turn on licensing and claim viability; a GitHub Copilot-related lawsuit was dismissed with prejudice, preventing refile.
- 6
Even when jail risk is low for non-fraud public-data scraping, civil lawsuits from large companies can create severe financial consequences.