Riverside Take 04 Feb 3 2025 from Nate
Based on AI News & Strategy Daily | Nate B Jones's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
OpenAI’s deep research mode is described as spending up to 30 minutes browsing the web and returning a full, cited report for complex questions.
Briefing
A new “deep research” mode tied to OpenAI’s full o3 model is pushing AI research performance sharply higher—while a Japan press conference signals a fast-track push toward artificial general intelligence (AGI) and Japan-specific AI infrastructure. The most concrete product update is that deep research can spend up to 30 minutes browsing the web, then return a full-length report with citations for complex tasks like legal research, scientific questions, mathematics, history, or even language evolution. The pitch is that it behaves like web-browsing research that produces graduate-level outputs, but with noticeably higher quality than earlier “deep research”-named offerings.
At the same time, SoftBank chairman Masayoshi Son brought Sam Altman to Japan to discuss a major OpenAI funding deal separate from “Stargate,” and the public commitments went beyond money. Son said an AGI announcement would happen in Japan in less than two years, and Altman agreed. The event also introduced a Japan-focused “fork” of OpenAI: a special company building a model called “Crystal” that would sit inside Japanese companies’ firewalls and independently review, optimize, and maintain their source code. The described workflow goes further than static code review—Crystal is said to listen to calls and continuously update code, with benefits initially exclusive to Japanese companies.
The transcript frames these moves as more than standard corporate partnership. SoftBank is portrayed as seeking a legacy moment—showing it helped bring AGI to Japan rather than merely funding a U.S. company that achieves AGI elsewhere. That framing matters because it hints at how AI deployment could be regionalized: not just models and APIs, but governance, data access, and operational control inside company environments.
The product and performance claims arrive with a separate benchmark story. Deep research rolled out over the weekend and quickly improved results on “Humanity’s Last Exam,” a test used to gauge how well AI can handle difficult, exam-like tasks. The transcript cites a progression: o1 around 9%, o1’s successor R1 around 10–11%, o3 mini-high around 13% on Friday, and then a jump to 25% after deep research kicked in by Sunday/Monday Japan time—using full o3 rather than o3 mini. The implication is that research-mode tooling (time, browsing, and report generation) can materially change benchmark outcomes in days.
Finally, the press conference included an unusual remark from Son about AI not “eating people,” justified with the claim that AI doesn’t need protein for energy. The transcript treats the comment as eccentric but also as part of a broader message: the leadership involved appears to believe AGI is close enough to plan for now, even as the practical details—like what Crystal will do inside firms and how it affects OpenAI’s relationship with Microsoft—remain open questions.
Cornell Notes
OpenAI’s “deep research” mode is presented as a step up from earlier web-scraping research tools: it can browse the web for up to 30 minutes, then produce a full report with accurate citations for complex questions. The transcript ties this capability to the full o3 model (not o3 mini), positioning it as smarter and more capable for tasks that normally require hours of expert work. In parallel, a Japan press conference with Sam Altman and Masayoshi Son signals an AGI timeline—an AGI announcement in Japan in less than two years—and a Japan-focused initiative called “Crystal.” Crystal is described as a firewall-contained model that reviews and optimizes company source code and is initially exclusive to Japanese companies. Benchmark results on “Humanity’s Last Exam” reportedly jumped to 25% after deep research was enabled, suggesting research-mode tooling can quickly move performance.
What does “deep research” actually do, and what kinds of questions is it meant for?
How is OpenAI’s deep research positioned relative to other similarly named products?
What commitments about AGI were made at the Japan press conference?
What is “Crystal,” and how is it supposed to work inside companies?
How did deep research affect performance on “Humanity’s Last Exam,” according to the transcript?
Why does the transcript treat the Japan deal and Crystal initiative as more than a typical investment?
Review Questions
- What operational steps does deep research perform (including time limits and output format), and how do those steps relate to the quality of citations?
- How do the transcript’s benchmark numbers on “Humanity’s Last Exam” change before and after deep research is enabled?
- What are the stated goals and constraints of the Crystal initiative, and why might firewall placement matter for adoption?
Key Points
- 1
OpenAI’s deep research mode is described as spending up to 30 minutes browsing the web and returning a full, cited report for complex questions.
- 2
The deep research capability is tied to the full o3 model (not o3 mini), with claims of higher quality than other “deep research” offerings.
- 3
Masayoshi Son and Sam Altman publicly aligned on an AGI announcement in Japan in less than two years.
- 4
A Japan-specific initiative called “Crystal” is described as a firewall-contained model that reviews, optimizes, and maintains company source code and can listen to calls.
- 5
The transcript links a major SoftBank funding deal for OpenAI to the Japan press conference, separate from “Stargate.”
- 6
Benchmark performance on “Humanity’s Last Exam” reportedly jumped to 25% after deep research was enabled, indicating research-mode tooling can quickly move results.
- 7
The Crystal plan is framed as initially exclusive to Japanese companies, implying a regional approach to AI deployment and control.