GPT-4 Turbo with Google Web Browsing (Assistants API)
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Rewrite user questions into Google-optimized queries using GPT-4 Turbo before searching.
Briefing
A practical Assistants API pattern is on display: rewrite a user question into a Google-friendly search query, fetch fresh web results, scrape the pages for relevant text, then feed that “grounded” context back into GPT-4 Turbo so answers reflect current information instead of relying on training-time knowledge. The workflow matters because it turns a general-purpose model into a news-and-facts retriever that can cite up-to-date details—illustrated with very recent claims about Sam Altman and with a sports result example.
The system starts when a user types a question. A first GPT-4 Turbo call reformulates the query into a shorter, search-optimized version (for example, converting “Sam Altman fired from OpenAI” into a Google-ready query). That rewritten query is sent to an Assistants API flow that uses function calling: one tool queries Google via SerpAPI to obtain organic result URLs, and another tool scrapes the returned pages using BeautifulSoup 4 to extract the relevant text. The pipeline then combines the original user question with the scraped context and makes a second GPT-4 Turbo call to generate the final answer.
A key design choice is how the final response is constrained. A system message instructs the model to return only the essential parts that answer the user’s original question, while adding three bullet points that justify the response using the grounded text. In testing, the approach produces answers that track the retrieved material—for instance, for the query about whether Sam Altman was fired, the model returns a summary consistent with the scraped sources, including details like timing (“fired… Friday”) and downstream effects (uncertainty and resignations). The presenter emphasizes that this kind of grounding is not dependent on the model’s training data, since the events referenced were described as occurring only days earlier.
The same mechanism is demonstrated with a non-governance question: “who won the Las Vegas F1 Grand Prix.” After query rewriting and scraping, GPT-4 Turbo generates a winner and supporting details (including mention of a penalty and a collision) drawn from the retrieved page text. The system also supports output-format experimentation. By changing the instruction prompt, the assistant can return structured JSON (e.g., a concise answer field) or even a short poem, while still using the scraped context as the factual basis.
Implementation details reinforce the pattern. The code defines separate functions for (1) generating the Google search query with GPT-4 Turbo, (2) retrieving organic results via SerpAPI, and (3) scraping each URL with BeautifulSoup. Function calling is used serially rather than in parallel because scraping requires the URLs first. The workflow is then run in a terminal environment; the OpenAI Playground is noted as unable to scrape websites in this setup. Overall, the transcript presents a working blueprint for “web-grounded” Q&A using GPT-4 Turbo plus Assistants API tool calls, with flexible output formatting and a clear grounding strategy.
Cornell Notes
The core idea is to ground GPT-4 Turbo answers in fresh web content by combining Assistants API function calling with a two-step retrieval pipeline. First, GPT-4 Turbo rewrites a user’s question into a Google-optimized search query. Next, SerpAPI fetches organic result URLs and BeautifulSoup 4 scrapes the pages to extract relevant text. That scraped context is then fed back into GPT-4 Turbo to produce an answer constrained by instructions to use only the essential information plus three bullet-point justifications. The approach is demonstrated with recent news about Sam Altman and with an F1 results question, and it supports different output formats like JSON or a short poem.
How does the system turn a free-form user question into something suitable for web retrieval?
What are the two tool functions used for web grounding, and why are they run serially?
How does the final GPT-4 Turbo answer stay tied to retrieved facts?
What evidence is shown that grounding works for breaking news?
How can the output format be changed without changing the retrieval pipeline?
Why does the OpenAI Playground fail for scraping in this setup?
Review Questions
- In what order do the search and scraping tool functions execute, and what dependency forces that order?
- What prompt constraints are used to ensure the final answer uses grounded context, and how many justification bullets are required?
- How does query rewriting improve the quality of Google results compared with sending the raw user question directly?
Key Points
- 1
Rewrite user questions into Google-optimized queries using GPT-4 Turbo before searching.
- 2
Use SerpAPI to fetch organic result URLs, then scrape those pages with BeautifulSoup 4 for grounded context.
- 3
Feed the original question plus scraped text back into GPT-4 Turbo to generate answers tied to current sources.
- 4
Constrain the final response with a system instruction to include only essential answer content plus three bullet-point justifications.
- 5
Support flexible output formats (plain text, valid JSON, or even a poem) by changing only the final formatting prompt.
- 6
Run tool calls serially when scraping requires URLs produced by the search step.
- 7
Expect environment differences: scraping may work in a terminal run but not in the OpenAI Playground for this configuration.