Get AI summaries of any video or article — Sign up free
GPT-4 Turbo with Google Web Browsing (Assistants API) thumbnail

GPT-4 Turbo with Google Web Browsing (Assistants API)

All About AI·
5 min read

Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

Rewrite user questions into Google-optimized queries using GPT-4 Turbo before searching.

Briefing

A practical Assistants API pattern is on display: rewrite a user question into a Google-friendly search query, fetch fresh web results, scrape the pages for relevant text, then feed that “grounded” context back into GPT-4 Turbo so answers reflect current information instead of relying on training-time knowledge. The workflow matters because it turns a general-purpose model into a news-and-facts retriever that can cite up-to-date details—illustrated with very recent claims about Sam Altman and with a sports result example.

The system starts when a user types a question. A first GPT-4 Turbo call reformulates the query into a shorter, search-optimized version (for example, converting “Sam Altman fired from OpenAI” into a Google-ready query). That rewritten query is sent to an Assistants API flow that uses function calling: one tool queries Google via SerpAPI to obtain organic result URLs, and another tool scrapes the returned pages using BeautifulSoup 4 to extract the relevant text. The pipeline then combines the original user question with the scraped context and makes a second GPT-4 Turbo call to generate the final answer.

A key design choice is how the final response is constrained. A system message instructs the model to return only the essential parts that answer the user’s original question, while adding three bullet points that justify the response using the grounded text. In testing, the approach produces answers that track the retrieved material—for instance, for the query about whether Sam Altman was fired, the model returns a summary consistent with the scraped sources, including details like timing (“fired… Friday”) and downstream effects (uncertainty and resignations). The presenter emphasizes that this kind of grounding is not dependent on the model’s training data, since the events referenced were described as occurring only days earlier.

The same mechanism is demonstrated with a non-governance question: “who won the Las Vegas F1 Grand Prix.” After query rewriting and scraping, GPT-4 Turbo generates a winner and supporting details (including mention of a penalty and a collision) drawn from the retrieved page text. The system also supports output-format experimentation. By changing the instruction prompt, the assistant can return structured JSON (e.g., a concise answer field) or even a short poem, while still using the scraped context as the factual basis.

Implementation details reinforce the pattern. The code defines separate functions for (1) generating the Google search query with GPT-4 Turbo, (2) retrieving organic results via SerpAPI, and (3) scraping each URL with BeautifulSoup. Function calling is used serially rather than in parallel because scraping requires the URLs first. The workflow is then run in a terminal environment; the OpenAI Playground is noted as unable to scrape websites in this setup. Overall, the transcript presents a working blueprint for “web-grounded” Q&A using GPT-4 Turbo plus Assistants API tool calls, with flexible output formatting and a clear grounding strategy.

Cornell Notes

The core idea is to ground GPT-4 Turbo answers in fresh web content by combining Assistants API function calling with a two-step retrieval pipeline. First, GPT-4 Turbo rewrites a user’s question into a Google-optimized search query. Next, SerpAPI fetches organic result URLs and BeautifulSoup 4 scrapes the pages to extract relevant text. That scraped context is then fed back into GPT-4 Turbo to produce an answer constrained by instructions to use only the essential information plus three bullet-point justifications. The approach is demonstrated with recent news about Sam Altman and with an F1 results question, and it supports different output formats like JSON or a short poem.

How does the system turn a free-form user question into something suitable for web retrieval?

A first GPT-4 Turbo call rewrites the user input into an “optimized Google search query.” The prompt instructs the model to convert unstructured text into a search-friendly query, often by shortening it and removing extra keywords. Example: “Sam ultman fired from open AI” becomes a cleaner search query like “Sam ultimate open ey fired status,” which is then used to fetch current results.

What are the two tool functions used for web grounding, and why are they run serially?

The Assistants API uses function calling with two tools: (1) a function that calls SerpAPI to get organic Google results and returns URLs, and (2) a function that scrapes a given URL using BeautifulSoup 4 to extract page content. They run serially because scraping depends on having the URL from the search step; without the URLs, the scraper can’t fetch the right pages.

How does the final GPT-4 Turbo answer stay tied to retrieved facts?

The final call includes both the original user question and the scraped “grounding context.” A system instruction constrains output to “only the essential parts” that answer the question and adds three bullet points backing the reasoning using the grounded text. This makes the response reflect what was retrieved rather than relying on the model’s pretraining knowledge.

What evidence is shown that grounding works for breaking news?

For the query about Sam Altman, the model’s answer is described as consistent with scraped sources, including timing (“suddenly fired… Friday”) and consequences (uncertainty and resignations). The transcript emphasizes that the referenced events were recent enough that they wouldn’t be reliably present in training data, so grounding is presented as the reason the answer matches current reporting.

How can the output format be changed without changing the retrieval pipeline?

The retrieval steps remain the same; only the final system message/prompt changes. The transcript demonstrates switching from a default “answer + three bullet points” format to “short concise response in valid JSON format,” and then to “short poem format,” while still using the scraped context to supply the factual content.

Why does the OpenAI Playground fail for scraping in this setup?

The transcript notes that website scraping doesn’t work in the OpenAI Playground environment for this assistant configuration. The scraping function works when run in a terminal environment, implying the Playground sandbox doesn’t allow the same outbound scraping behavior or tool execution needed for BeautifulSoup fetching.

Review Questions

  1. In what order do the search and scraping tool functions execute, and what dependency forces that order?
  2. What prompt constraints are used to ensure the final answer uses grounded context, and how many justification bullets are required?
  3. How does query rewriting improve the quality of Google results compared with sending the raw user question directly?

Key Points

  1. 1

    Rewrite user questions into Google-optimized queries using GPT-4 Turbo before searching.

  2. 2

    Use SerpAPI to fetch organic result URLs, then scrape those pages with BeautifulSoup 4 for grounded context.

  3. 3

    Feed the original question plus scraped text back into GPT-4 Turbo to generate answers tied to current sources.

  4. 4

    Constrain the final response with a system instruction to include only essential answer content plus three bullet-point justifications.

  5. 5

    Support flexible output formats (plain text, valid JSON, or even a poem) by changing only the final formatting prompt.

  6. 6

    Run tool calls serially when scraping requires URLs produced by the search step.

  7. 7

    Expect environment differences: scraping may work in a terminal run but not in the OpenAI Playground for this configuration.

Highlights

A two-call GPT-4 Turbo workflow—query rewriting, then answer generation—turns web search results into grounded answers.
Grounding is enforced by combining scraped page context with a strict output instruction: essential answer only plus three justification bullets.
Output formatting can be swapped (JSON or poem) without changing the retrieval-and-grounding mechanism.
The system relies on serial tool calling because scraping can’t happen until SerpAPI returns URLs.
Recent-event examples (Sam Altman) are used to illustrate why grounding matters beyond training-time knowledge.

Topics

Mentioned