GPT-4 First Impression - A New Era Begins?
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
GPT-4’s announced context length is 8,192 tokens, with an additional 32k context mode framed as about 50 pages of text.
Briefing
OpenAI’s GPT-4 arrives with a major jump in both capacity and capability—especially longer context and multimodal (text + image) understanding—while also emphasizing months of safety work to keep outputs aligned with how people want to use it. Early access through ChatGPT Plus lets users try GPT-4 directly, with a message cap and model options that trade speed for deeper reasoning.
A headline feature is GPT-4’s ability to handle far more text at once. The announced context length is 8,192 tokens—described as roughly twice the size of GPT-3.5-era offerings like text-davinci-003. OpenAI also mentions “unlimited access” to a 32k token context size, framed as about 50 pages of text. That scale matters because it changes what kinds of tasks fit in a single run: longer story drafts, bigger codebases, and more extensive document analysis without constantly trimming or summarizing.
In hands-on tests, GPT-4 performs well on critique-and-rewrite workflows. Using an “AI critic” style prompt, it reviews a short story about “the last dragon rider” in a world where dragons are hunted to extinction. The critique highlights concrete weaknesses—predictability, lack of character development, pacing problems, overused clichés, and inconsistencies in the protagonist’s thought process. When asked to rewrite the story based on that feedback, the output shifts into more elevated, descriptive prose, with imagery and mood-setting that feels closer to a polished literary style than the original.
GPT-4 also handles absurd, constraint-heavy instructions with structured, step-by-step planning. A prompt about moving snow from Norway to the Sahara is met with a logistics plan: obtain permits, plan routes and infrastructure, use refrigerated transport (trucks, trains, planes), hire logistics support, prepare insulated containers, monitor transit, deposit and distribute the snow at a chosen desert site, and then document and publicize the effort. The plan explicitly flags environmental, social, and economic impacts—turning a joke premise into a checklist-like response.
Beyond text, GPT-4’s multimodal ability is presented as a practical expansion of what users can ask. It can accept images alongside prompts, enabling tasks like describing a multi-panel image “panel by panel,” interpreting diagrams or graphs to answer questions, and spotting unusual details in everyday scenes. Examples include identifying components in an image of a lightning-to-VGA adapter package, reasoning over a chart about average daily meat consumption across regions, and describing an odd situation in a photo of a man ironing clothes on an ironing board attached to a moving taxi.
Access details include a ChatGPT Plus cap of 100 messages every four hours for GPT-4, and the interface offers model choices that include a slower “reasoning” option. The overall takeaway from these first impressions is clear: GPT-4’s longer context, stronger critique-and-generation loop, and image understanding broaden what can be attempted in a single conversation—while safety and alignment work aims to keep those capabilities usable rather than chaotic.
Cornell Notes
GPT-4 brings a step-change in what can fit into one prompt—8,192 tokens by default and an announced 32k context mode (framed as ~50 pages). That expanded context supports longer writing, deeper analysis, and fewer interruptions for summarization. Early tests through ChatGPT Plus show GPT-4 can critique a story by pointing to specific issues (predictability, weak character development, pacing, clichés, and inconsistencies) and then rewrite it with more developed language. It also follows complex, multi-step instructions, even for unrealistic scenarios like transporting snow from Norway to the Sahara, while addressing practical and impact considerations. A key differentiator is multimodal input: GPT-4 can interpret images and answer questions about diagrams, products, and unusual scenes.
What does GPT-4’s context window change for real tasks?
How does GPT-4 perform in a critique-and-rewrite workflow?
Why is the “snow from Norway to the Sahara” prompt a useful demonstration?
What does multimodal capability add compared with text-only prompting?
What access and usage constraints matter when trying GPT-4 in ChatGPT Plus?
Review Questions
- How do the 8,192-token and 32k-token context claims affect what kinds of prompts can be completed in one pass?
- Describe the specific categories of weaknesses GPT-4 identified in the dragon-rider story, and explain how those categories influenced the rewrite.
- Give two examples of image-based tasks GPT-4 can handle from the transcript, and explain what the model had to infer from the visuals.
Key Points
- 1
GPT-4’s announced context length is 8,192 tokens, with an additional 32k context mode framed as about 50 pages of text.
- 2
ChatGPT Plus provides GPT-4 access with a cap of 100 messages every four hours, and model choices that trade speed for reasoning.
- 3
A critique-and-rewrite loop can work effectively: GPT-4 can identify story problems (predictability, character development, pacing, clichés, inconsistencies) and then revise the prose accordingly.
- 4
Complex, multi-step instructions—even unrealistic ones—can be converted into structured plans with practical steps and impact considerations.
- 5
GPT-4’s multimodal capability allows it to interpret images and answer questions about diagrams, product images, and unusual real-world scenes.
- 6
OpenAI highlights months of safety and alignment work to make GPT-4 outputs more usable and aligned with user intent.