ADVANCED ChatGPT Prompt Engineering: 7+ Chain Prompts in the Tree of Thougts Principle
Based on All About AI's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Generate multiple candidate solutions first, then force numeric evaluation and ranking before choosing anything.
Briefing
Chain prompting built on the “tree of thought” idea is presented as a practical way to beat brittle single-shot answers from GPT-style models. The core method is a seven-step loop: start by stating the problem, generate multiple candidate solutions, have the model evaluate and rank them, discard the weakest options, then repeatedly brainstorm new competitors that build on the current best idea. After several rounds of “fierce competition,” the process outputs a refined winner plus a deeper analysis of how that winner could work.
The workflow begins with Prompt 1 defining the problem in plain terms. Prompt 2 asks for three distinct solutions, explicitly factoring in the most important outcome drivers. Prompt 3 evaluates each solution and assigns a numeric probability score (ranked on a 1–100 scale in the example). Prompt 4 then removes the two lowest-ranked ideas and compresses the remaining best option into a single “winning ID” summary that includes its probability. The loop starts again: Prompt 5 keeps the winning idea and asks for two new creative alternatives that compete with it, producing three candidates total. Prompt 6 evaluates and ranks the new set, and Prompt 7 repeats the “keep the best, drop the rest” step—cycling this process about five times to search for a stronger overall answer.
A relationship scenario demonstrates the mechanics. The problem: a 27-year-old considering breaking up after six years, citing stagnation and growing apart. The first round yields three approaches—direct communication, gradual distance, and mutual decision. After ranking, the direct communication option scores highest (85), so it becomes the anchor for the next loop. In subsequent rounds, the model keeps that winning approach while generating two fresh alternatives to challenge it, aiming to avoid getting stuck in the same narrow set of ideas. After five loops, the direct communication approach remains the top choice at 85, with a close runner-up at 80 (a “therapeutic intervention approach”). The presenter notes a practical limitation: as loops continue, the model can run out of context, which can cause repetition.
The final step adds depth. A “deep in the thought process” prompt produces scenario planning: implementation strategies, potential partnerships and resources, obstacles and mitigations, and possible unexpected outcomes and responses.
To make the technique usable, the transcript shifts from manual prompting to automation. A Python script is described as chaining the prompts and running the loop automatically, wrapped in a simple web UI with a progress bar. The script is then tested on a classic common-sense failure: drying clothes. The example claims that drying five wet items takes five hours, so drying 30 items would take 30 hours—an answer attributed to GPT-4 in the story. Running the same problem through the tree-of-thought chain yields a different result: drying all 30 simultaneously would still take five hours, assuming similar conditions and adequate airflow. The takeaway is that structured exploration—generate, rank, prune, and iterate—can improve reasoning and correct for assumptions that a single pass might miss.
Cornell Notes
The transcript presents a chain-prompting method based on “tree of thought” search. It starts by generating multiple candidate solutions, then has the model evaluate and rank them with numeric probability scores, discarding the weakest options. The best remaining idea is fed back into a loop where two new alternatives are brainstormed to compete against it, repeating about five times to find a stronger winner. A final prompt then expands the winning idea into implementation scenarios, resources, obstacles, and unexpected responses. The approach matters because it turns one-shot answers into iterative reasoning, and the examples suggest it can fix common-sense-style errors (like the clothes-drying problem) that a single response may get wrong.
How does the prompting loop decide which ideas survive to the next round?
What does “tree of thought” mean operationally in this workflow?
Why does the relationship example end up favoring “direct communication” at 85?
What limitation appears when the loop runs many iterations?
How does the clothes-drying example illustrate the benefit of chain prompting?
What role does automation play in making this method practical?
Review Questions
- In the described loop, what exact steps correspond to branching, evaluation, and pruning?
- Why might repeated loops lead to less variety in candidate solutions?
- In the clothes-drying scenario, what assumption is necessary for the “five hours for 30 clothes” answer to hold?
Key Points
- 1
Generate multiple candidate solutions first, then force numeric evaluation and ranking before choosing anything.
- 2
Prune aggressively: discard the two lowest-ranked ideas and keep only the highest-scoring “winning ID.”
- 3
Use the winning idea as an anchor for the next round, then brainstorm two new alternatives to keep exploration alive.
- 4
Repeat the generate–rank–prune cycle several times (about five in the example) to search for a stronger final answer.
- 5
Add a final refinement step that turns the winning idea into actionable scenarios, resources, obstacles, and contingencies.
- 6
Automate the prompt chain with a Python script and a simple UI to reduce manual time and make testing easier.
- 7
Expect context-window limits to affect later iterations, potentially increasing repetition.