The NEAT Algorithm is Neat
Based on sentdex's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
NEAT evolves network topology (nodes and connections) rather than relying on a fixed architecture, enabling structure discovery alongside weight tuning.
Briefing
NEAT’s core promise is structural learning: instead of keeping a fixed neural-network shape and only tuning weights, it evolves the network topology itself—adding and removing nodes and connections—while also adjusting weights and biases. That shift matters because it lets a relatively lightweight, CPU-friendly evolutionary search discover working controllers without the heavy training pipelines associated with modern GPU reinforcement learning. In practice, the walkthrough uses NEAT to solve classic OpenAI Gym control tasks by repeatedly evaluating candidate networks, scoring them by episode performance, and letting the population evolve until a fitness threshold is met.
The setup starts with the NEAT Python library and the NEAT algorithm’s defining idea: begin with a simple network and let evolution grow it toward better performance. The guide emphasizes that NEAT was published in 2002, predating today’s GPU-centric deep learning era, and argues that many problems NEAT can handle don’t require long training runs. For experimentation, OpenAI Gym environments are used because they provide straightforward observation vectors and discrete or bounded action spaces.
For CartPole, the process is concrete. The environment produces an observation vector (cart position, cart velocity, pole angle, pole angular velocity), while the action space is discrete—effectively choosing left or right. The NEAT evaluation loop resets the environment, feeds the current observation through the network via `net.activate`, converts the network outputs into an action (using `np.argmax` in the discrete case), steps the environment, and accumulates fitness based on reward (reward per frame alive). A NEAT configuration file defines network input/output sizes, activation function (the walkthrough favors `clamped`, which outputs in a range like -1 to 1), population size, and the stopping criterion. After some debugging around import/config naming and fitness thresholds, the system reaches the target performance quickly—balancing the pole for the environment’s maximum episode length.
The same training scaffold is then adapted to BipedalWalker. Here the observation is larger (a flattened vector of 24 values), and the action space is continuous with multiple action dimensions (four outputs). The evaluation loop changes accordingly: instead of taking an argmax, the network’s outputs are used directly as actions, and fitness is tied to forward progress (with large penalties if the agent falls). The walkthrough notes that this task takes longer and may require tuning NEAT hyperparameters like population size to keep iteration times reasonable.
Finally, the guide extends NEAT beyond control tasks into Conway’s Game of Life. The “environment” is a grid of cells governed by birth/death rules, with no direct movement—only evolving patterns. Fitness is based on how long a structure survives, with additional logic to penalize trivial oscillations (including repeated frame sequences). The evolved networks learn to place initial live cells that generate stable or long-lived configurations, including recognizable motifs like gliders and a specific long-surviving structure nicknamed the “176er.” The result is a demonstration that NEAT can discover nontrivial, rule-driven behaviors—often by evolving surprisingly compact structures—so long as the task’s search space isn’t dominated by extreme complexity.
Cornell Notes
NEAT (Neural Evolution of Augmenting Topologies) evolves neural networks by changing their structure—adding/removing nodes and connections—rather than only tuning weights in a fixed architecture. The walkthrough shows how to use the NEAT Python library with OpenAI Gym environments by writing an evaluation function that runs each genome in an environment, converts observations to actions via `net.activate`, and accumulates fitness from rewards or progress. CartPole is solved by mapping network outputs to discrete actions (using `argmax`) and rewarding survival time. BipedalWalker requires a different action mapping because actions are continuous (network outputs are used directly), and fitness depends on forward movement and penalties for falling. The same evolutionary approach is then applied to Conway’s Game of Life by evolving initial cell placements to maximize survival time while discouraging repetitive oscillations.
What is the practical difference between NEAT and standard neural-network training?
How does the CartPole evaluation loop turn observations into actions and fitness?
Why does action mapping change between CartPole and BipedalWalker?
What role does the NEAT configuration file play in making training work?
How is Conway’s Game of Life framed as a NEAT fitness problem?
Review Questions
- In NEAT, what changes over generations besides weights, and why does that matter for choosing network architectures?
- For a discrete-action environment like CartPole, what transformation is typically applied to `net.activate` outputs to select an action?
- When adapting NEAT from CartPole to BipedalWalker, which configuration parameters and evaluation-loop details must be updated, and why?
Key Points
- 1
NEAT evolves network topology (nodes and connections) rather than relying on a fixed architecture, enabling structure discovery alongside weight tuning.
- 2
A NEAT evaluation function is the core integration point: run each genome in an environment, convert observations to actions via `net.activate`, and accumulate fitness from rewards/progress.
- 3
CartPole is solved by mapping network outputs to discrete actions (commonly using `np.argmax`) and rewarding survival time per frame.
- 4
BipedalWalker requires continuous action handling: network outputs are used directly as multi-dimensional actions, and fitness depends on forward progress with strong penalties for falling.
- 5
Activation function choice can matter for bounded action spaces; `clamped` is used because it naturally fits within expected ranges like -1 to 1.
- 6
NEAT can be applied beyond control tasks: Conway’s Game of Life can be treated as an optimization over initial conditions, with fitness tied to survival time and safeguards against trivial oscillations.