Case-Based Reasoning | Introduction and Applications in Sports Science

TL;DR

Case-based reasoning stores past problem-solution pairs in a case base and uses them to solve new problems by analogy.

Briefing Cornell Notes

Briefing

Case-based reasoning is positioned as a practical way to make predictions and recommendations by reusing solutions from similar past examples—an approach that fits naturally with endurance sports, where athletes’ training histories and outcomes accumulate over time. The core idea is straightforward: store past “cases” that pair a problem description with a known solution, then, when a new case arrives, find the most similar prior cases, adapt their solutions, and add the new experience back into the case base for future use.

The method draws from cognitive science, which studies how people solve problems using experience. Everyday examples illustrate the mechanism: commuters estimate travel time based on previously seen traffic patterns on specific routes; bakers use earlier recipes as templates for new ones; doctors interpret a patient’s symptoms by comparing them to past presentations of the same illness. In each scenario, the “problem” is described (traffic conditions, recipe goals, symptoms), the “solution” is known (arrival time, a new recipe, a diagnosis), and the system maintains a library of past problem-solution pairs.

Operationally, case-based reasoning runs through four steps. First, retrieve: pull the most similar cases from the case base. Second, reuse: start with the solutions associated with those retrieved cases. Third, revise: adjust the reused solution to fit the new case, since no two problems are identical. Fourth, retain: store the newly solved case back into the case base so the system improves over time.

Two assumptions underpin the approach. Similar problems should lead to similar solutions, enabling adaptation from neighbors. And similar cases should recur often enough that the case base remains useful rather than becoming a collection of unrelated examples.

For implementation, the transcript points to k-nearest neighbors as a simple computational route to case-based reasoning. Using a similarity metric, k-nearest neighbors retrieves the closest data points to a new query. If the target is categorical, it supports classification; if numeric, it supports regression. The retrieved neighbors’ labels (or values) guide the prediction, and the new outcome can then be added to the case base.

In sports science research, the technique is tailored to marathon runners using training data collected 12 to 16 weeks before a race alongside actual marathon finish times. One use case treats the runner’s training as the problem and the finish time as the solution: for a new runner, the system retrieves similar runners from the case base and predicts a likely finish time based on their outcomes. Another use case flips the direction: if a runner has a goal finish time, that goal becomes the problem, and the system recommends training plans drawn from similar runners who achieved comparable targets. A third application aims at injury risk, using a runner’s training (and injury history) as the problem and whether they were injured at a given time as the solution, then estimating risk by comparing to prior injury patterns.

Overall, case-based reasoning is framed as a reusable, experience-driven engine for prediction and recommendation in endurance athletics—one that can support finish-time forecasting, training planning, and injury risk estimation by learning from past runners and continuously updating its knowledge base.

Cornell Notes

Case-based reasoning predicts outcomes by retrieving similar past examples, reusing their solutions, revising them for the new situation, and then storing the new solved case for future use. The approach relies on two assumptions: similar problems tend to have similar solutions, and similar cases occur often enough to make the case base valuable. In practice, k-nearest neighbors can implement the “retrieve” step by finding the most similar data points using a similarity metric, supporting either classification or regression. In marathon research, training data from 12–16 weeks before a race can serve as the problem description, with marathon finish time as the solution, enabling finish-time prediction for new runners. The same framework can recommend training plans for goal times or estimate injury risk using training and injury history.

What exactly counts as a “case” in case-based reasoning, and how is it structured?

A case pairs a problem description with a known solution. In the transcript’s examples, a doctor’s case is a patient’s symptoms (problem) paired with a diagnosis (solution). In the marathon setting, the problem can be a runner’s training data (12–16 weeks before the marathon) and the solution can be the runner’s actual marathon finish time. The case base is the collection of all prior problem-solution pairs.

Why does case-based reasoning use a four-step cycle (retrieve, reuse, revise, retain)?

Retrieve pulls the most similar cases from the case base. Reuse starts from the solutions attached to those retrieved cases. Revise adapts the reused solution because the new problem won’t match the old one exactly. Retain then adds the newly solved case back into the case base, so future predictions can benefit from the new experience.

What are the two key assumptions that make case-based reasoning work?

First, the world is “similarity-preserving”: similar problems should have similar solutions, which justifies adapting solutions from retrieved neighbors. Second, similar cases must show up regularly; otherwise the case base would contain mostly unrelated examples and retrieval would fail to find useful matches.

How does k-nearest neighbors connect to case-based reasoning?

k-nearest neighbors provides a straightforward way to retrieve similar cases. Given a new data point, it finds the k closest data points using a similarity metric. If the solution is categorical, the neighbors’ labels support classification; if the solution is numeric, the neighbors’ values support regression. After producing an answer, the new case can be retained in the case base.

How can marathon training data be used as inputs and outputs in this framework?

One setup treats training as the problem and finish time as the solution: for a new runner, the system retrieves similar runners and predicts a finish time based on their known outcomes. Another setup treats a goal finish time as the problem and recommends a training plan as the solution by retrieving runners with similar characteristics who achieved that goal. A third setup uses training (plus injury history) as the problem and injury occurrence at a specific time as the solution to estimate injury risk.

What does “revise” mean in sports-science recommendations?

Revise reflects that no two runners are identical. Even after retrieving similar runners, the system must adjust the borrowed solution—such as modifying a predicted finish time estimate or tailoring a recommended training plan—so it fits the new runner’s specific training profile and circumstances.

Review Questions

How do retrieve, reuse, revise, and retain differ, and which step most directly improves the system over time?
In the marathon examples, what changes when the goal is to predict finish time versus recommend training versus estimate injury risk?
Why do the two assumptions about similarity and recurring cases matter for the reliability of predictions?

Key Points

1
Case-based reasoning stores past problem-solution pairs in a case base and uses them to solve new problems by analogy.
2
The method follows a four-step loop: retrieve similar cases, reuse their solutions, revise for the new context, then retain the new solved case.
3
Case-based reasoning depends on similarity-preserving behavior (similar problems yield similar solutions) and on the recurrence of similar cases over time.
4
k-nearest neighbors can implement the retrieval step by selecting the k most similar data points using a similarity metric, enabling classification or regression.
5
In endurance sports research, runner training data from 12–16 weeks before a marathon can be used to predict marathon finish time by retrieving similar runners.
6
The same framework can recommend training plans from runners who achieved a given goal time by treating the goal as the problem and training as the solution.
7
Injury risk can be estimated by treating training (and injury history) as the problem and injury occurrence at a specific time as the solution.

Highlights

Case-based reasoning turns experience into predictions by retrieving similar past cases, adapting their solutions, and adding new outcomes back into the case base.

The four-step workflow—retrieve, reuse, revise, retain—captures both prediction and continuous learning.

k-nearest neighbors provides a practical retrieval mechanism for case-based reasoning, supporting both classification and regression.

Marathon applications can use 12–16 week training histories to forecast finish times, recommend training for goal times, or estimate injury risk. 

Topics

Case-Based Reasoning
Recommender Systems
Sports Science
Endurance Training
k-Nearest Neighbors

Case-Based Reasoning | Introduction and Applications in Sports Science | Recommender Systems