Case-Based Reasoning | Introduction and Applications in Sports Science | Recommender Systems
Based on Ciara Feely's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Case-based reasoning stores past problem-solution pairs in a case base and uses them to solve new problems by analogy.
Briefing
Case-based reasoning is positioned as a practical way to make predictions and recommendations by reusing solutions from similar past examples—an approach that fits naturally with endurance sports, where athletes’ training histories and outcomes accumulate over time. The core idea is straightforward: store past “cases” that pair a problem description with a known solution, then, when a new case arrives, find the most similar prior cases, adapt their solutions, and add the new experience back into the case base for future use.
The method draws from cognitive science, which studies how people solve problems using experience. Everyday examples illustrate the mechanism: commuters estimate travel time based on previously seen traffic patterns on specific routes; bakers use earlier recipes as templates for new ones; doctors interpret a patient’s symptoms by comparing them to past presentations of the same illness. In each scenario, the “problem” is described (traffic conditions, recipe goals, symptoms), the “solution” is known (arrival time, a new recipe, a diagnosis), and the system maintains a library of past problem-solution pairs.
Operationally, case-based reasoning runs through four steps. First, retrieve: pull the most similar cases from the case base. Second, reuse: start with the solutions associated with those retrieved cases. Third, revise: adjust the reused solution to fit the new case, since no two problems are identical. Fourth, retain: store the newly solved case back into the case base so the system improves over time.
Two assumptions underpin the approach. Similar problems should lead to similar solutions, enabling adaptation from neighbors. And similar cases should recur often enough that the case base remains useful rather than becoming a collection of unrelated examples.
For implementation, the transcript points to k-nearest neighbors as a simple computational route to case-based reasoning. Using a similarity metric, k-nearest neighbors retrieves the closest data points to a new query. If the target is categorical, it supports classification; if numeric, it supports regression. The retrieved neighbors’ labels (or values) guide the prediction, and the new outcome can then be added to the case base.
In sports science research, the technique is tailored to marathon runners using training data collected 12 to 16 weeks before a race alongside actual marathon finish times. One use case treats the runner’s training as the problem and the finish time as the solution: for a new runner, the system retrieves similar runners from the case base and predicts a likely finish time based on their outcomes. Another use case flips the direction: if a runner has a goal finish time, that goal becomes the problem, and the system recommends training plans drawn from similar runners who achieved comparable targets. A third application aims at injury risk, using a runner’s training (and injury history) as the problem and whether they were injured at a given time as the solution, then estimating risk by comparing to prior injury patterns.
Overall, case-based reasoning is framed as a reusable, experience-driven engine for prediction and recommendation in endurance athletics—one that can support finish-time forecasting, training planning, and injury risk estimation by learning from past runners and continuously updating its knowledge base.
Cornell Notes
Case-based reasoning predicts outcomes by retrieving similar past examples, reusing their solutions, revising them for the new situation, and then storing the new solved case for future use. The approach relies on two assumptions: similar problems tend to have similar solutions, and similar cases occur often enough to make the case base valuable. In practice, k-nearest neighbors can implement the “retrieve” step by finding the most similar data points using a similarity metric, supporting either classification or regression. In marathon research, training data from 12–16 weeks before a race can serve as the problem description, with marathon finish time as the solution, enabling finish-time prediction for new runners. The same framework can recommend training plans for goal times or estimate injury risk using training and injury history.
What exactly counts as a “case” in case-based reasoning, and how is it structured?
Why does case-based reasoning use a four-step cycle (retrieve, reuse, revise, retain)?
What are the two key assumptions that make case-based reasoning work?
How does k-nearest neighbors connect to case-based reasoning?
How can marathon training data be used as inputs and outputs in this framework?
What does “revise” mean in sports-science recommendations?
Review Questions
- How do retrieve, reuse, revise, and retain differ, and which step most directly improves the system over time?
- In the marathon examples, what changes when the goal is to predict finish time versus recommend training versus estimate injury risk?
- Why do the two assumptions about similarity and recurring cases matter for the reliability of predictions?
Key Points
- 1
Case-based reasoning stores past problem-solution pairs in a case base and uses them to solve new problems by analogy.
- 2
The method follows a four-step loop: retrieve similar cases, reuse their solutions, revise for the new context, then retain the new solved case.
- 3
Case-based reasoning depends on similarity-preserving behavior (similar problems yield similar solutions) and on the recurrence of similar cases over time.
- 4
k-nearest neighbors can implement the retrieval step by selecting the k most similar data points using a similarity metric, enabling classification or regression.
- 5
In endurance sports research, runner training data from 12–16 weeks before a marathon can be used to predict marathon finish time by retrieving similar runners.
- 6
The same framework can recommend training plans from runners who achieved a given goal time by treating the goal as the problem and training as the solution.
- 7
Injury risk can be estimated by treating training (and injury history) as the problem and injury occurrence at a specific time as the solution.