Matplotlib Tutorial (Part 8): Plotting Time Series Data
Based on Corey Schafer's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
Plot time series by passing datetime objects as the x-axis and numeric values as the y-axis using `plt.plot_date`.
Briefing
Time-series plotting in Matplotlib hinges on treating dates as real datetime objects—not plain strings—and then using Matplotlib’s date formatting tools to make the x-axis readable. The tutorial starts with a simple seven-day dataset created directly in Python, plots it with `plt.plot_date`, and then adjusts the line style so points connect into a continuous trend. From there, it focuses on two practical upgrades that matter in real charts: automatic rotation/alignment of date tick labels and custom date label formats (like “May 24” instead of the default year-month-day).
The first example builds a list of consecutive `datetime` values and a matching list of y-values. Plotting is straightforward: pass dates as the x-axis and the numeric series as the y-axis. The initial output may show markers rather than a line, but setting `linestyle='solid'` produces a connected time series; markers can be disabled with `marker=None` if desired. To improve readability, the chart uses Matplotlib’s figure-level auto-formatting for dates—calling `autofmt_xdate()` on the current figure—so tick labels rotate and spread out instead of bunching together.
Next comes the key formatting step: changing how dates appear on the x-axis. The tutorial imports `matplotlib.dates` as `MPL dates` and uses `matplotlib.dates.DateFormatter` with a format string compatible with Python’s `strftime` codes. It then applies that formatter to the x-axis via `ax.xaxis.set_major_formatter(...)` after retrieving the current axis. The example uses abbreviated month, day, and four-digit year (e.g., `%b %d %Y`), resulting in labels like “May 24” rather than full numeric date strings.
The real-world section loads Bitcoin OHLCV data from a CSV using pandas. The CSV includes a `date` column plus columns such as open, high, low, close, adjusted close, and volume. The code initially plots `price_date` (from the CSV’s `date` column) against `price_close` using `plot_date`, but the x-axis renders incorrectly because the dates are still strings. A deliberate out-of-order date row demonstrates the problem: the chart places “May 17” at the end rather than at the correct chronological position.
The fix is twofold. First, convert the `date` column to pandas datetime objects with `pd.to_datetime(...)`, replacing the string values. Second, sort the data by date so the time series is ordered correctly; the tutorial uses `sort_values` (with `inplace=True`) to reorder the DataFrame without needing to reassign it. After these changes, Matplotlib receives properly typed datetimes, and the x-axis behaves as expected.
The session closes by previewing a follow-up on real-time plotting—useful for monitoring data that updates continuously from APIs or sensors—while reinforcing that the core workflow for time series is: parse dates correctly, sort chronologically, plot with `plot_date`, and then format the x-axis for human readability.
Cornell Notes
Matplotlib time-series charts work best when the x-axis uses true datetime objects rather than string dates. The tutorial first plots a small list of consecutive Python `datetime` values, then improves readability by connecting points with a solid line and using `autofmt_xdate()` to rotate and align tick labels. It then customizes x-axis labels with `matplotlib.dates.DateFormatter` and a `strftime`-style format string (e.g., abbreviated month, day, year). For CSV data (Bitcoin prices), it shows that plotting fails visually when the `date` column is still strings; converting with `pd.to_datetime` and sorting by date fixes both ordering and formatting.
Why do time-series plots break when dates come from a CSV as strings?
What is the simplest way to plot a basic time series with Matplotlib?
How does `autofmt_xdate()` improve date-axis readability?
How can x-axis date labels be reformatted (e.g., “May 24” instead of full numeric dates)?
What pandas steps ensure the CSV time series is plotted in correct chronological order?
Review Questions
- When would you choose `marker=None` versus leaving markers on a time-series plot?
- What two changes are required to make a CSV-based date column plot correctly on the x-axis?
- How would you modify the `DateFormatter` format string to show day-of-week plus month and day?
Key Points
- 1
Plot time series by passing datetime objects as the x-axis and numeric values as the y-axis using `plt.plot_date`.
- 2
Use `linestyle='solid'` to connect points into a line when Matplotlib defaults to markers.
- 3
Call `autofmt_xdate()` on the current figure to rotate and align date tick labels for readability.
- 4
Customize x-axis date labels with `matplotlib.dates.DateFormatter` and a `strftime`-style format string, then apply it with `ax.xaxis.set_major_formatter(...)`.
- 5
When CSV dates appear out of order, convert the `date` column with `pd.to_datetime` so Matplotlib treats them as datetimes.
- 6
Sort the DataFrame by the converted datetime column (e.g., `sort_values(..., inplace=True)`) to ensure the plotted timeline is chronological.