Matplotlib Tutorial (Part 1): Creating and Customizing Our First Plots

TL;DR

Install Matplotlib with pip install matplotlib, then import pyplot as PLT for plotting.

Briefing Cornell Notes

Briefing

Matplotlib is positioned as a practical way to turn Python data into clear, customizable charts—starting with a basic line plot and quickly layering in the elements that make a graph usable: titles, axis labels, legends, styling, and layout fixes. The core workflow is straightforward: install the library with pip, import pyplot as PLT, prepare x/y data arrays, plot with PLT.plot, and render with PLT.show. From there, the tutorial focuses on turning an “it works” chart into something that communicates meaning.

The first plot uses age ranges (25–35) as the x-axis and median salaries as the y-axis, with the salary data sourced from the Annual Stack Overflow Developer Survey (2019 data). After plotting, the chart is intentionally bare—no title or axis labels—so the next step is adding context. A chart title is added via PLT.title, and axis descriptions are added with PLT.xlabel and PLT.ylabel (including “USD” in the y-axis label). This transforms the figure from a generic line into a readable visualization that someone else can interpret without guessing what the axes represent.

Next comes comparison: a second line is added for Python developers’ median salaries by age. The tutorial then addresses a common usability problem—multiple lines without explanation—by adding a legend. Two legend approaches are shown: passing legend entries manually in a fixed order (works but is error-prone), and a better method that attaches labels directly in each PLT.plot call (label=...), letting PLT.legend automatically match the plotted lines. That “self-documenting” pattern also pays off later when the plotting order changes.

The styling section moves from “default look” to deliberate design. It demonstrates three ways to control line appearance: (1) compact format strings (color/marker/line style), (2) explicit keyword arguments like color= and linestyle=, and (3) hex color codes for more flexible palettes. Line width is adjusted with line width= to emphasize language-specific series (Python and JavaScript) while keeping the general “all developers” line thinner. Markers are briefly introduced as an option but then removed when they don’t suit the chart.

To improve readability, the tutorial adds PLT.tight_layout to fix padding issues across screen sizes and PLT.grid(True) to make it easier to see where lines intersect. It then shifts to Matplotlib’s built-in styling system: PLT.style.use with named styles such as “538,” “seaborn,” and “ggplot,” showing how those presets change colors, backgrounds, and grid behavior. A fun alternative style is introduced with PLT.xkcd, which mimics the hand-drawn look of xkcd comics.

Finally, the tutorial shows practical output handling: saving figures with PLT.savefig (e.g., plot.png) and expanding the dataset from the limited 25–35 slice to the full age range (18–55) that meets a minimum data threshold. The larger plot reveals that Python’s salary advantage is most pronounced in the mid-20s to mid-30s, then narrows as other languages catch up. The session closes by previewing future chart types (bar, pie, scatter, histogram, stack plots, time series) and data loading from CSVs, while also promoting Brilliant as a learning supplement for statistics and Python-based analysis.

Cornell Notes

Matplotlib is introduced through a step-by-step path from a bare line chart to a publication-ready visualization. The workflow starts with installing matplotlib, importing pyplot as PLT, preparing x/y lists, plotting with PLT.plot, and rendering with PLT.show. Usability upgrades include adding a title, labeling axes, and creating a legend—preferably by using label= in each PLT.plot call so PLT.legend stays correct even if plotting order changes. The tutorial then customizes appearance using explicit styling (color, linestyle, line width, optional markers), hex colors, grid lines, and PLT.tight_layout to prevent clipping. Built-in styles via PLT.style.use (including “538,” “seaborn,” “ggplot”) and the playful PLT.xkcd option demonstrate how quickly overall aesthetics can be changed.

What is the minimal Matplotlib recipe for a line plot, and what does each step accomplish?

A basic line plot follows a simple sequence: (1) install matplotlib with pip install matplotlib, (2) import pyplot as PLT using from matplotlib import pyplot as PLT, (3) create x and y data lists (e.g., ages for dev_X and median salaries for dev_Y), (4) draw the line with PLT.plot(dev_X, dev_Y), and (5) display it with PLT.show. Without PLT.show, the plot won’t appear in the interactive window. This structure stays the same even as more lines, labels, and styling are added later.

Why is adding legend entries via PLT.legend(list) more fragile than using label= in PLT.plot?

Passing legend text directly into PLT.legend requires the legend order to match the exact order lines were added to the plot. If plotting order changes (for example, to control which line appears on top), the legend can become wrong unless the legend list is manually updated. Using label= in each PLT.plot call (e.g., label='all devs' and label='Python') makes the legend self-documenting; PLT.legend can then pull the correct labels automatically, even after reordering lines.

How does the tutorial customize line appearance without relying on hard-to-read format strings?

Instead of using a single compact format string that encodes color, marker, and line style together, it uses explicit keyword arguments. Examples include color=K for black, linestyle='--' for dashed lines, and line width=3 to thicken selected series. This approach is easier to read and reduces the chance of forgetting what each character in a format string means.

What practical steps improve readability and layout across different screens?

Two key improvements are PLT.tight_layout, which adjusts subplot parameters to prevent labels from being cut off (notably on smaller laptop screens), and PLT.grid(True), which adds grid lines so viewers can more easily estimate values and see where lines intersect. Together, they make the chart both cleaner and more interpretable.

How do Matplotlib styles and the xkcd option change the look of plots?

Matplotlib styles can be applied globally with PLT.style.use('538') or other named presets like 'seaborn' and 'ggplot'. These presets alter default colors, backgrounds, and grid behavior. Separately, PLT.xkcd is a method that mimics the hand-drawn xkcd comic style, producing squiggly, comic-like lines—useful for lighter or playful visualizations.

What does expanding from the 25–35 age subset to the full 18–55 dataset reveal?

The initial subset shows a large gap between Python developers and other developers, especially in the 25–35 range. When the full 18–55 data (with a minimum-answer threshold for reliability) is plotted, the gap is most pronounced in the mid-20s to mid-30s, then narrows as other languages catch up. Python still maintains leverage in many ages, but the difference is less extreme outside that central band.

Review Questions

When would you prefer label= in PLT.plot over manually passing legend entries to PLT.legend, and what failure mode does it prevent?
Which combination of commands helps prevent clipped labels and improves value estimation on the chart?
How do PLT.style.use and PLT.xkcd differ in what they change about a plot’s appearance?

Key Points

1
Install Matplotlib with pip install matplotlib, then import pyplot as PLT for plotting.
2
Create two aligned lists for x and y values, then plot with PLT.plot(x, y) and display with PLT.show.
3
Make charts interpretable by adding PLT.title plus PLT.xlabel and PLT.ylabel.
4
Add legends reliably by using label= in each PLT.plot call and then calling PLT.legend without manually specifying an order.
5
Use explicit styling arguments (color, linestyle, line width) and hex colors to control appearance clearly.
6
Improve readability with PLT.tight_layout to fix padding/clipping and PLT.grid(True) to add reference lines.
7
Apply global aesthetics with PLT.style.use (e.g., '538', 'seaborn', 'ggplot') or switch to a comic-like look with PLT.xkcd, and save outputs using PLT.savefig.

Highlights

A legend built from PLT.plot(label=...) stays correct even when line plotting order changes—avoiding a common mismatch bug.

PLT.tight_layout is a practical fix for padding problems that appear on smaller screens, preventing titles/labels from getting cut off.

Matplotlib styles (PLT.style.use) can quickly swap the entire visual theme, while PLT.xkcd produces a distinctive hand-drawn comic aesthetic.

Expanding the dataset from ages 25–35 to 18–55 shows Python’s salary advantage is concentrated in the mid-20s to mid-30s rather than evenly spread across all ages.

Topics

Matplotlib Basics
Line Plots
Plot Customization
Legends and Labels
Styling and Themes