Hello, Pyplot!
Altair is not the only Python library that we can use to visualize data.
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations. Within Matplotlib is Pyplot or matplotlib.pyplot
, a collection of functions that make matplotlib work like MATLAB. Each pyplot
function makes some change to a figure, such as creating a figure and its plotting area, plotting some lines, decorating the plot with labels, etc.
In this lecture note, we will demonstrate very briefly what pyplot
can do to show how it is different from Altair.
- Import Pyplot
- Load Dataframe with pandas
- Plotting basic charts
Interactivity, animation, mathematical operations, etc. and other things you can do in Matplotlib will not be discussed in INF100. However, you may find this Python library incredibly useful for scientific publication, data science work, and more, so feel free to explore a more detailed tutorial of Pyplot in your own time!

Import Pyplot
To begin, we need to import the libraries needed to use their functions: pandas
for dataframes and matplotlib.pyplot
for visualization.
Note: First you will need to pip install pandas and matplotlib in the terminal
import pandas as pd
import matplotlib.pyplot as plt
Load Dataframe
As in the Data Analysis lecture notes, we load in our dataset as a dataframe using a built-in function from pandas: read_csv()
. Let’s read in the Hawks dataset from that lecture!
hawks = pd.read_csv('http://raw.githubusercontent.com/vincentarelbundock/Rdatasets/refs/heads/master/csv/Stat2Data/Hawks.csv', index_col=0)
hawks.head()

Plotting
Generating visualizations with pyplot is very quick. We run a pyplot method called plot()
which plots y-values against x-values as lines and/or markers. This method takes in the data attributes you want to plot e.g., the Wing
and Weight
columns in the hawks
dataframe.
Then, we use a pyplot method called show()
to display the chart.
plt.plot(hawks.Wing, hawks.Weight) # plot(x,y)
plt.show()

So…by default, plot()
will generate a blue line chart. In our case, a line chart is a rather chaotic way to visualize the relationship between Wing and Weight, which are two quantitative/continuous variables. Note: a line chart is more suitable for showing trends between 1 quantitative variable and 1 ordinal (ordered categorical) variable, such as income over years.
A scatterplot is a better way to go about showing the trends between two quantitative variables. We can turn the line chart into a scatterplot by adding more arguments into plot()
to customize the visualization’s marks and channels. For example, 'o'
tells pyplot to use circle marks. And adding a “g” in 'go'
makes these circles green.
plt.plot(hawks.Wing, hawks.Weight, 'go')
plt.show()

We can customize this chart even more, say adding x-axis and y-axis labels to the chart! And of course, we can’t forget a title!
plt.plot(hawks.Wing, hawks.Weight, 'go')
plt.ylabel("Weight")
plt.xlabel("Wing length")
plt.title("Relationship between a hawk's weight and wingspan")
plt.show()

Instead of customizing a chart manually, we can also plot a variety of chart types like scatterplots, bar charts, correlations, and histograms using methods more specific than plot()
. For example, scatter()
, bar()
, xcorr()
and hist()
.
We can do more than plot just a single chart too. If we want to combine multiple charts into one figure, we can first create a layout using figure()
. Then, we use subplot()
to position each individual chart in the figure. For example:
plt.figure(figsize=(9,3)) # Create a 9x3 inch figure
plt.subplot(1, 2, 1) # The figure will have 1 row and 2 columns. Place scatterplot 1st from the left.
plt.scatter(hawks.Wing, hawks.Weight)
plt.subplot(1, 2, 2) # Place histogram 2nd from the left.
plt.hist(hawks.Wing)
plt.show()
