Plotnine
Plotnine (v0.15.7)
From ad-hoc sketches to publication-ready figures
Plotnine allows you to transform raw data into sophisticated visual stories by utilizing the grammar of graphics.
The Grammar of Graphics: A coherent, structured system designed for describing and constructing graphical representations of data.
Whether you are looking for the [API reference] or a [Cheatsheet], Plotnine provides a powerful framework for Python users. Its syntax is intentionally designed to be similar to the highly successful ggplot2 package from the R ecosystem.
📊 Exploring the Workflow: Anscombe’s Quartet
To demonstrate Plotnine's capabilities, we will visualize Anscombe’s Quartet. This is a famous set of four datasets that possess nearly identical descriptive statistics—such as the mean and variance—despite having wildly different distributions.
In mathematical terms, for each dataset , the statistics are roughly:
🚀 Quick Start
You can generate a basic visualization with a single line of code.
from plotnine import *
from plotnine.data import anscombe_quartet
# Initial basic scatter plot
ggplot(anscombe_quartet, aes(x="x", y="y")) + geom_point()
At first glance, this scatter plot is confusing because the four datasets are overlapped. We need a way to differentiate them.
🎨 Adding Aesthetics
By mapping the color aesthetic to the dataset column, Plotnine automatically generates a legend to distinguish the groups.
(ggplot(anscombe_quartet, aes("x", "y", color="dataset"))
+ geom_point())
🔲 Declarative Subsetting (Faceting)
Instead of writing complex for loops to create multiple plots, you can use facet_wrap to repeat the visualization across separate panels.
(ggplot(anscombe_quartet, aes("x", "y", color="dataset"))
+ geom_point()
+ facet_wrap("dataset"))
Note: Since the panels now separate the data, the use of color becomes redundant.
🍰 Layering Visualizations
Plotnine treats plots as a series of layers. While global mappings are inherited, you can modify or add specific layers (like trend lines) to enhance the insight.
(ggplot(anscombe_quartet, aes("x", "y", color="dataset"))
+ geom_point()
+ geom_smooth(method="lm", se=False, fullrange=True)
+ facet_wrap("dataset"))
This visualization proves Anscombe's point: datasets with the same statistics can look entirely different.
🛠️ Refining for Publication
While the previous plot is great for Exploratory Data Analysis (EDA), publication-quality figures require more precision.
| Feature | Adjustment | Purpose |
|---|---|---|
| Colors | sienna & steelblue | Professional aesthetic |
| Sizing | size=3 (points) | Better visibility |
| Axes | scale_y_continuous | Precise break points |
| Coordinates | coord_fixed | Fixed aspect ratio |
(ggplot(anscombe_quartet, aes("x", "y"))
+ geom_point(color="sienna", fill="darkorange", size=3)
+ geom_smooth(method="lm", se=False, fullrange=True, color="steelblue", size=1)
+ facet_wrap("dataset")
+ scale_y_continuous(breaks=(4, 8, 12))
+ coord_fixed(xlim=(3, 22), ylim=(2, 14))
+ labs(title="Anscombe’s Quartet"))
🌟 "Going to Eleven" (Advanced Theming)
Finally, you can fully customize the theme to align with a specific brand or personal style.
(ggplot(anscombe_quartet, aes("x", "y"))
+ geom_point(color="sienna", fill="orange", size=3)
+ geom_smooth(method="lm", se=False, fullrange=True, color="steelblue", size=1)
+ facet_wrap("dataset")
+ labs(title="Anscombe’s Quartet")
+ scale_y_continuous(breaks=(4, 8, 12))
+ coord_fixed(xlim=(3, 22), ylim=(2, 14))
+ theme_tufte(base_family="Futura", base_size=16)
+ theme(
axis_line=element_line(color="#4d4d4d"),
axis_ticks_major=element_line(color="#00000000"),
axis_title=element_blank(),
panel_spacing=0.09,
))
📈 Summary of Evolution
✅ Next Steps for You
- Install Plotnine via PyPI
- Import your own dataset
- Apply the Grammar of Graphics to your project
Project Details:
- Developer: Hassan Kibirige
- Funding: The Open Source Data Science Company
- Resources: [PyPI] | [Source Code] | [License]