I Analyzed 2,491 Master Paintings to Find the Real Recipe
What every great painting shares — measured across 14,044 painting steps, 728 unique sources, and the actual recreation guides for famous works from Cennini to Sargent.
We just finished building grounded recreation guides for 2,491 famous paintings. Every step in every guide cites a source in our corpus of classical art-instruction texts. It is, as far as we know, the first dataset of this kind.
Before publishing it, we wanted to know what the dataset itself says about how master painters actually worked. What techniques recur? What materials are universal? What phase does almost every painting go through?
The numbers below are mined from the 2,491 records. Methodology is on the methods page; the full corpus is on the sources page.
The structure of a painting
Every recreation guide labels each step with one of seven phases. After enrichment, the distribution of all 14,044 steps across all 2,491 works looks like this:
| Phase | Steps | % of total |
|---|---|---|
| Refining | 2,945 | 21.0% |
| Finishing | 2,620 | 18.7% |
| First pass | 2,421 | 17.2% |
| Underdrawing | 2,297 | 16.4% |
| Underpainting | 2,129 | 15.2% |
| Varnishing | 1,286 | 9.2% |
| Drying | 100 | 0.7% |
Two findings that surprised us.
Underdrawing is universal. 92.2% of the 2,491 paintings have at least one underdrawing step. The exceptions are mostly engravings and watercolors. For oil on canvas, underdrawing is not a stylistic choice — it is the precondition for everything that follows.
Underpainting is nearly as common. 85.5% of the works have an explicit underpainting phase. This is the grisaille or imprimatura tradition the classical books document: laying in values in monochrome before introducing color. Modern direct-painting tutorials skip this entirely. The masters didn't.
The techniques that show up everywhere
We named-extracted the technique labels from every guide's criticalTechniques block. The top 15, aggregated across all 2,491 works:
| Technique | Mentions |
|---|---|
| Simultaneous Contrast | 989 |
| Glazing and Scumbling | 931 |
| Glazing (alone) | 562 |
| Scumbling (alone) | 510 |
| Fat over Lean | 379 |
| Chiaroscuro | 212 |
| Grisaille Underpainting | 165 |
| Layering | 134 |
| Monochrome Underpainting | 82 |
| Contour Drawing | 76 |
| Grisaille | 57 |
| Complementary Colour Juxtaposition | 56 |
| Color Contrast | 51 |
| Color Mixing with Complements | 43 |
| Color Harmony | 40 |
If we collapse spelling variants, glazing and scumbling appear together or separately in more than 78% of all paintings in the corpus (1,955 of 2,491). They are not obscure techniques. They are the technique. A painter who does not know what glazing is cannot recreate the canvases that hang in most major museums.
Simultaneous contrast leads the list. The principle that a color looks different depending on what sits next to it — Chevreul's central observation from 1839 — is the most-cited critical technique across the whole corpus. The Impressionists made it visible. The Old Masters were already using it implicitly. Both groups had to be taught it.
The pigments that built every museum
Across the 2,491 recreation guides, we tracked which pigments appear in the colorPalette field. Restricted to the top 20 named pigments:
| Pigment | Color mentions |
|---|---|
| Yellow ochre | 1,517 |
| Lead white | 1,315 |
| Vermilion | 1,172 |
| Ivory black | 1,048 |
| Titanium white (modern subst.) | 992 |
| Lamp black | 613 |
| Cadmium yellow | 550 |
| Raw umber | 434 |
| Burnt umber | 408 |
| Cadmium red | 366 |
| Viridian | 328 |
| Zinc white | 269 |
| Burnt sienna | 252 |
| Alizarin | 191 |
| Raw sienna | 179 |
| Crimson | 174 |
| Cerulean | 168 |
| Cobalt blue | 143 |
| Sap green | 107 |
| Verdigris | 96 |
(Ultramarine appears 2,073 times in raw color mentions, but the corpus treats it as a parent category — French ultramarine vs. genuine, etc. — so we excluded it from the named-pigment list to avoid double-counting.)
The working palette is small. Roughly twelve pigments handle the vast majority of the canvases in the corpus: a white, a black, the earths (ochre, sienna, umber in raw and burnt), a strong red (vermilion or cadmium), a strong yellow (cadmium or chrome), a strong blue (ultramarine or cobalt), a strong green (viridian). With those, almost anything in our corpus is technically reachable. The classical books made this point. The data confirms it.
Lead white is not yet over. It appears in 1,315 of our recreation guides — more than half — usually with a modern-equivalent substitution of titanium or zinc white noted. The substitution is honest about its compromises; titanium does not behave optically the way lead behaves. The masters chose lead for a reason. Modern painters who study them should know what reason.
How long is it supposed to take
The estimatedTime field on every recreation guide is parsed into a low-and-high hour estimate. Across all 2,491 works:
| Statistic | Low estimate | High estimate |
|---|---|---|
| Mean | 26.6 hours | 39.9 hours |
| Median | 20 hours | 30 hours |
| Min | 4 | 6 |
| Max | 100 | 150 |
So the average famous painting on Apprentice is somewhere between 20 and 30 hours of focused easel time to recreate to a working-study level. Three to six sessions for most. Twelve to twenty for the harder ones.
Per-artist, the spread is wider than we expected:
| Artist | Works | Avg recreation time |
|---|---|---|
| Vermeer | 32 | 50.0 hours |
| Goya | 33 | 44.7 hours |
| Velázquez | 34 | 41.9 hours |
| Hopper | 33 | 37.1 hours |
| Constable | 33 | 30.7 hours |
| Homer | 35 | 28.3 hours |
| Gauguin | 32 | 25.2 hours |
| Munch | 33 | 25.0 hours |
Vermeer is the slowest in the corpus, which surprises nobody who has tried. Goya next, for different reasons — the alla-prima energy of the late Black Paintings hides hours of correction. Modern painters tend to assume Impressionism is faster than the Old Masters; the data is more equal than that.
What the sources actually are
Every claim in every guide cites a source from our corpus. The most-cited individual sources across the 2,491 records:
| Source | Cited in (records) |
|---|---|
| The Practice of Oil Painting (Solomon, 1910) | 2,491 |
| Laws of Contrast of Colour (Chevreul, 1839) | 1,645 |
| Wikipedia: Oil painting | 1,222 |
| The Practice and Science of Drawing (Speed, 1913) | 787 |
| Wikipedia: Landscape painting | 587 |
| The Science of Painting (Vibert, 1892) | 562 |
| Wikipedia: Composition (visual arts) | 518 |
| Wikipedia: Color theory | 384 |
Solomon shows up in every guide. That is partly because his book is the broadest of the ten, partly because his structure (grisaille → glazing → scumbling → varnish) matches the structure the generator is asked to produce. Chevreul appears in two-thirds of the corpus because the principle of simultaneous contrast applies to almost any painting with adjacent colors — which is to say, almost any painting.
The full source library, with download links, is at sources.
What this dataset is for
We built the recreation guides to be useful to a person at an easel, not to be a research artifact. But the aggregate is its own thing — a sketch of what master-painting practice looks like in distribution. Universal underdrawing. Near-universal underpainting. A small palette of pigments. Glazing as the technique under nearly every famous canvas. Simultaneous contrast as the single most-cited principle of all.
One-line summary, if we had to: drew first, painted monochrome second, glazed color third, scumbled cold passages last — with a palette of about twelve pigments and an eye calibrated for what each color does to its neighbor. That's most of Solomon. It's implicit in the 2,491 recreations.
Individual guides are at /collections. The source library is at /sources. The pipeline is documented at /methods.