from Guide to Machine Learning on Feb 10, 2024

Are there really only 17 million colors?

We know that the typical image in the digital world has 3 channels with 256 unique values per channel, whether it be RGB, HSV, or YUV. This gives us roughly $256^3 = 16,777,216$ or around 17 million distinct colors. But, are there really only 17 million colors that exist? This couldn't be.

Color is continuous

Color lies on a continuous spectrum of possibilities, not on an oddly discretized one. The standard RGB format used today was defined by HP and Microsoft in 1996, called the "sRGB" color space — originally defined for monitors, printers, and the internet.

Notice that cameras are omitted, so the format only covers displaying, not capturing color. Knowing this, there's a more nuanced question we can ask: Can we only perceive 17 million colors? To answer this question, we can break down the mystery into two more specific questions:

1. Are the colors beyond this range outside the visible spectrum?
2. Are finer-grained colors imperceptibly different?

Capturing the visible spectrum

To capture the entirety of the visible spectrum, we can start from first principles: Define color in exactly the way that we perceive it. There are three types of color "cone" receptors in our eyes, which primarily sense colors at long, medium, and short wavelengths.

We can define a system that quite literally lists off the normalized response for each cone. For example, long = 0.1, medium = 0.3, short = 0.3 would define a color, abbreviated as (0.1, 0.3, 0.3). This long, medium, short (LMS) color space is actually used in practice, but there's a key problem: LMS can represent colors that are physically impossible.

To understand why, let's look at this graph. The x axis is the wavelength, and the y axis is the normalized response.

Let's say medium is 0.5. There are only two points where medium is 0.5, so there are only two valid combinations of long and short. We show both possible LMS combinations below: (0.3, 0.2, 0.5) and (1, 0.5, 0).

Note that there are no other possible combinations of long and short, outside of these two, if medium is 0.5. For example, (1, 0.5, 1) does not exist. There is no x value wavelength that can possibly achieve this combination of y value responses. Summarized more succinctly, there is no way to "traverse" colors in LMS. Changing just one or two components of LMS results in a physically-impossible color.

To address this, the Commission internationale de l'éclairage (CIE) defined a new basis X, Y, and Z, where we can smoothly interpolate between any pair of colors by changing the x, y, or z components1. The color space is illustrated below.

However, there's another problem: The CIE XYZ 1931 color space above can also represent physically-impossible, "imaginary" colors. This makes storing colors in CIE XYZ less storage-efficient: We're wasting bit representations on invalid colors that will never exist.

To amend this, CIE RGB and sRGB both define a region within the CIE XYZ's visible spectrum, where all non-negative linear combinations now define a physically-possible color. The latter of which is the RGB system we use today.

This is roughly how we arrived at most common color spaces today. In short, we started with the most expansive, direct translation of color perception into a color system. Then, we whittled away at the space's "size" until we ended up with a fully, physically-possible color space.

So, are there other perceptible colors, outside of the typical RGB? Certainly yes. The visible spectrum covered by XYZ and LMS is far larger than the space of colors covered by sRGB.

Discerning between colors

A 1962 publication "The Eye"2 plots a mean wavelength discrimination curve — in other words, for every wavelength, what is the minimum wavelength change needed for a human to discern a color difference? The plot shows that the minimum wave length is 1 nanometer, so let's use this number.

Given that the visible spectrum ranges from 380 to 700 nm, this means we can discern at most 320 unique colors. Clearly, this falls far short of the 17 million unique colors we can represent, so it suffices to say that sRGB's granularity doesn't appear to be rooted in our ability to discern colors.

Within the sRGB color space, this means that all colors we can reasonable discern are likely covered. However, are there colors we can discern, outside of the sRGB color space? To answer this, let's look at the wavelengths sRGB defines each of the primary colors to be3:

• Blue is ~467 nm
• Green is ~556 nm
• Red is ~607 nm

Since all colors in the sRGB color space are linear interpolations of these colors in chromaticity space, we couldn't ever represent a color beyond these wavelengths. This means that of the entire visible spectrum, we can only represent approximately colors corresponding to the wavelengths 460 to 600 nm. That's just 43% of the visible spectrum from 380 to 700 nm! Given we can discern between 1 nm wavelengths, there are plenty of colors we can see but not represent.

Takeaways

Tragically, there are many many more than 17 million colors — many colors beyond what we can represent with sRGB. Fortunately though, within the range of colors we can represent, it seems that sRGB is granular enough to cover the differences in color that we can perceive.

Knowing this, the priority is to expand the color space, not to increase its granularity. In light of this, a large number of color profiles have evolved to cover more and more of the visible spectrum. From sRGB 1996 to Adobe RGB 1998 to ProPhoto RGB in 2013, each set of color profiles grows larger than the last, with the most modern color spaces now covering a majority of the visible spectrum.

Given that we can represent so many more colors than we can discern though, this begs the question: Why waste space representing all those extra, imperceptibly-different colors? This leads us to the next idea: We should be able to "losslessly" compress images with the same visual quality. Let's see How image compression works.

back to Guide to Machine Learning

1. The visible spectrum in CIE XYZ is a convex hull, meaning that the line between any pair of colors lies fully in the visible spectrum.

2. There's a lengthy StackOverflow post in the topic of discernible wavelengths. I was hoping to find a primary source, but the cited publication From Davson, H., The Eye, vol 2. London, Academic Press, 1962 is only available in print. In fact, the figure was taken from another blog post "Color Perception", which came from the printed publication.

3. Take these numbers with a fat grain of salt. I just eye-ball'ed the sRGB Wikipedia illustration. This 2017 Ukrainian arXiv article finds the wavelength colors to be 611.4 nm for red, 549.1 nm for green, and 464.2 nm for blue, but it wasn't clear to me how they computed these numbers. With that said, their numbers look not too far off from my ballpark eye-balling, and we only need an approximate number anyhow.