Taking photos in RAW file format has become a very common piece of advice for any serious digital photographer. Few years ago, only DSLR cameras would bring you the chance to get photos in RAW format, but this is such a great feature that now, even point-and-shoot cameras are supporting this format. Moreover, if the RAW format is not natively supported, there is even hacking software allowing to get RAW photos from cameras not officially supporting it. There is also a relatively new camera type called MILC (Mirrorless interchangeable-lens camera) also supporting RAW photos, and the list keeps growing.
In the same line, there is now much more software support for the development of photos in raw format, with great free open source options like RawTherapee, offering even better features than proprietary commercial software.
There are many good posts out there explaining why you should shoot in raw and the pros and cons about it. Knowing that National Geographic photographers shoot exclusively in raw format tells you something about the goodness of taking photos in that format.
Here we are going to show you how to develop a raw photo "by hand" instead of by using a software product. We are not suggesting this as an alternative to develop your photos in a regular basis, but as an exercise to:
- Take a glimpse of what the raw image processing software has to deal with.
- Have a hands-on experience with raw files and the many subjects around it.
- Have a way to get what really your camera produces without the effects of what your developing software introduces.
- The Tools We Will Use
- What contains a RAW file
- A Simple Plan
- Strategy to Get the Three RGB Colors per Pixel
- The Photo we Will Develop
- Extracting the Raw Color Channels
- Building (wrongly) our First RGB Image
- Gamma Correcting our (first) Image
- Why We Say it Was a Wrongly Attempt?
- The Strange Case of Pinkish Highlights
- Building Correctly our RGB Image
In this exercise we will use:
ImageJ: An image processing and analysis tool written in Java. This is a free and very useful tool with a lot of powerful plugins. Here you will find our introduction to ImageJ explaining the features we use here.
RawDigger: Is a tool to visualize an examine raw data, with histograms and statistic information (like minimum, maximum, standard deviation, average) about the whole photo or rectangular samples.
If you want to replicate this exercise and you are not familiar with Iris or ImageJ we strongly encourage you to use the links we have provided above.
The term RAW is not an acronym, it is just an indication the image file contains —almost— non-processed sensor data; neither it is a specific file format, there are many raw file formats, for example Nikon® and Canon® use their own raw file format, each one with a different internal structure.
The sensor of a digital camera corresponds to what a frame of film is for an analogical camera. A digital sensor is composed by photosites (also called photodiodes, photosensors, photodetector or sensels), where each photosite corresponds to each pixel the developed photo will contain.
There are at least two kinds of sensors used by most of the digital cameras: the Bayer CFA (Color Filter Array) and the Foveon®, where the Bayer type is much more commonly used. For example, all Nikon and Canon cameras use the Bayer type sensor.
In the Foveon sensors, every photosite can decompose and measure the light hitting it in red, green and blue components.
In the Bayer CFA sensor, each photosite can only measure the amount of one primary color component of the light hitting it; in other words, each photosite is sensitive to only one primary color component in the light. Usually, the primary colors in a Bayer sensor are Red, Green and Blue (RGB).
In a Bayer type sensor, the distinct color sensitive photosites are usually arranged in a 2x2 pattern repeated all over the sensor surface. Currently, all Nikon® and Canon® D-SLR cameras use this type of sensor. However, there are variations to this pattern, for example in Fujifilm® X-Trans® sensors follow a 6x6 pattern. There are also Bayer type sensors with photosites sensitive to four colors, like CYGM (cyan, yellow, green and magenta) or RGBE (red, green, blue and emerald).
There are two types of rows alternating along the height of the sensor: a row alternating red and green sensitive photosites, followed by another row alternating green and blue sensitive photosites. As a consequence of this design, the quantity of green photosites is twice as red or blue ones.
In this exercise we will use a raw photo taken with a Nikon D7000, where the photosites are (starting from the top left corner) red and green in the first row (R, G1) and green and blue (G2, B) in the following row, exactly as in the image above. We can represent this pattern with the term GRBG, where the first G corresponds to the green sensitive photosite in a "red row" (the one containing the red and green photosites) and the last G corresponds to the green photosite in a "blue row". In some sensors, these green photosites, from the red and from the blue rows -by design- has not the same sensitivity, aiming for better accuracy of the final image. In this exercise we will assume they have exactly the same sensitivity. However, being able to access the raw data, we can —in the future— run some statistical tests and find if in fact they have the same sensitivity!.
A regular non-raw picture file, like in a ".tif" or ".jpg" format, is built on the RGB model, that means they save the red, green and blue primary color components of each pixel in the image, which once they are mixed reproduce each image pixel color.
The additive mix of the primary color components (RGB) of each pixel brings each pixel color in the image. Each pixel RGB component value represents the intensity or brightness of each primary color as part of the mix of the pixel color. For example, for one yellow spot in the color image, the red and green components —for that spot— have relatively greatest values than the blue component, because the yellow is obtained from the addition of red and green with little or no blue component.
A common way to model this is by thinking the image is composed by three layers with the same dimensions of the whole color picture, but each containing only the red, green and blue (RGB) values of each pixel. With this arrange, when the software is going to draw the color of a pixel on a given position of the image, it reads that position from each RGB layer to get the RGB pixel primary color components.
Each of this RGB layers is an image on itself, but as they contain information for only one color, they can only be visualized in a monochromatic way, usually with higher brightness representing higher component values.
However, in a raw file there is only one layer for the whole image, where each pixel on this layer represents the light metering from each photosite in the sensor, so the first pixel value corresponds to the reading from the red sensitive photosite, the following pixel corresponds to the green photosite and so on, according to the Bayer pattern we saw before.
As there is only one layer in the raw file, it can only be visualized as a monochromatic image. However, the brightness from one pixel to the following neighbor one can vary substantially, because they represent different color components, so in a yellow spot, the blue pixel will have a lower value and the immediately following green pixel will have a relatively higher value. This makes the distinctively appearance of the raw files when visualized with monochromatic colors. For example, the following image is a clip at 200% scale showing part of a strawberry from the image we will develop.
The pixel color values in the raw file are technically dimensionless quantities, but in technical papers they are expressed in ADUs (Analog to Digital Units, a "dimensionless unit"), this is in reference to the component in the sensor called Analog to Digital Converter (ADC), which converts the electric charge, produced by the hit of the light in each photosite, to Digital Numbers (DN) which is another "unit" used for the pixel color values.
In the case of the Nikon D7000 raw file we will use, we will work with 14-bit ADU values. However, the 12-bit format is also common for raw files.
In general terms, in a n-bit raw file, there are 2n possible ADU values, ranging from 0 to 2n - 1, where higher values represents higher brightness. e.g. 12-bit raw files can contain pixel color values from 0 to 4,095 ADUs, and 14-bit raw files can contains pixel color values from 0 to 16,384 ADUs.
As we said in the previous section, a regular RGB image is built from three layers with the same dimensions of the final image but containing the values for the RGB components of each image pixel. With the help of Iris, having those layers (as monochromatic images) is enough to build a regular color image which at the end can be saved in ".jpg" or ".tif" format, among others. So the plan is to build the red, green and blue layers from the information in the raw file and then we will merge them to get our RGB image.
If in a raw file, we have information for only one RGB color value per pixel, there are two missing colors per pixel. The regular way to obtain them is through the use of raw photo development software (either in-camera firmware or in desktop computer post-processing). This software interpolates the colors in the neighborhood of each photosite to compute (or rather estimate) the missing colors. This procedure is called de-mosaicing, de-bayering or just color interpolation for Bayer filter.
There are many demosaicing algorithms, some of them are publicly known or open source, while others are software vendor trade secrets (e.g. Adobe Lightroom, DxO Optics Pro, Capture One Software). If you are interested, here you have a comparative study of some of them.
In this exercise we will not make the demosaicing. That is something we can not do "by hand". Instead, we will do something called binning: we have already seen how any 2x2 pixel cluster in the raw file contains information for the three RGB channels, two for the green channel one for the red and one for the blue one, so we can imagine each 2x2 pixel cluster as corresponding to a virtual photosite metering the three RGB channels at the same time. In representation of the green channel we will average the two green readings.
From this point of view, we have —in the raw file— information for the three RGB channels for a picture with the dimensions of the real photo downscaled by 2.
The demosaicing algorithms are —of course— not perfect and may introduce some artifacts in the final image, for example they can blur parts of the image, show some edges with color fringing (false coloring) or very pixelated edges (looking like a zipper: Zippering artifact), show isolated bright or dark dots, show false maze patterns, etc.
By avoiding the demosaicing we will get an image that can be used as reference in comparison to the same picture but achieved using demosaicing. This way, we can find if some features in the latter were introduced by the algorithm or if they really came from the camera. This reference image can also help to tune-up the parameters available in a demosaicing algorithm. To make a fair comparison the demosaiced image must be first downscaled by 2.
The photo we will develop was taken with a Nikon D7000 using this Sigma 17-50mm f/2.8 lens with an aperture of f/11, 1/200s of exposition time, ISO 100, handheld, and with Vibration Reduction (or Optical Stabilization in Sigma terms) on. The target is directly illuminated by the sun at 10:22 AM. The raw file format is "14-bit Lossless Compressed Raw" and is named _ODL9241.NEF with 19.1 MB of size.
The picture has "formally" 4,928 x 3,264 pixels. We say formally, because physically we will find raw data for an image of 4,948 x 3,280 pixels. Probably, this is because the pixels close to the borders have not enough neighbours to make the interpolation, so 10 pixels at the both vertical ends and 8 pixels at both horizontal ends do not count for the final picture when using regular development software.
The photo we will develop is shown above. This image was developed with Adobe Lightroom 4.4, using the default settings but adjusting the White Balance by picking the gray patch behind the carrot.
In the following image you can click on the headings (e.g "Linear Histogram") to see the corresponding picture.
The histograms above are about the raw photo data, not from the resulting image. That's why there are two green channels. By looking at them, you can see the scene brightness range barely fits in the whole 14-bit ADU range of the file format.
The Log-Linear (logarithmic vertical scale and linear horizontal scale) at first glance seems to show strong shadow clipping, but that is just an effect of the horizontal scale trying to show too much data in very little space. We can notice that by looking at the statistics at the right of each histogram, it is not possible such shadow clipping because the minimum values do not even reach to zero.
In my camera I have the setting for exposure control steps established at 1/3 of EV, so setting the exposure one notch below would result in ADU values downscaled by 2(1/3) = 1.26 (at least theoretically). This way, the minimum value would be
15/1.26 = 12 and the maximum
16,383/1.26 = 13,003 (approximately). However, I prefer a very little highlight clipping —without loose of any significant detail— than lowering the exposure a notch (Exposure to the right: ETTR).
To simplify the handling of files in Iris, in the "File > Settings : Working path" field, set the folder path where your raw file is and where the intermediate files we will produce will be saved in. We will call this our Working Folder.
You can load the raw file in Iris by using the menu command "File > Load a raw File". In my case I load the raw photo file "_ODL9241.NEF". Once the image is loaded, at the right end of the photo you will find a black border: this is the "Optical Black" (OB) area.
The pixels in this area of the raw file contain the readings of masked photosites. These photosites have a dark shield over them, so the light can not hit them. They are used to determine the relative zero in the reading of the other regular photosites. In some cameras (e.g. Canon) the black level is a value that must be subtracted from the values of the regular pixels in the raw file, and is either a known value of the camera model or it must be computed from the OB pixels. For example, if the value in a regular pixel is 6,200 and the black level is 1,024 (either known or computed) the value of the pixel to be used in the development of the photo would be
6,200 - 1,024 = 5,176.
In the case of Nikon camera raw files, the black level has already been subtracted from the reading of the photosites, so we will use directly the pixel color values we will find in the raw file. As you can see the raw data is not completely raw!.
If you hover the mouse cursor above the top right corner of the image, just before the OB pixels, you will see the coordinates x:4948, y:3280 on the Iris status bar. That's why to get rid of the OB pixels we use the "window" Iris command:
>window 1 1 4948 3280
This command will clip and keep the image from the corner (1,1) to (4948, 3280). Now you will notice the OB pixels are gone.
Now, we need to extract the "four channels" from the 2×2 binned image. The Iris command "split_cfa" collects the pixels in the 2×2 Bayer pattern and save them as images:
>split_cfa raw_g1 raw_b raw_r raw_g2
The four parameters raw_g1, raw_b, etc. are the name of the files where Iris will save the pixel color values from the 4 pixels in the 2x2 Bayer pattern. Those files will be saved in the Working Folder we set at the beginning. For the GRBG pattern in the Nikon D7000 camera, the parameter raw_g1 stands for the green pixels in "blue rows", raw_b for the blue pixels, raw_r for the red pixels and raw_g2 for the green in "red rows". To load one of the green "channels" you can use:
Check how the size of the resulting image is the half of the original raw file.
You can use the sliders in the Treshold control to fit at your taste the brightness and contrast of the image. Remember that doing that will not change the pixel color values. To have a fair representation of the image, you should use 16382, 0 in the Threshold control.
Now we have two "green channels" saved: raw_g1 and raw_g2. In your working folder you will find the files "raw_g1.fit" and "raw_g2.fit" among others. We would just use one of them and discard the another, but instead, we will take the advantage of having two green readings to use the average of them (pixel by pixel), this will soften the noise in the green channel:
>add_mean raw_g 2 >save raw_g
Finally, we have now the three primary color of our binned image, they are in the files raw_r, raw_g and raw_b (with ".fit" implicit name extension).
We can merge our three RGB channels from the raw file to build a RGB image by using the command:
>merge_rgb raw_r raw_g raw_b
Voilá! there we have our hand-developed RGB photo!! (first try). As the pixel color values in the image goes from 0 to 16,383 (14-bits data), for a fair representation of the image we should use those values in the Iris Threshold control.
But wait, this picture is too dark, with a green/cyan color cast (look at the white patch behind the carrot)... it doesn't look like the image developed with Lightroom we saw before.
There are two reasons for this situation, one is a White Balance issue (the issue is really something more than just White Balance, but for the moment, let's deal with it as it were just that) and the other reason is the lack of Gamma Correction in the image.
Opening our raw channels ("raw_r.fit", "raw_g.fit" and "raw_b.fit") with ImageJ, we measure the average RGB values in the white patch (on the card behind the carrot), as described here. Approximately, the mean RGB values are
(7522, 14566, 10411).
In order to get a neutral color for this white patch, this three RGB values should be equal to each other. In the RGB model, the neutral values —corresponding to shades from black to white— have the same value for the three RGB components.
As the white patch is the brightest area in the picture, let's scale the lower color values to match the greatest one. We will multiply the red and blue components, of the whole picture, by the required factors to make match the RGB averages on the white patch. In this sense, the required WB (White Balance) factors are
ga/(ra, ga, ba) =
(ga/ra, ga/ga, ga/ba) where
(ra, ga, ba) are the RGB average measured on the white patch. This is equal to 14556/(7522, 14566, 10411) = (14566/7522, 14566/14566, 14566/10411), so the WB factors are
(1.936, 1, 1.399).
You should notice we are not sure if using these WB factors we will blow off-scale some color value. To be sure of that, we must find the maximum value in each RGB channel. Using ImageJ in similar way as we measure the mean RGB values of the white patch, we will find the maximum values are
(16383, 15778, 16383). We would also find the same values by just looking at the statistics at the right side of the raw-data histograms shown above.
As 16383 is our raw-data maximum value, we don't need any additional information to realize that any factor above 1 will blow up the red and blue channels, because their maximum values are both at the upper end of the scale!. So we need to adjust our RGB factors in a way their relative value still hold. Which means multiplying all of them by the same factor, and that factor must give us red and blue WB factors below or equal to one. So we must divide them by the maximum of the red and blue WB factors =
1.936. That way, our final WB factors are (1.936, 1, 1.399)/1.936 = (1, 1/1.936, 1.399/1.936) =
(1, 0.5165, 0.7226).
We use this WB factors to White Balance our image using the Iris menu command "Digital Photo > RGB balance...". By doing that, we get the following image.
The cyan-blue cast is gone! But we are not there yet, it is still too dark.
You can notice the relative brightness of some parts in the image has changed. As we have lowered the green and blue values, the cyan/near-cyan colors are darker (green + blue = cyan), for example look at the blue sky dish, and the same has happened with the neutral colors.
Multiplying the RGB channels is a 'poor man' way to color balance the image. In a real image editing software, a full-fledged color balancing algorithm would change the pixels chromaticity but not their luminance. We will see those concepts in following sections.
The image in the RAW file is expected to be photometrically (this is equivalent to radiometrically but referring to EMR in the visible spectrum) correct. This means the pixel values are directly proportional to the intensity of the light that hit each photosite. In other words, if in one spot of the sensor the amount of light that hit the photosites was the double than in another spot, the pixel values in the first spot would be twice as the pixel values in the second spot.
This characteristic is also called linear brightness and is required for the most part of the image processing. However, this is not always true and there are standard tests to find the response of a sensor with respect to known levels of luminance (which is what we wxpect to be linear), the result is a curve called OECF (Opto-Electronic Conversion Function). For some cameras this chart show very linear response and for other not so much. In the site of the German company Image Engineering you can see the OECF charts for the Canon 7D and Nikon D7000.
However, the human vision has a non-linear perceptual response to brightness and paradoxically linear brightness is not perceived as linear.
A Linear Perception of brightness occurs when the changes of the perceived brightness has the same proportion to the changes of the brightness value. For example, at the image above, the top strip has linear brightness, changing in the horizontal axis from black to white. This way, —for example— at one third of its length, the brightness value is one third of the way between black to white. However, the perceived brightness is not linear, e.g. it changes a lot more from 0% to 20% of its length than between 40% to 60% of it.
The bottom strip also has the brightness changing from black to white. But in this case, the brightness is Gamma Corrected, and now the perceived brightness changes more gradually, it is now perceptually linear.
The human eye perceives linear brightness changes when they occur geometrically. We can detect two patches has different brightness when they are different in more than about one percent. In other word, our perception of brightness is approximately logarithmic.
In the Gamma Correction model, the input is linear brightness, with values from 0 (black) to 1 (full brightness) and is applied over each RGB color component. The output is the gamma corrected (non-linear) brightness. The Gamma Correction is a power function, which means the function has the form y = xγ, where the exponent is numerical constant known as gamma (this is the function: x raised to gamma). The very known sRGB color space uses approximately a 2.2 gamma. However, even when it is said "the gamma is 2.2" the value for the gamma term in the formula for the Gamma Correction is 1/2.2, the inverse of the referred value and not directly the value.
Mathematically speaking, the gamma correction in the sRGB color space is not exactly a power function, you can read the details here in Wikipedia, but numerically speaking, it is very close to the function we saw above with a 2.2 gamma.
At the darkest input values, the correction adds rapidly growing amounts of brightness to the input. For example, a brightness of 0.1 is corrected up to 0.35 (+0.25), that behaviour goes until reach 0.236 which is mapped to 0.52 (+0.28). Then the brightness addition gradually decreases, and for example a brightness of 0.8 is mapped to 0.9 (+0.1).
Using Iris we can correct the gamma of our first image. For the metadata in our intermediate files ("raw_r.fit", "raw_g.fit", "raw_b.fit"), *Iris will make the conversion of the pixel color values to the [0,1] range of brightness (required as input in the Gamma Correction model), considering they are 14-bit and having a maximum value of 16,383 ADU.
To follow how the gamma correction is applied, we hover the mouse cursor over the gray patch (the middle one behind the carrot) and find the pixel values are around
(1516, 1530, 1527).
When the Gamma Correction is applied, the model dictates this values should change to
((1516/16383)1/2.2×16383, (1530/16383)1/2.2×16383, (1527/16383)1/2.2×16383) = (5553, 5576, 5571)
To apply the Gamma Correction we can use the command shown below or we can use the menu option "View > Gamma adjustment...". We choose the command, because it allows better precision when specifying the gamma value.
>gamma 2.2 2.2 2.2
After triggering the gamma command we hover again the mouse cursor over the gray patch, and find in effect values around what we expect (5553, 5576, 5571), this shows the image data has been transformed in the way we said. Also notice how we use the 2.2 value in the command, when -as we had seen- the Gamma Correction raise the brightness to (1/2.2).
For a fair display of the image we have to set (16383, 0) in the Iris Threshold control, as we show in the following image. The image is much better now, but looks dull for the lack of color saturation and contrast.
We can give the final touches to our image enhancing its color saturation and brightness contrast by using the Iris menu options "View > Saturation adjustment..." and "View > Contrast adjustment...":
After these type of adjustments we finally get the following image.
Comparing with our previous results, this image is very much better. However, the colors are not OK yet. For example, the spinach leaf has a very unsaturated green color. But as a first try, with quite a few steps, it is almost acceptable.
In the picture below we can move the slider and compare at pixel level our image with the one developed with Lightroom (LR). To allow a pixel by pixel comparison I have enlarged our image by a factor of 2. By default LR applies a sharpening filter to the image. Considering that, to allow a fair comparison, I have also applied some sharpening to our image using Photoshop (Amount:78%, Radius:2.4, Threshold:0).
Comparing side by side the pictures -as expected- there is much more detail in the LR version. The colors in the LR version are also better, but a little bit over-saturated. In the case of the carrot in the background, the LR version is too redish and the color of our version looks more natural.
At beginning of our first development try, we said we were doing it wrongly. We said that because we treated the RAW image colors as if they were sRGB colors: we put the raw pixel color values directly in the sRGB image. In proper words, the problem is the RGB colors from the raw RGB image are not in the same color space than those in the sRGB image we want to create. We will see that more clearly after some definitions.
I think it is funny to realize the colors are just a human invention, they really don't exist as such. The colors are a human categorization of their visual experience.
We call light to the electromagnetic radiation band between approximately 300 and 700nm (nanometers: 10-9m) of wavelength: the visible spectrum. The light that comes to our eyes, has a given mix with relative power for each wavelength in the visible spectrum, which is called Spectral Power Distribution (SPD). When the SPD has a dominant power presence of waves close to 450 nm we call it blue, when the dominant waves are close to 515 nm we call it green, and so on with each color we can see.
The RGB way of representing a color as a mixture of red, green and blue values is one of many existing color models. A color model defines a way to uniquely represent a color through the use of numbers. As analogy let's think about how to represent a position in a plane. For example, there are the Cartesian and the Polar coordinate systems: both are models to uniquely specify the position of a point in a plane by the use of numbers. In the same sense, there are many models to represent colors, all of them requires three values to represent a color. For example there are the XYZ, xyY, Lab, Luv and CMY color models, and the RGB is just another color model.
Some color models use absolute coordinates to specify a color, in the sense the coordinates in the model are enough —without any additional information— to uniquely specify a color. However, other color models use relative coordinates: they specify colors as the mix of other (primary) colors (e.g. RGB —Red, Green, Blue— and CMY —Cyan, Magenta, Yellow— color models). Because of that, those relative color coordinates do not uniquely specify a color, their exact meaning depends on the absolute color of the primaries —additional information!—. To get the absolute coordinates from the relative ones, we need the absolute coordinates of the primaries.
The names of all the aforementioned color models are acronyms of the three values used for the color representation: Obviously, three coordinates (or three degrees of freedom) are enough to uniquely specify any color. The CMYK color model, which uses coordinates relative to cyan, magenta, yellow and black, has 4 color components just for practical reasons: printing with the addition of black tint has practical printing advantages than using just the primaries of the CMY model (without the K), which however, is enough to unambiguously specify the same colors (assuming the same primaries).
A color space is the set of all the colors available in a given color model. In other words, the set of colors having a valid representation in that model.
We speak of a color space because a color requires three coordinates to uniquely specify it, so we can visualize the colors in the model with a 3D representation of the model coordinates. In that context, the set of valid colors get usually represented by a solid object (occupying a space) in the model system coordinates.
Some authors distinguish between absolute and relative (or non-absolute) color spaces. When the color model has absolute coordinates or when it has relative ones but there is a way to map those relative into absolute coordinates, they say the color model has an absolute color space. If the color model uses relative coordinates, and there is no way to get the absolute coordinates, or even being a way it isn't used, they say the model has a relative color space.
This differentiation is useful: in a relative color space we can only know the relationship between the colors in the space, (e.g. 'this color is greener than the other') but we can not assess characteristics of them for comparison with colors out of the color space, which is something we can do in an absolute color space.
For some authors, what we referred as absolute color space is the only accepted color space and the relative color space is not a color space at all.
Over-simplifying the whole story, the color values and models we actually use are very related to two "surveys", one of 10 observers and another with 7 observers conducted independently by William Wright and John Guild in the late 1920’s. These "surveys" were based on the question "Give me the RGB combination that matches this color". I find fascinating how despite some technical details were different in their experiments, and so few observers were included, the results from them are almost identical.
In the image above, the RGB coordinates are normalized to achieve
r+g+b = 1, this way, they are the fraction of red, green and blue (summing 1) required to match each (wavelength) color. Guild's values are based on measurements at only 36 wavelengths (approximately each 10nm), the values at each 5nm were obtained by interpolation and extrapolation using a graphical technique. Notice there are negative values! I promise to post details about that data in the future.
Their data was averaged, normalized in more than one sense, and finally, in the quest of some desirable properties this data was transformed to give birth not only the CIE 1931 standard RGB color space, but also the CIE 1931 XYZ model and its sibling the xyY model. These models represents the whole human vision color space.
Notice that between the original collected data and the final published standards there are only mathematical transformations, new data was not added. Many other studies have followed, but in the literature and many web sites you’ll find CIE 1931 XYZ color space making an appearance often.
As the universe of colors needs three values to represent them, we can "see" those models in a 3D space and find the color universe as a contiguous solid object. The Bruce Lindbloom's web site has nice and interactive representations of the universe of colors.
In the above image, the points on the surface corresponds to the most intense (saturated) colors, whilst those close to the core (where x = y = z) are the neutral ones, from black in the origin to white in the farthest point from the origin. The wall closer to the XZ plane has the red colors. The opposite, close to the YZ plane, has the green colors. The surface close to the Z axis has the blue colors.
When referring to colors, it is useful to realize each of them can be described through two main components: Luminance and Chromaticity. For example, we can imagine one image projector showing a scene on a projection screen. If we vary the intensity of the light beam, or move the projector closer or farther to the projection screen (while keeping the image on focus), we will clearly see how the brightness of the projected image will vary, and so the colors will have a changing luminance, while other color quality will stay equal, that quality is the Chromaticity. To be technically correct we will add we are assuming the projector beam and the projection screen has the same white color.
As we know the color specification has three degrees of freedom, or in other words, they need three values to be uniquely specified, our color decomposition in only two coordinates (Luminance and Chromaticity) seems to be missing something. What happens is the chromaticity is a two-dimensional entity, it is specified with two coordinates. In that way, we specify uniquely a color using three coordinates: two for the chromaticity and the third for the luminance.
In the above CIE xy Chromaticity space, the points in the horseshoe-shaped border curve, with a blue colored scale from 400 to 700nm, correspond to the color of each wavelength indicated by the scale, they represent the deepest or purest colors of monochromatic light. This curve is called the spectral locus.
The bottom straight line border is the locus of purples. They can not be produced by a single wavelength, they represent the mix of light with wavelengths at the both ends of the visible spectrum, so they are called "non-spectral".
A point in the way from the border of the space to the point on the position (1/3, 1/3) represents the color of mixed wavelengths, and the color at that (1/3, 1/3) represents the model white reference.
Color Gamut is the complete subset of color that can be represented in a given circumstance, as within a color space, the colors a camera "can see", or those that an output device can show.
A color gamut is usually represented for comparison with another particular gamut or the whole human vision gamut, and as we saw when discussing about color spaces, this requires to have the color gamut expressed in absolute terms.
It is very interesting and informative to compare two gamuts in a 3D space. However, that requires an interactive mechanism to examine the objects representing the gamuts or many views to allow the reader to see the differences.
To avoid the practical difficulties showing gamuts in 3D spaces, a usual representation is the gamut projection on the CIE xy chromaticity plane of the CIE xyY color model. However, in this plane the comparison is not really done between gamut spaces but between the chromaticities in them. The advantage of this form of comparison is they can easily be seen in the context of the whole human vision color gamut.
The gamut comparison in a chromaticity plane can be misleading because a third dimension is missing. At any given level of luminance only a subset of all the projected chromaticities are in the gamut. Given the absolute coordinates of a color, its xy coordinates may seem to fit inside a gamut whilst the third coordinate can show the point is really out of the gamut.
White is the color we perceive when we see a material —that do not emit light by itself— reflecting with equal relative power all the light wavelengths striking it, and the white point of an illuminant is the chromaticity of a white object under that illuminant. In simple words, white is the color of the illuminant light, and as with any color, its precise definition must be made giving its SPD.
Every color space, to be fully specified, requires a white point reference or —in other words— a standard illuminant. For example, for the CIE 1931 XYZ color space, the standard illuminant is called "E", which has Equal power for all the wavelengths in the visible spectrum. In a diagram of wavelength power vs wavelength this is a flat line. This is a theoretical spectrum and it was chosen by the CIE looking for some desired numerical properties in the color space.
The daylight is composed by sunlight and skylight. Depending on the balance of these components, the daylight can be relatively more reddish or more bluish. Reflecting this, there is not a single CIE standard illuminant characterizing daylight illumination, but a series ("D" from Daylight) of them.
The CIE illuminant D65 is intended to portray the average noon daylight, coming from diffuse skylight without sunlight, similar to a overcast sky. According to the CIE it "is intended to represent average daylight". The illuminant D55 portrays the average sunlight with skylight, and the illuminant D50 characterize direct sunlight. There is also the illuminant D75 portraying the "North Sky" light.
The CIE D illuminants are very commonly used, for example, the Adobe RGB and the sRGB color spaces use the D65 as reference white. On the other hand, the ISO 3664:2009 standard "Graphic Technology and Photography: Viewing Conditions" specifies the CIE illuminant D50 as the light source for viewing and assessing color for the graphic technologies and photography.
For the light emitting displays, the "illuminant" is the color of their white representation, which for RGB displays is the color at when all the RGB channels are at 100% intensity.
From a realistic point of view, the CCT characterization of daylight SPD of a specific site depends, among other factors, on the date of the year, time of the day, the sky state, the site altitude, pollution and the location of the site in the world. In other words, just because the illuminant was direct sunlight it doesn't mean CIE illuminant D50 will exactly describe that illumination.
To precisely specify a light source, we need to characterize its SPD. For example, there is data describing CIE A and CIE D65 illuminants with their relative intensity for each wavelength. But for practical reasons this is too cumbersome. Fortunately, there is a mathematical model for a theoretical light emitting object called ideal Planckian black body radiator, which radiates light of comparable spectrum to that emitted by natural sources.
The SPD of this ideal black body is only function of its temperature. This way, a SPD is precisely specified by just giving a temperature.
In a similar way as we see glowing colors in an incandescent metal object, the ideal black body with increasing temperatures, shows colors from reddish/orange ("warm") hues to more neutral ones, continuing to clear-blue ("cool") hues at the highest temperatures. Notice how —ironically— "warm" colors corresponds to lower temperatures and conversely "cool" colors corresponds to higher temperatures. The temperature is customary given in Kelvin degrees (°K). In a chromaticity space, the set of points representing the color of a black body at different temperatures, fits in a curve called "Planckian locus".
The temperature at which a Planckian Black Body Radiator and an illumination source's appear to match is called the correlated color temperature. Light sources with chromaticities falling exactly over the Planckian locus have true color temperature, while chromaticities near the locus have correlated color temperature.
Often, the White Balance control interface, in photo and image editing software, is based on the navigation over this space. In this sense, the control interface contains a temperature slider to walk over the Planckian locus, to White Balance the colors resulting from a scene under an illuminant with a given CCT, and a tint slider to walk perpendicular to the Planckian locus, to affect the relative intensity of the green color component.
However, as this control is used to White Balance the effect of an illuminant with a CCT specified by the temperature slider value, the effect of this slider on the image is to tint it with the opposite (complementary) color. This way, an illuminant with a relative low CCT, with a reddish/orangish color, is balanced with a blue color casting, while illuminants with high CCT, with a bluish color, are balanced with a yellow color casting. For this reason, the temperature slider is shown with a graduation from blue to yellow, showing the color casting that will be used to White Balance the image.
The tint slider is graduated from green to —its complementary color— magenta, showing again how it will affect the image.
For example, the White Balance control of Adobe Lightroom, DxO Optics Pro and Rawtherapee —in the image above— is based on the navigation on the Planckian locus as we described before. Rawtherapee has an additional slider based directly on the relative intensity of blue and red intensities, so there is an option to everyone taste. The Capture NX2 White Balance control shows exactly the factor that will multiply each, red and blue color components.
There are standards specifying bins along the CIE D and Planckian locus normalizing tolerances about when a light can be characterized with a specific CCT. For example, the ANSI C78.377A standardize the description of tints in LEDs and Solid State Lighting.
The designation of the CIE series D illuminants, are based on their correlated temperature (e.g. D65 is for 6504°K CCT).
In the RGB model, each color is described as a mix of red, green and blue primary colors, where each primary can vary from a continuum between 0 to 100%.
When there is the same amount of the three primaries
R = G = B the resulting color is neutral, from 0% for black to 100% for white.
When a spot on a RGB display has one of its primaries at the 100% and the others are at 0%, the spot will show the purest or deepest possible color corresponding to that primary. For example (r:0, g:100%, b:0) will show the deepest green possible on that display. As the display can only show colors based on the combination of its primaries, the depth of all the colors it can produce is limited by the depth of its primaries.
The RGB color model is like a "template" of a color model, the absolute specifications of the primary colors are required —and also the white reference as for any color space— to have an absolute color space. That's why there is one RGB color model but there are many RGB color spaces: Adobe RGB, Apple RGB, CIE RGB, ProPhoto RGB, sRGB, etc.
Technically speaking, the chromaticity of the colors a RGB output device can produce is limited by the chromaticities of its primary colors. This can be clearly seen on the chromaticity space shown in the image above. The surface of each triangle represents the set of chromaticities the corresponding color space can represent.
The vertex of each triangle corresponds to the absolute chromaticity of the primaries defined by the represented RGB color spaces. The borders of the triangles correspond to the deepest colors available on each space.
The sRGB color space is a very well known standard, which is assumed by many applications, including web browsers, as the color space of an image when it has not the corresponding information describing its real color space.
From a practical point of view, we must consider the RGB displays (e.g. CRT monitor, LCD or LED flat screens, cell phones, tablets, iPad, TV, Blu-Ray disk and so on) nowadays use 8 bits per channel RGB (if not less, as all modern analogue and even most digital video standards use chroma sub-sampling by recording a picture's color information at reduced resolution).
This means, those RGB displays can only show 256 intensity levels for each primary color (this is 2563 =
16.78 million of combinations). It doesn't matter if you are watching on your screen computer an image coming from file with 16-bit per channel (e.g. a 16-bit ".tif" file). For display purposes, the image data will be linearly quantized to a maximum of 256 values per channel.
For example, in the development exercise we are doing, the specular reflections on the sky-blue dish, show in Iris status bar RGB coordinates around (16383, 15778, 16383). With a threshold control set with (16383, 0), this is shown in the screen as (16383, 15778, 16383)
*255/16383 = (255, 245, 255).
From a practical point of view, the color profile is a file associated to an input or output device, with the required information to transform —absolute— color coordinates to/from the device color space. In the first case it is referred as input color profile and in the second as output color profile.
Let's suppose we have a scanner as input device and a color printer as output device. At a low level layer, the scanner and printer particular native color spaces do not match, it is required to translate colors form one space to the other. To make the things simpler and get a more useful solution to this situation, the concept of input color profile is to convert the input device color coordinates from its particular color space to a universally known one, as XYZ or Lab, and the output profile takes the colors from that space and take them to the output native device native space. This is like two guys with different native language using two interpreters translating from one native language to English and another from English to the other native language. However, the color profiles are data, information specifying how to make those translation, they are not code or software.
Only with this information an application can accurately represent colors in/from the device. When this information is not available for a screen (output device), the software applications may assume the device color space is equal or at least very close to the sRGB color space, so in this circumstance —when the device has not a known color profile— the accuracy of the colors in/from the device depends on how close is its color space to the sRGB.
For general use, there is mainly two kinds of color profiles, the Adobe DCP (DNG Camera Profile) and the icc color profiles. As you might guess the DCP is an input profile while the icc profiles can be prepared for input or output devices.
Unfortunately, no one can be told what the Matrix is.
You have to see it for yourself.
Very often, when working at low level with colors, arises the need to use a matrix. This matrix basically represents the change of a reference color model or space to another. Which in generic terms involves the transformation of one point coordinate from one system to another. I know, this starts to sound as a complicated mathematical subject, but don't worry, we will deal here with it as a notational subject, we just want you to know from were everything comes from.
Let's suppose we have the (x, y) coordinates of a point P with respect to the system XY and we want the coordinates (u, v) with respect to the system UV.
If you look at the diagram above you will notice we can compute u by adding and subtracting the length of some line segments. In particular we can get u this way:
With the help of trigonometric functions we can find the length of the line segments we need to compute u. For example we can say:
We would continue and get the formula of u, but instead, we can also look at those formulas at a higher level of abstraction and see that at the end u should be a linear function of x and y, like
u = m1·x + m2·y + c —for some m1, m2 and c— after all, x and y are the only lengths we have, right? Besides, the coordinate systems XY and UV have the same origin, so when
(x,y) = (0,0) also
(u,v)=(0,0), which gives
0 = m1·0 + m2·0 + c, in other words
c = 0. If we follow the same reasoning, we will get a similar formula for v, and at the end we will have something like:
In the case of colors, we use points with three coordinates, so to transform the (x, y, z) coordinates the formulas will be like:
This is the kind of formula that arises very often transforming color coordinates from one space or model to another, so it worths simplify its notation making explicit the matrix operation inside it:
In the above formula, as M is a matrix, the multiplication by xyz implicitly denotes a matrix multiplication, not a scalar one. This kind of matrix is often called Color Matrix.Here you will find this kind of formula to transform a color from CIE XYZ coordinates to sRGB linear coordinates, and here is used for chromatic adaptation.
When you have this matrix for transforming from xyz to uvw, you implicitly have the formula for the other way around, from uvw to xyz. You just have to invert the matrix at the other side of the formula:
The matrix multiplication is not commutative, which means that in general "A×B" is not equal to "B×A". To specify the ordering of matrix multiplication in words; "pre-multiply A by B" means "B×A", while "post-multiply A by B" means "A×B".
In a photo taken to a white object (e.g. a sheet of paper) while using white light (e.g. outdoor, directly illuminated by the sun in a clear day), we would expected the RGB values in the raw image file would have the same coordinate values
R = G = B. After all, "it uses the RGB model" where the neutral colors have as RGB color components the same value. Well, that really doesn't happens at all.
If you look at the histogram of raw image data about a white object that was photographed as described in the previous paragraph you will find something like this:
The red, green and blue sensor photosites have very different relative sensitivity, and this can be better noticed when looking at a white object in the photo. The most sensitive is the green photosite, then the blue and finally the red. This behaviour is like if the camera would have a greenish-cyan lens filter and by looking through it, a white object looks greenish and the red-wine colors has equal RGB values in raw data.
This is not a matter of sensor quality. If we look at the top of the camera sensors ranking at DxO Mark, we will find this characteristic pretty much alike in all sensor brands.
The following table shows data from that site, showing the multipliers required to apply to the red and blue raw values in order to get the level of the green ones for pixels corresponding to a white object under a CIE illuminant D50.
|Brand||Model||Red raw||Blue raw|
If you see the histograms above you can imagine an increasing exposure as the shifting of these RGB peaks to the right side. In such case, —eventually— the green peek will hit first the raw upper limit (in this case
16,383 is the right end of the histograms scale) and wont be able to shift anymore to the right. All the "overflowing" green values will be kept at the upper limit of the scale, and as a consequence, all those pixels with this truncated green value will not be able to have the relative balance among their RGB components to correctly represent the photographed source color.
If the raw conversion process doesn't take the right precautions, the spots on the image corresponding to this highlight clipping will show the relative absence of the green component, and they will look pinkish or magentish. This is because magenta and green are complementary colors, so the color balance will go to magenta when the green component is decreased.
In our image development, this kind of highlight-raw-clipping occurs in the specular reflections on the sky-blue dish. Before color balance, they have a raw RGB value around (16383, 15779, 16383), which is shown on the screen (multiplied by 255/16383, because —remember— RGB displays use 8-bits per channel) as (255, 245, 255) which is technically a little magentish, but it is so bright that almost looks white. After the white balance adjustment, that value changes to (1, 0.5165, 0.7226)*(16383, 15779, 16383) = (16383, 8141, 11845) which in the screen results in (255, 127, 184), looking clearly reddish/pinkish as shown in the following image.
To correct this problem, we have to White-Balance the image raw channels, find the minimum in the maximum channel values
min(max(red), max(green), max(blue)) and clip the channels, in a way that at the end, all of them have the same maximum pixel value.
What we are doing with this procedure, is dealing with the raw channels as if they had pixel values with different units, and by doing the white-balance we convert them to the same pixel value unit. After this, we can compare the channels, and knowing the maximum values corresponds to the highlight, we clip them to the smallest one in order to get the same maximum value in all the channels, achieving this way neutral highlights.
In the image we are processing, the maximum raw values after white balance are
(16383, 8141, 11845), so we must clip the red and blue channels to 8141.
We will use the Iris command
clipmax <old top> <new max> which affects the pixels with values greater than
<old top> assigning them the
<new max> value. In our case, we want the values above to 8141 to have the value 8141, that corresponds to
clipmax 8141 8141.
The Iris command
mult <factor> multiplies all the image pixel values by a given
factor. We will use this command to white balance the image, then we will clip the pinkish highlight, and finally we will reverse the white balance to get our raw channels in the original camera raw domain. This will allow us to work later with different approaches on the subject of different raw channel scales (caused by the different RGB photosites sensitivities) without the issue of pinkish highlights.
** Processing the raw Red channel ** >load raw_r >mult 1 >clipmax 8141 8141 >mult 1 >save raw_phc_r ** Processing the raw Green channel ** >load raw_g >mult 0.5165 >clipmax 8141 8141 >mult 1.9361 >save raw_phc_g ** Processing the raw Blue channel ** >load raw_b >mult 0.7226 >clipmax 8141 8141 >mult 1.3839 >save raw_phc_b
When we multiply the raw channels by (1, 0.5165, 0.7226) we are white balancing the image, and when we multiply the channels by (1, 1.93611, 1.3839) we are multiplying them with the reciprocal of the factors used for white-balance, reversing that white-balance:
(1, 1.9361, 1.3839) == 1/(1, 0.5165, 0.7226).
Now, we have our raw channels saved in the files (
raw_phc_b) free of pinkish highlights (the "phc" term in the file names stands for "pinkish highlight corrected").
Notice how we must white balanced the raw channels as input to our procedure while building the final image. Once we have the sRGB image we can white balance it —again— this time in the sRGB space and as part of the final touches, but the different raw photosite sensitivities must be handled with white-balanced channels in the camera raw space to avoid the appearance of color artifacts.
Another interesting thing to notice, is how —in the previous procedure— we have clipped to a maximum of 8141 ADU the red and blue channels, right when they had 16383 and 11845 maximum pixel values (correspondingly), this seems a harsh clipping!. We can use ImageJ to find out the image areas affected by that clipping. The following images show how only the specular highlights are clipped, there is no loose of significant detail in the image.
Sometimes it is overlooked that image areas with shadow clipping can suffer analogue problems to those we described for highlight clipping: When there is decreasing exposure, the peaks in the histograms are like moving to the left, and it is possible that only the red channel gets clipped to 0. In a similar way to highlight clippings, if the raw converter does not take the required precautions, the dark areas —with this shadow clipping— will show a color shifting to the cyan, the red complementary color, or will look greenish if both red and blue are clipped.