[[!meta Error: cannot parse date/time: ]] [[!meta Error: cannot parse date/time: ]]
In a previous article we introduced a "A Simple DSLR Camera Sensor Noise Model" and then we used this model to analyze a camera sensor raw noise in "Use of Linear Regressions to Model Camera Sensor Noise". Those articles deal with the theoretical and practical basis to analyze noise from raw image samples.
The input to that analysis are samples of photographs taken at different exposition levels, from which we compute the variance and mean of the photosite readings. Until now, we have prepared the samples and computed their variance and mean using Iris Software. But that workflow was very labor intensive, error prone, and inhibit us to further analysis like the study of the covariance between channels in the same sample because Iris software does not bring that information. Furthermore, Iris is a windows only software, which prevents users with other OSs to collect noise data using this tool.
To solve those issues, we have prepared —and still is a work in progress— a R package called
imgnoiser (from IMaGe NOISE R package). In this article, we introduce you to its usage, which automatizes the acquisition, analysis noise statistics, letting us to focus in the study, instead of devoting a lot of time to collect the data or to prepare code to study that data.
Without the use of this package I could spent a whole morning just collecting the noise statistics. Now, with the help of
imgnoiser it takes me only few minutes and brings additional and more accurate data than Iris.
You can install the package from github using:
imgnoiser repository wiki, in github, you will also find the package vignettes including the list of the
vvm classes components, along with their functions and its parameters.
If you install this package, you will have in your own R session, detailed documentation of the package functions, their parameters, their default and possible values, the package options and so on. All of that is, or eventually will be, available in your R help system.
This article explains general concepts and the package workflow usage with "real life" examples. In that sense, we will mention the main functions, their main parameters and so on, but the minute technical detail of all the functions and their arguments is not the focus of this post; please read about that in the R help system or the package vignettes.
If using this package you find any issue please file it here.
In the R code excerpts, the function output or messages are prefixed with
#> while the real comments are prefixed just with
- Reproducible Research
- Collecting the Image Samples
- The Package Options
- Getting the Raw Image Samples
- Brief Introduction to the Usage of R6 Classes
- Digesting the image Samples
- Digesting Samples with the hvdvm Class
- Usage Common to the vvm and hvdvm Classes
This package is built having in mind the reproducible research concept. To make a research reproducible, all the code and data to reproduce the research figures must be delivered, including a data dictionary and an explanation of the complete software development environment, its configuration and the whole set of instructions which generated each intermediate or final tables and images.
The focus of
imgnoiser is to factorize most of that software environment, configuration, data definitions and conventions in one single point, to easily share and reuse noise analysis papers without the need to include, define and explain repeatedly the software environment and instructions to reproduce most of the results.
Before the existence of the
imgnoiser package, after taking the photographs for the noise analysis, we cropped from them the region of our interest (ROI) using Iris Software. Now we can use dcraw. Dcraw is a very well known open source and free software product, with its own wikipedia entry, capable to develop raw image files from most of the DSLR camera brands an models. Many open raw photo decoders use dcraw under the hood. For example (quoted from the dcraw web page), ACDSee, Adobe Photoshop, Picasa from Google and Rawtherapee among many others.
Dcraw is developed and maintained by David Coffin using Ansi C code, this make it very portable to any OS. I am not very familiar with Linux, but I understand it can be compiled in just one command line, please check the dcraw web site, which explains that.
You can find in the web the way to get the binaries in other OSs. For example, this page explains how to compile dcraw for OS X Mavericks, and in this other one you can find a link to download the windows executable for 32 and 64-bits systems, this is provided by Axel Rietschin, which is a trustworthy source, and is the version I will use for this post in a Windows 7, 64-bits system.
The way to take the photographs to get the image samples is explained here and also in a vignette in the
imgnoiser package that you can read here. We will assume you have just taken the photographs and you want to know how to handle them in order to process the samples with
imgnoiser package contains a set of options with factory settings you can change to customize the package at your taste.
For example the plotting color palette
'color.pallette' and the default RGGB channel labels
'RGGB.channel.labels' are among those options. Very often, they are used by the package functions as default values. This way you can fit those options and use less arguments while calling those functions.
imgnoiser.option() function without arguments, you will get a list of all the package parameters. If you call it with the name of an option you will get its value, or
NULL if it doesn't exists. If you call it with an option name and a value you will set that option with that value.
# Get the complete list of package options imgnoiser.option() #> $fit.model.family #>  "lm" #> #> $fit.model.data #>  "std.var" #> #> $fit.model.name #>  "standard" #> #> <Option list truncated here> # Get the current confidence level option imgnoiser.option('conf.level') #>  0.95 # Set the confidence level to 98% imgnoiser.option('conf.level', 0.98) # Check the new value imgnoiser.option('conf.level') #>  0.98
You can save the package options and load them in a different R session. Use the
imgnoiser.options.save() function to save the options and
imgnoiser.options.load() to load them. Check the R help system for the details.
Dcraw is mostly used to process the raw file and get the final
.tif image. However, we want it to extract the raw image without any change of this raw data in the process. To get that we just have to use the
-D -4 command options. For example, I will open a command line window and set the folder containing the
.NEF (Nikon raw format) photographs I want to process, as the current working directory: you can shift+right+click on the folder in the Windows file Explorer and choose Open command window here from the popup menu that appears. Then I use:
for %f in (*.NEF) do dcraw -D -4 %f
Previously, I have set the path to the dcraw executable in my system PATH environment variable. If you don't wan't or don't know how to do that, just prefix
dcraw with the path to the 'dcraw.exe' file, as in
... do C:\path\to\dcraw -D ....
We can't just use
dcraw -D -4 *.NEF because dcraw does not understand Windows wildcard characters like
*. If you want to put that command line in a batch file, you must to use
Now my folder with the
.NEF files also contains
.pgm ones, containing the raw images. The
.pgm format is an open format and stands for Portable Gray Map. In opposition to
.NEF, this is a format that
imgnoser can read.
For each processed
.NEF file now there is a
.pgm one with the same base name as in:
05/11/2013 12:39 p.m. 16,331,695 _ODL5762.NEF 08/01/2015 05:58 p.m. 32,458,899 _ODL5762.pgm
Now we should find the region of interest (ROI) in the images to later crop them from each photograph. To do that we should pick a "well exposed" image, this means one with the image data in the third quartile along the horizontal axis in the image histogram. We are looking for the area of the image most evenly lit, otherwise we will measure variance caused by shadows in the image instead of noise variance.
Remember the raw image is composed by only one layer (there are not red , green and blue layers as in RGB images), like in a gray image, where the pixels follow a 2x2 color filter (channels) array pattern. First we will separate those four channels using the
path <- 'path/to/samples' cfa <- imgnoiser::split_channels('_ODL5719.pgm', path) str(cfa) # List of 4 # $ ch1: int [1:1640, 1:2474] 962 986 948 934 949 1008 944 959 984 961 ... # $ ch2: int [1:1640, 1:2474] 1940 1998 1983 1984 1972 1980 1975 2033 1993 1992 ... # $ ch3: int [1:1640, 1:2474] 1963 2015 1997 1990 1994 2035 1982 1955 2034 2027 ... # $ ch4: int [1:1640, 1:2474] 1469 1437 1472 1527 1479 1528 1477 1502 1450 1524 ...
The above list of channels, are labeled according to 2x2 channel pattern in the image starting from the image top left corner:
+-----+-----+--- | ch1 | ch2 | +-----+-----+--- | ch3 | ch4 | +-----------+--- |
Most of the cameras have the RGGB pattern array. This way
(ch1, ch2, ch3, ch4) == (red, greenR, greenB, blue), where
greenR stands for green in the red row and —naturally—
greenB stands for green in the blue row. If you are not sure if that is the case for your camera, remember that for a white target the channels with highest and similar values correspond to the green channels, then the blue and finally the red one. If you look the values in the result from the R code above, it is clear they have the RGGB pattern.
Now we will save one of the green channels using the
writeTIFF function in the
tiff package. That function expects values in the
[0,1] range. Accordingly, we will divide each pixel value in the greenR channel by
16,383 ADU which is the maximum possible in the 14-bit
library(tiff) matx <- cfa$ch2 / 16383 tiff::writeTIFF(matx, 'green_r.tif', bits.per.sample = 16L, compression='LZW') #  1
Now we have a
.tif image which we open using ImageJ Software. We will remove pixel values variance caused by noise to get the overall illumination in the image. Using the FFT bandpass filter we will keep the image features with a size above 6 pixels (menu command "Process » FFT » Bandpass Filter..."), then we will see the resulting image using the menu command "Plugins » 3D » Interactive 3D Surface Plot", with this tool we can interactively rotate and scale the rendered surface, where its elevation represents the light intensity.
It is clear the reddish area is the more evenly lit region. The holes in the surface are dirty spots in the camera sensor.
Using this 3D view tool, we will look for a region, of at least 32x32 pixels, evenly lit and without including a dirt spot. Proceeding that way, I select a rectangle as the best region, and find out this corresponds to an area of 144x114 px with the top left corner at (496,726). From this area, finally I choose a 40x40 px square (1600 pixels) at (496, 726).
Notice we are working with pixels from one channel only, whose dimensions are half the ones of the whole image. To map this ROI in the channel to the whole image we have to multiply the coordinates and the region dimensions by 2. This way we get a 80x80 pixels square with its top left corner at (992, 1452).
If you really want to find the most evenly lit area in your shots, after the bandpass filter described above, you can convert the resulting image to floating point pixel values with the menu command "Image » Type » 32-bit", duplicate that image and apply "Process » Filter » Variance" with 40 pixel radius, then take the square root to that image to get the standard deviation ("Process » Math » Square Root") and divide it by the other bandpass copy, to get the relative standard deviation. Set the brightness and contrast to a proper image presentation ("Image » Adjust » Brightness/Contrast"). Now look for the darkest and flattest area with at least 32x32 pixels, look how flat it is with the 3D viewer tool.
In other article I mentioned finding an area with the smaller variance, but the issue with that technique is that the darkest areas get a lower variances without being necessarily more uniformly lit, computing the relative standard deviation as indicated above, counterbalance that fact.
The coordinate system origin in ImageJ is the (0,0), corresponding to the pixel at the image top left corner. This means that if we choose both coordinates with even values, the samples will also have the RGGB pattern. This is very important to do not later get confused because the samples have an unexpected channel pattern. It is also important to choose an area with even dimensions, that way we will have in each sample the same quantity of pixels from each channel.
Now we have to clip this region from each of the photographs. To that effect we will use an Image Magick tool. This is also free software, available for Linux, Windows and Mac OS. To clip our ROI, from the raw images we got from dcraw, we have to execute the
mogrify command this way:
mogrify -extract <width>x<height>+<horiz pos>+<vert pos> <image file name>
<height> are the clipping dimensions, and
mogrify -extract 40x40+992+1452 _ODL*.pgm
The files resulting from this command, replace the the original ones. Instead of the set of big (32MB)
.pgm files, now I have a set of small 4KB files:
08/01/2015 09:40 p.m. 3,215 _ODL5752.pgm 08/01/2015 09:40 p.m. 3,215 _ODL5753.pgm
If you want to keep your original, big
.pgm files, perhaps to use another image area as an additional set to your samples, you can use the
path option in the
mogrify command. For example, if you add the option
-path crops the resulting crops will be saved in the crops sub-folder, leaving your original
Now we have the sample images ready to be processed by
imgnoiser, you can download them from here (134 KB). Notice that except of locating the ROI, the required activities are just a couple of command lines that take just a few seconds of processing.
Some of the
imgnoiser functions are organized in R6 classes, which means its usage is slightly different to other R packages. However, I am sure you will find this more convenient. In this section we will briefly learn the general usage of this classes. What will see is applicable to any R6 class not just those in
The first step first to be able to access to the services in a class is the creation of an instance of that class. This instance is also called class instance object or just object for short.
Typically the object creation is done through the use of the
new function. For example, the
imgnoiser package contains the
vvm class. So, we can create an instance object of the
vvm class using:
my.object <- vvm$new(<your creation parameters>)
Now, the variable
my.object contains an object of the
vvm class, which we will use to access to the class services. During the usage of this object, it will carry, inside itself, the information resulting from the functions we will call using the object. For example, lets suppose we have already processed some image samples and obtained some noise statistics about them, and now we want to fit a linear model over those statistics. At this point in time,
my.object will already contain the image samples statistics and we can fit the model just using:
my.object$fit.model(<your model fitting parameters>)
The point here is, we don't have to keep R variables holding the image samples statistics. That information is kept inside
my.object, so we can just call the
fit.model() function without the need to provide that information as arguments in that call. Also, the result of the model fitting is now also inside the object. Again, we don't need to keep those fitted models in other R variables.
As a final example, lets suppose we want to plot the image samples statistics and the predictions from the fitted model. As the fitted model, and the image statistics, are kept inside the object, we can just call the plotting function with arguments focused only on the plot. i.e. We don't have to provide the statistics and the fitted model as arguments to the plotting function.
my.object$plot(<your plotting parameters>)
As you may see, this is very convenient and more easier to use than continuously providing the different input data required by the class functions and saving their results.
Notice that even when
plot are functions of the
vvm class (technically they are methods) we call them using the object we created at the very beginning. That makes sense considering each object holds the data produced in its previous usage.
A class can also contain variables. For example, the
vvm class includes the
var.df variable, containing a data frame with the noise variance information. As usual, we access to that class component through an object,
my.object in our examples. As with any R variable, we can read the value in the object variable, for example, to pass it as argument to other function or to set other variable, just by itself or as part of an expression:
my.var <- my.object$var.df other_var <- subset(my.object$var.df, <my subset conditions>) some_function(..., my.object$var.df)
imgnoiser package most of those variables are read-only. However, technically its is possible to write or set values on them, eventually we might end up having some of them.
my.object$some.variable <- some_value
You can print the instance object to get a list of its components to remember what you can do with it.
my.object #> <vvm> #> function: #> digest() #> digest.as.rgb() #> digest.from.rgb() #> exists.model() #> fit.model() #> get.model() #> get.model.predictions() #> initialize() #> plot() #> print.model.summary() #> remove.model() #> variable: #> channel.labels #> cov.df #> green.channels #> merged.var.cov.df #> model.list #> RGGB.indices #> var.df #> wide.var.df
To get detailed documentation about this components, you can use the R help system, for example:
There are two way to process the image samples, each one is represented by one class in the package, the
vvm and the
hvdvm class. In both classes the function that process the raw image samples is named
digest(). Except for the way how they process the samples, both are very similar. However, each way of processing brings different output, in their values as in their meaning.
vvm class computes the regular mean, variance and covariance in the image samples, expecting a quadratic relationship between the pixel values variance and their mean. The theoretical basis of this approach is explained here and the practical application of that model is explained here.
vvm class can digest the raw image samples in more than one way. In the simplest way it process the data as-is in the raw image samples. However, this class can convert the raw images to a RGB space an the process the resulting RGB samples, this help us to track the image noise from its origin in the raw image to the final sRGB or Adobe RGB profile.
We will process the samples with the
vvm class. Lets create a
vvm object with name
vvm.obj. During the instance creation we can specify the channel names, which of those are the green channels and so on. The easiest way to do all that, when the samples have the RGGB pattern —as in our case— is with the parameter
has.RGGB.pattern set to
vvm.obj <- vvm$new(has.RGGB.pattern = TRUE)
In the call to the
digest() function we will specify the samples we want to get processed by giving the alphabetical range of their base name, their path and their file name extension. In the following example we want to process all the images between (and including) the files from 'ODL5719.pgm' to 'ODL5767.pgm' in the './samples/' folder.
vvm.obj$digest(file.name.from = '_ODL5695', file.name.to = '_ODL5767', file.name.ext = '.pgm', file.path = './samples/')
A progress bar will appear while the function is crunching the samples, it takes approximately 5 seconds the processing of 200 samples.
Notice that just by calling three functions we have done all the noise data collection! Now we can sit back and take a look to the plot of the variance against the mean for each channel in the image samples.
You may notice, in the above graph legend, a channel labeled
Green Avg this "channel" shows the values averaging both green channels. In a later section we will see in detail the
The following image shows the kind of image sample we don't want to include in the set of digested samples.
A partially or totally clipped sample have smaller variance than it should have if such clipping would not occurred, also their mean value is distorted.
Of course it is not possible to check always each sample histogram to discard these cases. The
digest() function parameters
max.raw helps to detect this situation. If more than the 2% of the pixel values in a channel sample is outside the given
[min, max] range, the sample is discarded. This parameters also help us to process only the noise in a particular segment of its whole range, for example close to the origin.
These arguments must be numeric vectors, each with one or four values. If only one value is given, the same one is used for all the channels, otherwise each channel have its own minimum or maximum value.
To get the data computed by the
digest() function, part of which is shown in the plot above, we can check the
var.df and the
cov.df object variables. They are data frames containing —correspondingly— the variance and covariance data collected from the images samples.
|_ODL5695.pgm||Green R||Green B||-14.2368|
|_ODL5695.pgm||Green R||Green Avg||301.9246|
Notice those data frames are in a rather narrow form. A sign of that is they use more than one row for data from the same image sample. In the case of the
var.df data, you can get it in a wide form using the
This table contains exactly the same data as
var.df, it has just a wide shape. Both forms are useful for different computation and plotting requirements. For example, using the wide form wen can easily plot the relationship between the mean values in the red and blue channel with respect to the green one.
ggplot(vvm.obj$wide.var.df) + geom_point(aes(x=green.avg.mean, y=red.mean/green.avg.mean), colour ='#BB7070') + geom_point(aes(x=green.avg.mean, y=blue.mean/green.avg.mean), colour ='#7070BB') + xlab('Green Avg mean [ADU]') + ylab('Ratio of mean values to Green average [ADU]') + labs(title='Ratio of mean values to Green average' ) + theme(plot.title = element_text(size = rel(1.5)))
Also, we may need to have the
var.df and the
cov.df information merged in a single data frame. We can get that through the
merged.var.cov.df variable. However, this data frame can be expensive to build (requiring some seconds) and for that reason it is built at the variable first reading, subsequents use of these variable will just reuse the computed data without requiring a perceptible computing time.
|_ODL5695.pgm||Red||Green R||0.3391||696.1||1411||314.5||618.1||Red, Green R|
|_ODL5695.pgm||Red||Green B||36.1396||696.1||1415||314.5||597.3||Red, Green B|
|_ODL5695.pgm||Red||Green Avg||18.2393||696.1||1413||314.5||593.5||Red, Green Avg|
|_ODL5695.pgm||Green R||Green B||-14.2368||1410.8||1415||618.1||597.3||Green R, Green B|
|_ODL5695.pgm||Green R||Blue||4.7311||1410.8||1034||618.1||518.7||Green R, Blue|
|_ODL5695.pgm||Green R||Green Avg||301.9246||1410.8||1413||618.1||593.5||Green R, Green Avg|
|_ODL5695.pgm||Green B||Blue||-19.0844||1414.6||1034||597.3||518.7||Green B, Blue|
There are more components of the
vvm class we haven't explained yet. But as they are very similar in both, the
vvm and the
hvdvm class, we will explain them in a later section common to both classes.
In the previous section we have seen how the
vvm$digest() function process raw image samples and computes some statistics to analyze the raw image noise. Those statistics are in the same color space as the pixel values in the raw image, they are in the camera sensor raw color space. However, we don't want to finish the analysis just there, we want to track the noise to the final image. To allow this, there is a
vvm$digest.as.rgb() function which converts the pixel values from the raw camera space to a chosen RGB space. For example, we can convert the pixel values to
sRGB and then compute statistics over the pixels in that RGB space.
To make possible this conversion, from the camera raw to a RGB color space, we need some camera sensor color information. This camera color information can be easily obtained converting a camera raw image photo file to the
.DNG (Adobe digital negative) format using the Adobe DNG Converter. The Windows version can be downloaded from here and the Mac OS X version from here. The Linux users will need the help of a friend converting any raw photograph. This color information very possible changes with the ISO speed, so you will need a conversion for each ISO you want to work with in
imgnoiser package contains as example the
nikon.d7000.ISO100.colmap list, with the required information for a Nikon D7000 camera at ISO 100. You can use this list as a baseline and replace the data content according to your camera sensor.
str(nikon.d7000.ISO100.colmap) #> List of 6 #> $ cam.matrices.1 :List of 5 #> ..$ illum.cct.k : num 2856 #> ..$ color.matrix : num [1:3, 1:3] 0.8642 -0.3999 -0.0453 -0.3058 1.1033 ... #> ..$ forward.matrix : num [1:3, 1:3] 0.7132 0.2379 0.0316 0.1579 0.8888 ... #> ..$ cam.calib.matrix: num [1:3, 1:3] 1 0 0 0 1 0 0 0 1 #> ..$ reduct.matrix : num [1:3, 1:3] 1 0 0 0 1 0 0 0 1 #> $ cam.matrices.2 :List of 5 #> ..$ illum.cct.k : num 6504 #> ..$ color.matrix : num [1:3, 1:3] 0.82 -0.487 -0.104 -0.224 1.239 ... #> ..$ forward.matrix : num [1:3, 1:3] 0.6876 0.2676 0.0151 0.3081 1.0343 ... #> ..$ cam.calib.matrix: num [1:3, 1:3] 1 0 0 0 1 0 0 0 1 #> ..$ reduct.matrix : num [1:3, 1:3] 1 0 0 0 1 0 0 0 1 #> $ analog.balance : num [1:3] 1 1 1 #> $ black.raw.level: num 0 #> $ white.raw.level: num 15892 #> $ tone.curve :'data.frame': 64 obs. of 2 variables: #> ..$ src : num [1:64] 0 0.0159 0.0317 0.0476 0.0635 ... #> ..$ dest: num [1:64] 0 0.0125 0.0411 0.0841 0.1326 ...
Once you get the
.dng file, you can read the required information using ExifTool, that is a command line application which can be installed on Windows, Mac OS X and Unix systems. If you are note comfortable with this kind of application, there are also GUI interfaces taking advantage of the ExifTool engine, check the ExifTool web page.
For Windows I use ExifToolGUI, where the relevant data for the color transformation from raw to RGB is:
It is also required the, reduction matrices with exif tags 0xC625 and 0xC626. In the case of the Nikon D7000 these tags don't exist in the
.dng file, that's why they don't appear in the image above. So, I fill the corresponding entries in the list with 3x3 identity matrices. Once you get ready your camera color information in that R list, don't forget to save it for future use. If you kindly sent me the saved R object by email, I will gladly include it in the package.
The color information in that R list is needed for the creation of an object instance of the class
cm.obj <- imgnoiser::colmap$new(imgnoiser::nikon.d7000.ISO100.colmap)
Now we need to let the
colmap object to find out the white balance adjustment adequate for our image samples. This means that if our samples are not shots to a neutral surface, we will need to take a shot to a white object to make this white balance calibration.
To compute the white balance, we need the raw channel values of a camera raw reference white. This is like when in a photo processing application we white-balance the image by picking a neutral color area with an eyedropper. As the target in the image samples is a white surface, we can choose any of them, but if you look at the mean ratios in the image above, you will realize the color balance changes a little with the exposition level. Considering this, we will choose for our raw reference white, the images with
Green Avg almost over
7,500 ADU, which is approximately at the middle point of their vertical variation range:
subset(vvm.obj$wide.var.df, abs(green.avg.mean - 7500) < 400)
From this images we compute their mean color values:
sel <- subset(vvm.obj$wide.var.df, abs(green.avg.mean - 7500) < 400) # Keep only the color mean columns sel <- sel[,c('red.mean', 'green.avg.mean', 'blue.mean')] # Get their means apply(sel, 2, mean) #> red.mean green.avg.mean blue.mean #> 3696 7539 5588
We will use this color coordinates as the white reference argument in a call to the function
get.conv.matrix.from.raw() in the
cm.obj object we have just created. That raw white reference must be specified as a vector with the RGB color components.
The destination space,
to.space, is the second parameter required in the call to the
get.conv.matrix.from.raw() function. The possible values are:
We will use
sRGB for our example:
neutral.raw <- c(3696, 7539, 5588) raw.to.sRGB <- cm.obj$get.conv.matrix.from.raw(from.neutral.raw = neutral.raw, to.space = 'sRGB') #> White Balance CCT:5360.34853090509 #> White Balance xy (1931):0.3359028328649360.348287791612151 # Lets print the conversion matrix raw.to.sRGB #> [,1] [,2] [,3] #> [1,] 3.53864420 -0.6478555 -0.1170973 #> [2,] -0.35326743 1.6515491 -0.6454761 #> [3,] 0.03012944 -0.4927949 1.9944290
get.conv.matrix.from.raw() function reports the white balance CCT found for the given white reference (
5,360 °K in or example) and returns (in "invisible" way) the conversion matrix from camera raw to the target RGB space (
sRGB in our example). If we want, we can receive and keep that result in an R variable to print it or do whatever else we want with it. But, as the package keeps this matrix, and other information, inside itself for the following color conversion, we don't have to do that.
Now we are almost ready to call the
digest.as.rgb() class function, through our
vvm.obj object, which makes the color conversion and computes the noise statistics in the resulting RGB images. The parameters for this function are the same we used before in the call to the
vvm.obj$digest() function, in the previous section, plus additional ones regarding to the color conversion. One of them is the
map.to.rgb parameter, to provide a
colmap object with the color conversion know-how and already calibrated to white-balance the samples, it is the
cm.obj in our example.
Furthermore, we have to choose the target scale with the
rgb.scale parameter. This argument determines the range (from zero to this value) of the resulting RGB color values. We will choose
255 as a customary range for RGB color coordinates. However, the resulting color values will not be integers, they will be floating point numbers holding the complete precision of our calculations.
Moreover, with the
rgb.labels parameter, we can choose the labels for the color values in the resulting data frame. By default they are (
blue) but you can change them if you want, for example to your native language.
is.neutral is for a logical argument indicating if the samples come from shoots to a neutral target. If it is true, during the color conversion, the white-balance parameters previously established in the call to
get.conv.matrix.from.raw, will be refined for each sample. This may be required to counterbalance the white balance variation we saw in the the mean ratios plot above: the white balance CCT is kept the same for all the samples, only the raw balance is adjusted. This white balance adjustment is only possible when the samples are neutral.
The very last step in the conversion process from raw to RGB, is the application of a tone curve. Actually, it is the application of two tone curves, one is the camera tone curve and the other is a gamma curve corresponding to the target RGB space. The camera tone curve specification is part of the data supplied in the
colmap object creation. It can be a NULL value, meaning it is not required or wanted a camera tone curve. This camera tone curve is aimed to deal with the camera tone idiosyncrasies in order to get nicer images or with a particular style, this way you will see in the camera reviews, "this camera has a <choose a fancy adjective> contrast".
After the application of the camera tone curve, the gamma curve corresponding to the RGB target space is applied to the image. Every RGB target space has a corresponding gamma correction curve. For example the sRGB color space has a tone curve very close to 2.2. In this article we explain the basic idea behind a tone curve or gamma correction curve.
digest.as.rgb() function has a couple of parameters to let you choose if you want to apply the camera tone curve and the RGB target space tone curve or not.
use.camera.tc (use the camera tone curve) parameter is for a logical value indicating if the camera tone curve should be applied to the image samples or not. If you set this argument to TRUE —and you provided a tone curve in the
colmap object creation data— it will be applied to the images, otherwise it wont be applied.
target.tc parameter deals with the target RGB gamma correction. Its default value is yes, apply the target space tone curve. However, you can specify other value as:
linear: Do not apply a target space gamma curve at all.
sRGB: Apply the sRGB tone curve.
ProPhoto: Apply the ProPhoto tone curve.
BT.709: Apply the ITU-R Recommendation BT.709 tone curve.
Gamma.2.2: Apply a regular 2.2 gamma correction tone curve.
Gamma.1.8: Apply a regular 1.8 gamma correction tone curve.
- A data frame with two columns specifying the tone curve.
This allows to play and use other known color spaces. For example, it is said Adobe Lightroom internally uses the
ProPhoto space with the
sRGB tone curve. We can get that by setting
ProPhoto as the target space in the call to the
get.conv.matrix.from.raw() function and
sRGB as the space tone curve in the call to
As another example, we can set the
digest.as.rgb() function parameters with
use.camera.tc = FALSE and
target.tc = 'linear' to get the pixel values in the RGB target space but keeping them linear.
The tone curve argument itself can be the tone curve. In such case, its first two columns must numeric and they must have at least 8 rows. The column names are not relevant. The first column will represent the source pixel value which will be mapped to the corresponding value in the second column. The values in the first column must contain the
[0,1] range, and those in the second one must be in the
[0,1] range. The missing values, the values between the given points of the curve, will be interpolated using smooth splines, with the required degree to pass over each data point.
As we are learning the use of the
imgnoiser package and not really analyzing the noise data, in our current example, we will maintain the things simple don't using any tone curve and keeping the data linear.
To keep our previous noise statistics about the raw pixel values, we will create another
vvm object, with the name
rgb.vvm, to get the sRGB statistics, then we will plot the resulting sRGB linear statistics.
rgb.vvm <- vvm$new(has.RGGB.pattern = TRUE) rgb.vvm$digest.as.rgb(file.name.from = '_ODL5695', file.name.to = '_ODL5767', file.name.ext = '.pgm', file.path = img.path, # Additional parameters for the color space conversion is.neutral = TRUE, map.to.rgb = cm.obj, rgb.scale = 255, target.tc = 'linear') #> 70 image samples were successfully processed as RGBs. rgb.vvm$plot()
sorted.var.df <- dplyr::arrange(rgb.vvm$wide.var.df, green.mean) sorted.var.df[54:57,]
We can see the red component has the largest amount of noise among all the RGB the colors. A hint to explain this is the red raw values having a very lower scale than the others. From our raw white reference it is
3696/7539 = 49%: the half of the green scale. Also, the conversion matrix to build the red RGB color (first row) amplifies a lot the red raw component. However, as the focus here is to get used to the
imgnoser package we won't get into more details.
Now we have similar data as we saw as result of calling the
vvm.obj$digest() function, as the
merged.var.cov.df data frames. Lets take a look at the last one.
As we have already said, there are more components of the
vvm class we haven't explained yet. They are also available when we digest the images after their conversion to RGB with the
digest.as.rgb() function. But as those components are very similar in both, the
vvm and the
hvdvm class we will explain them later in a section common to both classes.
hvdvm class computes the mean, and the half noise variance and covariance of the difference of two identical raw images. The theoretical basis of this approach is explained here and the practical application of that model is explained here.
Compared to the
vvm$digest() function, the
hvdvm$digest() one, needs additional information to identify the shots taken under identical photographic conditions. To that effect, we have to prepare a
.csv file relating our samples file names with the photographic conditions under which they were taken.
.csv file must contain columns with the following names:
The letter case in the column names is not important. Also instead of period
. you can use a white space. The columns sequence is not significant and there can be also other ones. Except for
crop.file.name the exact values in these columns are not important, the
hvdvm procedure will just distinguish the different values in them. In this sense, for example,
shutter.speed can contain
0.02 or just
50. However, you must be consistent with your value coding and use the same value for the same condition and vice versa.
To build that
.csv file, we will use the already mentioned ExifTool. This time using it —necessarily— as a command line application (not through a GUI interface): Open a command window on the folder containing the original raw image file photographs and run this command:
exiftool.exe -common -ext nef -csv ./ > photo-conds.csv
ExifTool will extract common (
-common) exif metadata from the Nikon '.nef' image files (
-ext nef) in the current directory (
./) using the .csv format (
-csv). The output redirection (
> ISO100Conds.csv) will save the command output into the
"photo-conds.csv" file in the current working directory. Of course, this works fine for Canon
.CR2 and many other proprietary file formats.
Now, we have to edit the
.csv file produced by ExifTool to comply with the column naming and to add a column with the name of our sample files. If you open that file with MS Excel —annoyingly— it will convert the exposition times to dates (e.g
Jan-50). To avoid that you can use LibreOffice. If you want to use MS Excel, rename the file to make it have a
.txt extension, open it with MS Excel and convert the text lines to data with the "Text to Columns" menu command in the Data ribbon. Select the line with the column names and the data below it before to trigger the "Text to Columns" command. In the following dialog windows choose "delimited" by comma and assign the text type to the
ShutterSpeed column. Also remove any other line in the file to just get the column titles in the first row and the photographic info below. After doing that I get what is shown below.
The columns required by
imgnoiser have green background, you can remove the others if you want. Now rename, those columns to
If you want, you can leave the names as they come from ExifTool and set with the
imgnoiser.option() function a map, using the
'conds.col.map' option, from your column names to the expected ones. The advantage of this approach is that you can reuse this mapping for countless
.csv files and avoid renaming the column names every time. Also, the package options can be saved and reused in another R session. We will check that later.
We must add a column titled
crop.file.name with the name of our samples using a formula taking as input the
FileName column: take the first characters before the period
. that separates the file name from its extension (
NEF in my case) and concatenate the
.pgm name extension in our sample files, something like
=LEFT(C2,8)&".pgm". Copy that formula along all the rows.
Finally we need to add the
lighting column with information based on our notes about the changes in the intensity of the light over the target scene. The exact values are not important. The class will be just interested on the different values in this column.
To begin with, you can fill all the column with the value
'Normal' and for those photos with higher lighting we can set the label
'High'. If you increased the lighting twice during the shooting session, you can correspondingly tag those photos with the values
'High A' and
'High B' in that column. Proceed with the same labeling schema for the shoots with lower lighting.
Use the "Save as..." command to save the file with the
.csv format. Now we are ready to
digest() the samples with the
We have to create an
hvdvm object and call the
digest() function giving as arguments the name of the
.csv file we have just prepared and a path to the image samples referred from it:
hvdvm.obj <- hvdvm$new(has.RGGB.pattern = TRUE) hvdvm.obj$digest('path/to/photo-conds.csv', 'path/to/images/')
You must notice the
.csv file, in the
crop.file.name column, contains only the sample file names, i.e. without the path to the folder containing them. That path is given in the second parameter call to the
Now, we can plot the statistics obtained in the digestion:
hvdvm digestion we have —again— those data frames we saw after using the
vvm$digest.as.rgb() function: the
merged.var.cov.df data frames contain the data extracted from the samples using the
hvdvm technique. Lets take a look to
pict2 columns in the above data frame, and also present in other data frames in
hvdvm objects, show the sample image file names to which the
hvdvm procedure was applied: according with the data in the input
.csv file, they come from photographs taken under identical conditions.
cond column, refers —with a sequential number— to the photographic conditions used to shot those samples. Different
cond values represent different photographic conditions and vice versa.
We can get the different photographic conditions through the
photo.conditions.df variable in the
# Get the different photographic conditions head(my.hvdvm$photo.conditions.df)
Now we can see the photographic conditions referred by the column
cond in the other
hvdvm data frame variables. For example, the image samples in the first row in the
wide.var.df variable, shown previously, come from photographs taken at an exposition time of
There are more components of the
hvdvm class we haven't explained yet. But as they are very similar in both, the
vvm and the
hvdvm classes, we will explain them in a following section common to both of them.
In this section we will continue explaining the concepts and general workflow, common to the usage of the
hvdvm or the
vvm class. At this point, we will assume we have already digested some images with those classes.
The functions we will mention here can be called from a
vvm or a
hvdvm instance object. In the code excerpts we will use the one in the
my.obj variable which can be an instance of any of those classes.
The notion of model in the
imgnoiser package involves two concepts:
- Model Data Source: The specific data which will be used as the observations over which we will fit a model or regression.
- Model Family: The mathematical model or procedure allowing us to fit a model or regression over the data mentioned in the previous point.
When we fit a model in
imgnoiser, we use a data source, fit a regression model available in a model family over that data, and give a name to that fitted model.
Actually —by default— only one model data source exists and its named
std.var. This model uses variables from the noise variance
var.df result we get after calling the
digest() function. In particular, in
std.var data model the
mean is the predictor variable (x) and
var is the predicted one (y). This observations are grouped by
You may add more data models at your will by changing the
get.model.predictions option value.
When a function requires as an argument, the name of a model data source, by default, it will use the
fit.model.data option value, which as you might guess, by default is
std.var. Unless you provide more models, leave the default value for those arguments (typically named
model.data.name ), as we will do in the following examples.
We can choose, among different statistical regression models, the one that we want to use on our model data source. We group those models by the R function that actually makes the fitting.
By default, the following model families, from the
stat package, are available:
As these fitting functions allow a wide variety of options, in fact directing them to different regression models or fitting procedures,
we have a broader amount of options than may it seem at first glance. As a trivial example, you can use the simple linear model
lm() with weights.
When a function requires as an argument a model family, it will use by default the
fit.model.data option value, which by default is
'lm'. You can add more fitting function at your taste by changing the
fit.model.family option value.
We can start using a simple (OLS: ordinary least squares) line fitting. To this effect we will just left the default value for all the
fit.model() function arguments:
# Fit the default 'lm' model and save it with the default 'standard' name. my.obj$fit.model() #> The model "standard" was successfully fitted using: #> lm(formula = var ~ mean + I(mean^2), data = model.src("std.var"))
Notice the output after the model fitting mentions the model family (
lm), the formula (
formula) used to relate predicted and predictor variables and the model source data (
data = model.src("std.var")). If there would be any other parameter in the calling to the
lm() function, they will appear in this output. Everything sent to the fitting functions will appear in this output; nothing is sent "under the hood".
Notice we did not input explicitly a formula in the call to the
fit.model() function, it has been inferred from the source data model variables and the default argument for the
degree parameter of that function, which for the
hvdvm class is
1 and for the
vvm one is
2. This argument indicates a linear relationship with the predicted variable
var being predicted by a first or second degree polynomial on
mean. If you set
degree = 3, a cubic polynomial on
mean will be fitted to predict the
var variable, with
4 you will fit a bi-quadratic one and so on.
Instead of the
degree you can directly set the
formula argument for the fitting with the specific formula you want. In such cases, when you specify the formula, the
degree argument is silently ignored. For example, we may want a regression predicting
log2(var) using as predictor
log2(mean), in such case we can fit the model using:
# Fit a model named 'log' using the 'lmrob' family: my.obj$fit.model('log', formula = log2(var) ~ log2(mean), model.family = 'lmrob') #> The model "log" was successfully fitted using: #> lmrob(formula = log2(var) ~ log2(mean, data = model.src("std.var"))
The restrictions for the
formula parameter are those from the model family function. In the previous example, the restrictions are from the `robustbase::lmrob()' function. For this reason, we can not use formulas with the 'smooth.spline' model family.
Other default arguments in the call to the
fit.model() function we just did, is the name to assign to the model we are fitting (
model.name parameter). Its default value is the
fit.model.name option value, with
standard as factory value.
Also, the confidence level (
conf.level parameter) is used to compute the prediction interval limits. Its default value is the
conf.level option value, with 95% as factory value.
We can call the
fit.model() function again but with an argument for the
model.name parameter different to
standard to do not overwrite the model we just fitted, or we can omit again this argument to overwrite that model.
Lets fit a robust linear model using as weight the reciprocal squared values of the mean. We will assign the name
'weighted' to that model:
# This time we will set the model name and model family parameters my.object$fit.model(model.name = 'weighted', model.family = 'lmrob', weights=1/mean^2) #> The model "weighted" was successfully fitted using: #> lmrob(formula = var ~ mean + I(mean^2), data = model.src("std.var"), weights = 1/mean^2)
Notice how we set the
weights argument for the
lmrob function in the call to
fit.model() using the variable names in our model. This works because all the parameters that don't correspond to the
fit.model() function are passed directly to the fitting function,
lmrob in this case. This allows the use any and every parameter available for them.
We can get the predictions from the
standard model we just fit, by calling the
get.model.predictions() function with the default arguments:
# Get a data frame with the model predictions for the 'standard' model preds <- my.obj$get.model.predictions() head(preds,4) tail(preds,4)
The columns in the resulting predictions data frame are:
mean: The predictor variable.
var: The predicted variable.
channel: The grouping factor.
fit.se: The prediction standard error.
lpl: The lower prediction interval limit.
upl: The upper prediction interval limit.
The prediction interval is based on the confidence level argument (
conf.level parameter), which by default is the same one used to fit the model. If you want technical details about these lasts columns, please check the R
predict() function with respect to the
level = conf.level,
se.fit = TRUE and
interval = 'prediction' arguments.
If we don't specify a model name, the predictions are computed using the default model, and the predictor variable is chosen for evenly spaced points in the same range as the source data used to fit the model. However, you can specify those predictor values and select the channels whose predictions you want.
For example, lets get the predictions from the
'weighted' model, for predictor values from
8000 with steps of
1000, for the
Red channels, with prediction intervals with 98% of confidence level:
# Get the predictions from the 'weighted' model and the given predictor values my.obj$get.model.predictions( model.name = 'weighted', x = seq(5000, 8000, 1000), select = c('Blue','Red'), conf.level = 0.98)
To get a model summary, we call the
print.model.summary() function indicating the name of the model whose summary we want. As usual, if we don't give a model name, the default model will be used:
# Get the default model summary my.obj$print.model.summary() #> #------- #> # Model Name: "standard" #> # #> # Call: #> # lm(formula = var ~ mean + I(mean^2), data = model.src("std.var")) #> #------- #> #> ##-- channel : "Red" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -273.30 -5.89 1.24 3.06 281.97 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 4.76e-01 1.08e+01 0.04 0.96 #> mean 4.63e-01 1.40e-02 32.95 < 2e-16 #> I(mean^2) 1.15e-05 2.15e-06 5.35 1.2e-06 #> #> Residual standard error: 68.8 on 67 degrees of freedom #> Multiple R-squared: 0.995, Adjusted R-squared: 0.995 #> F-statistic: 6.84e+03 on 2 and 67 DF, p-value: <2e-16
#> #> ##-- channel : "Green R" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -551.7 -4.6 11.5 14.7 467.6 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -1.30e+01 1.74e+01 -0.75 0.46 #> mean 4.36e-01 1.12e-02 38.94 < 2e-16 #> I(mean^2) 4.40e-06 8.51e-07 5.18 2.2e-06 #> #> Residual standard error: 111 on 67 degrees of freedom #> Multiple R-squared: 0.996, Adjusted R-squared: 0.996 #> F-statistic: 9.17e+03 on 2 and 67 DF, p-value: <2e-16 #> #> ##-- channel : "Green B" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -463.8 -4.0 9.7 13.1 406.4 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -9.19e+00 2.01e+01 -0.46 0.65 #> mean 4.26e-01 1.29e-02 33.06 < 2e-16 #> I(mean^2) 6.25e-06 9.78e-07 6.38 1.9e-08 #> #> Residual standard error: 128 on 67 degrees of freedom #> Multiple R-squared: 0.995, Adjusted R-squared: 0.995 #> F-statistic: 7.3e+03 on 2 and 67 DF, p-value: <2e-16 #> #> ##-- channel : "Blue" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -320.8 -12.7 8.7 12.3 495.4 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -9.78e+00 1.34e+01 -0.73 0.47 #> mean 5.02e-01 1.15e-02 43.78 < 2e-16 #> I(mean^2) 4.95e-06 1.15e-06 4.29 5.8e-05 #> #> Residual standard error: 85.4 on 67 degrees of freedom #> Multiple R-squared: 0.997, Adjusted R-squared: 0.997 #> F-statistic: 1.08e+04 on 2 and 67 DF, p-value: <2e-16 #> #> ##-- channel : "Green Avg" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -426.8 -23.4 16.7 22.5 573.6 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) -2.23e+01 1.95e+01 -1.14 0.257 #> mean 4.59e-01 1.25e-02 36.65 <2e-16 #> I(mean^2) 2.08e-06 9.50e-07 2.19 0.032 #> #> Residual standard error: 124 on 67 degrees of freedom #> Multiple R-squared: 0.995, Adjusted R-squared: 0.995 #> F-statistic: 7.14e+03 on 2 and 67 DF, p-value: <2e-16
print.model.summary() function —by default— prints the summary of each fitted
channel in the model. You can use the
select parameter to choose which specific models summaries you want. You can pick them by index number (between 1 and 4) or by
channel label. For example, lets get the summary for the
Blue channel in the
my.obj$print.model.summary('weighted', select='Blue') #> #------- #> # Model Name: "weighted" #> # #> # Call: #> # lmrob(formula = var ~ mean + I(mean^2), data = model.src("std.var"), weights = 1/mean^2) #> #------- #> #> ##-- channel : "Blue" --## #> #> Residuals: #> Min 1Q Median 3Q Max #> -335.108 -10.786 -0.235 5.168 467.468 #> #> Coefficients: #> Estimate Std. Error t value Pr(>|t|) #> (Intercept) 2.40e+00 1.97e-01 12.19 < 2e-16 #> mean 4.73e-01 5.93e-03 79.79 < 2e-16 #> I(mean^2) 8.33e-06 1.45e-06 5.75 2.4e-07 #> #> Robust residual standard error: 0.0375 #> Multiple R-squared: 0.996, Adjusted R-squared: 0.996 #> Convergence in 11 IRWLS iterations
#> #> Robustness weights: #> 4 weights are ~= 1. The remaining 66 ones are summarized as #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.499 0.878 0.957 0.911 0.984 0.999 #> Algorithmic parameters: #> tuning.chi bb tuning.psi refine.tol #> 1.55e+00 5.00e-01 4.69e+00 1.00e-07 #> rel.tol solve.tol eps.outlier eps.x #> 1.00e-07 1.00e-07 1.43e-03 2.13e-08 #> warn.limit.reject warn.limit.meanrw #> 5.00e-01 5.00e-01 #> nResample max.it best.r.s k.fast.s k.max #> 500 50 2 1 200 #> maxit.scale trace.lev mts compute.rd fast.s.large.n #> 200 0 1000 0 2000 #> psi subsampling cov #> "bisquare" "nonsingular" ".vcov.avar1" #> compute.outlier.stats #> "SM" #> seed : int(0)
This function calls the
base::summary() function corresponding to the model family used for the model fitting. Those functions accepts other parameters you can also specify when calling to the
For example, the
summary.lm() function is used for models from the 'lm' model family, and this functions accepts the logical parameter
correlation; we can pass this parameter through a call to the
# The default model uses the 'lm' model family my.obj$print.model.summary(select='Blue', correlation = TRUE)
The R functions that actually fit the models, like
lmrob(), return an object which can be used in other R functions supporting them.
For example, the function
car::avPlots() Added-Variable Plots, constructs and plots added-variable (also called partial-regression) models as an analysis tool to check the good fitness of each component in a linear model. We can use that function to find out if the linear and quadratic component
mean component in the
vvm models have a good fit. However, to be able to do that, we need the object returned by the fitting functions we mentioned above.
get.model() returns a list with the model objects for every fitted channels in a given model. As with the
print.model.summary() function, we can select (
select parameter) the channel(s) whose model we want, and as usual, if we don't provide a model name the default one will be used.
Lets get the
Red model in the default model and use it to get the Added-Variable Plots:
red.fitted.object <- my.obj$get.model(select='Red') car::avPlots(red.fitted.object\ )
get.model() function returns a list even when it is returning only one single object. That is the reason why we have to use
[] to extract the
Red model object from the value returned by
get.model() and pass it to the
model.list variable provides a list with the already fitted models.
my.obj$model.list #> $standard #>  "lm(formula = var ~ mean + I(mean^2), data = model.src("std.var"))" #> #> $weighted #>  "lmrob(formula = var ~ mean + I(mean^2), data = model.src("std.var"), weights = 1/mean^2)"
As you can see in the output above, the names in the list are the model names and the list components are the calls made to fit each model.
exists.model() function tests if a model with a given name already exists. This function returns
TRUE only if the model with the given name exists:
# We have a model named 'weighted' my.obj$exists.model('weighted') #> lmrob(formula = var ~ mean + I(mean^2), data = model.src("std.var"), weights = 1/mean^2) #>  TRUE # But we dont have other one called 'log-log' my.obj$exists.model('log-log') #> The model"log-log" does not exists. #>  FALSE
We call the
remove.model() function when we want to remove a fitted model. In this case, for security reasons, there is no default model name, you must specify it.
# We have a model named 'standard' ... my.obj$exists.model('standard') #> lm(formula = var ~ mean + I(mean^2), data = model.src("std.var")) #>  TRUE # .. but we don't want it any more my.obj$remove.model('standard') #> The model with name"standard"has been removed. # Now, it doesn't exists my.obj$exists.model('standard') #> The model"standard" does not exists. #>  FALSE
There is nothing like a good plot to let the data speak by itself, showing us its features, validating our assumptions or raising new questions.
With the data we get through object variables like
photo.conditions.df and the result from the
get.model.predictions() function, we can built any kind of plot we want. However, the function
plot() provides an easy way to plot the data and a model already fitted.
By default, without any argument, the
plot() function just plots the model observations. Using the
model.name parameter, we can include a fitted model in the plot. As a convenience, if we set the
model.name parameter to
TRUE, the default model (the one with the name equal to the
fit.model.name option value) is plotted.
# Plot the data my.obj$plot()
# rebuild the previously deleted default model my.obj$fit.model() #> The model "standard" was successfully fitted using: #> lm(formula = var ~ mean + I(mean^2), data = model.src("std.var")) # Now, lets plot the default ('standard') model my.obj$plot(model.name = TRUE)
ylim parameters you can delimit the region to plot, zooming-in a particular area in the plot. Those parameters are vector of two numeric values with the desired
y range for the plot.
Behind the scenes the
ggplot2 package is drawing the plots. When we plot this kind of zoomed-in graphs,
ggplot throws some warnings about the data being excluded from the plot because those axes ranges. The
warnings logical parameter is FALSE by default to avoid them.
We may need to look closer some regions of a plot because the scale can hide interesting details. For example, if look closer to the axes origin, in the above plot, we get the following image.
# Get a closer look to the data points and the standard model in the axes origin my.obj$plot(model.name = TRUE, xlim=c(0,300), ylim=c(0,150))
model.name parameter instead of TRUE (for the default plot), we can —of course— use the name of the fitted model we want to plot.
When we include a model in the plot, we can draw the fitted values (logical parameter
fit) and the prediction intervals (logical parameter
pred.int). By default, including a model in the plot causes to draw the fitted values as in the previous plots.
Lets plot the fitted values and the prediction intervals:
# Plot close-up to the data and the weighted model in the axes origin. # Show the values fitted by the model and the prediction intervals. # Mention in the sub title this is the weighted model my.obj$plot(model.name = 'weighted', fit=TRUE, pred.int = TRUE, xlim=c(0,300), ylim=c(0,150), slab='Weighted Model')
In the previous image the prediction intervals are the semi-transparent areas around the lines through the fitted values. Be aware the prediction intervals are not available when the fitted model comes from the
smooth.spline model family.
In the previous plot we have also added a subtitle: we can use the
tlab parameter to set the main title and
slab for the sub title. With the
ylab parameters we can set the axes labels. If for any reason, you don't want any of those labels, set the corresponding parameter to NA. We will see a case for this in the following section.
obs logical parameter determines if the data is shown (as points) in the plot or not, by default it is
Additional plot customization ca be done with the following package options:
plot.line.width: The width of the lines passing through the fitted values.
plot.point.size: The size of the points showing the source data.
plot.point.opacity: The opacity of the points showing the source data.
plot.ribbon.opacity: The opacity of the ribbons showing the prediction intervals.
with parameter allows to specify a formula to filter the data that will be plot. For example, specifying
with = ~ channel != 'Green Avg', will plot a graph excluding the data from the channel with the
'Green Avg' label.
We can use the
y parameters to do some magic tricks. With then we can transform the fitted model and the source data to our wish. For example, lets say we want to plot the SNR (Signal to noise ratio) against the signal in a log-log (base 2) plot. In terms of our model variables, the SNR is the
mean divided by the noise standard deviation, which is the squared root of the
var variable (noise variance).
In other words, we want to plot the
log2(mean) variable in the
x axis and the
log2(mean/sqrt(var)) in the y axis. Also, we should change the plot title because the default one, with these settings, wont correspond to the graph content anymore. In the following code example we will do this, also hiding the data points.
# Set the prediction interval ribbons to 10% opacity imgnoiser.option('plot.ribbon.opacity', 0.1) # Plot the SNR in a log-log graph my.obj$plot(model.name = 'weighted', x=log2(mean), y=log2(mean/sqrt(var)), obs=FALSE, fit=TRUE, pred.int=TRUE, tlab='Signal to Noise Ratio (SNR)', slab='Weighted model')
Notice how easily we have started the analysis processing the image samples and finished within minutes with a nice plot, showing even the prediction intervals.
Despite the flexibility of the
plot() function, it is not unlikely you will need to cook the plot more with your own hands. To this effect, the
plot() function returns the
ggplot object it constructs to render the plot. However, this is an invisible return. You must receive it in a variable and use it the way you want.
plot() function supports this idea by allowing us to set it to
FALSE instructing the
plot() function to build and pass the
ggplot object but without showing or printing the plot. This way we can get that
ggplot object, customize it more to our needing and finally print it.
For example, lets say we want to show side by side the plots of the 'standard' and the 'weighted' models close to the axes origin, sharing the same title, legend and
y axis using this code which requires the
ggplot object of those model plots:
# Get the standard linear regression plot std.plot <- my.objj$plot(model.name=TRUE, print=FALSE, xlab='Mean', ylab=NA, tlab=NA, xlim=c(0,300), ylim=c(0, 150)) # Get the weighted, robust, linear regression plot weighted.plot <- my.obj$plot(model.name='weighted', print=FALSE, xlab='Mean', ylab=NA, tlab=NA, xlim=c(0,300), ylim=c(0, 150)) # Use both plot objects for detailed customization. shared.legend(std.plot, weighted.plot, 'The Standard and Weighted Models', 'Variance (ADU^2)' )
In the example above, there is an implicit
shared.legend() function which based on the
ggplot of two graphs, puts them one above the other sharing a common title and
y axis label. Correspondingly, the 'standard' and 'weighted' plots were built, but not printed, avoiding to have title and
imgnoiser package you can save and load:
- Object Instances
- The whole set of package options.
imgnoiser.save() function we can save an object instance of the
hvdvm or the
vvm class to a disk file, and later —maybe in another R session or computer— retrieve it with the
imgnoiser.load() function. This way you can share your data and findings with a friend or a community. The files are saved to disk have
.imn as name extension.
If when saving an object, there is a file with the same name, it wont be overwritten unless the argument
stop.overwrite = FALSE is specified.
By default, when we save an object, we also save in the same file all the
imgnoiser package options. This makes sense because through the package options you can not only customize cosmetic or user interface preferences, but procedures in the form of functions, like fitting model procedures, source data model extraction, fitted model prediction functions and so on. You can change those options in such a way that the saved data (and meta-data) may be not enough to reproduce the environment when you save your noise analysis objects with the package.
When we load an object, by default, we also load all the
imgnoiser package options saved with the object. The
load.options logical parameters in the
imgnoiser.load() functions allows to change that behavior and save or load the package options only when desired. Besides, you can save your current package options (we will see this in the following section) before to load the options with an object.
As a rule of thumb it is recommended to save the package options with the objects, to be able to load those options, or not, when you load the object.
As a convenience, if when saving an object we don't provide a file name, the object name will be used. This way, for example, the
my.vvm.object will be saved in the 'my.vvm.object.imn' file. Also, if the file name does not have the
.imn extension that will be automatically appended to it.
# List the models in the object my.obj$model.list #> $weighted #>  "lmrob(formula = var ~ mean + I(mean^2), data = model.src(\"std.var\"), weights = 1/mean^2)" #> #> $standard #>  "lm(formula = var ~ mean + I(mean^2), data = model.src(\"std.var\"))" # Save the my.obj object, and the package options, in a file named 'my.obj.imn' imgnoiser.save(my.obj) #> The "my.obj" object was successfully saved in the "my.obj.imn" file. # Destroy this objects to "make sure" we load the disk instance my.obj <- NULL from.disk.obj <- NULL # Some days later, from another computer, load from disk the saved object from.disk.obj <- imgnoiser.load('my.obj') #> An instance of the "vvm" class was successfully load from the "my.obj.imn" file. # List the models in the object from.disk.obj$model.list #> $weighted #>  "lmrob(formula = var ~ mean + I(mean^2), data = model.src(\"std.var\"), weights = 1/mean^2)" #> #> $standard #>  "lm(formula = var ~ mean + I(mean^2), data = model.src(\"std.var\"))"
Other saving and loading examples:
# Save the vvm.obj object in a particular path and file name with # the package options. If there is afile with the same name just overwrite it imgnoiser.save(vvm.obj, 'path/to/d7000-iso100-neutral', stop.overwrite = FALSE) # Load the object but not the package options hvdvm.obj <- imgnoiser.load('path/to/my.object', load.options = FALSE)
It is annoying to discover and set carefully the package options that suits you best, just to start all over again in the next R session.
imgnoiser.options.load() functions you can save and later load the whole set of package options.
# Save the options imgnoiser.options.save('2015-04-12-options', stop.overwrite = FALSE) # Load the options imgnoiser.options.load('2015-04-12-options')
imgnoiser package is a set of tools, concepts and definitions that helps to analyze image noise and makes easier to share the findings without delivering and explaining once and again the software environment, definitions and instructions to reproduce most of the results.