Team:Stony Brook/Model

iGEM SBU 2019

Model

What is image processing and analysis?

Image processing is the use of algorithms to analyze digital images. This can include classification of objects and images, and pattern recognition. Image processing is also useful for analyzing a system that is difficult to analyze by hand. For our project, we used image processing and analysis to visualize and quantify the amount of mottling and fluorescence in our leaves during our experiments.

Analyzing mottling on leaves

One of the symptoms of TMV is a mottled “mosaic” pattern on the leaf. This is seen as yellow or light green spots on the surface of the leaf (figure 1). Since it has been shown that these areas contained virus, and that the dark green areas are resistant to virus [1], we expected that if our gene was reducing the amount of viral RNA, it would also reduce the amount of mottling seen on the leaf.

Figure 1. An example of mottling due to TMV on a leaf

The algorithm used to measure mottling was adapted from [2], and is summarized in figure 2. First, the image of the lead is taken on a white background with a standard of known area (step 1). Next, the image is binarized so the leaf and the standard appear black, while the spots appear white (step 2). Then, the image is processed in two ways. First, the background is filled, so only the spots in white are shown (step 3a). Statistics such as the total area of all the spots, the number of spots, and the size of each spot is calculated. In parallel, the image is inverted and filled so only the standard and the entire leaf appear white (step 3b). The areas of the standard and the leaf are then calculated. The standard is distinguished from the leaf by measuring the eccentricity of each object, and taking the lower one as the standard. The leaf is then traced in green, and the standard is traced in magenta so the user can see if the program analyzed the objects correctly (step 3b). Then, using these statistics, metrics about the leaf can be displayed to the user, such as descriptive statistics (step 4), or a histogram of the spots (step 5).

Figure 2. TMV mottling algorithm.

This algorithm was implemented using MATLAB. A GUI was created (figure 3) so the user can see each individual image as it is being analyzed. The user also has the option to switch the leaf and the standard if the program identified the objects incorrectly, and the option to download all the data and summary statistics as an Excel file. For a large amount of images, a script was also created in MATLAB that performs this algorithm for every image in a specified folder. To download these scripts and functions, please visit our Github page.

Figure 3. Image of the MATLAB GUI than can be used to visualize the TMV mottling algorithm described.

Measuring bias in the program

When using the program to measure the leaves, the concern arose of whether the program was accurately measuring the area of the leaves. To address this issue, we measured the areas of 127 leaves using graph paper (figure 4), and then took pictures of those leaves to measure the area with the program.

Figure 4. Example of graph paper used to calculate the true areas of leaves by hand, so they could be compared against in the program.

First, the eccentricity of the leaves was measured against the eccentricity of the standard (piece of paper with known area) to make sure the eccentricity was a valid parameter to distinguish the leaf from the standard. MATLAB code was used to find the eccentricities, and perform significance testing. The eccentricities of the leaves $(\mu = 0.138995$, $\sigma = 0.044967)$ were significantly different than the eccentricities of the leaves $(\mu = 0.580446, \sigma = 0.150514)$ $(t_{252} = -31.6696, p < 0.001)$, as illustrated by the histogram in figure 5. Thus, we concluded that using the eccentricity was a valid parameter to distinguish between the leaf and the standard. For those few cases where it is not, a “switch objects” button was implemented to recalculate the leaf statistics when the standard and leaf were switched.

Figure 5. A histogram plot created in MATLAB of the distribution of eccentricities of the leaves and of the standards for all the photos. n=127.

Then, the accuracy of the program was measured by calculated the relative error between the true areas (calculated by hand with the graph paper) and the program areas. As seen in the unshifted line graph (figure 6a), the program had a tendency to overestimate the area, and thus the program needed to be corrected. A dilation shift was then applied to all the data points. The shift was calculated using the formula \[shift=1 - \frac{m\overline{r}}{\Sigma \frac{p_i}{t_i}}\]where $\overline{r}$ is the average relative error, and $\frac{p_i}{t_i}$ is the ratio of program area to true area for each image. For our data, our shift was calculated to be $0.649253$. As seen in the shifted line graph (figure 6b), the program areas align more closely to the true areas. A rank sum test was used to confirm that the distribution of shifted relative errors was not significantly different than a distribution with a mean of 0 $(U = 14001, p = 0.332)$. Thus, for all the mottling programs, a shift of $0.649253$ is applied to all areas before being outputted.

Figure 6. Line plots of the relative errors of the program. Figure 6a (left) shows the unshifted data, while Figure 6b (right) shows the data multiplied by the shift mentioned in the paragraph. The red circles represent the individual data points. The blue line represents the line of best fit of these data points. The black dotted line is the reference line of y = x, or program area = true area. Figures were created in MATLAB.

Analyzing fluorescence in leaves

The main use of images in our project was for measuring fluorescence intensity in the leaves. This was used as a quantitative measure of how much virus (GFP) and the gene (RFP) was expressed. For more information about our results, see our results page.

The algorithm is summarized in figure 7. First, the image of the leaf with the fluorescence is taken (step 1). Next, the image is converted to a grayscale image based on the filter supplied (step 2). Then, if specified, the standards are specified to quantify the lowest and highest settings (step 3a). This will then adjust the intensity values of the image, where values below the low intensity will be set to 0, values above the high intensity will be set to 255, and values in between these intensities will be set to $255 \cdot \frac{v - s_o}{s_b - s_o}$, where $v$ is the data point, $s_o$ is the low intensity value, and $s_b$ is the high intensity value). Additionally, if a mask photo is specified, a mask will be applied to the photo to find the border of the leaf, which can be used to find the percent infected (step 3b). Then, the intensity of each region is measured in the grayscale image. A histogram of the distribution of the intensity of all the spots (step 4), and a heatmap of where these spots are located and how intense they are (step 5) can also be created based on the raw data.

Figure 7. Fluorescence intensity algorithm.

This algorithm was implemented using MATLAB. A GUI was created (figure 8) so the user can see each individual photo as it is being analyzed. The user also have the option to download all the graphs as MATLAB figure files, and the option to download all the data and statistics in an Excel file. For a large amount of images, a script was also created in MATLAB that performs this algorithm on all images in a specified folder. To download these scripts and functions, please visit our Github page.

Figure 8. Image of the MATLAB GUI than can be used to visualize the fluorescent intensity algorithm described.

Summary of our image processing

We developed MATLAB code for analyzing mottling and fluorescence in leaves. We used this code to help quantify the amount of expression in our leaves during our experiment. All of our code is open source and is available on our Github page for other teams to look at, use, and adapt for their own purposes.

References

  1. Burundukova, O.L., et al. “Dark and Light Green Tissues of Tobacco Leaves Systemically Infected with Tobacco Mosaic Virus.” Biologica Plantarum, Apr. 2007, https://link.springer.com/content/pdf/10.1007/s10535-009-0053-8.pdf

  2. Marathe, Hrushiketh, and Kothe, Prerna. “Leaf Disease Detection Using Image Processing Techniques.” International Journal of Engineering Research & Technology, Mar. 2013, https://www.ijert.org/research/leaf-disease-detection-using-image-processing-techniques-IJERTV2IS3480.pdf

iGEM Stony Brook 2019

iGEM Stony Brook 2019