Vision-based techniques have become more and more relevant in science in
past 20 years. Reason is twofold: Firstly, good sensors have
become dramatically cheaper and better over recent years. Secondly,
society is becoming used to visual-based information, which eases
the introduction of vision tools in production items.
In this context, we consider important to base our projects using
vision based tools, more specifically built-in phone-like cameras.
This could lead to the simplification of the distribution of the final
product, as well as to the increase of the familiarity of final users
with the tools.
In this line, we propose a color-based model for the quantitative
prediction of the presence of heavy metals in water. This model, while
relatively simple to produce for in-lab images, become significantly
more complex if expected to deal with off-lab imagery. In a sense,
our model needs to be robust enough to learn from a small set of in-lab
images, in controlled conditions, yet be applicable to wide ranges
of data.
Contribution
Computer Vision
Proposal
The proposal is built around a dataset of images taken in-lab.
We did not fix a lightning box or set any other kind of imaging conditions.
This shall make the model learn
different conditions, lightning and ISO-bounded noise.
Then, images are automatically segmented to extract an average color from the
tinted version of the pellet.
For each heavy metal or nitrate, we generate a dataset that relates
the average color of each pellet with the concentration of the target.
A regression algorithm is then used to produce a prediction model able
to quantify the presence of a target substance from an RGB image.
The resulting process of the imaging workload is a list of datasets (one per heavy metal or nitrate) relating the RGB color of the pellet and the numerical concentration of the material. The representative color is taken as the average color over the colored pixels in the pellet. This dataset is meant to be used as training data for a predictor of the concentration in untested waters.
The relationship between color and (squared) concentration resulted in a linear (least square) regression model. No higher order regression was needed, and there was no necessity for color gamut training or adjustment. In fact, nitrates require only one of the channels for a fair linear regression to be fitted with only one channel (red for nitrate blue, and blue for nitrate yellow).
The resulting process of the imaging workload is a list of datasets (one per heavy metal or nitrate) relating the RGB color of the pellet and the numerical concentration of the material. The representative color is taken as the average color over the colored pixels in the pellet. This dataset is meant to be used as training data for a predictor of the concentration in untested waters.
The relationship between color and (squared) concentration resulted in a linear (least square) regression model. No higher order regression was needed, and there was no necessity for color gamut training or adjustment. In fact, nitrates require only one of the channels for a fair linear regression to be fitted with only one channel (red for nitrate blue, and blue for nitrate yellow).
Figure 1. Color model.
SWOT Analysis
- The data model is created with minimal constraints, specially regarding the imaging (lightning, positioning) conditions.
- No color gamut other than RGB was used in the experiments. The coloring of the pellets was gradual, yet clear, enough to use RGB. Although a converstion to CieLAB or HSV might have been trivial, it is good not to require trained or dataset-specific color spaces.
- The color model is created with simplistic linear regression models. We implies that (a) the model does not require large amount of data and (b) that the coloring is progressive over the concentration of heavy metals and nitrates.
- No data (image) preprocessing was needed for the data gathering, which speaks for the simplicity of the process.
- However different the in-lab images are, truth is that real conditions are hard to simulate. A prospective deployment of portable testing kits on field might require training.
- Although the linear fitting performs numerically well, the prediction error might be relevant when testing real-world waters.
- Mixed models (multiple contaminations) have not been tested, yet might appear in a real-world deployment of the system.
- The model should be trained on increasingly different imaging conditions, watching out the validity of the linear regression model.
- The main threat to the project and the prospective developments stems for the variability in the imaging conditions. It is hard to preview how different camera sensors and lightning condition might capture the water coloring.
Figure 2. SWOT anlaysis.