Microarrays are nowadays a widely used tool for probing the set of genes expressed under different conditions. In spite of the fact that this technology has been around for about 15 years, there is still a widespread distrust of the experimental results obtained through microarray experiments. Because of this lack of trust, often results obtained with microarrays must ultimately be verified with (much much) lower throughput techniques. In fact microarrays, as currently used, are a low-throughput technique impersonating a high-throughput one.
While some researchers attribute microarrays unreliability and lack of reproducibility to the physical processes occurring during the experiment, others attribute it to the models used to analyze the experimental data. Indeed, many papers use widely different models for analyzing the data obtained in microarray experiments. The crucial distinction between those (many) models is the manner in which hypothesized “experimental biases” are “corrected.”
In our work, we have taken a radically different approach. Instead of using a standard statistical model for the data and developing an ad hoc correction to the model, we physically modeled the experimental process. We found that we could in fact obtain a closed form mathematical expression relating what is measured in a microarray experiment (fluorescent intensity) and what one wishes to probe (gene expression) with just 4 parameters needing to be fit to the data on thousands of genes.
Using our model, we are able to obtain greatly increased reliability and reproducibility of microarray results without the need for ad hoc corrections of hypothesized bias. Indeed, we found little evidence for systematic biases in the experimental processes. Hopefully, our results will lead to a more standardized process for the analysis of microarray data and the use of microarrays as a truly high-throughput technique.