How does the PCR detection probability influence experimental observation?
Sometimes PCR fails to amplify the alleles at that locus. One of the potential reasons for PCR failure is mutation of the promoter. We simulate bad PCR reaction by defining detection probabilities. As mentioned previously, in the laboratory data, each mosquito sample must have four loci. Each locus has a number of alleles or is marked by “NA”, which represents occurrence of the bad PCR reaction at itself. The number of loci that has “NA” is counted across all mosquito samples and normalized to be PCR detection probability. The following is the detection probability calculated from laboratory data 2005-2007.
||P(Bad PCR reaction detection|locus)|
The detection probability obtained from laboratory data will be used as the probabilities to assign “NA” to each locus. For example, in a blood meal, there are a number of microsatellite alleles at the locus TPOX of sample 1. PCR process is simulated and this locus has 3% chance (detection probability) to have bad PCR reaction product. As long as PCR bad reaction product is assigned, all microsatellite alleles will be replaced with an "NA", which means the PCR product is unable to be found at TPOX.
There are two methods of handling bad PCR reactions. The first method is calculating the blood meal probabilities by excluding the samples with one or more bad PCR bad reaction product. The second one is calculating blood meal probabilities by only excluding the locus bearing “NA” in the lower bound method.
Table 2: Comparison of means and variances of perfect and imperfect detections, given the multiple blood meal distribution: 80% one blood meal, 18% 2 blood meals, and 2% 3 blood meals. Human population = 10000. Mosquito population = 324.
|Human population = 10000||Hardy Weinberg Equilibrium|
|Exclude samples with NA||Imperfect Detection||Number of Blood meal = 1||80.50% ± 2.75%|
|Number of Blood meal = 2||18.71% ± 2.72%|
|All Data||Perfect Detection||Number of Blood meal = 1||80.06% ± 2.17%|
|Number of Blood meal = 2||19.21% ± 2.17%|
|exclude "NA" Locus||Number of Blood meal = 1||80.57% ± 2.36%|
|Number of Blood meal = 2||18.91% ± 2.25%|
The differences of those variances are consistent with the expectations. Excluding sample means to shrink the size of blood meal data. However, under the assumptions of the independence of each sample and the independence of each locus, the blood meal data with smaller sample size is still considered to have the strength predicting multiple blood meal probabilities, but is not surprised to have higher standard deviations. Calculating blood meal probabilities by excluding “NA” locus cause less impact on the variance. However, locus with “NA” might have the maximum allele numbers in that sample. Excluding the locus is expected to underestimate the multiple blood meal probability, which indirectly causes higher variance as well.