Runs several iterations of a full COI sensitivity analysis with varying parameters.
Usage
cont_sensitivity(
repetitions = 10,
coi = 3,
max_coi = 25,
plmaf = runif(1000, 0, 0.5),
coverage = 200,
alpha = 1,
overdispersion = 0,
relatedness = 0,
epsilon = 0,
seq_error = 0.01,
bin_size = 20,
comparison = "overall",
distance = "squared",
coi_method = "variant",
use_bins = FALSE
)
Arguments
- repetitions
The number of times each sample will be run.
- coi
Complexity of infection.
- max_coi
A number indicating the maximum COI to compare the simulated data to.
- plmaf
Vector of population-level minor allele frequencies at each locus.
- coverage
Coverage at each locus. If a single value is supplied then the same coverage is applied over all loci.
- alpha
Shape parameter of the symmetric Dirichlet prior on strain proportions.
- overdispersion
The extent to which counts are over-dispersed relative to the binomial distribution. Counts are Beta-Binomially distributed, with the beta distribution having shape parameters \(\frac{p}{overdispersion}\) and \(\frac{1-p}{overdispersion}\).
- relatedness
The probability that a strain in mixed infections is related to another. The implementation is similar to relatedness as defined in THE REAL McCOIL simulations (doi:10.1371/journal.pcbi.1005348 ): "... simulated relatedness (r) among lineages within the same host by sampling alleles either from an existing lineage within the same host (with probability r) or from the population (with probability (1-r))."
- epsilon
The probability of a single read being miscalled as the other allele. This error is applied in both directions.
- seq_error
The level of sequencing error that is assumed. If no value is inputted, then we infer the level of sequence error.
- bin_size
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf
.- comparison
This argument is no longer supported; this function will compare the theoretical curve and sample curve for all PLMAFs.
- distance
This argument is no longer supported; this function will solve a weighted least squares minimization problem.
- coi_method
The method we will use to generate the theoretical relationship. The method is either "variant" or "frequency". The default value is "variant".
- use_bins
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf
.
Value
A list of the following:
predicted_coi
: A dataframe of the predicted COIs. COIs are predicted usingcompute_coi()
. Each column represents a separate set of parameters. Each row represents a predicted COI. Predictions are done many times, depending on the value ofrepetitions
.probability
:A list of matrices containing the probability that our model predicted each COI value. Each row contains the probability for a different run. The first row contains the average probabilities over all the runs.param_grid
: The parameter grid. The parameter grid is all possible combinations of the parameters inputted. Each row represents a unique combination.boot_error
: A dataframe containing information about the error of the algorithm. The first column indicates the COI that was fed into the simulation. The other columns indicate the mean absolute error (mae), the lower and upper bounds of the 95% confidence interval and the bias.