Predict the COI of the sample.
Usage
compute_coi(
data,
data_type,
max_coi = 25,
seq_error = 0.01,
bin_size = 20,
comparison = "overall",
distance = "squared",
coi_method = "variant",
use_bins = FALSE
)
Arguments
- data
The data for which the COI will be computed.
- data_type
The type of the data to be analyzed. One of
"sim"
or"real"
.- max_coi
A number indicating the maximum COI to compare the simulated data to.
- seq_error
The level of sequencing error that is assumed. If no value is inputted, then we infer the level of sequence error.
- bin_size
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf
.- comparison
This argument is no longer supported; this function will compare the theoretical curve and sample curve for all PLMAFs.
- distance
This argument is no longer supported; this function will solve a weighted least squares minimization problem.
- coi_method
The method we will use to generate the theoretical relationship. The method is either "variant" or "frequency". The default value is "variant".
- use_bins
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf
.
Value
A list of the following:
coi
: The predicted COI of the sample.probability
: A probability density function representing the probability of each COI.
Details
Compare the within sample allele frequency (WSMAF) and the population level allele frequency (PLMAF) of the sample to what a theoretical WSMAF and PLMAF should look like. By examining the sample's WSMAF and PLMAF to the theoretical WSMAF and PLMAF, an estimation can be made about what the COI of the sample is. We refer to the sample's WSMAF vs PLMAF as the "sample curve" and refer to the theoretical WSMAF vs PLMAF as the "theoretical curve." To determine the predicted COI value, one of three different methods can be selected:
end
Determines the distance between the theoretical and sample curve at a PLMAF of 0.5. The COI is whichever theoretical COI curve has the smallest distance to the simulated data.
ideal
Determines the distance between the theoretical and sample curve at the ideal PLMAF. The ideal PLMAF is calculated by looking at the change between the COI of \(i\) and the COI of \(i-1\) and finding the PLMAF for which this distance is maximized. The COI is whichever theoretical COI curve has the smallest distance to the simulated data at the ideal PLMAF.
overall
Determines the distance between the theoretical and simulated curve for all PLMAFs. Computes the distance between the theoretical curves and the simulated curve. The COI is whichever theoretical curve has the smallest distance to the simulated curve. There is an option to choose one of several distance metrics:
abs_sum
: Absolute value of sum of difference.sum_abs
: Sum of absolute difference.squared
: Sum of squared difference.