Generate the simulated COI curve.
Arguments
- sim
Output of
sim_biallelic()
.- seq_error
The level of sequencing error that is assumed. If no value is inputted, then we infer the level of sequence error.
- bin_size
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf
.- coi_method
The method we will use to generate the theoretical relationship. The method is either "variant" or "frequency". The default value is "variant".
Value
A list of the following:
data
: A tibble with
plmaf_cut
: Breaks of the form[a, b)
.m_variant
: The average WSMAF or proportion of variant sites in each segment defined byplmaf_cut
.bucket_size
: The number of loci in each bucket.midpoints
: The midpoint of each bucket.
seq_error
: The sequence error inferred.bin_size
: The minimum size of each bin.cuts
: The breaks utilized in splitting the data. of each COI.
Details
Utilize the output of sim_biallelic()
, which creates simulated
data. The PLMAF is kept, and the function computes whether a SNP is a
variant site or not, based on the simulated WSMAF at that SNP. This process
additionally accounts for potential sequencing error. To check whether the
simulated WSMAF correctly indicated a variant site or not, the phased
haplotype of the parasites is computed.
See also
process_real()
to process real data.
Other simulated data functions:
plot-simulation
,
sim_biallelic()