Generate the COI curve for real data.
Usage
process_real(
wsmaf,
plmaf,
coverage,
seq_error = 0.01,
bin_size = 20,
coi_method = "variant"
)Arguments
- wsmaf
The within-sample minor allele frequency.
- plmaf
The population-level minor allele frequency.
- coverage
The read coverage at each locus.
- seq_error
The level of sequencing error that is assumed. If no value is inputted, then we infer the level of sequence error.
- bin_size
This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing
plaf.- coi_method
The method we will use to generate the theoretical relationship. The method is either "variant" or "frequency". The default value is "variant".
Value
A list of the following:
data: A tibble with
plmaf_cut: Breaks of the form[a, b).m_variant: The average WSMAF or proportion of variant sites in each segment defined byplmaf_cut.bucket_size: The number of loci in each bucket.midpoints: The midpoint of each bucket.
seq_error: The sequence error inferred.bin_size: The minimum size of each bin.cuts: The breaks utilized in splitting the data. of each COI.
Details
The function computes whether a SNP is a variant site or not, based on the WSMAF at that SNP. This process additionally accounts for potential sequencing error.
See also
process_sim() to process simulated data.
