Skip to contents

Predict the COI of the sample.

Usage

compute_coi(
  data,
  data_type,
  max_coi = 25,
  seq_error = 0.01,
  bin_size = 20,
  comparison = "overall",
  distance = "squared",
  coi_method = "variant",
  use_bins = FALSE
)

Arguments

data

The data for which the COI will be computed.

data_type

The type of the data to be analyzed. One of "sim" or "real".

max_coi

A number indicating the maximum COI to compare the simulated data to.

seq_error

The level of sequencing error that is assumed. If no value is inputted, then we infer the level of sequence error.

bin_size

[Deprecated] This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing plaf.

comparison

[Deprecated] This argument is no longer supported; this function will compare the theoretical curve and sample curve for all PLMAFs.

distance

[Deprecated] This argument is no longer supported; this function will solve a weighted least squares minimization problem.

coi_method

The method we will use to generate the theoretical relationship. The method is either "variant" or "frequency". The default value is "variant".

use_bins

[Deprecated] This argument is no longer supported; to estimate the COI, all data points are used. Data points are not grouped in bins of changing plaf.

Value

A list of the following:

  • coi: The predicted COI of the sample.

  • probability: A probability density function representing the probability of each COI.

Details

Compare the within sample allele frequency (WSMAF) and the population level allele frequency (PLMAF) of the sample to what a theoretical WSMAF and PLMAF should look like. By examining the sample's WSMAF and PLMAF to the theoretical WSMAF and PLMAF, an estimation can be made about what the COI of the sample is. We refer to the sample's WSMAF vs PLMAF as the "sample curve" and refer to the theoretical WSMAF vs PLMAF as the "theoretical curve." To determine the predicted COI value, one of three different methods can be selected:

end

Determines the distance between the theoretical and sample curve at a PLMAF of 0.5. The COI is whichever theoretical COI curve has the smallest distance to the simulated data.

ideal

Determines the distance between the theoretical and sample curve at the ideal PLMAF. The ideal PLMAF is calculated by looking at the change between the COI of \(i\) and the COI of \(i-1\) and finding the PLMAF for which this distance is maximized. The COI is whichever theoretical COI curve has the smallest distance to the simulated data at the ideal PLMAF.

overall

Determines the distance between the theoretical and simulated curve for all PLMAFs. Computes the distance between the theoretical curves and the simulated curve. The COI is whichever theoretical curve has the smallest distance to the simulated curve. There is an option to choose one of several distance metrics:

  • abs_sum: Absolute value of sum of difference.

  • sum_abs: Sum of absolute difference.

  • squared: Sum of squared difference.