Generate a table representing the prevalence of unique mutations. In order to ensure confidence in the results, a threshold is provided indicating confidence in genotype calls. All data that do not meet this threshold will be removed from the computation.
Arguments
- data
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr).
- threshold
A minimum UMI count which reflects the confidence in the genotype call. Data with a UMI count of less than the threshold will be filtered out from the analysis.
Value
A tibble with the extra class mut_prev
. The
output has the following columns:
mutation_name
: The unique mutation sequenced.n_total
: The number of samples for which a mutation site was sequenced.n_mutant
: The number of samples for which a mutation occurred.prevalence
: The prevalence of the mutation.
See also
plot_mutation_prevalence()
for plotting the table.
Examples
ref_file <- miplicorn_example("reference_AA_table.csv")
alt_file <- miplicorn_example("alternate_AA_table.csv")
cov_file <- miplicorn_example("coverage_AA_table.csv")
data <- read_tbl_ref_alt_cov(
ref_file,
alt_file,
cov_file,
gene == "atp6" | gene == "crt"
)
mutation_prevalence(data, 5)
#> # A tibble: 16 × 4
#> mutation_name n_total n_mutant prevalence
#> <chr> <int> <int> <dbl>
#> 1 atp6-Ala623Glu 36 NA NA
#> 2 atp6-Glu431Lys 39 NA NA
#> 3 atp6-Gly639Asp 26 19 0.731
#> 4 atp6-Ser466Asn 15 9 0.6
#> 5 atp6-Ser769Asn 17 NA NA
#> 6 crt-Ala220Ser 11 4 0.364
#> 7 crt-Asn326Asp 21 8 0.381
#> 8 crt-Asn326Ser 26 NA NA
#> 9 crt-Asn75Glu 29 24 0.828
#> 10 crt-Cys72Ser 31 23 0.742
#> 11 crt-His97Leu 47 NA NA
#> 12 crt-His97Tyr 47 NA NA
#> 13 crt-Ile356Leu 22 15 0.682
#> 14 crt-Ile356Thr 41 18 0.439
#> 15 crt-Lys76Thr 29 25 0.862
#> 16 crt-Met74Ile 29 24 0.828