Next-generation cancer strategies are based on next-generation gene sequencing (NGS), which paves the way for new techniques and tools to detect mutations and determine patient treatment. A team of Chinese researchers has proposed a more effective strategy for filtering out false positive results, which would improve the accuracy and efficiency of cancer diagnosis and treatment.
The research team proposed DeepFilter, which is a deep learning-based filter for removal False positives in somatic variants in the NGS data.
Their study is published on January 6, 2023 in Science and Technology in Tsinghua.
Finding physical mutations, or changes in normal tissues, are key to understanding fatal genetic diseases of the human genome such as cancer. Next-generation gene sequencing is accelerating the search for somatic mutations through the use of techniques that separate DNA/RNA into multiple fragments and identify sequences in parallel, resulting in thousands or millions of sequences at the same time. This technology improves accuracy while reducing sequencing cost and time.
Powerful ‘communication tools’ comb through NGS data and track tumors or other mutations by comparing sequences to a reference genome from related tissues in the same individual.
VarDict is a physical variable callback tool commonly used in Clinical research. Previous studies have shown that VarDict achieves higher accuracy rates and detects more real variables than similar communication tools. However, VarDict also produces a higher number of false positives than other callers, which may skew the results.
“that mistake percentage Zekun Yin, study author from Shandong University, said that out of 1:10,000 in a genome with 3 billion loci will lead to many false calls, which may lead to inaccurate clinical diagnoses. “However, filtering out true positives may also lead to a missed diagnosis.”
Typically, researchers manually filter out some false positives — a cumbersome and expensive process, which the Chinese research team set out to mitigate.
“It would save a lot of time and money if we introduced an automatic method to effectively filter out most false positives,” said Hao Zhang, study author from Shandong University.
Inspired by recent successes in incorporating machine learning-based methods to call up genetic variants from NGS data, the Chinese research team introduced a deep learning-based variant filter. Dubbed DeepFilter, the filter was designed to effectively filter out false positive variants generated by VarDict while ensuring high call sensitivity.
DeepFilter treats the task of distinguishing whether a variable is true or false as a binary classification problem. The researchers used three types of datasets to train and test the DeepFilter: real-world normal tumor sample data, a combination of gold standard data, and synthetic data.
Experimental results based on both synthetic and global NGS data have been promising:
“DeepFilter outperforms other filters in terms of false-positive variant filtering tasks, making VarDict more valuable in practical clinical research and greatly facilitating downstream analysis in biological research and patient treatment,” Zhang said.
The team plans to dig deeper into the false positive problem alternative Filtering, specifically looking at the positive and negative sample imbalance problem and integrating other machine learning and Deep learning filter methods.
“Our ultimate goal is to solve the problem of operating efficiency and accuracy of contrast recall and provide the latest contrast detection tool,” Yin said.
Hao Zhang et al., DeepFilter: a deep learning-based variable filter for VarDict, Science and Technology in Tsinghua (2023). DOI: 10.26599/TST.2022.9010032
Provided by Tsinghua University Press
the quote: A deep learning filter improves accuracy of detecting cellular mutations and accuracy of cancer diagnosis (2023, February 1) Retrieved February 1, 2023 from https://medicalxpress.com/news/2023-02-deep-learning-filter-precision-cell -mutation. html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.