Finding optimal threshold for correction error reads in DNA assembling
2009

Finding the Best Way to Correct Errors in DNA Reads

publication Evidence: high

Author Information

Author(s): Chin Francis YL, Leung Henry CM, Li Wei-Lin, Yiu Siu-Ming

Primary Institution: Department of Computer Science, The University of Hong Kong

Hypothesis

Can we determine an optimal threshold for correcting errors in DNA assembly reads?

Conclusion

The study presents a method to calculate the optimal threshold for minimizing errors in DNA assembly, achieving significant reductions in false positives and false negatives.

Supporting Evidence

  • Our method reduced total errors by 77.6% compared to ECINDEL and 65.1% compared to SRCorr.
  • The optimal threshold M was calculated to minimize false positives and false negatives.
  • Experimental results matched theoretical calculations for both real and simulated data.

Takeaway

This study helps scientists figure out the best way to fix mistakes in DNA sequences by finding the right number to use when deciding if a piece of DNA is correct.

Methodology

The authors calculated probabilities of false positives and false negatives for different substring lengths and thresholds, then determined the optimal threshold that minimizes total errors.

Limitations

The method may not eliminate all false positives and negatives due to the variability in sampling probabilities in real biological data.

Digital Object Identifier (DOI)

10.1186/1471-2105-10-S1-S15

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication