Identifying Sample Identity in High-Throughput Sequencing Data
Author Information
Author(s): Rachel L. Goldfeder, Stephen C. J. Parker, Ajay Subramanian, Hatice Ozel Abaan, Elliott H. Margulies
Primary Institution: National Human Genome Research Institute, National Institutes of Health
Hypothesis
Can genotype concordance rates be used to determine if two lanes of HiSeq 2000 data are from the same sample?
Conclusion
The study demonstrates that genotype concordance rates can effectively distinguish between lanes from the same sample and those from different samples.
Supporting Evidence
- The method uses genotype concordance rates to confirm sample identity.
- Distributions of concordance rates are non-overlapping for same-sample versus different-sample comparisons.
- The approach is robust even with varying numbers of reads analyzed.
Takeaway
The researchers found a way to tell if two samples of DNA come from the same person by looking at how similar their genetic information is.
Methodology
The study analyzed 24 lanes of HiSeq 2000 data from three different human samples to compare genotype concordance rates.
Limitations
The method may not be applicable to all sequencing platforms and relies on sufficient data quality.
Participant Demographics
Samples included unrelated male and female individuals of Caucasian descent with known partial consanguinity.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website