Data Handling Strategies for High Throughput Pyrosequencers
Author Information
Author(s): Gabriele A. Trombetti, Raoul JP Bonnal, Ermanno Rizzi, Gianluca De Bellis, Luciano Milanesi
Primary Institution: Institute for Biomedical Technologies – National Research Council (ITB-CNR)
Hypothesis
How can we effectively handle and analyze data from new high throughput pyrosequencers?
Conclusion
We developed an automated computation pipeline that effectively manages the increased data throughput from new pyrosequencers by utilizing the European Grid for computational power.
Supporting Evidence
- The pipeline successfully analyzed 273 sequenced amplicons from a cancerous human sample.
- Mutations were confirmed by either Sanger resequencing or NCBI dbSNP.
- The Grid platform provided a cost-effective solution for uneven workloads in scientific research.
Takeaway
We built a smart computer program to help scientists analyze DNA data faster and cheaper using powerful computers in Europe.
Methodology
An automated multi-step computation pipeline integrated with a database storage system was created to analyze sequenced amplicons.
Limitations
The study discusses the peculiarities of the new pyrosequencer and the challenges of using the Grid for computation.
Participant Demographics
Cancerous human samples were used for sequencing.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website