9/03/2014

Biases in RNA deep sequencing data

Biases in small RNA deep sequencing data

  1. Timofey S. Rozhdestvensky1,*
+ Author Affiliations
  1. 1Institute of Experimental Pathology (ZMBE), University of Muenster, Von-Esmarch-Strasse 56, 48149 Muenster, Germany and 2Advanced Medical and Dental Institute (AMDI), Universiti Sains Malaysia, 13200 Penang, Malaysia
  1. *To whom correspondence should be addressed. Tel: +49 251 8358607; Fax: +49 251 8352134; Email: rozhdest@uni-muenster.de
  2. Correspondence may also be addressed to Carsten A. Raabe. Tel: +49 251 8358615; Fax: +49 251 8352134; Email: raabec@uni-muenstser.de

High-throughput RNA sequencing (RNA-seq) is considered a powerful tool for novel gene discovery and fine-tuned transcriptional profiling. The digital nature of RNA-seq is also believed to simplify meta-analysis and to reduce background noise associated with hybridization-based approaches. The development of multiplex sequencing enables efficient and economic parallel analysis of gene expression. In addition, RNA-seq is of particular value when low RNA expression or modest changes between samples are monitored. However, recent data uncovered severe bias in the sequencing of small non-protein coding RNA (small RNA-seq or sRNA-seq), such that the expression levels of some RNAs appeared to be artificially enhanced and others diminished or even undetectable. The use of different adapters and barcodes during ligation as well as complex RNA structures and modifications drastically influence cDNA synthesis efficacies and exemplify sources of bias in deep sequencing. In addition, variable specific RNA G/C-content is associated with unequal polymerase chain reaction amplification efficiencies. Given the central importance of RNA-seq to molecular biology and personalized medicine, the authors review recent findings that challenge small non-protein coding RNA-seq data and suggest approaches and precautions to overcome or minimize bias.


No comments:

Post a Comment