Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data

Verfasser / Beitragende:
[Federico Comoglio, Cem Sievers, Renato Paro]
Ort, Verlag, Jahr:
2015
Enthalten in:
BMC Bioinformatics, 16, p. 32
Format:
Artikel (online)
ID: 528784714
LEADER naa a22 4500
001 528784714
005 20180924065517.0
007 cr unu---uuuuu
008 180924e201502 xx s 000 0 eng
024 7 0 |a 10.3929/ethz-b-000110087  |2 doi 
024 7 0 |a 10.1186/s12859-015-0470-y  |2 doi 
035 |a (ETHRESEARCH)oai:www.research-collecti.ethz.ch:20.500.11850/110087 
100 1 |a Comoglio  |D Federico 
245 1 0 |a Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data  |h [Elektronische Daten]  |c [Federico Comoglio, Cem Sievers, Renato Paro] 
246 0 |a BMC bioinformatics 
506 |a Open access  |2 ethresearch 
520 3 |a Background PAR-CLIP is a recently developed Next Generation Sequencing-based method enabling transcriptome-wide identification of interaction sites between RNA and RNA-binding proteins. The PAR-CLIP procedure induces specific base transitions that originate from sites of RNA-protein interactions and can therefore guide the identification of binding sites. However, additional sources of transitions, such as cell type-specific SNPs and sequencing errors, challenge the inference of binding sites and suitable statistical approaches are crucial to control false discovery rates. In addition, a highly resolved delineation of binding sites followed by an extensive downstream analysis is necessary for a comprehensive characterization of the protein binding preferences and the subsequent design of validation experiments. Results We present a statistical and computational framework for PAR-CLIP data analysis. We developed a sensitive transition-centered algorithm specifically designed to resolve protein binding sites at high resolution in PAR-CLIP data. Our method employes a Bayesian network approach to associate posterior log-odds with the observed transitions, providing an overall quantification of the confidence in RNA-protein interaction. We use published PAR-CLIP data to demonstrate the advantages of our approach, which compares favorably with alternative algorithms. Lastly, by integrating RNA-Seq data we compute conservative experimentally-based false discovery rates of our method and demonstrate the high precision of our strategy. Conclusions Our method is implemented in the R package wavClusteR 2.0. The package is distributed under the GPL-2 license and is available from BioConductor at http://www.bioconductor.org/packages/devel/bioc/html/wavClusteR.html. 
540 |a Creative Commons Attribution 4.0 International  |u http://creativecommons.org/licenses/by/4.0  |2 ethresearch 
690 7 |a PAR-CLIP  |2 ethresearch 
690 7 |a RNA  |2 ethresearch 
690 7 |a RNA binding proteins  |2 ethresearch 
690 7 |a Bayesian statistics  |2 ethresearch 
700 1 |a Sievers  |D Cem  |e joint author 
700 1 |a Paro  |D Renato  |e joint author 
773 0 |t BMC Bioinformatics  |d London : BioMed Central  |g 16, p. 32  |x 1471-2105 
856 4 0 |u http://hdl.handle.net/20.500.11850/110087  |q text/html  |z WWW-Backlink auf das Repository (Open access) 
908 |D 1  |a Journal Article  |2 ethresearch 
950 |B ETHRESEARCH  |P 856  |E 40  |u http://hdl.handle.net/20.500.11850/110087  |q text/html  |z WWW-Backlink auf das Repository (Open access) 
950 |B ETHRESEARCH  |P 100  |E 1-  |a Comoglio  |D Federico 
950 |B ETHRESEARCH  |P 700  |E 1-  |a Sievers  |D Cem  |e joint author 
950 |B ETHRESEARCH  |P 700  |E 1-  |a Paro  |D Renato  |e joint author 
950 |B ETHRESEARCH  |P 773  |E 0-  |t BMC Bioinformatics  |d London : BioMed Central  |g 16, p. 32  |x 1471-2105 
898 |a BK010053  |b XK010053  |c XK010000 
949 |B ETHRESEARCH  |F ETHRESEARCH  |b ETHRESEARCH  |j Journal Article  |c Open access