RGlab alumni Eloi Mercier and Arnaud Droit will be giving a Bioconductor practical at this year’s BioC held in Seattle on July 27th through 29th.
Transcription factors (TFs) play critical roles in regulating gene expression. Determining transcription factor binding sites (TFBSs) is challenging because the DNA segments recognized by TFs are often short and dispersed in the genome, and the target loci of a TF vary between tissues, developmental stages and physiological conditions. Chromatin immunoprecipitation followed by massively parallel sequencing (ChIP-seq) has become the standard for genome-wide profiling DNA association of transcription factors. However, analysis of the massive and heterogeneous datasets from these studies poses several challenges, including effective data visualization, seamless connection of low-level (close to raw data) and high-level (close to answering biological questions) analysis, integration of data from multiple technological platforms, and flexibility to customize the analysis so that specific biological questions cans be addressed. Although there are several recently developed programs that target some of the individual steps, we describe an integrated, open source, R-based analysis pipeline. The pipeline addresses data input, peak detection, sequence and motif analysis, visualization, and data export, and can readily be extended via other R and Bioconductor packages.
We propose a lab session to present the first complete set of Bioconductor tools for sequence and motifs analysis of ChiP-Seq data. It core consists of three Bioconductor packages: PICS calls enriched regions; rGADEM identifies de novo motifs and MotIV visualizes and annotates motifs. The pipeline is computationally efficient, and have been designed to work together and with other Bioconductor packages (ShortRead, ChIPpeakAnno, rtracklayer, GenomeGraphs, etc.)




No comments yet.