|
||||
|
|
Computational Biology Group |
| University of Cambridge > Department of Oncology > Computational Biology Group > Resources |
BayesPeakThis is an algorithm for finding enriched locations in ChIP-seq data. A package is being developed to provide a genome-wide analysis and will be released here. Last update: 7/10/09. C and Perl codes
InstructionsThe sequencing procedure following chromatin immunoprecipitation produces short sequences representing the ends of the fragments contained in the sample. The bed files produced after these reads are aligned back to the reference genome need to be converted into counts-per-window for BayesPeak to analyse them. The perl script above, does exactly that, for each sample producing two count files, one for the forward and one for the reverse DNA strand. To use it, copy the scripts and and data files to your working directory and type:   perl   forBayesPeak.pl   input_filename.bed   window_length   output_filename_forward.txt   output_filename_reverse.txt
For example, to use the input and output files that follow, using 300bp genomic windows type Then, to do the peak-calling, compile the C code by typing   gcc   BayesPeak_H3K4me3.c   -lm   -O3   -o   out and then run by typing: out or ./out This will produce the files
parameter_estimates.txt   which contains the parameters of the model at each simulation (details will follow) We recommend using a threshold of 0.50 for those probabilities and then joining the resulting adjacent windows to define the peaks in the data. At the present state, the code is only available to run on this dataset and looks at the specific region 92-95Mb on mouse chromosome 16 (mm9). This is a preliminary presentation of our algorithm and modifications will follow. Example data and files
This site is being updated, for any enquiries with the above scripts and data contact C.Spyrou[at]statslab.cam.ac.uk |
|
|
|