Dimont sequence extractor using own genome

Description

Dimont sequence extractor prepares an annotated FastA file as required by Dimont from a genome (in FastA format) and a tabular file (e.g., BED, GTF, narrowPeak,...).

This version of the tool allows to use custom genomes.

Inputs

Parameters

Details

The regions specified in the tabular file are used to determine the center of the extracted sequences. All extracted sequences have the same length as specified by parameter "Width".

In case of ChIP data, the center position could for instance be the peak summit. An annotated FastA file for ChIP-seq data comprising sequences of length 1000 centered around the peak summit might look like:

> peak: 500; signal: 515
ggccatgtgtatttttttaaatttccac...
> peak: 500; signal: 199
GGTCCCCTGGGAGGATGGGGACGTGCTG...
...

where the anchor point is given as 500 for the first two sequences, and the confidence amounts to 515 and 199, respectively.

Output

Reference

If you use Dimont, please cite

J. Grau, S. Posch, I. Grosse, and J. Keilwagen. A general approach for discriminative de-novo motif discovery from high-throughput data. Nucleic Acids Research, 41(21):e197, 2013.