This tools retrieves promoter sequences for a set of genes and finds shared sequence motifs in them. Currently the tool works only for human, mouse, rat, drosophila and yeast genes.
This tool retrieves upstream sequences for the specified genes and submits them to the Weeder program. It needs to access the chip-specific annotations, so if you have not specified the chiptype during normalization of, e.g., Illumina data, it will not work. RefSeq IDs are used for retrieving promoter sequences constructed and annotated by UCSC genome browser staff. The same promoter sequences can be downloaded as a single FastA-formatted file from UCSC Golden Path folder. User can define how long promoter sequences are used for the analysis:
Human Mouse Rat Drosophila Yeast Small 1000 bp 1000 bp 1000 bp 1000 bp 500 bp Medium 2000 bp 2000 bp 2000 bp 2000 bp 1000 bp Large 5000 bp 5000 bp 5000 bp 5000 bp 2500 bp
After retrieving the promoter sequences, they are submitted to the Weeder program that finds common motifs (putative transcription factor binding sites). Weeder can find motifs of two different sizes. The sizes are the same for all species. The length of the common motif and the number of allowed mismatches in the motifs varies with the setting:
Length Mismatches Small 6 1 8 2 Medium 10 3
In addition, the user can define:
Tool returns a specified number of common motifs written to an HTML file. These can then be further analyzed to verify whether they are known transcription factor binding sites.