Ensemble Machine Methods for DNA Binding

12 years 7 months ago
Ensemble Machine Methods for DNA Binding
We introduce three ensemble machine learning methods for analysis of biological DNA binding by transcription factors (TFs). The goal is to identify both TF target genes and their binding motifs. Subspace-valued weak learners (formed from an ensemble of different motif finding algorithms) combine candidate motifs as probability weight matrices (PWM), which are then translated into subspaces of a DNA k-mer (string) feature space. Assessing and then integrating highly informative subspaces by machine methods gives more reliable target classification and motif prediction. We compare these target identification methods with probability weight matrix (PWM) rescanning and use of support vector machines on the full k-mer space of the yeast S. cerevisiae. This method, SVMotif-PWM, can significantly improve accuracy in computational identification of TF targets. The software is publicly available at
Yue Fan, Mark A. Kon, Charles DeLisi
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Authors Yue Fan, Mark A. Kon, Charles DeLisi
Comments (0)