Abstract
Gene regulatory programs are orchestrated by proteins calpression of target genes both through direct binding to gAccurately modeling the DNA sequence preferences of TFs aproblems in regulatory genomics. These efforts have long racy of TF binding site motifs. Today, protein binding miprecipitation followed by sequencing (ChIP-seq) experimendata on in vitro and in vivo TF binding. Moreover, genomeincluding ChIP-seq experiments that profile histone modiftional states, provide additional information for predictWe will present a flexible new discriminative framework fusing these massive data sets. We will first describe in train support vector regression (SVR) models with a novelprobe sequences to binding intensities. In a large data sour SVR models better predicted in vitro binding than popon enrichment of k-mer patterns.We will then show how to train kernel-based SVM models disequence models and investigate the cell-type specificity