BIOKDD '07 workshop was successfully completed on Aug 12, 2007. You may find the electronic proceedings here.
We are editing a new book "BIOKDD: Knowledge Discovery and Data Mining in Biology", predominately based on extended work selected among previous BIOKDD contributors. However, if you can write well to a lay audience, have a unique computational technique that works, or want to share unique large-scale discocvery results using pure computational techniques, you may consider contributing a BIOKDD book chapter. If so, please send us your presubmission inquiry by email to us ASAP.
Introduction
Bioinformatics is the science of managing, mining, and interpreting information from biological data. Various genome projects have contributed to an exponential growth in DNA and protein sequence databases. Advances in high-throughput technology such as microarrays and mass spectrometry have further created the fields of functional genomics and proteomics, in which one can monitor quantitatively the presence of multiple genes, proteins, metabolites, and compounds in a given biological state. The ongoing influx of these data, the presence of biological answers to data observed despite noises, and the gap between data collection and knowledge curation have collectively created exciting opportunities for data mining researchers.
While tremendous progress has been made over the years, many of the fundamental problems in bioinformatics, such as protein structure prediction, gene-environment interaction, and regulatory pathway mapping, are still open. Data mining will play essential roles in understanding these fundamental problems and development of novel therapeutic/diagnostic solutions in post-genome medicine.
Workshop History (2001-2006)
Data Mining approaches seem ideally suited for Bioinformatics, since it is data-rich, but lacks a comprehensive theory of life's organization at the molecular level. The extensive databases of biological information create both challenges and opportunities for developing novel KDD methods. To highlight these avenues we organized the Workshops on Data Mining in Bioinformatics (BIOKDD 2001-2006), held annually in conjunction with the ACM SIGKDD Conference.
This will be the 7th year for the workshop.
The goal of this workshop is to encourage KDD researchers to take on the numerous challenges that Bioinformatics offers. The workshop will feature invited talks from noted experts in the field, and the latest data mining research in bioinformatics. We encourage papers that propose novel data mining techniques for post-genome bioinformatics studies in areas such as:
Phylogenetics and comparative Genomics
DNA microarray data analysis
RNAi and microRNA Analysis
Protein/RNA structure prediction
Sequence and structural motif finding
Modeling of biological networks and pathways
Statistical learning methods in bioinformatics
Computational proteomics
Computational biomarker discoveries
Computational drug discoveries
Biomedical text mining
Biological data management techniques
Semantic webs and ontology-driven biological data integration methods
Papers should be at most 10 pages long, single-spaced, in font size 10 or larger with one-inch margins on all sides. Paper in PDF/PS format can be sent to both of the co-chairs by email. Camera-ready format papers may be referenced from previous BIOKDD conference proceedings (e.g., BIOKDD06)
Important Dates
5/25/2007 Submission of Papers
6/23/2007 Notification of Acceptance; Workshop Registration Open
7/14/2007 Submission of Camera Ready Papers
8/12/2007 Full-day Workshop Presentation
Publication
Submission of accepted papers. For accepted workshop papers, we require that each camera-ready paper be formatted strictly according to the official ACM Proceedings Format. Please submit PDF file only. To prepare for the camera-ready PDF file submission, you may use either the Microsoft word template or the Latex files preparation instructions found here. All final camera-ready submissions must be accompanied by a completed digital copy (scanned Okay) of the ACM copyright transfer form, or else the paper cannot be included in the final workshop proceedings.
Publication of proceeding and expanded papers. You may read the workshop editorial (small pdf file) here and the full workshop proceeding (large pdf file) is also made available online here. Expanded version of selected high-quality papers from the workshop will be invited for publication in a special issue (late spring/summer 2008) of Journal of Bioinformatics and Computational Biology (JBCB). Details of the journal/book publication will be announced after the workshop and this web site: http://bio.informatics.iupui.edu/biokdd07/
Program Overview
Duration: 1 FULL DAY (08/12/07)
Location: BIOKDD '07 will be held in conjuction with ACM KDD 2007, at the Fairmont San Jose Hotel in San Jose, California, USA. The following is the contact information for the hotel:
Fairmont Hotel
170 South Market Street
San Jose, CA, 95113
Tel: (408) 998-1900
Fax: (408) 287-1648
Email: sanjose@fairmont.com http://www.fairmont.com/sanjose/
The workshop registration is required for each accepted paper. The fee covers hospitalities and administrative expenses related to the successful organization of the workshop. The registration fee is $60 for each workshop paper presenter (without printed proceedings), or $80 for each workshop paper presentation (with printed proceedings), or $60 for official full-day participant who needs printed copies of the workshop proceedings. For those who are not presenting and who do not intend to participate the full day event, official registration fee is not required but recommended.
The registration is now open as of July 25th 2007 and will close on August 10th 2007.
Please also note that ACM SIGKDD '07 conference has a separate registration process for those interested in the whole ACM KDD conference event. The conference registration ($700), however, will not be required for participation in this workshop.
To register officially for the workshop, please use the following Google Checkout to pay the fees.
Workshop Schedule
Note: the allocated time includes presentation time, Q&A time (5min), and transition time from one speaker to the next (2min).
8:50-9:00am: Opening Remarks
Session 1. 9:00-9:30am: Talk 1
• “Gene Selection by Matrix Reordering and Replicator Dynamics”, Wenyuan Li, Xiuwen Zheng, and Ying Liu, University of Texas at Dallas and University of Washington. 9:30-10:00am: Talk 2 • “Investigating the Use of Extrinsic Similarity Measures for Microarray Analysis”, Duygu Ucar, F. Altiparmak, H. Ferhatosmanoglu, and Srinivasan Parthasarathy, The Ohio State University.
10:00-10:30am: Coffee Break
Session 2. 10:30-11:00am: Talk 3 • “Mining Over-Represented 3D Patterns of Secondary Structures in Proteins”, Matteo Comin, Concettina Guerra and Giuseppe Zanotti, University of Padova, Italy and Georgia Institute of Technology. 11:00-12:00am: Invited Talk • “Exploring Genomic Medicine Using Integrative Biology”, Atul Butte, Stanford University School of Medicine and the Lucile Packard Children's Hospital.
12:00-1:30pm: Lunch
Session 3. 1:30-2:00pm: Talk 4
• “Combining Domain Fusions and Domain-Domain Interactions to Predict Protein-Protein Interactions”, Nguyen Thanh Phuong and Tu Bao Ho, Japan Advanced Institute of Science and Technology. 2:00-2:30pm: Talk 5 •“A Linear-time Algorithm for Predicting Functional Annotations from Protein-Protein Interaction Networks”, Yonghui Wu and Stefano Lonardi, University of California, Riverside. 2:30-3:00pm: Talk 6 • “Profile-feature based Protein Interaction Extraction from Full-Text Articles”, Shilin Ding, Minlie Huang, Hongning Wang, and Xiaoyan Zhu, Tsinghua University, China. 3:00-3:30pm: Talk 7 • “A Decomposition Approach for Discovering Network Building Blocks”, Qiaofeng Yang and Stefano Lonardi, Lawrence Berkeley National Laboratory and University of California, Riverside.
3:30-4:00pm: Coffee Break
Session 4. 4:00-4:20pm: Short Talk 1
• “Use of Gene Ontology as a Tool for Assessment of Analytical Algorithms with Real Data Sets: Impact of Revised Affymetrix CDF Annotation”, Megan Kong, Zhongxue Chen, Yu Qian, Jennifer Cai, Jamie Lee, Eva Rab, Monnie McGee, and Richard H. Scheuermann, University of Texas Southwestern Medical Center and Southern Methodist University. 4:20-4:40pm: Short Talk 2 •“Clustering of Non-Alignable Protein Sequences”, Abdellali Kelil, Shengrui Wang, Ryszard Brzezinski, University of Sherbrooke Sherbrooke, QC, Canada 4:40-5:00pm: Short Talk 3
• “Discovering Ovarian Cancer Biomarkers using Gene Ontology Based Microarray Analysis”, Wei Guan, Alexander Gray, Sham Navathe, Nathan Bowen, John McDonald, and Lilya Matyunina, Georgia Institute of Technology
5:00pm: Concluding Remarks
Organizers
Program Chairs
Jake Y. Chen
Indiana University School of Informatics
Purdue School of Science Department of Computer & Information Science
Indiana University–Purdue University Indianapolis
Indianapolis, IN 46202 Email: jakechen@iupui.edu
Web site: http://bio.informatics.iupui.edu/CV/
Stefano Lonardi
Department of Computer Science & Engineering
University of California
Riverside, CA 92521
Email: stelo@cs.ucr.edu