Combining multiple clusterings for protein structure prediction

Sakar, C.; Kursun, Olcay; Seker, Huseyin; Gurgen, Fikret

doi:10.1504/ijdmb.2014.064012

Combining multiple clusterings for protein structure prediction

Atıf İçin Kopyala

Sakar C. O., Kursun O., Seker H., Gurgen F.

INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, cilt.10, sa.2, ss.162-174, 2014 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 10 Sayı: 2
Basım Tarihi: 2014
Doi Numarası: 10.1504/ijdmb.2014.064012
Dergi Adı: INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.162-174
İstanbul Üniversitesi Adresli: Evet

Özet

Computational annotation and prediction of protein structure is very important in the post-genome era due to existence of many different proteins, most of which are yet to be verified. Mutual information based feature selection methods can be used in selecting such minimal yet predictive subsets of features. However, as protein features are organised into natural partitions, individual feature selection that ignores the presence of these views, dismantles them, and treats their variables intermixed along with those of others at best results in a complex un-interpretable predictive system for such multi-view datasets. In this paper, instead of selecting a subset of individual features, each feature subset is passed through a clustering step so that it is represented in discrete form using the cluster indices; this makes mutual information based methods applicable to view-selection. We present our experimental results on a multi-view protein dataset that are used to predict protein structure.