Nonnegative Matrix Factorization and Its Application to Pattern Analysis and Text Mining


Zurada J. M., Ensari T., Asl E. H., Chorowski J.

Federated Conference on Computer Science and Information Systems (FedCSIS), Krakow, Polonya, 8 - 11 Eylül 2013, ss.11-16 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Basıldığı Şehir: Krakow
  • Basıldığı Ülke: Polonya
  • Sayfa Sayıları: ss.11-16
  • İstanbul Üniversitesi Adresli: Evet

Özet

Nonnegative Matrix Factorization (NMF) is one of the most promising techniques to reduce the dimensionality of the data. This presentation compares the method with other popular matrix decomposition approaches for various pattern analysis tasks. Among others, NMF has been also widely applied for clustering and latent feature extraction. Several types of the objective functions have been used for NMF in the literature. Instead of minimizing the common Euclidean Distance (EucD) error, we review an alternative method that maximizes the correntropy similarity measure to produce the factorization. Correntropy is an entropy-based criterion defined as a nonlinear similarity measure. Following the discussion of maximization of the correntropy function, we use it to cluster document data set and compare the clustering performance with the EucD-based NMF. Our approach was applied and illustrated for the clustering of documents in the 20-Newsgroups data set. The comparison is illustrated with 20-Newsgroups data set. The results show that our approach produces per average better clustering compared with other methods which use EucD as an objective function.