Clustering short text using Ncut-weighted non-negative matrix factorization
Non-negative Matrix Factorization
Discriminative model
tf–idf
Document Clustering
DOI:
10.1145/2396761.2398615
Publication Date:
2012-11-15T16:38:00Z
AUTHORS (5)
ABSTRACT
Non-negative matrix factorization (NMF) has been successfully applied in document clustering. However, experiments on short texts, such as microblogs, Q&A documents and news titles, suggest unsatisfactory performance of NMF. An major reason is that the traditional term weighting schemes, like binary weight tfidf, cannot well capture terms' discriminative power importance due to sparsity data. To tackle this problem, we proposed a novel scheme for NMF, derived from Normalized Cut (Ncut) problem affinity graph. Different idf, which emphasizes discriminability level, Ncut measures level. Experiments two data sets show our significantly boosts NMF's text
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (5)
CITATIONS (30)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....