Artikel ; Online: Cluster analysis of urdu tweets
Journal of King Saud University: Computer and Information Sciences, Vol 34, Iss 5, Pp 2170-
2022 Band 2179
Abstract: Document clustering allows a user to group semantically similar documents. It has been an interesting research area for the past many years and various methods and techniques have been developed. However, the research has primarily been limited to ... ...
Abstract | Document clustering allows a user to group semantically similar documents. It has been an interesting research area for the past many years and various methods and techniques have been developed. However, the research has primarily been limited to English and other high resource languages. For low-resource languages, such as Urdu, the area of document clustering is open to contributions. This work presents an experimental evaluation of clustering techniques on Urdu tweets. It is a challenging task to semantically cluster tweets due to their very short length. In this paper, various features, including sentence and phrase-level embeddings, TF-IDF features and document embeddings are extracted from tweets and clustering is performed using three different algorithms: K-Means, Bisecting K-Means, and Affinity Propagation algorithms. Furthermore, a comparison is performed with the traditional topic modeling approach. The results indicate that the TF-IDF features combined with the K-means clustering algorithm outperformed the adopted clustering techniques. |
---|---|
Schlagwörter | Document clustering ; Topic modelling ; Unsupervised learning ; Feature extraction methods ; Document embeddings ; Urdu language processing ; Electronic computers. Computer science ; QA75.5-76.95 |
Thema/Rubrik (Code) | 004 |
Sprache | Englisch |
Erscheinungsdatum | 2022-05-01T00:00:00Z |
Verlag | Elsevier |
Dokumenttyp | Artikel ; Online |
Datenquelle | BASE - Bielefeld Academic Search Engine (Lebenswissenschaftliche Auswahl) |
Volltext online
Zusatzmaterialien
Kategorien
Fernleihe an ZB MED
Sie können sich den gewünschten Titel als lokale Nutzerin oder lokaler Nutzer von ZB MED direkt an den Standort Köln schicken lassen.