Statistical word sense aware topic models
Gespeichert in:
Verfasser / Beitragende:
[Guoyu Tang, Yunqing Xia, Jun Sun, Min Zhang, Thomas Zheng]
Ort, Verlag, Jahr:
2015
Enthalten in:
Soft Computing, 19/1(2015-01-01), 13-27
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605468265 | ||
| 003 | CHVBK | ||
| 005 | 20210128100315.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150101xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s00500-014-1372-z |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s00500-014-1372-z | ||
| 245 | 0 | 0 | |a Statistical word sense aware topic models |h [Elektronische Daten] |c [Guoyu Tang, Yunqing Xia, Jun Sun, Min Zhang, Thomas Zheng] |
| 520 | 3 | |a LDA has been proved effective in modeling the semantic relation between surface words. This semantic information in the document collection is useful to measure the topic distribution for a document. In general, a surface word may significantly contribute to several topics in a document collection. LDA measures the contribution of a surface word to each topic and considers a surface word to be identical across all documents. However, a surface word may present different signatures in different contexts, i.e., polysemous words can be used with different senses in different contexts. Intuitively, disambiguating word senses for topic models can enhance their discriminative capabilities. In this work, we propose a joint model to automatically induce document topics and word senses simultaneously. Instead of using some pre-defined word sense resources, we capture the word sense information via a latent variable and directly induce them in a fully unsupervised manner from the corpora. Experimental results show that the proposed joint model outperforms the baselines significantly in document clustering and improves the word sense induction as well against a standalone non-parametric model. | |
| 540 | |a Springer-Verlag Berlin Heidelberg, 2014 | ||
| 690 | 7 | |a Topic modeling |2 nationallicence | |
| 690 | 7 | |a Word sense induction |2 nationallicence | |
| 690 | 7 | |a Document representation |2 nationallicence | |
| 690 | 7 | |a Document clustering |2 nationallicence | |
| 700 | 1 | |a Tang |D Guoyu |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | |
| 700 | 1 | |a Xia |D Yunqing |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | |
| 700 | 1 | |a Sun |D Jun |u Institute for Infocomm Research, A-STAR, Singapore, Singapore |4 aut | |
| 700 | 1 | |a Zhang |D Min |u Soochow University, Suzhou, China |4 aut | |
| 700 | 1 | |a Zheng |D Thomas |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | |
| 773 | 0 | |t Soft Computing |d Springer Berlin Heidelberg |g 19/1(2015-01-01), 13-27 |x 1432-7643 |q 19:1<13 |1 2015 |2 19 |o 500 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s00500-014-1372-z |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s00500-014-1372-z |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Tang |D Guoyu |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Xia |D Yunqing |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Sun |D Jun |u Institute for Infocomm Research, A-STAR, Singapore, Singapore |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Zhang |D Min |u Soochow University, Suzhou, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Zheng |D Thomas |u Department of Computer Science and Technology, TNList, Tsinghua University, Beijing, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Soft Computing |d Springer Berlin Heidelberg |g 19/1(2015-01-01), 13-27 |x 1432-7643 |q 19:1<13 |1 2015 |2 19 |o 500 | ||