Topic segmentation on spoken documents using self-validated acoustic cuts

Verfasser / Beitragende:
[Hongjie Chen, Lei Xie, Wei Feng, Lilei Zheng, Yanning Zhang]
Ort, Verlag, Jahr:
2015
Enthalten in:
Soft Computing, 19/1(2015-01-01), 47-59
Format:
Artikel (online)
ID: 605468397
LEADER caa a22 4500
001 605468397
003 CHVBK
005 20210128100316.0
007 cr unu---uuuuu
008 210128e20150101xx s 000 0 eng
024 7 0 |a 10.1007/s00500-014-1383-9  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s00500-014-1383-9 
245 0 0 |a Topic segmentation on spoken documents using self-validated acoustic cuts  |h [Elektronische Daten]  |c [Hongjie Chen, Lei Xie, Wei Feng, Lilei Zheng, Yanning Zhang] 
520 3 |a Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach. 
540 |a Springer-Verlag Berlin Heidelberg, 2014 
690 7 |a Topic segmentation  |2 nationallicence 
690 7 |a Story segmentation  |2 nationallicence 
690 7 |a Topic boundary detection  |2 nationallicence 
690 7 |a Spoken document retrieval  |2 nationallicence 
690 7 |a Normalized cuts  |2 nationallicence 
700 1 |a Chen  |D Hongjie  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
700 1 |a Xie  |D Lei  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
700 1 |a Feng  |D Wei  |u School of Computer Science, Tianjin University, Tianjin, People's Republic of China  |4 aut 
700 1 |a Zheng  |D Lilei  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
700 1 |a Zhang  |D Yanning  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
773 0 |t Soft Computing  |d Springer Berlin Heidelberg  |g 19/1(2015-01-01), 47-59  |x 1432-7643  |q 19:1<47  |1 2015  |2 19  |o 500 
856 4 0 |u https://doi.org/10.1007/s00500-014-1383-9  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s00500-014-1383-9  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Chen  |D Hongjie  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Xie  |D Lei  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Feng  |D Wei  |u School of Computer Science, Tianjin University, Tianjin, People's Republic of China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Zheng  |D Lilei  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Zhang  |D Yanning  |u School of Computer Science, Northwestern Polytechnical University, Xi'an, Shaanxi, People's Republic of China  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Soft Computing  |d Springer Berlin Heidelberg  |g 19/1(2015-01-01), 47-59  |x 1432-7643  |q 19:1<47  |1 2015  |2 19  |o 500