Multi-label audio concept detection using correlated-aspect Gaussian Mixture Model

Verfasser / Beitragende:
[Cencen Zhong, Zhenjiang Miao]
Ort, Verlag, Jahr:
2015
Enthalten in:
Multimedia Tools and Applications, 74/13(2015-07-01), 4817-4832
Format:
Artikel (online)
ID: 605447306
LEADER caa a22 4500
001 605447306
003 CHVBK
005 20210128100131.0
007 cr unu---uuuuu
008 210128e20150701xx s 000 0 eng
024 7 0 |a 10.1007/s11042-013-1842-9  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s11042-013-1842-9 
245 0 0 |a Multi-label audio concept detection using correlated-aspect Gaussian Mixture Model  |h [Elektronische Daten]  |c [Cencen Zhong, Zhenjiang Miao] 
520 3 |a As an essentially multi-label classification problem, audio concept detection is normally solved by treating concepts independently. Since in this process the original useful concept correlation information is missing, this paper proposes a new model named Correlated-Aspect Gaussian Mixture Model (C-AGMM) to take advantage of such a clue for enhancing multi-label audio concept detection. Originating from Aspect Gaussian Mixture Model (AGMM) which improves GMM by incorporating it into probabilistic Latent Semantic Analysis (pLSA), C-AGMM still learns a probabilistic model of the whole audio clip by regarding concepts as its component elements. However, different from AGMM that assumes concepts independent with each other, C-AGMM considers their distribution on a sub-manifold embedded in the ambient space. With an assumption that if two concepts are close in the intrinsic geometry of this distribution then their conditional probability distributions are likely to show similarity, a graph regularizer is exploited to model the correlation between these concepts. Following the Maximum Likelihood Estimate principle, model parameters of C-AGMM encoding the concept correlation clue are derived and used directly as the detection criterion. Experiments on two datasets show the effectiveness of our proposed model. 
540 |a Springer Science+Business Media New York, 2014 
690 7 |a Aspect Gaussian Mixture Model  |2 nationallicence 
690 7 |a Probabilistic latent semantic analysis  |2 nationallicence 
690 7 |a Audio concept detection  |2 nationallicence 
690 7 |a Concept correlation  |2 nationallicence 
690 7 |a Multi-label classification  |2 nationallicence 
700 1 |a Zhong  |D Cencen  |u Institute of Information Science, Beijing Jiaotong University, Beijing, China  |4 aut 
700 1 |a Miao  |D Zhenjiang  |u Institute of Information Science, Beijing Jiaotong University, Beijing, China  |4 aut 
773 0 |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/13(2015-07-01), 4817-4832  |x 1380-7501  |q 74:13<4817  |1 2015  |2 74  |o 11042 
856 4 0 |u https://doi.org/10.1007/s11042-013-1842-9  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s11042-013-1842-9  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Zhong  |D Cencen  |u Institute of Information Science, Beijing Jiaotong University, Beijing, China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Miao  |D Zhenjiang  |u Institute of Information Science, Beijing Jiaotong University, Beijing, China  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/13(2015-07-01), 4817-4832  |x 1380-7501  |q 74:13<4817  |1 2015  |2 74  |o 11042