Multi-label audio concept detection using correlated-aspect Gaussian Mixture Model
Gespeichert in:
Verfasser / Beitragende:
[Cencen Zhong, Zhenjiang Miao]
Ort, Verlag, Jahr:
2015
Enthalten in:
Multimedia Tools and Applications, 74/13(2015-07-01), 4817-4832
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605447306 | ||
| 003 | CHVBK | ||
| 005 | 20210128100131.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150701xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s11042-013-1842-9 |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s11042-013-1842-9 | ||
| 245 | 0 | 0 | |a Multi-label audio concept detection using correlated-aspect Gaussian Mixture Model |h [Elektronische Daten] |c [Cencen Zhong, Zhenjiang Miao] |
| 520 | 3 | |a As an essentially multi-label classification problem, audio concept detection is normally solved by treating concepts independently. Since in this process the original useful concept correlation information is missing, this paper proposes a new model named Correlated-Aspect Gaussian Mixture Model (C-AGMM) to take advantage of such a clue for enhancing multi-label audio concept detection. Originating from Aspect Gaussian Mixture Model (AGMM) which improves GMM by incorporating it into probabilistic Latent Semantic Analysis (pLSA), C-AGMM still learns a probabilistic model of the whole audio clip by regarding concepts as its component elements. However, different from AGMM that assumes concepts independent with each other, C-AGMM considers their distribution on a sub-manifold embedded in the ambient space. With an assumption that if two concepts are close in the intrinsic geometry of this distribution then their conditional probability distributions are likely to show similarity, a graph regularizer is exploited to model the correlation between these concepts. Following the Maximum Likelihood Estimate principle, model parameters of C-AGMM encoding the concept correlation clue are derived and used directly as the detection criterion. Experiments on two datasets show the effectiveness of our proposed model. | |
| 540 | |a Springer Science+Business Media New York, 2014 | ||
| 690 | 7 | |a Aspect Gaussian Mixture Model |2 nationallicence | |
| 690 | 7 | |a Probabilistic latent semantic analysis |2 nationallicence | |
| 690 | 7 | |a Audio concept detection |2 nationallicence | |
| 690 | 7 | |a Concept correlation |2 nationallicence | |
| 690 | 7 | |a Multi-label classification |2 nationallicence | |
| 700 | 1 | |a Zhong |D Cencen |u Institute of Information Science, Beijing Jiaotong University, Beijing, China |4 aut | |
| 700 | 1 | |a Miao |D Zhenjiang |u Institute of Information Science, Beijing Jiaotong University, Beijing, China |4 aut | |
| 773 | 0 | |t Multimedia Tools and Applications |d Springer US; http://www.springer-ny.com |g 74/13(2015-07-01), 4817-4832 |x 1380-7501 |q 74:13<4817 |1 2015 |2 74 |o 11042 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s11042-013-1842-9 |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s11042-013-1842-9 |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Zhong |D Cencen |u Institute of Information Science, Beijing Jiaotong University, Beijing, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Miao |D Zhenjiang |u Institute of Information Science, Beijing Jiaotong University, Beijing, China |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Multimedia Tools and Applications |d Springer US; http://www.springer-ny.com |g 74/13(2015-07-01), 4817-4832 |x 1380-7501 |q 74:13<4817 |1 2015 |2 74 |o 11042 | ||