Markov random field based fusion for supervised and semi-supervised multi-modal image classification

Verfasser / Beitragende:
[Liang Xie, Peng Pan, Yansheng Lu]
Ort, Verlag, Jahr:
2015
Enthalten in:
Multimedia Tools and Applications, 74/2(2015-01-01), 613-634
Format:
Artikel (online)
ID: 605446725
LEADER caa a22 4500
001 605446725
003 CHVBK
005 20210128100128.0
007 cr unu---uuuuu
008 210128e20150101xx s 000 0 eng
024 7 0 |a 10.1007/s11042-014-2018-y  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s11042-014-2018-y 
245 0 0 |a Markov random field based fusion for supervised and semi-supervised multi-modal image classification  |h [Elektronische Daten]  |c [Liang Xie, Peng Pan, Yansheng Lu] 
520 3 |a In recent years, there has been a massive explosion of multimedia content on the web, multi-modal examples such as images associated with tags can be easily accessed from social website such as Flickr. In this paper, we consider two classification tasks: supervised and semi-supervised multi-modal image classification, to take advantage of the increasing multi-modal examples on the web. We first propose a Markov random field (MRF) based fusion method: discriminative probabilistic graphical fusion (DPGF) for the supervised multi-modal image classification, which can make use of the associated tags to enhance the classification performance. Based on DPGF, we then propose a three-step learning procedure: DPGF+RLS+SVM, for the semi-supervised multi-modal image classification, which uses both the labeled and unlabeled examples for training. Experimental results on two datasets: PASCAL VOC'07 and MIR Flickr, show that our methods can well exploit the multi-modal data and unlabeled examples, and they also outperform previous state-of-the-art methods in both two multi-modal image classification. Finally we consider the weakly supervised condition where class labels are from image tags which are noisy. Our semi-supervised approach also improves the classification performance in this case. 
540 |a Springer Science+Business Media New York, 2014 
690 7 |a Multi-modal classification  |2 nationallicence 
690 7 |a Image classification  |2 nationallicence 
690 7 |a Semi-supervised learning  |2 nationallicence 
690 7 |a Markov random field  |2 nationallicence 
700 1 |a Xie  |D Liang  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
700 1 |a Pan  |D Peng  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
700 1 |a Lu  |D Yansheng  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
773 0 |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/2(2015-01-01), 613-634  |x 1380-7501  |q 74:2<613  |1 2015  |2 74  |o 11042 
856 4 0 |u https://doi.org/10.1007/s11042-014-2018-y  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s11042-014-2018-y  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Xie  |D Liang  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Pan  |D Peng  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Lu  |D Yansheng  |u School of Computer Science and Technology, Huazhong University of Science and Technology, 430074, Wuhan, China  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/2(2015-01-01), 613-634  |x 1380-7501  |q 74:2<613  |1 2015  |2 74  |o 11042