Hybrid method for modeless Japanese input using N-gram based binary classification and dictionary

Verfasser / Beitragende:
[Yukino Ikegami, Setsuo Tsuruta]
Ort, Verlag, Jahr:
2015
Enthalten in:
Multimedia Tools and Applications, 74/11(2015-06-01), 3933-3946
Format:
Artikel (online)
ID: 605447675
LEADER caa a22 4500
001 605447675
003 CHVBK
005 20210128100133.0
007 cr unu---uuuuu
008 210128e20150601xx s 000 0 eng
024 7 0 |a 10.1007/s11042-013-1805-1  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s11042-013-1805-1 
245 0 0 |a Hybrid method for modeless Japanese input using N-gram based binary classification and dictionary  |h [Elektronische Daten]  |c [Yukino Ikegami, Setsuo Tsuruta] 
520 3 |a The rapid growth of globalization requires handling a large number of multilingual documents, where Japanese input co-exist with English and other languages, which use the Roman alphabet. Conventional methods for Japanese input require Japanese users to switch the input mode between Japanese and the Latin alphabet. As current solution, there is a modeless Japanese input method that automatically switches the input mode. However, those need training with a large amount of text data for improving the performance. This paper proposes a hybrid modeless Japanese input method that is based on the non-Japanese word dictionary and n-gram character sequence features to decide whether to convert and switch to Kana input or not. The aim of using the non-Japanese word dictionary is decreasing false positive against non-Japanese language words. This dictionary is composed by text data available on the Web. The n-gram based discriminative model are learned by a Support Vector Machine from a balanced corpus, which contains various domain texts. The evaluation of our method has shown that its statistical accuracy according to F-measure for prediction of non-Kana characters improves 7.7 % compared to n-gram only based method. In addition, the real user test has shown the average value of inputted time was agreeside for our method, against disagree side for conventional Japanese input method that requires switching input mode. 
540 |a Springer Science+Business Media New York, 2014 
690 7 |a Multilingual documents  |2 nationallicence 
690 7 |a Modeless Japanese input  |2 nationallicence 
700 1 |a Ikegami  |D Yukino  |u Tokyo Denki University, MuzaiGakuendai, 2-1200, Chiba, Inzai-shi, Japan  |4 aut 
700 1 |a Tsuruta  |D Setsuo  |u Tokyo Denki University, MuzaiGakuendai, 2-1200, Chiba, Inzai-shi, Japan  |4 aut 
773 0 |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/11(2015-06-01), 3933-3946  |x 1380-7501  |q 74:11<3933  |1 2015  |2 74  |o 11042 
856 4 0 |u https://doi.org/10.1007/s11042-013-1805-1  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s11042-013-1805-1  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Ikegami  |D Yukino  |u Tokyo Denki University, MuzaiGakuendai, 2-1200, Chiba, Inzai-shi, Japan  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Tsuruta  |D Setsuo  |u Tokyo Denki University, MuzaiGakuendai, 2-1200, Chiba, Inzai-shi, Japan  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Multimedia Tools and Applications  |d Springer US; http://www.springer-ny.com  |g 74/11(2015-06-01), 3933-3946  |x 1380-7501  |q 74:11<3933  |1 2015  |2 74  |o 11042