<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>     caa a22        4500</leader>
  <controlfield tag="001">467884064</controlfield>
  <controlfield tag="003">CHVBK</controlfield>
  <controlfield tag="005">20180406152729.0</controlfield>
  <controlfield tag="007">cr unu---uuuuu</controlfield>
  <controlfield tag="008">170328e20060401xx      s     000 0 eng  </controlfield>
  <datafield tag="024" ind1="7" ind2="0">
   <subfield code="a">10.1007/s10032-005-0147-6</subfield>
   <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="035" ind1=" " ind2=" ">
   <subfield code="a">(NATIONALLICENCE)springer-10.1007/s10032-005-0147-6</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
   <subfield code="a">Retrieving poorly degraded OCR documents</subfield>
   <subfield code="h">[Elektronische Daten]</subfield>
   <subfield code="c">[Y. Fataicha, M. Cheriet, J. Nie, C. Suen]</subfield>
  </datafield>
  <datafield tag="520" ind1="3" ind2=" ">
   <subfield code="a">A significant portion of currently available documents exist in the form of images, for instance, as scanned documents. Electronic documents produced by scanning and OCR software contain recognition errors. This paper uses an automatic approach to examine the selection and the effectiveness of searching techniques for possible erroneous terms for query expansion. The proposed method consists of two basic steps. In the first step, confused characters in erroneous words are located and editing operations are applied to create a collection of erroneous error-grams in the basic unit of the model. The second step uses query terms and error-grams to generate additional query terms, identify appropriate matching terms, and determine the degree of relevance of retrieved document images to the user's query, based on a vector space IR model. The proposed approach has been trained on 979 document images to construct about 2,822 error-grams and tested on 100 scanned Web pages, 200 advertisements and manuals, and 700 degraded images. The performance of our method is evaluated experimentally by determining retrieval effectiveness with respect to recall and precision. The results obtained show its effectiveness and indicate an improvement over standard methods such as vectorial systems without expanded query and 3-gram overlapping.</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
   <subfield code="a">Springer-Verlag, 2005</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Document processing</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Optical character recognition (OCR)</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Information retrieval (IR)</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Error-grams</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Query expansion</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Fataicha</subfield>
   <subfield code="D">Y.</subfield>
   <subfield code="u">Laboratory for Imagery, Vision, and Artificial Intelligence (LIVIA), École de Technologie Supérieure, 1100 Notre-Dame West, H3C 1K3, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Cheriet</subfield>
   <subfield code="D">M.</subfield>
   <subfield code="u">Laboratory for Imagery, Vision, and Artificial Intelligence (LIVIA), École de Technologie Supérieure, 1100 Notre-Dame West, H3C 1K3, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Nie</subfield>
   <subfield code="D">J.</subfield>
   <subfield code="u">Department Informatique et Recherche opérationnelle, University of Montreal, CP 6128, succursale Centre-ville, H3C 3J7, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Suen</subfield>
   <subfield code="D">C.</subfield>
   <subfield code="u">Centre for Pattern Recognition and Machine Intelligence (CENPARMI), Concordia University, Suite GM-606, 1455 de Maisonneuve Boulevard West, H3G 1M8, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="773" ind1="0" ind2=" ">
   <subfield code="t">International Journal of Document Analysis and Recognition (IJDAR)</subfield>
   <subfield code="d">Springer-Verlag</subfield>
   <subfield code="g">8/1(2006-04-01), 1-99999</subfield>
   <subfield code="x">1433-2833</subfield>
   <subfield code="q">8:1&lt;1</subfield>
   <subfield code="1">2006</subfield>
   <subfield code="2">8</subfield>
   <subfield code="o">10032</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2="0">
   <subfield code="u">https://doi.org/10.1007/s10032-005-0147-6</subfield>
   <subfield code="q">text/html</subfield>
   <subfield code="z">Onlinezugriff via DOI</subfield>
  </datafield>
  <datafield tag="908" ind1=" " ind2=" ">
   <subfield code="D">1</subfield>
   <subfield code="a">research-article</subfield>
   <subfield code="2">jats</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">856</subfield>
   <subfield code="E">40</subfield>
   <subfield code="u">https://doi.org/10.1007/s10032-005-0147-6</subfield>
   <subfield code="q">text/html</subfield>
   <subfield code="z">Onlinezugriff via DOI</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Fataicha</subfield>
   <subfield code="D">Y.</subfield>
   <subfield code="u">Laboratory for Imagery, Vision, and Artificial Intelligence (LIVIA), École de Technologie Supérieure, 1100 Notre-Dame West, H3C 1K3, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Cheriet</subfield>
   <subfield code="D">M.</subfield>
   <subfield code="u">Laboratory for Imagery, Vision, and Artificial Intelligence (LIVIA), École de Technologie Supérieure, 1100 Notre-Dame West, H3C 1K3, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Nie</subfield>
   <subfield code="D">J.</subfield>
   <subfield code="u">Department Informatique et Recherche opérationnelle, University of Montreal, CP 6128, succursale Centre-ville, H3C 3J7, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Suen</subfield>
   <subfield code="D">C.</subfield>
   <subfield code="u">Centre for Pattern Recognition and Machine Intelligence (CENPARMI), Concordia University, Suite GM-606, 1455 de Maisonneuve Boulevard West, H3G 1M8, Montreal, Quebec, Canada</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">773</subfield>
   <subfield code="E">0-</subfield>
   <subfield code="t">International Journal of Document Analysis and Recognition (IJDAR)</subfield>
   <subfield code="d">Springer-Verlag</subfield>
   <subfield code="g">8/1(2006-04-01), 1-99999</subfield>
   <subfield code="x">1433-2833</subfield>
   <subfield code="q">8:1&lt;1</subfield>
   <subfield code="1">2006</subfield>
   <subfield code="2">8</subfield>
   <subfield code="o">10032</subfield>
  </datafield>
  <datafield tag="900" ind1=" " ind2="7">
   <subfield code="a">Metadata rights reserved</subfield>
   <subfield code="b">Springer special CC-BY-NC licence</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="898" ind1=" " ind2=" ">
   <subfield code="a">BK010053</subfield>
   <subfield code="b">XK010053</subfield>
   <subfield code="c">XK010000</subfield>
  </datafield>
  <datafield tag="949" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="F">NATIONALLICENCE</subfield>
   <subfield code="b">NL-springer</subfield>
  </datafield>
 </record>
</collection>
