<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
 <record>
  <leader>     caa a22        4500</leader>
  <controlfield tag="001">606159908</controlfield>
  <controlfield tag="003">CHVBK</controlfield>
  <controlfield tag="005">20210128100627.0</controlfield>
  <controlfield tag="007">cr unu---uuuuu</controlfield>
  <controlfield tag="008">210128e20151101xx      s     000 0 eng  </controlfield>
  <datafield tag="024" ind1="7" ind2="0">
   <subfield code="a">10.1007/s00521-015-1861-8</subfield>
   <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="035" ind1=" " ind2=" ">
   <subfield code="a">(NATIONALLICENCE)springer-10.1007/s00521-015-1861-8</subfield>
  </datafield>
  <datafield tag="245" ind1="0" ind2="0">
   <subfield code="a">Topological Q-learning with internally guided exploration for mobile robot navigation</subfield>
   <subfield code="h">[Elektronische Daten]</subfield>
   <subfield code="c">[Muhammad Hafez, Chu Loo]</subfield>
  </datafield>
  <datafield tag="520" ind1="3" ind2=" ">
   <subfield code="a">Improving the learning convergence of reinforcement learning (RL) in mobile robot navigation has been the interest of many recent works that have investigated different approaches to obtain knowledge from effectively and efficiently exploring the robot's environment. In RL, this knowledge is of great importance for reducing the high number of interactions required for updating the value function and to eventually find an optimal or a nearly optimal policy for the agent. In this paper, we propose a topological Q-learning (TQ-learning) algorithm that makes use of the topological ordering among the observed states of the environment in which the agent acts. This algorithm builds an incremental topological map of the environment using Instantaneous Topological Map model which we use for accelerating value function updates as well as providing a guided exploration strategy for the agent. We evaluate our algorithm against the original Q-learning and the Influence Zone algorithms in static and dynamic environments.</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
   <subfield code="a">The Natural Computing Applications Forum, 2015</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Reinforcement learning</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Q-learning</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Convergence acceleration</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Topological map</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="690" ind1=" " ind2="7">
   <subfield code="a">Guided exploration</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Hafez</subfield>
   <subfield code="D">Muhammad</subfield>
   <subfield code="u">Faculty of Computer Science and Information Technology, University of Malaya, 50603, Kuala Lumpur, Malaysia</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
   <subfield code="a">Loo</subfield>
   <subfield code="D">Chu</subfield>
   <subfield code="u">Faculty of Computer Science and Information Technology, University of Malaya, 50603, Kuala Lumpur, Malaysia</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="773" ind1="0" ind2=" ">
   <subfield code="t">Neural Computing and Applications</subfield>
   <subfield code="d">Springer London</subfield>
   <subfield code="g">26/8(2015-11-01), 1939-1954</subfield>
   <subfield code="x">0941-0643</subfield>
   <subfield code="q">26:8&lt;1939</subfield>
   <subfield code="1">2015</subfield>
   <subfield code="2">26</subfield>
   <subfield code="o">521</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2="0">
   <subfield code="u">https://doi.org/10.1007/s00521-015-1861-8</subfield>
   <subfield code="q">text/html</subfield>
   <subfield code="z">Onlinezugriff via DOI</subfield>
  </datafield>
  <datafield tag="898" ind1=" " ind2=" ">
   <subfield code="a">BK010053</subfield>
   <subfield code="b">XK010053</subfield>
   <subfield code="c">XK010000</subfield>
  </datafield>
  <datafield tag="900" ind1=" " ind2="7">
   <subfield code="a">Metadata rights reserved</subfield>
   <subfield code="b">Springer special CC-BY-NC licence</subfield>
   <subfield code="2">nationallicence</subfield>
  </datafield>
  <datafield tag="908" ind1=" " ind2=" ">
   <subfield code="D">1</subfield>
   <subfield code="a">research-article</subfield>
   <subfield code="2">jats</subfield>
  </datafield>
  <datafield tag="949" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="F">NATIONALLICENCE</subfield>
   <subfield code="b">NL-springer</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">856</subfield>
   <subfield code="E">40</subfield>
   <subfield code="u">https://doi.org/10.1007/s00521-015-1861-8</subfield>
   <subfield code="q">text/html</subfield>
   <subfield code="z">Onlinezugriff via DOI</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Hafez</subfield>
   <subfield code="D">Muhammad</subfield>
   <subfield code="u">Faculty of Computer Science and Information Technology, University of Malaya, 50603, Kuala Lumpur, Malaysia</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">700</subfield>
   <subfield code="E">1-</subfield>
   <subfield code="a">Loo</subfield>
   <subfield code="D">Chu</subfield>
   <subfield code="u">Faculty of Computer Science and Information Technology, University of Malaya, 50603, Kuala Lumpur, Malaysia</subfield>
   <subfield code="4">aut</subfield>
  </datafield>
  <datafield tag="950" ind1=" " ind2=" ">
   <subfield code="B">NATIONALLICENCE</subfield>
   <subfield code="P">773</subfield>
   <subfield code="E">0-</subfield>
   <subfield code="t">Neural Computing and Applications</subfield>
   <subfield code="d">Springer London</subfield>
   <subfield code="g">26/8(2015-11-01), 1939-1954</subfield>
   <subfield code="x">0941-0643</subfield>
   <subfield code="q">26:8&lt;1939</subfield>
   <subfield code="1">2015</subfield>
   <subfield code="2">26</subfield>
   <subfield code="o">521</subfield>
  </datafield>
 </record>
</collection>
