Strategies for simulating pedestrian navigation with multiple reinforcement learning agents

Verfasser / Beitragende:
[Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández]
Ort, Verlag, Jahr:
2015
Enthalten in:
Autonomous Agents and Multi-Agent Systems, 29/1(2015-01-01), 98-130
Format:
Artikel (online)
ID: 605514607
LEADER caa a22 4500
001 605514607
003 CHVBK
005 20210128100705.0
007 cr unu---uuuuu
008 210128e20150101xx s 000 0 eng
024 7 0 |a 10.1007/s10458-014-9252-6  |2 doi 
035 |a (NATIONALLICENCE)springer-10.1007/s10458-014-9252-6 
245 0 0 |a Strategies for simulating pedestrian navigation with multiple reinforcement learning agents  |h [Elektronische Daten]  |c [Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández] 
520 3 |a In this paper, a new multi-agent reinforcement learning approach is introduced for the simulation of pedestrian groups. Unlike other solutions, where the behaviors of the pedestrians are coded in the system, in our approach the agents learn by interacting with the environment. The embodied agents must learn to control their velocity, avoiding obstacles and the other pedestrians, to reach a goal inside the scenario. The main contribution of this paper is to propose this new methodology that uses different iterative learning strategies, combining a vector quantization (state space generalization) with the Q-learning algorithm (VQQL). Two algorithmic schemas, Iterative VQQL and Incremental, which differ in the way of addressing the problems, have been designed and used with and without transfer of knowledge. These algorithms are tested and compared with the VQQL algorithm as a baseline in two scenarios where agents need to solve well-known problems in pedestrian modeling. In the first, agents in a closed room need to reach the unique exit producing and solving a bottleneck. In in the second, two groups of agents inside a corridor need to reach their goal that is placed in opposite sides (they need to solve the crossing). In the first scenario, we focus on scalability, use metrics from the pedestrian modeling field, and compare with the Helbing's social force model. The emergence of collective behaviors, that is, the shell-shaped clogging in front of the exit in the first scenario, and the lane formation as a solution to the problem of the crossing, have been obtained and analyzed. The results demonstrate that the proposed schemas find policies that carry out the tasks, suggesting that they are applicable and generalizable to the simulation of pedestrians groups. 
540 |a The Author(s), 2014 
690 7 |a Pedestrian simulation  |2 nationallicence 
690 7 |a VQQL  |2 nationallicence 
690 7 |a Collective behaviors  |2 nationallicence 
690 7 |a MARL  |2 nationallicence 
700 1 |a Martinez-Gil  |D Francisco  |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain  |4 aut 
700 1 |a Lozano  |D Miguel  |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain  |4 aut 
700 1 |a Fernández  |D Fernando  |u Department of Computer Science, University Carlos III of Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain  |4 aut 
773 0 |t Autonomous Agents and Multi-Agent Systems  |d Springer US; http://www.springer-ny.com  |g 29/1(2015-01-01), 98-130  |x 1387-2532  |q 29:1<98  |1 2015  |2 29  |o 10458 
856 4 0 |u https://doi.org/10.1007/s10458-014-9252-6  |q text/html  |z Onlinezugriff via DOI 
898 |a BK010053  |b XK010053  |c XK010000 
900 7 |a Metadata rights reserved  |b Springer special CC-BY-NC licence  |2 nationallicence 
908 |D 1  |a research-article  |2 jats 
949 |B NATIONALLICENCE  |F NATIONALLICENCE  |b NL-springer 
950 |B NATIONALLICENCE  |P 856  |E 40  |u https://doi.org/10.1007/s10458-014-9252-6  |q text/html  |z Onlinezugriff via DOI 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Martinez-Gil  |D Francisco  |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Lozano  |D Miguel  |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain  |4 aut 
950 |B NATIONALLICENCE  |P 700  |E 1-  |a Fernández  |D Fernando  |u Department of Computer Science, University Carlos III of Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain  |4 aut 
950 |B NATIONALLICENCE  |P 773  |E 0-  |t Autonomous Agents and Multi-Agent Systems  |d Springer US; http://www.springer-ny.com  |g 29/1(2015-01-01), 98-130  |x 1387-2532  |q 29:1<98  |1 2015  |2 29  |o 10458