Strategies for simulating pedestrian navigation with multiple reinforcement learning agents
Gespeichert in:
Verfasser / Beitragende:
[Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández]
Ort, Verlag, Jahr:
2015
Enthalten in:
Autonomous Agents and Multi-Agent Systems, 29/1(2015-01-01), 98-130
Format:
Artikel (online)
Online Zugang:
| LEADER | caa a22 4500 | ||
|---|---|---|---|
| 001 | 605514607 | ||
| 003 | CHVBK | ||
| 005 | 20210128100705.0 | ||
| 007 | cr unu---uuuuu | ||
| 008 | 210128e20150101xx s 000 0 eng | ||
| 024 | 7 | 0 | |a 10.1007/s10458-014-9252-6 |2 doi |
| 035 | |a (NATIONALLICENCE)springer-10.1007/s10458-014-9252-6 | ||
| 245 | 0 | 0 | |a Strategies for simulating pedestrian navigation with multiple reinforcement learning agents |h [Elektronische Daten] |c [Francisco Martinez-Gil, Miguel Lozano, Fernando Fernández] |
| 520 | 3 | |a In this paper, a new multi-agent reinforcement learning approach is introduced for the simulation of pedestrian groups. Unlike other solutions, where the behaviors of the pedestrians are coded in the system, in our approach the agents learn by interacting with the environment. The embodied agents must learn to control their velocity, avoiding obstacles and the other pedestrians, to reach a goal inside the scenario. The main contribution of this paper is to propose this new methodology that uses different iterative learning strategies, combining a vector quantization (state space generalization) with the Q-learning algorithm (VQQL). Two algorithmic schemas, Iterative VQQL and Incremental, which differ in the way of addressing the problems, have been designed and used with and without transfer of knowledge. These algorithms are tested and compared with the VQQL algorithm as a baseline in two scenarios where agents need to solve well-known problems in pedestrian modeling. In the first, agents in a closed room need to reach the unique exit producing and solving a bottleneck. In in the second, two groups of agents inside a corridor need to reach their goal that is placed in opposite sides (they need to solve the crossing). In the first scenario, we focus on scalability, use metrics from the pedestrian modeling field, and compare with the Helbing's social force model. The emergence of collective behaviors, that is, the shell-shaped clogging in front of the exit in the first scenario, and the lane formation as a solution to the problem of the crossing, have been obtained and analyzed. The results demonstrate that the proposed schemas find policies that carry out the tasks, suggesting that they are applicable and generalizable to the simulation of pedestrians groups. | |
| 540 | |a The Author(s), 2014 | ||
| 690 | 7 | |a Pedestrian simulation |2 nationallicence | |
| 690 | 7 | |a VQQL |2 nationallicence | |
| 690 | 7 | |a Collective behaviors |2 nationallicence | |
| 690 | 7 | |a MARL |2 nationallicence | |
| 700 | 1 | |a Martinez-Gil |D Francisco |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain |4 aut | |
| 700 | 1 | |a Lozano |D Miguel |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain |4 aut | |
| 700 | 1 | |a Fernández |D Fernando |u Department of Computer Science, University Carlos III of Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain |4 aut | |
| 773 | 0 | |t Autonomous Agents and Multi-Agent Systems |d Springer US; http://www.springer-ny.com |g 29/1(2015-01-01), 98-130 |x 1387-2532 |q 29:1<98 |1 2015 |2 29 |o 10458 | |
| 856 | 4 | 0 | |u https://doi.org/10.1007/s10458-014-9252-6 |q text/html |z Onlinezugriff via DOI |
| 898 | |a BK010053 |b XK010053 |c XK010000 | ||
| 900 | 7 | |a Metadata rights reserved |b Springer special CC-BY-NC licence |2 nationallicence | |
| 908 | |D 1 |a research-article |2 jats | ||
| 949 | |B NATIONALLICENCE |F NATIONALLICENCE |b NL-springer | ||
| 950 | |B NATIONALLICENCE |P 856 |E 40 |u https://doi.org/10.1007/s10458-014-9252-6 |q text/html |z Onlinezugriff via DOI | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Martinez-Gil |D Francisco |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Lozano |D Miguel |u Departament d'Informàtica, Escola Tècnica Superior d'Enginyeria (ETSE), Universitat de València, Avda. de la Universidad s/n, 46100, Burjassot, Valencia, Spain |4 aut | ||
| 950 | |B NATIONALLICENCE |P 700 |E 1- |a Fernández |D Fernando |u Department of Computer Science, University Carlos III of Madrid, Avda. de la Universidad 30, 28911, Leganés, Madrid, Spain |4 aut | ||
| 950 | |B NATIONALLICENCE |P 773 |E 0- |t Autonomous Agents and Multi-Agent Systems |d Springer US; http://www.springer-ny.com |g 29/1(2015-01-01), 98-130 |x 1387-2532 |q 29:1<98 |1 2015 |2 29 |o 10458 | ||