[Toss-devel] Recent papers on mulit-agent RL
Status: Beta
Brought to you by:
lukaszkaiser
|
From: Lukasz S. <luk...@gm...> - 2013-04-21 20:31:36
|
http://arxiv.org/abs/1304.5159 Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with Self-Interested Agents Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>, Kian Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1> (Submitted on 18 Apr 2013) A key challenge in non-cooperative multi-agent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, self-interested agents (e.g., humans). The practicality of existing works addressing this challenge is being undermined due to either the restrictive assumptions of the other agents' behavior, the failure in accounting for their rationality, or the prohibitively expensive cost of modeling and predicting their intentions. To boost the practicality of research in this field, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intention-aware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions. We show that the performance losses incurred by the resulting planning policies are linearly bounded by the error of intention prediction. Empirical evaluations through a series of stochastic games demonstrate that our policies can achieve better and more robust performance than the state-of-the-art algorithms. http://arxiv.org/abs/1304.2024 A General Framework for Interacting Bayes-Optimally with Self-Interested Agents using Arbitrary Parametric Model and Model Prior Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>, Kian Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1> (Submitted on 7 Apr 2013 (v1 <http://arxiv.org/abs/1304.2024v1>), last revised 18 Apr 2013 (this version, v2)) Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayes-optimality is theoretically achievable by modeling the environment's latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In self-interested multi-agent environments, the transition dynamics are mainly controlled by the other agent's stochastic behavior for which FDM's independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent's behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners' domain knowledge to be exploited to produce a fine-grained and compact representation of the other agent's behavior. Empirical evaluation shows that our approach outperforms existing multi-agent reinforcement learning algorithms. |