[Toss-devel] Recent papers on mulit-agent RL

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

http://arxiv.org/abs/1304.5159
Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit
Intentions for Interacting with Self-Interested Agents
Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>, Kian
Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1>
(Submitted on 18 Apr 2013)

A key challenge in non-cooperative multi-agent systems is that of
developing efficient planning algorithms for intelligent agents to interact
and perform effectively among boundedly rational, self-interested agents
(e.g., humans). The practicality of existing works addressing this
challenge is being undermined due to either the restrictive assumptions of
the other agents' behavior, the failure in accounting for their
rationality, or the prohibitively expensive cost of modeling and predicting
their intentions. To boost the practicality of research in this field, we
investigate how intention prediction can be efficiently exploited and made
practical in planning, thereby leading to efficient intention-aware
planning frameworks capable of predicting the intentions of other agents
and acting optimally with respect to their predicted intentions. We show
that the performance losses incurred by the resulting planning policies are
linearly bounded by the error of intention prediction. Empirical
evaluations through a series of stochastic games demonstrate that our
policies can achieve better and more robust performance than the
state-of-the-art algorithms.

http://arxiv.org/abs/1304.2024
A General Framework for Interacting Bayes-Optimally with Self-Interested
Agents using Arbitrary Parametric Model and Model Prior
Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>, Kian
Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1>
(Submitted on 7 Apr 2013 (v1 <http://arxiv.org/abs/1304.2024v1>), last
revised 18 Apr 2013 (this version, v2))

Recent advances in Bayesian reinforcement learning (BRL) have shown that
Bayes-optimality is theoretically achievable by modeling the environment's
latent dynamics using Flat-Dirichlet-Multinomial (FDM) prior. In
self-interested multi-agent environments, the transition dynamics are
mainly controlled by the other agent's stochastic behavior for which FDM's
independence and modeling assumptions do not hold. As a result, FDM does
not allow the other agent's behavior to be generalized across different
states nor specified using prior domain knowledge. To overcome these
practical limitations of FDM, we propose a generalization of BRL to
integrate the general class of parametric models and model priors, thus
allowing practitioners' domain knowledge to be exploited to produce a
fine-grained and compact representation of the other agent's behavior.
Empirical evaluation shows that our approach outperforms existing
multi-agent reinforcement learning algorithms.