From: Lukasz Stafiniak <lukstafi@gm...>  20130421 20:31:36

http://arxiv.org/abs/1304.5159 Interactive POMDP Lite: Towards Practical Planning to Predict and Exploit Intentions for Interacting with SelfInterested Agents Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>;, Kian Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1>; (Submitted on 18 Apr 2013) A key challenge in noncooperative multiagent systems is that of developing efficient planning algorithms for intelligent agents to interact and perform effectively among boundedly rational, selfinterested agents (e.g., humans). The practicality of existing works addressing this challenge is being undermined due to either the restrictive assumptions of the other agents' behavior, the failure in accounting for their rationality, or the prohibitively expensive cost of modeling and predicting their intentions. To boost the practicality of research in this field, we investigate how intention prediction can be efficiently exploited and made practical in planning, thereby leading to efficient intentionaware planning frameworks capable of predicting the intentions of other agents and acting optimally with respect to their predicted intentions. We show that the performance losses incurred by the resulting planning policies are linearly bounded by the error of intention prediction. Empirical evaluations through a series of stochastic games demonstrate that our policies can achieve better and more robust performance than the stateoftheart algorithms. http://arxiv.org/abs/1304.2024 A General Framework for Interacting BayesOptimally with SelfInterested Agents using Arbitrary Parametric Model and Model Prior Trong Nghia Hoang <http://arxiv.org/find/cs/1/au:+Hoang_T/0/1/0/all/0/1>;, Kian Hsiang Low <http://arxiv.org/find/cs/1/au:+Low_K/0/1/0/all/0/1>; (Submitted on 7 Apr 2013 (v1 <http://arxiv.org/abs/1304.2024v1>;), last revised 18 Apr 2013 (this version, v2)) Recent advances in Bayesian reinforcement learning (BRL) have shown that Bayesoptimality is theoretically achievable by modeling the environment's latent dynamics using FlatDirichletMultinomial (FDM) prior. In selfinterested multiagent environments, the transition dynamics are mainly controlled by the other agent's stochastic behavior for which FDM's independence and modeling assumptions do not hold. As a result, FDM does not allow the other agent's behavior to be generalized across different states nor specified using prior domain knowledge. To overcome these practical limitations of FDM, we propose a generalization of BRL to integrate the general class of parametric models and model priors, thus allowing practitioners' domain knowledge to be exploited to produce a finegrained and compact representation of the other agent's behavior. Empirical evaluation shows that our approach outperforms existing multiagent reinforcement learning algorithms. 