"A B S T R A C T
This thesis is split into two independent parts.
The ﬁrst is an investigation of some practical aspects of Marcus
Hutter’s Universal Artiﬁcial Intelligence theory . The main
contributions are to show how
a very general agent can be built and analysed using the mathematical tools of
this theory. Before the work presented in this thesis, it was an open
question as to
whether this theory was of any relevance to reinforcement learning
This work suggests that it is indeed relevant and worthy of future
The second part of this thesis looks at self-play learning in two
player, deterministic, adversarial turn-based games. The main
contribution is the introduction of
a new technique for training the weights of a heuristic evaluation function from
data collected by classical game tree search algorithms. This method is shown
to outperform previous self-play training routines based on Temporal Difference
learning when applied to the game of Chess. In particular, the main
using this technique to construct a Chess program that learnt to play
Chess by tuning a set of initially random weights from self play games."
It is developed from the paper we looked at quite some time ago.