Re: [Toss-devel] Payoff discounting
Status: Beta
Brought to you by:
lukaszkaiser
From: Lukasz K. <luk...@gm...> - 2010-03-27 22:25:29
|
> Wow. It works. I can confirm (even though it needs 7, not 6 iterations for me in the tic tac toe example, but that's not a problem for now). > I mean, this form of discounting is obviously better than exponential > discounting. Hutter in his "Universal AI" has a discussion on > horizons. The reasons for our horizon are practical and human-like: > intelligently greedy (high resolution for the short term) and > intelligently persistent (low resolution for time distinctions in the > long term). I'm not really sure about the role of horizons for us: in reachability games you normally want them and this was very obvious for our entanglement tests (this is true reachability). Have a look at a hint run of entanglement now - it is uncomparable to what was before! But besides their existence, we might have to just experiment with what is the best one. Anyway - I like what we have now a lot! Will start a hint run of gomoku or breakthrough to get a feeling how these are played now. Lukasz |