From: Yuan, J. <jia...@jp...> - 2012-08-02 00:07:15
|
Dear MMLF-support team, We are trying design some reinforcement learning models for our equity trading algorithms. We were doing some test on MMLF, and it seems pretty convenient so far. However, we noticed that the None reward set in the environment seems to be automatically transformed as reward=0 in the agent. More specifically, in the evaluateAction(self, actionObject) function, we sometimes return the resultDict with reward as None. However, if I set a stop point within getAction(self) in the td_agent.py and check the value of self.reward, it turns out that self.reward is tranfored as 0 automatically. The reason we need to set the reward as none is because our action space depends on state. Since evaluateAction sometimes may pass us actionObject that's outside of the state space, we have got to reject these steps, and do not wish the (state, action) pairs appears in the eligibility trace. If possible, can you help me to check about it? Thanks a lot, Ralph ________________________________ Jiangchuan Yuan | Linear Quantitative Research | Electronic Client Solutions | Global Equities | J.P. Morgan | 383 Madison Avenue, New York, NY, 10179 | T: +1 (212) 622-5624 | jia...@jp...<mailto:jia...@jp...> | jpmorgan.com<http://jpmorgan.com> This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email. |