From: Jeremy Maitin-S. <je...@je...> - 2022-04-05 02:06:21
|
Each construction of a StateMachine object is fairly expensive as it calls add_states which constructs all of the state objects and their transitions. When using Sphinx to process a large number of small documents (though I think the problem also applies to small documents), more than 5% of the total time is spent just constructing the StateMachine objects --- e.g. 5 seconds out of 65 seconds total run time. In general I think it would be better if the StateMachine definition were decoupled from a particular execution of the state machine. However, that might require significant code changes. There is already a minimal form of caching used in RSTState.nested_parse. However, StateMachine objects are also created in nested_list_parse and elsewhere. As a hack I was able to mitigate this by adding a global cache of StateMachine objects, keyed by the derived StateMachine class and by the tuple of state classes, and modifying StateMachine.unlink to return the object to the cache. However i"m not sure if a global cache is an appropriate solution to be added to docutils itself. Please CC me in any replies as I'm not subscribed to the mailing list. |