From: Guenter M. <mi...@us...> - 2022-04-07 09:42:50
|
Dear Jeremy, On 2022-04-05, Jeremy Maitin-Shepard wrote in gmane.text.docutils.devel: > Every construction of a Publisher object calls setup_option_parser which is > rather expensive. > When processing a small number of large documents the overhead may not be > too significant, but when using Sphinx to process a large number of small > documents (e.g. for API documentation where every symbol is a separate > page), >10% of the total time is spent just in setup_option_parser. > As a hack I was able to mitigate this problem by monkey patching > Publisher.setup_option_parser to cache the option_parser based on > `type(self.parser), type(self.reader), type(self.writer)` but that assumes > that the options depend only the class (which is true in practice for my > use case). > It would be great to have a proper solution for this, though. A possible solution would be to cache or pre-fetch the settings object instead of the option_parser: Settings assigned to the settings parameter of the convenience functions or the Publisher.settings attribute are used instead of the above sources -- https://docutils.sourceforge.io/docs/dev/runtime-settings-processing.html#settings-priority If, for a Publisher instance `publisher`, ``publisher.settings`` is defined, the call to ``publisher.setup_option_parser()`` is skipped. https://docutils.sourceforge.io/docs/dev/runtime-settings-processing.html#runtime-settings-processing-for-other-applications Maybe in Sphinx can use this when creating Publisher instances with identical conditions for the settings (reader/parser/writer components and docutils.conf files locations). See docutils/tools/buildhtml.py for an example of settings pre-fetching. Warning: the details there will change with the upcoming move from optparse to argparse. https://sourceforge.net/p/docutils/bugs/441 Using ``Publisher.get_settings()`` should be safer. Thank you for reporting your findings Günter |