We are using pexpect to retrieve large (sometimes >1Mb)
amounts of data using ssh. Pexpect can take several
minutes to read the input, and often will time out
without completing. This is due to doing repeated
regex matching on the buffer. This patch optionally
only searches the last N bytes of the buffer, which in
our case reduced the runtime to ~0.1 seconds, an
improvement of several hundred-fold.
I discovered when I went to submit this patch that
there is one already submitted, but I am sending this
anyway since it differs in several ways:
1) the default behaviour is unchanged - if
maxsearchsize is not set in either the spawn .__init__
or .expect methods then it will continue to search the
entire buffer. This means that existing client code
will work exactly as before.
2) the code is much simpler because the buffer is
unchanged - it uses the optional 'pos' parameter to the
regex search function.
3) this also makes it more efficent, since it does not
do any string slicing to get the last N bytes.
4) The maxsearchsize can be set globally for all calls
and/or individually for each call to expect. This
allows the search size to be tuned depending on what
you are expecting.
5) It is a patch on the current cvs version of
pexpect.py (1.107)
Dave
Patch to add maxsearchsize - only search the last N bytes of input
The searchwindowsize option was landed.