Menu

Redundant communication for NumPy arrays?

2015-04-08
2015-04-24
  • Michael C. Grant

    Consider the code snippet generated for the MatrixMult UDF:

    If TypeOf x Is Range Then x = x.Value2
    If Not TypeOf x Is Object  Then
        x = NDims(x, 2)
        Set x = py.Call(py.Module("numpy"), "array", py.Tuple(x))
    End If
    ...
    Set args = py.Tuple(x, y)
    ...
    Set MatrixMult = py.Call(func, args)
    

    If I'm reading this correctly, MatrixMult is sending the raw data for x over the connection to the Python side for conversion into a NumPy array, sending that converted result back over the connection to be placed into the argument list, and finally sent again to the Python side for the final computations.

    Am I understanding that right? I cannot help but wonder if it wouldn't be better to avoid this back and forth, which would involve some sort of Python-side wrapper for the function call that combines the NumPy array conversions and the final function call into a single communication.

    I understand that this may indeed be the most expedient approach for now, and for small arrays it might not be worth caring about. But I am working on a module that will facilitate calling arbitrary NumPy/SciPy functions by an Excel formula, and so it might worth the extra effort to determine the most efficient way to communicate the data over the link.

     
  • Eric Reynolds

    Eric Reynolds - 2015-04-08

    Hi Michael,

    You're absolutely right, in fact one of the things which has been on my to do list for a long time (which unfortunately has remained completely static for too long as I have been sidetracked) has been to refactor the ExcelPython so that (a) the add-in's VBA wrapper generator is written in Python instead of VBA and (b) the VBA wrapper does everything which it's more efficient to do on the VBA side then passes the data over to Python and leaves Python to do whatever's more efficient to do on the Python side (like building numpy objects and so on).

    I guess the MatrixMult example serves mostly to illustrate what's possible using the low-level API, i.e. how you can fully manipulate Python objects in VBA. The example is actually a bit anacronistic and made more sense in ExcelPython V1, which ran in-process and which had very efficient cross-language calls, something which V2 sacrificed in the name of reliability.

    If you're interested please go ahead and add your observations to the documentation on GitHub and send me a pull request. Anyone is welcome to contribute to ExcelPython!

    Regards,

    Eric.

     

Log in to post a comment.