Re: [Pyobjc-dev] Bridge Support Performances

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 20 Jan, 2010, at 12:30, Virgil Dupras wrote:

> Hi there,
> 
> I recently started to use PyObjC 2.2 (after having use 1.4 for a long
> time), mainly for 64-bit support, and the biggest problem I have with
> it is bridge support's memory usage. One of my applications which used
> to use 22mb of memory on launch when built with pyobjc 1.4 used 48mb
> when built with pyobjc 2.2.

Ouch.

> 
> I have read that other thread about it and Ronald mentioned that this
> was because of bridge support code which he didn't have enough time to
> work on. I plan to try to tackle the issue soon and I'd appreciate
> your feedback, Ronald, so I can know early if I'm just on a fool's
> errand.

I haven't had time yet to measure where memory is going, although I have
done some micro-optimizations (such as making sure that all strings are
interned).

I'm currently working on a py3k port and some other changes, although 
much slower than I'd like. I haven't managed to spent more than on hour
or two a week on PyObjC this year :-(. 

> 
> As a stopgap measure, I plan to simply clone the Cocoa and Foundation
> packages and remove everything I don't use from the bridgeSupport xml
> files. My guess is that it will greatly enhance loading speed and
> memory usage since I only use a tiny fraction of Foundation and
> AppKit.

That would certainly help in the short term, but isn't a workable solution in the longer turn.

> 
> Then, I noticed that Foundation and AppKit bridgesupport files are
> huge, not really because there's a lot of symbols in them (there are,
> however), but because the XML file format has so much redudancy in it
> (For example, funtion arguments elements require way too much crap.
> the types could be one string for 32 bit and one string for 64 bit
> instead of being this long list of <arg> element). What I'm thinking
> about is to convert those bridge support files to a more concise
> format, maybe in YAML or something. This wouldn't help memory usage,
> but it might help loading speed.

I don't think that replacing the XML by an YAML file would help, unless there
is a memory leak in the XML parsing code (and that should then be solved
by fixing that memory leak).

 I am thinking of "compiling" the bridgesupport files into some other format 
though, such as a python or C file. That's mostly to get rid of the dependency
on libxml, but could also help to reduce overhead (why parse an XML file
when you can also load an .pyc file).

Compiling into C would make it possible to create pre-constructed objects,
at the cost of much greater coupling between pyobjc-core and the framework
wrappers.  That would only be useful when there is clear evidence that such
a route could save a lot of memory and/or time.

> 
> Last, the most important part of my super plan, would be smart
> importers. I don't know much about them, but I keep seeing that Brett
> Canon keeps writing article about custom imports and stuff. Maybe that
> by using this, we could make PyObjC only load the requested elements.
> If, for example, someone would type "from AppKit import symbol1,
> symbol2" instead of "from AppKit import *", then only symbol1 and
> symbol2 would be loaded from the bridge support.
> 
> So, what do you think? Is that plan realistic, will there be a
> show-stopper down the road?

One problem with that is that the framework wrappers themself use
'from Foo import *' to ensure that imports behave simularly to ObjC 
includes.  Changing that for most wrappers wouldn't be much of a problem,
but it would be for a more complex one like the Quartz wrappers.

We'll only know for sure once someone measures what's going on, but
my gut feeling is that it is possible to reduce the amount of memory used
by PyObjC and that it is also possible to reduce the amount of time
that it takes to import a framework wrapper.

BTW. One possible source of memory use is the proxy for NSString 
objects: for compatiblity with the rest of python that proxy is a subclass
of Python's unicode type and therefore contains a copy of the content
of the string (with 16-bits per character in the string regardless of how
the string is represented in ObjC). 

BTW2. Measuring where memory has gone will be harder than I'd like
due to Python's memory management routines: Python has its own
malloc-like allocator which means that the Instruments won't be very
helpfull to do the measurements. With some luck a Python build with
'--without-pymalloc' still works...

Ronald

> --
> Virgil Dupras
> Hardcoded Software
> http://www.hardcoded.net
> 
> ------------------------------------------------------------------------------
> Throughout its 18-year history, RSA Conference consistently attracts the
> world's best and brightest in the field, creating opportunities for Conference
> attendees to learn about information security's most important issues through
> interactions with peers, luminaries and emerging and established companies.
> http://p.sf.net/sfu/rsaconf-dev2dev
> _______________________________________________
> Pyobjc-dev mailing list
> Pyo...@li...
> https://lists.sourceforge.net/lists/listinfo/pyobjc-dev