Thread: Re: [Pypes-user] Pypes meet Axon ?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Thanks Michael,

Don't be too surprised by the similarities because I've been a fan a  
Kamaelia for some time now. I really like the style of messaging  
passing so many of my ideas came from Kamaelia. Although there are  
similarities, I don't think Pypes is quite as ambitious. Pypes started  
out as a way of feeding huge volumes of data to a search index. It was  
originally linear and I was using function composition to chain  
together the new style generators in Python (PEP 342).

Although this worked well, it made it a bit tough to setup at runtime  
(thinking in terms of UI work). I also wanted the ability to publish  
to multiple locations and/or branch based on certain attributes in a  
packet. I turned to Stackless because I think it's a really cool. I  
love seeing these sort of old school concepts resurrected (like Flow- 
Based Programming).

Too many of us become complacent with "what is" and just assume that  
it's the best (or only) way to do things. Both Stackless (coroutines)  
and Flow-Based programming challenge some very fundamental concepts in  
computer science. More importantly, they seem to be ideal concepts  
that are quite applicable to the sort of problems that we're trying to  
solve with these frameworks.

Quite honestly I see pypes (Visual Design Studio) as more of an ETL  
style framework and for this reason, I haven't addressed the issue of  
cycles. I wanted to keep things as simple as possible because lots of  
complexity leads to the potential for more bugs. With this ETL  
mindset, performance is a priority. A typical installation of pypes  
(VDS) in an enterprise setting is 3 quad Xeon machines all processing  
content in parallel. It's not uncommon to push 15 million documents  
through the system.

If something fails after 3 million documents, we have to go back and  
start the feed over again which can really start to eat up time as  
well as the customer's patience. For this reason simplicity is also a  
priority because it allows us to minimize the possibility for errors.

The UI work was really inspired by Yahoo Pipes. I hadn't done any Ajax  
or Javascript coding prior to this and I would have never thought this  
type of UI was possible had it not been for Yahoo Pipes. The idea of a  
web UI was really cool because I had been writing an HTTP service  
layer on top of pypes that exposed a REST API where external  
applications could inject content into the system by issuing HTTP POST  
commands. This style of processing is easy to scale out using hardware  
load balancing.

The fact that I was able to build the UI as a web application still  
baffles me. Best of all, it really cuts down on dependencies. Visual  
Design Studio is pure Python with no C-extensions so it's simple to  
install. Of course, C-extensions can be shipped/built separately to  
address specific performance concerns. Right now I'm using Elementtree  
which is shipped with Python 2.6 and it's blazing fast (about twice as  
fast as libxml -- yes I deal with a LOT of XML content).

I have some Bayesian and Fisher classifiers as well as tools for  
creating decision trees since I deal with a lot of taxonomy mapping.  
Information Extraction is another focal point.

If pypes gains a following I suspect we'll see some RSS mashup tools  
surface. I've been in a few organizations that would love to leverage  
Yahoo Pipes but are afraid because there's no SLA. Yahoo Pipes also  
doesn't allow you to write custom components and both these problems  
are address in pypes.

At any rate, thanks for taking the time to check out pypes and provide  
some feedback. I could definitely see some collaboration in our  
future. Will you be at PyCon this year by any chance? I thought I saw  
a tweet stating that it wasn't looking too promising.

-Eric

Thread: Re: [Pypes-user] Pypes meet Axon ?

pypes-user