pypes-user Mailing List for pypes
Status: Beta
Brought to you by:
egaumer
You can subscribe to this list here.
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
---|
From: Eric G. <eg...@py...> - 2009-08-18 03:05:29
|
Thanks Michael, Don't be too surprised by the similarities because I've been a fan a Kamaelia for some time now. I really like the style of messaging passing so many of my ideas came from Kamaelia. Although there are similarities, I don't think Pypes is quite as ambitious. Pypes started out as a way of feeding huge volumes of data to a search index. It was originally linear and I was using function composition to chain together the new style generators in Python (PEP 342). Although this worked well, it made it a bit tough to setup at runtime (thinking in terms of UI work). I also wanted the ability to publish to multiple locations and/or branch based on certain attributes in a packet. I turned to Stackless because I think it's a really cool. I love seeing these sort of old school concepts resurrected (like Flow- Based Programming). Too many of us become complacent with "what is" and just assume that it's the best (or only) way to do things. Both Stackless (coroutines) and Flow-Based programming challenge some very fundamental concepts in computer science. More importantly, they seem to be ideal concepts that are quite applicable to the sort of problems that we're trying to solve with these frameworks. Quite honestly I see pypes (Visual Design Studio) as more of an ETL style framework and for this reason, I haven't addressed the issue of cycles. I wanted to keep things as simple as possible because lots of complexity leads to the potential for more bugs. With this ETL mindset, performance is a priority. A typical installation of pypes (VDS) in an enterprise setting is 3 quad Xeon machines all processing content in parallel. It's not uncommon to push 15 million documents through the system. If something fails after 3 million documents, we have to go back and start the feed over again which can really start to eat up time as well as the customer's patience. For this reason simplicity is also a priority because it allows us to minimize the possibility for errors. The UI work was really inspired by Yahoo Pipes. I hadn't done any Ajax or Javascript coding prior to this and I would have never thought this type of UI was possible had it not been for Yahoo Pipes. The idea of a web UI was really cool because I had been writing an HTTP service layer on top of pypes that exposed a REST API where external applications could inject content into the system by issuing HTTP POST commands. This style of processing is easy to scale out using hardware load balancing. The fact that I was able to build the UI as a web application still baffles me. Best of all, it really cuts down on dependencies. Visual Design Studio is pure Python with no C-extensions so it's simple to install. Of course, C-extensions can be shipped/built separately to address specific performance concerns. Right now I'm using Elementtree which is shipped with Python 2.6 and it's blazing fast (about twice as fast as libxml -- yes I deal with a LOT of XML content). I have some Bayesian and Fisher classifiers as well as tools for creating decision trees since I deal with a lot of taxonomy mapping. Information Extraction is another focal point. If pypes gains a following I suspect we'll see some RSS mashup tools surface. I've been in a few organizations that would love to leverage Yahoo Pipes but are afraid because there's no SLA. Yahoo Pipes also doesn't allow you to write custom components and both these problems are address in pypes. At any rate, thanks for taking the time to check out pypes and provide some feedback. I could definitely see some collaboration in our future. Will you be at PyCon this year by any chance? I thought I saw a tweet stating that it wasn't looking too promising. -Eric |
From: Michael S. <spa...@gm...> - 2009-08-15 16:22:27
|
Hi, Just saw pypes pass through the PyPI RSS feed, and think it's really interesting/cool. Congratulations on getting a release out the door :-) For the reason why I find it interesting, I'll let code do the talking, with some things I find interesting parallels... :-) pype.component.py class Component(object): # stackless.tasklet based Axon.Component.py class component(microprocess): # generator based (we also have thread based) pypes.component.py def __init__(self): self.inputs = {'in' : [None, 'Default input port'] } self.outputs = {'out': [None, 'Default output port']} self._parameters = {} Axon.Component.py Inboxes = {"inbox" : "Default inbox for bulk data. Used in a pipeline much like stdin", "control" : "Secondary inbox often used for signals. The closest analogy is unix signals" } Outboxes = {"outbox": "Default data out outbox, used in a pipeline much like stdout", "signal": "The default signal based outbox - kinda like stderr,but more for sending singal type signals", } def __init__(self, *args, **argd): """(subclass always calls this via something like this: super(component, self).__init__() self.__dict__.update(argd) self.inboxes = dict() self.outboxes = dict() for boxname in self.Inboxes: self.inboxes[boxname] = makeInbox(notify=self.unpause) for boxname in self.Outboxes: self.outboxes[boxname] = makeOutbox(notify=self.unpause) .... pypes.component.py def recv(self, port): Axon.Component.py def recv(self,boxname="inbox"): pypes.component.py def send(self, port, data): Axon.Component.py def send(self,message, boxname="outbox"): pypes.component.py def recvall(self, port): Axon.Component.py def Inbox(self, boxname="inbox"): You'll see further equivalences between Axon.Box.py and pype.py Clearly we have some similar thinking here, and I must also say that your visual design programme is very shiny, and similar to our Compose tool, which is currently bust - so yours is by definition far nicer at the moment :-) (Though it shares code with our pygame based run time visualiser which you can see described here: http://www.kamaelia.org/AxonVisualiser ) Anyway some differences: -> You're using stackless tasklets & multiprocessing <- We're using standard generators, threads & pprocessing -> You appear to deny cyclic graphs. (I could be very wrong here though :) <- Many of our systems contain cycles. -> You appear to have been aware of J P Morrisons work before starting <- I wasn't . (starting points for me were occam, unix, hardware, etc) Interesting to me: * I've wanted to switch from pprocessing to multiprocessing * I've been curious about supporting more forms of component, including stackless tasklets and greenlets for a while :-) * You also have a very shiny web based system editor, whereas our compose tool (which uses pygame & tkinter) is currently out of action :-) ie figuring out some way of interoperating with Pypes strikes me as a really nice thing to do :) (The biggest "issue" I can see there is the switch around of port/box name & data really) Anyway, I just wanted to say that pypes looks really neat & interesting, and congratulations on getting a release out! :-) Good luck & have fun :) :-) Michael. -- http://yeoldeclue.com/blog http://twitter.com/kamaelian http://www.kamaelia.org/Home |