From: Nathan I. <nin...@gm...> - 2005-12-21 15:21:46
|
On 12/20/05, The Rasterman Carsten Haitzler <ra...@ra...> wrote: > > > just one thing - with efm it shouldnt be forking 1 process per image all > at > once. it will only be keeping 1 forked child at a time - running along > generating for images without thumbs if they need one. the parent just > gets the > child exit event then forks off another. so it's 1 fork per image - and > only > per image that needs a thumb ANd only once per generation. i think you ca= n > safely asume even on the worst of posi systems 1 fork is nothing compared > to > the workload of loading, scaling then writing an image file :) Ahh, I should have read the code more closely. I saw the fork() at the top of the _e_thumb_generate and assumed the worst. Thanks for clarifying that. anyway. i personalyl still favor the fork model as it requires nop pthreads= , > likely is no overhead compared to threads, has no concurrency and cache > issues, > aned is simple. Agreed. what it does ned is an ability to tune how many forked image > generators to allow at a time (efm allows only 1 so dual cpu systems will > be > happy, more cpus wont benefit - ok maybe 3 as x is probably involved, and > you > might say 4 if you let the kernel run IO cpu instructions on a 4th cpu). I think single CPU systems could even benefit from spawning a few worker processes. Each process will reach a state where it's waiting on a read fro= m the image file, so two or three thumbnailing processes could potentially interleave their resource usage pretty well. One sitting in a blocking read while the other is processing image data. anyway > - you DO have a very valid point for when 2 apps start thumbnailign the > same > dir. we should definitely put in a locking mechanism for that so wither > they > share the workload, or the first guy in gets "lock ownership" and drives > the > thumbnailing until he's done and the other process sits and waits (maybe > polling the lock file if we sue that mechanism - the owenr coudl update > the > timespamp on the lockfile whenever it generates something. or the lock > file > could contain info as so the queue of ungenerated images to go... or mayb= e > the > simplest case all processes not owning the thumbnailing for that dir hold > off > until the owner releases (hopefulyl not too long from now) and then do a > full > update). I'm actually pretty surprised the fd.o spec didn't address locking at all, at least not that I saw the last time I read it. anyway - i do agree that there is need to unify. i do also think there are = 2 > levels here. 1. just generate thumbs and let calling process know (either > via a > blocking api or a fork/event), 2. be able to ask for the thumb path for > any > given file path, and 3. load thumb into a canvas object (another level > entirely). i do think you want to support blocking and sync - both. reall= y > async can just be a wrapper on top of the blocking api. I think 1 and 2 are what I'm most concerned with atm. Following the fd.ospec (to a point, since its lack of jpeg support is just dumb) 3 is not a large issue since its just loading a png or jpeg. imho it could do with: > > 1. add a file path to the thumb gen queue > 2. delete a file path from the queue > 3. begin queue processing > 4. pause/unpause queue processing > 5. end queue processing > 6. ask for thumb path from file path > 7. brute-force blocking-api generate thumb > 8. get "new thumb available" events > 9. set paralellism count (how many threads or forked children to allow at > a > time) I think we're on the same page here as far as features. So here's the idea = I had in mind for implementation. First off, we have a lib that provides the blocking API with Epsilon. I spoke to atmos about all of this and he expressed a desire to keep Epsilon simple and not expand the functionality much at that level. So that could provide the lowest level blocking thumbnail generation based on MIME type with plugins (as mentioned in your next paragraph). To address the async aspect, we'd wrap Epsilon with the queue processing and event API with the features you mention above. Then to provide the async behavior (and address the locking problem), the lib could actually setup an IPC channel and fork off a small daemon. That daemon woul= d then be available to all processes owned by that user and fork off processe= s responsible for the actual generation of the thumbnails. Since the daemon provides the only route to thumbnail generation we don't have to deal with locking or potential deadlocks and the race condition is eliminated. If the daemon is auto-started on demand, it could also exit after a configurable amount of inactivity. It also knows what's currently being processed so it can shortcut the requeueing of duplicate items. The only real downside I see to this is the IPC communication overhead, though entropy actually does this to communicate between threads and appear= s to do so w/o any significant performance impact. Any other concerns that I'= m missing? locking can be implemented under the bonnet of such and api. also one thing > i > think might be good here is that the lib actually dynamically adapts to > whatever libs it can find RUNTIME - not compile-time. so if it finds > imlib2, it > will sue it. if it finds evas, it will use that, if it finds epeg, it wil= l > use > that - it can dlopen the libs just like the runtime linker, and thus adap= t > to > whatever is on the system runtime without compile time dependencies > (installing > more libs just gets faster thumbnailing or more format support etc.). thi= s > allows us to add other things in future under the hood (thumbnailing > pdf's, > html files, text files, svg, etc.) Yeah, I think this all belongs at the lower level blocking API. Nathan |