Documents/regexes should be processed in parallel

Status: Beta

Brought to you by: billbaumgartner, gregcaporaso, rndlph

#6 Documents/regexes should be processed in parallel

Status: open

Owner: nobody

Labels: None

Priority: 4

Updated: 2007-12-30

Created: 2007-07-27

Creator: David Randolph

Private: No

One thing we have been silent about in the papers is the actual running time of the program. Our approach is considerably more expensive than the Horn approach.

I note that, at a high level at least, it should be trivial to parallelize the processing of the individual documents to improve the running time of the application.

On the perl side (on UNIX at least), I could use fork() to pull this off, but threads are probably more appropriate. I am not sure how all of this plays in the other languages/OSes. But I think this is worth pursuing. I have four processors on my current system, and we are only pegging one of them.

Indeed, since I don't think the ordering of the evaluation of the regexes matters, we could probably parallelize on a lower level.

Thoughts?

Cheers,
Dave

Discussion

Greg C - 2007-12-30

priority: 5 --> 4
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Documents/regexes should be processed in parallel

Group

Searches

Help

#6 Documents/regexes should be processed in parallel

Discussion