See comment at https://sourceforge.net/p/workingwiki/bugs/374/?limit=10&page=2#c9a8.
I don't want to use straight Ajax for requests to do WW operations, because with Ajax you only know when you initiate a request and when you get the result back. I want to provide ongoing feedback while the operation is happening, so I probably want to use Comet. The main alternative being Websocket, which requires a special server, whereas it looks like I can do a comet response within the MediaWiki extension php code.
Anonymous
Specifically the SSE technique. This will not work on IE, but like, I bet a lot of WW stuff doesn't work on IE anyway. More seriously, I can make IE fall back to the current CGI way of doing actions.
I think I'm going to write a test API action in the extension, to see how well SSE works or doesn't, in context.
Simple comet test works, at least in Firefox. Sends an update each second, and quits on the serverside when the client side aborts the Comet connection. Great news!
This seems to mean I can do this within the standard WW api framework (well, by jettisoning the framework in midstream to do my own low-level output, but anyway I can do it while keeping my code within the standard api class hierarchy, which is great).
Last edit: Lee Worden 2013-11-12
One concern is what will happen if the connection gets lost midway - will the client be able to reconnect? Will reconnecting start a second instance of the operation? That would be bad. Will the first instance of the operation think it has been aborted?
If so, maybe it would be worth having a level of indirection - do an ajax call to initiate the operation, have it return quickly while leaving the server cranking on the operation and dumping output to a file, and give the client a key that it can use to make a comet connection to follow the progress of the operation, i.e. spool the contents of the output file. In that case, reopening the comet connection would be harmless. Even then, though, I'd want to distinguish a lost connection from an intentional abort.
In some cases I might be able to spawn a Unix command and return, like when creating a background job, but in other cases I might need to do long-running operations in php. Can I have it return the Ajax response quickly and keep running on the server side? I don't think so, but maybe.
Tried a quick test where the Comet call starts loading updates once per second into the browser, and then I disconnect and reconnect the wireless... The data stops coming of course, but it keeps running on the server, and the client never tries to reconnect or anything, it just seems to get stuck waiting forever - the little text still says "Connected to lalashan" and there's a spinner in the tab...
There are some events in the EventSource object that I haven't provided handlers for - maybe one of them will help handle this...
A more precise test. Start up the one-per-second updates; switch to a different wireless network while it's receiving and see what happens; switch back and see what happens.
When I switch away to a different network, I get nothing: no updates, no error event, no reconnect.
When I switch back, after a pause of 10 or 15 seconds I get all the backed-up updates at once, and then they continue.
So I'm not thrilled about this lack of robustness. Also, when it does reconnect, it does it by calling the same URL over again. So I would have to take measures to make sure it doesn't initiate a second instance of the create-background-job or whatever operation.
I'm thinking of using Ajax polling rather than Comet after all. It's more noise on the wire due to the 2-way interaction, but gives me more control over reconnecting, and so I can do it more robustly.
Either way I'm going to have to separate reporting on jobs' progress from doing the jobs, on the server side, to make it possible to drop and reconnect without restarting the job.
I'm thinking of an architecture where the job says "Started", gives the client a unique key, and closes the connection, then does the job. As it does the job it will dump progress reports into memcache, using the unique key, and the client will use the key to request updates to those reports.
But I'll have to work out whether it's possible to return a response and then finish the job. MW doesn't do that as is. But I think I can do it if I seize control of the HTTP output, as I've been doing for the Comet response.
Or I might be able to fix up Comet reconnection myself (within this indirection framework with memcache keys) - which would give me the advantages of the above scenario plus less noise on the wire, and probably fewer delays because the updates keep coming without the lag for me to request them over and over.
I should also find out what a WebWorker is and whether I should use one to consolidate multiple Comet streams into one. I already have whatever one Comet thing happening concurrently with background job updates every 60 seconds, and those things could be folded into a single stream of updates.
But I should postpone this until I have some faint idea whether multiple things happening at once is a real issue or not.
A WebWorker by the way is basically a JS background thread that you can create to do asynchronous work, such as listen for events from an SSE connection. I see no advantage in using this on a single page, since the event framework in the main thread is fine for that.
However, if the number of open SSE connections to the server becomes a problem - there may be a limit to how many it can support - there's the option of using a single shared WebWorker with a single connection for all the WW pages that are open in ones browser, rather than one or more connection per tab.
OK, setting aside whether I'll use Ajax or Comet or Bon Ami or whatever to convey the shit to the client, how am I going to have one process do the work and another one report on its progress?
Time to survey the list of things that will be done this way.
So anyway, some of the output from these operations is generated directly from the PHP ("Waiting for lock from XXX"), and other output comes from Unix commands (the output of make, for instance). It might make more sense to collect the output in a Unix file rather than in memcache, since I can redirect the output of cp and make to there with less hassle.
So a sequence something like:
So why don't I mock up a version of this and see what it takes.
A fake-job API call that generates a tempfile and takes a while to fill it
A status-update API call that the client can call to follow additions to the tempfile
Test it by modifying the Comet-testing JS that I already have.
So how to get an API call to send its response early? What does ApiBase, ApiMain, etc do after it calls my class's execute()?
Well I'm having a hell of a time getting the initial call to send out its response before it does the work - the mw.Api object waits until the server has done all the work and terminated before it registers that it's gotten a response worth processing. I'm controlling the HTTP headers and stuff, but for some reason when I set "Connection: close" I get "Connection: keep-alive" regardless. Maybe the proxy on yushan is overriding my "Connection" header?
Should I try to work with RHPCS to figure out if it's the proxy? I want to be robust and not put special requirements on sysadmins like details of how you need to configure your proxy.
What if I do a hybrid approach where the initial call produces a key, spools output to a status file, and also streams its output to the client? The client can happily receive it up till it loses the connection, then can reconnect using the key to listen in on the ongoing operation without restarting it.
Weird, but maybe doable. Maybe too complicated though. Might be worth taking it slower and figuring out what's "the right way" to do it.
Ha! Or! I have the long job use Comet-style output to send its response, because that gets processed immediately. But as soon as the client gets the first piece of data from it, it drops the connection and make a request to listen in on the status file. That way the long job can do its work without having to also produce constant updates for the client. This will be especially important when the long job is making system calls to make, cp, etc, and the output is going to the file only.
This is generally working and seems workable. I am having trouble with NFS though. Even though my code calls fwrite() and fflush() every second while writing to the status file, the other process that's watching it, on a different cluster node, doesn't see anything but a 0-byte file for a long time. It updates about every 30 seconds, dumping out 30 seconds worth of data. If I were writing in C, I think I could call ioctl( FIOSYNC ) to make NFS flush out the file to/from the server, but I don't see how to do that from PHP. More to come about that...
I feel like this is fixable though, and this approach will work.
It looks like I may be able to get NFS to sync the file by closing and reopening it, or by locking it. If I can do it by locking that would be great, because I don't think I can close and reopen it while a long make process is appending to it.
Whoa dude, that locking trick actually does work! This is great because I think I can get it to sync by having my php process lock it over and over while some other make process is spooling data to it...
[aside: someday I'm going to understand nfs better... we really want it to sync better on its own, because when one node updates a project file and another node retrieves it, it really would be nice if we could trust it was getting the updated file. I have problems all the time where I edit and save part of the WW code, and load a page to test it, but it's still seeing the old version of the code... I just have to be patient and let it have about 15 or 30 seconds to catch up after I save...]
PS. if you're following along at home, you can watch the test data update all quick style in the browser by clicking on a 1.21 page's main header, the one that gives the title of the page. I'll probably remove that testing easter egg pretty soon, when I get to using the comet code for something real.
Something to think about btw: if the client is connected to a secondary process, not the one that's doing the operation, how is it going to abort the operation?
I started a nice discussion about this on wikitech-l by the way: http://lists.wikimedia.org/pipermail/wikitech-l/2013-November/073116.html
I have developed a Comet demo that does a lot of what I want for this: https://github.com/worden-lee/cometdemo. That's a way of putting the code out into the world and making it available for other uses, and of course I'll use it in WW.
Next is to build this into an implementation of "merge background job", I think, because it's awkward to have "destroy" and "merge" side by side and try to remember that the links behave so differently. The merge action requires:
When that's done, I'll have
This should put me in a good position to upgrade other operations more quickly afterward.
It seems like PE will need to have some kind of dance where it
Last edit: Lee Worden 2013-11-26
Using the read bit to signal the writer to stop won't work, because I want to read the file both before and after; using the write bit as bidirectional signal will cause problems, for instance if there's a second watcher process - it'll mistakenly think it's time to stop watching and delete the file. My sense of humor wants me to kill the writer process by setting the execute bit, so maybe I'll do it that way.
I've got Ajax dynamic loading of project files running, though it's surely not perfect. But I won't be deploying it for testing right away because I don't want to disrupt the MMED meeting with buggy code deployments (since I just did disrupt it twice :().
So I'm going to start work on this Comet version of it, which is the next step. I'll put the Ajax dynamic loading interface out there for testing, while I'm working on this Comet stuff behind the scenes.
As I'm thinking of it now, I think the Comet protocol works something like this.
.make.logfiles and the like will not be deleted: this operation log file is a separate thing.That seems pretty solid as long as I can figure out how to dump all the relevant messages into a single file - which seems pretty achievable, using a combination of
>,| tee, and direct writing from PHP code.So actually I think I can simplify that - we don't need to send a request to initiate the operation, then get confirmation back, then start following the operation's progress. Instead,
api.phpwithformat=cometor whatever, and with a randomly generated key included.There may also be a separate comet-updates api action that can be called, as an alternative to resending a bunch of parameters from the original request that don't need to be included once it's underway. So then client will maybe
This is not really much simpler, so it might turn out to be better to use the earlier proposal. Let's see how it shapes up in the code.