WorkingWiki / Bugs / #610 making WW code reusable outside of MediaWiki

Lee Worden - 2014-09-29

[original ticket description:
While I'm working on the working markup project, I'm losing my way in the WW code.
I give it a page with just a single project-file tag, and hope it will try to find some text and put it in place of the tag. It seems to be looking for an archived project file of that name! Why?
]

This doesn't really merit a bug ticket, but this is a convenient place to take notes and make sense out of it. I apologize for the extra email notifications.

So WWInterface::expand_tokens( $text, null, $markup_filename ) is what I call. It is calling WWStorage::find_file_content( $project_filename, $project_description, $markup_filename, false ). That final 'false' says the project file is not a source file. This means look for an archived project file, because find_file_content is not meaningful for a regular project file - it looks for file contents in the wiki, not in a working directory.

Why is expand_tokens() calling this?

pass 1 of expand_tokens:

for each token

if ( it's not a source-file tag defining a source file's contents )

if ( the project name is known, and the pagename is known )

then call find_file_content()

if that finds content, record it at $uniq_tokens[$token]['file-content']

if that found content and the project doesn't have it as an archived file, produce a warning.

OK, I see why it would be doing that (in the WW context). It looks like expand_tokens() may be doing a bunch of checks and maybe other things that don't make sense in a simpler markup context. I'll have a look at what they are and how to disentangle them.

Last edit: Lee Worden 2014-10-01
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-09-29

In this case with the (non) archived project file, it shouldn't need to do a find_file_content, because the tag content would have been supplied with the rest of the tag data by the parser and I could just record it with the token data. But let's have a look at what else is going on in there.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-09-29

Notes on what expand_tokens() does: http://lalashan.mcmaster.ca/theobio/projects/index.php/WorkingWiki/Developer_Documentation/Fetching_source_file_contents#Replacing_tokens

It looks like when calling it without a wiki I can pretty much just omit the entire pass 1.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-09-30

It looks like when calling it without a wiki I can pretty much just omit the entire pass 1.

The "right" way to do that is to use a WWInterface object, and in the markup case, use a subclass to provide an object that does certain methods differently. How hard would it be to do that? WW uses static class methods for WWInterface and WWStorage, because before now I haven't had a reason to instantiate an instance. So I would need to replace all calls to WWInterface::whatever() with $wwinterface->whatever() and make sure there is a $wwinterface variable in scope.

I can make that object global, or continue moving toward providing a "context" object that is passed around, to increase modularity and reusability. I think passing it around would be better overall, but would take too much work, so I should probably make it global for now. But at least encapsulated into a global $wwContext that points to a WWInterface, WWStorage, and whatever else.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-09-30

make it global for now. But at least encapsulated into a global $wwContext that points to a WWInterface, WWStorage, and whatever else.

[#611]

Related

Bugs: #611

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-01

The expand_tokens issue is resolved - renaming this ticket for ongoing work on making relevant parts of WW compatible with non-mediawiki uses. (PE should already be compatible).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-01

summary: parsing and puzzlement --> making WW code reusable outside of MediaWiki
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Description has changed:

Diff:

--- old
+++ new
@@ -1,3 +1 @@
-While I'm working on the [working](http://lalashan.mcmaster.ca/theobio/projects/index.php/Working_Markup) [markup](https://github.com/worden-lee/working-markdown) project, I'm losing my way in the WW code.
-
-I give it a page with just a single project-file tag, and hope it will try to find some text and put it in place of the tag.  It seems to be looking for an archived project file of that name!  Why?
+While I'm working on the [working](http://lalashan.mcmaster.ca/theobio/projects/index.php/Working_Markup) [markup](https://github.com/worden-lee/working-markdown) project, I need to make fixes to parts of WW that fail when called in a non-MediaWiki environment, so I can reuse them.

Anonymous

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-01

WW uses MediaWiki's hooks framework here and there, to let one piece of code extend the behavior of another. For example, the Background code intervenes in project file rendering by making tags with make="background" render as links. And the Preview code intervenes in that by making them not functions as links during preview.

The hooks framework doesn't work when the code isn't called from MW. Replace with a different, universally usable framework? Probably not too hard. Or bypass those calls, figuring the non-WW uses don't need those features at least for now?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Lee Worden - 2014-10-01
  
  What does WW use hooks for?
  
  WW-BeforeGetProjectFile: Insert list of background jobs in special page
  
  WW-BeforeManageProject: Insert list of background jobs in special page
  
  WW-GetProjectFile-LocateFile: apparently not used
  
  WW-AllowMakeInSession: make operations are allowed, except in existing background sessions.
  
  WW-GetProjectFile-AssumeResourcesDirectory: when in a background session, a filename without a project name means in the session root directory, rather than the resources directory
  
  WW-GetProjectFile-Headers: add "preview/background session key is" message to special page
  
  WW-ManageProject-Headers: same
  
  WW-GetProjectFile-Altlinks: background code adds 'make in background' to links
  
  WW-ListDirectorySetup: report the name of the directory being browsed differently if it's in a preview or background session
  
  WW-GetProjectFileQuery: when creating a GPF link, add preview or background session info as needed
  
  WW-MakeManageProjectQuery: when creating an MP link, add preview or background session info as needed
  
  WW-PERequest: when constructing a request to PE, add preview or background session info as needed
  
  WW-MakeTarget: disallow make operations during the parse that happens while saving a page after preview
  
  WW-OKToSyncFromExternalRepos: 'sync' in a git or svn project causes it to update from the remote repository, unless in a preview session.
  
  WW-RenderProjectFile: check for make="background" and produce a background-make link if so. Also don't do project file operations during a save to avoid redundant makes.
  
  WW-DynamicProjectFilesPlaceholderMessage: when previewing, display a different message in the placeholder, since reloading isn't the right thing.
  
  WW-HiddenActionInputs: add background or preview session info to forms that perform WW actions.
  
  WW-Api-Call-Arguments: add background or preview session info to links that invoke the MW/WW API.
  
  WW-OKToInsertBackgroundJobsList: it's ok except in preview pages
  
  WW-BackgroundMakeOK: it's ok to spin off background jobs, except when in a preview or background session
  
  WW-AddToMakeForm: add 'background make' button to the form on MP
  
  WW-CachePageFromDB: allow retrieving pages from the database, except if we're currently previewing that page
  
  WW-OKToSyncSourceFiles: don't sync files from stored pages during preview, because those are the wrong versions of the files.
  
  WW-OKToArchiveFiles: don't archive changes to project files make during preview
  
  WW-SequesterArchivedProjectFiles: not currently in use
  
  WW-ProactivelySyncIfNeeded: do a special sync operation during preview, to make sure we sync the submitted version of the source files while we have access to it
  
  ... ok! so all those are for background and preview stuff.
  
  I implemented preview and background jobs support basically as separate extensions that extend the WW extension. That hooks framework is how extensions are implemented, so I used it. It would probably be simple to reimplement it, by providing a function wwRunHooks that does what wfRunHooks does. Right now I think I'll write a wwRunHooks that does nothing when wfRunHooks is unavailable, and let reimplementation be an option for the future.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-01

write a wwRunHooks that does nothing when wfRunHooks is unavailable, and let reimplementation be an option for the future.

Done.

Next up is calls to MW's RequestContext::getMain(), to get info about what's being requested from the wiki. Where's it being used?

ProjectEngineConnection looks for 'logkey' argument in the calling URL or API parameters, passes it to PE if so

ww_setup() checks it for 'disable-make' argument, and for 'ww-static-files' argument. Uses it to add a noscript-redirect to the HTTP header, to get static file display on non-javascript browsers.

ProjectDescription uses it to set the 'MW_PAGENAME' environment variable for make jobs

WWInterface uses it to add ResourceLoader modules to the output page

WWApiGetProjectFile accesses it and passes it as $context argument to ProjectEngineConnection::make_target().

Indeed this object is being passed around among various WW functions. The $context argument is used

In ProjectEngineConnection's call_project_engine functions and ProjectDescription::fill_pe_request(), including for MW_PAGENAME

In WWPreview::OKToSyncSourceFiles_hook(), to compare the title of the page being previewed to the title of the page in question

So the RequestContext is needed for

getting the name of the current page

checking for special flags on the URL ('disable-make', etc.)

adding ResourceLoader modules to the output.

The latter 2 are fine and easily dealt with, since they're WW features that can be ignored outside of MW. As for the name of the page, I guess I should put it into the $wwContext object and stop passing the RequestContext object around.

Is there a case where we recursively call things using a different RequestContext? Say, when parsing wikitext within wikitext? If so, we'd have a problem where passing this argument as an argument works but using the global $wwContext doesn't... looks like yes, since there's code to make it work in ww_setup()...

There is a DerivativeRequest in WWApiListResourcesDirectory, used to make a subsidiary call to WWApiListDirectory.

Also in WWAction::pass_to_api(), which is the shim code used whenever there's a "?ww-action=" in the URL, to reimplement the CGI-style WW actions as calls to the newer WW API actions.

There are also uses of $wgRequest and other MW globals here and there (which are supposed to be replaced by use of RequestContext), so those'll be an issue as well.

I think I may need to split WWInterface into two things - WWMWInterface (or just WWInterface) and WWProcessing, basically - so I can use the core WW stuff and not the MW stuff. Ideally make sure RequestContext is only used in that one file, and info extracted from it is passed to everything else, so all the rest is reusable.

So for that I'd want to go through the whole thing and inventory what goes where, and look for potential complications. It's only 3600 lines of code :)... :(...

Along with that, I should probably inventory all the places where $wg* globals are used, since they won't be there. Ideally all that info should be black-boxed into the WWInterface class.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-01

I think I may need to split WWInterface into two things - WWMWInterface (or just WWInterface) and WWProcessing, basically - so I can use the core WW stuff and not the MW stuff.

Ideally all that info should be black-boxed into the WWInterface class.

More conservatively, maybe, move MW-specific things into WWInterface methods as needed, so I can override them in a non-MW subclass. And evolve more gradually toward having all that info black-boxed in WWInterface.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-04

Along with that, I should probably inventory all the places where $wg* globals are used, since they won't be there. Ideally all that info should be black-boxed into the WWInterface class.

Yep, and wf*() functions, starting with wfMsg().

What a pain! But the result will be a clean piece of processing code that doesn't assume anything about the presence of a wiki.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Lee Worden - 2014-10-04
  
  I won't clog up this ticket with an endless catalog of wf and wg stuff. But I do want to take a look to see what I'm dealing with...
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
  - Lee Worden - 2014-10-04
    
    I won't clog up this ticket with an endless catalog of wf and wg stuff. But I do want to take a look to see what I'm dealing with...
    
    Well, there's lots. Missing wf*() functions will cause a crash, so they'll probably get fixed if I take an ad-hoc approach. $wg* globals will just appear as blank values, so I'll need to take some care that they don't cause subtle bugs.
    
    But right now, I'm just going to look at what's stopping my test code from running. Currently it's wfMsg(), so I'm going to write a wwMessage() function that does something really rudimentary when the wf*() are missing.
    
    Last edit: Lee Worden 2014-10-04
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    
    Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-14

I am arriving to more interesting things: how to tell ProjectEngine to operate in an existing working directory in my home directory, rather than using files stashed in its cache.

But first, I have this unresolved issue with ProjectDescription::env_for_make_jobs using global MW info to get a value for $(MW_PAGENAME) in certain cases. [#529] is where I worked through that. I made it work by passing a ResourceContext object as an argument through various function calls.

But outside of MW there is no such object. Instead we need to have the WWInterface object know what the value of MW_PAGENAME should be, and we'll ask it, without passing that object around.

This is trivial, except in the weird cases where it's not. IIRC, the weird stuff is (a) when using the API to make a file that is conceptually within a page, but we're not parsing the page, (b) when parsing .wikitext files that may or may not actually be contained by a page.

These cases are actually not especially hard. I was worried about recursive execution, where I would need the pagename to be different during an inner call and then revert to its old value. I can't think of any cases where that would happen. I can handle normal parsing, API parsing, and .wikitext parsing all by just setting the page name in the WWInterface object and leaving it.

So I will do that, and then I'll remove the $context argument from functions that no longer need it.

Related

Bugs: ~~#529~~

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-14

MW_PAGENAME stuff is dealt with. On to how to make ProjectEngine process files in place, rather than away in its cache directories.

I think it's like this:

Set a flag in the PE request. If we put 'process-in-place' => true, and the project URI is 'file://.' or 'file:///home/wonder/src/files-for-markup/', PE should do it in that location.

Use a configuration variable in PE for security. It should require $peAllowProcessingInPlace = true or refuse to accept the above. This variable needs to be false by default, and false in any PE installation that runs in a server accepting requests from the internet, for obvious reasons.

Don't send source file contents to sync in the PE request. The source files are already in the directory.

We may want to provide a list of source files' names however. Some things like the make rules to create .tex.d files use that list to tell which files to process. For later, this.

We probably want to support syncing source file contents from tags embedded in the markup file, literate-programming style. Later for this, also. For now require the source files to be actual files placed in the working directory by the human.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Lee Worden - 2014-10-14
  
  OK, well with the protocol as it is, the 'process-in-place' flag is redundant since it's implied by the use of a 'file:' url. But I'm going to require it anyway, to preserve the option of other uses of file: in the future.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-14

Not my biggest priority, but I'm noticing it's not doing the 'short name' the way I would like for these file: project uris. This is the human-readable name it passes to PE along with the unfriendly URI name for each project, used to construct error messages, etc. I went to fix it and the code looks kind of messed up. Should fix.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-15

PE is being kind of stubborn about assuming files go in its cache directory. When I ask it to do files in /usr/local/src/working-markdown, it complains that it can't create files in /var/cache/ProjectEngine/persistent//usr/local/src/working-markdown....

I guess that assumption is built in in several places. I already have a class hierarchy for kinds of repositories (PERepositoryInterface as base, with PEWorkingWikiRepository, PEGitRepository, PEResourcesDirectoryRepository, etc. derived from it), and I've added PEInPlaceDirectoryRepository to the family, so now I need to get all the assumptions about where to cache encapsulated into PERepositoryInterface so I can override them in PEInPlaceDirectoryRepository.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Lee Worden - 2014-10-15
  
  PEResourcesDirectoryRepository already provides a directory that isn't in the cache. But it bypasses a lot of code by not allowing PE to sync files and make targets in its directory. PEInPlaceDirectoryRepository needs us to do those things correctly.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-15

Also, a worthwhile step towards solving this problem is teaching PE to output diagnostic messages to stdout when I'm running it from the command line. It already knows how to output lots of diagnostics to the comet client, so I should be able to just add a little encapsulation and have it go multiple places.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.
- Lee Worden - 2014-10-15
  
  I'd like to have a little class hierarchy, where you either have an output-diagnostics object or you don't, and if you do, you give your messages to it and it outputs them as appropriate.
  
  What I have now:
  
  You call log_sse_message( $text, $request ), where $request is the request data provided by the client. It looks in $request to determine whether the client wants comet updates or not, and if so, it uses the key given to locate the logfile to append the updates into, for spooling by the separate comet server process. If no key is there, it ignores the update.
  
  So the question is how does PE know when to output messages to stdout. I can do it either by setting a global $pe variable, or by setting something in the request data. I guess in the request data is more flexible. It means more request data, but since it'll only be used when the request data isn't actually transmitted over a wire, I don't think that's a concern.
  
  So I can just extend that $request logic. The SSE (Comet) case is triggered
  by the presence of $request['sse-log-key']. Semantically it wouldn't be nice to put a special value in there to signify log to stdout. I could either rename it to a general purpose flag for how to log output, or introduce a separate 'log-to-stdout' flag. I think if I tried to implement a general-purpose flag, I'd end up needing it to be a (kind of output, output location) pair, which is basically the same as having different flags for different kinds of output but more verbose. So I think I'll go with 'log-to-stdout'. Come to think of it, this also has the advantage of allowing a client to request both SSE and stdout logging at the same time - though I can't see why, there's no reason not to allow it.
  
  So no class hierarchy, just testing for both flags in log_sse_message().
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.
  - Lee Worden - 2014-10-15
    
    introduce a separate 'log-to-stdout' flag.
    
    Easy. Done in PE.
    
    Not so easy: how to get the WW code to insert this flag in every request to PE, when they're made from various points deep in the function call hierarchy.
    
    I've already faced the corresponding issue of how to get the 'log-to-sse' flag into all the requests, when processing a nest of function calls originating from an API call that wants SSE output. I dealt with that kind of ad-hoc, by putting a test for the sse key in the original HTTP request data in the ProjectEngineConnection class, which really shouldn't be jumping levels of abstraction like that. So I'd like to improve that as well.
    
    I could make ProjectEngineConnection an object, with state, allowing me to give it a logging behavior as part of its state and use that for all requests it generates.
    
    Or I can make the logging behavior be part of the WWInterface object's state. This is simpler to code, so I'll do that for now.
    
    Actually, rather than logging behavior per se, I'll have ProjectEngineConnection ask the WWInterface object to contribute to the request data before sending it, so it can add logging flags or modify other things. IIRC I already have PEC calling a hook function, to let the preview and background code add its flags, so I'll move that functionality into this call as well.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    
    Anonymous
    
    Add attachments
    Cancel
    You seem to have CSS turned off. Please don't fill out this field.
    
    You seem to have CSS turned off. Please don't fill out this field.

Lee Worden - 2014-10-15

I'll have ProjectEngineConnection ask the WWInterface object to contribute to the request data before sending it, so it can add logging flags or modify other things.

OK, that was easy too. Thinking about things before I do them is great!

Now to use it to find out where the path is getting set wrong.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

making WW code reusable outside of MediaWiki

Group

Searches

Help

#610 making WW code reusable outside of MediaWiki

Discussion

Related

Related