On Thu, Nov 29, 2012 at 1:06 PM, Tim Lyons <guy.linton@gmail.com> wrote:

thanks but...

The question is not about selection of objects as a whole but about more detailed info.

I do appreciate that I may have misunderstood what you were suggesting, but I don't think it answers the question!

On 29 Nov 2012, at 17:23, Doug Blank wrote:

On Thu, Nov 29, 2012 at 10:00 AM, Tim Lyons <guy.linton@gmail.com> wrote:
Nick, Doug, Thanks very much for your replies.

Unfortunately, my question was not about filters at all. I agree that reusing the "export" selection code would be a good idea, but that seems just to be a matter of implementing it, and my question was different.

Right, but I think that these issue may be more related than one might think at first look. (Nick, please feel free to refine the selection page, and think about a reorg with filters. I thought about combining Proxies with Filters, but there are a number of places that might not work properly. But I'd love to hear more about such an idea.)
I am sorry that I was not clear, and will try again with a specific example. The problem is: how does one plugin determine the URL for hyperlinks to pages produced by other plugins?

There are a couple of related points: how does code get the *right* URL, and how does code what has been selected. The *right* URL should always be relative (relative allows the root to be placed anywhere), and so code needs to know how to construct the URL based on how "relatively deep" the link is. There is currently a hacky "up" parameter in NarrativeWeb where "../" gets appended a number of times. We can probably come up with a better scheme. 

The other issue: how does another plugin know if there are valid links? Part of this confusion is caused by the current WebCalendar which can create links to be used in combination of the NarrativeWeb. However, there is no guarantee when you run these two reports separately. I think what Benny is suggesting is that such use (creating a Calendar with NarrativeWeb) should be done together, within a global organizer (like a Book). In this view, perhaps all webreports are created through a single interface where multiple ones can be selected at once.

(One can select HTML output for other reports, but I think that this is different from creating a webreport. But perhaps having such HTML output have links is a related discussion point).

So, if we centralize the selection, we know what is available for a link. If we centralize the creation of the pages, then we know exactly what has a link.

Well, no, I don't think this addresses the question. Are you assuming that the 'right' url for a given object will always be the same? I was trying to point out that this is not necessarily the case, and anyway, even if an object has been selected, not all the information about that object will be output. What do you do about the info that is not output (and backlinks to that info). See the detailed discussion in the original email.

Ok, sorry if I am dense :) Some of the confusion is no doubt based on all of us attempting to understand what the problems are, and how we can go about solving them. Let me state some of my assumptions. Not all of these are germane to the current discussion, but they might be related:

1. There should be one definitive location for each object (primary or otherwise) at a site where a webreport resides.
2. Those locations are referenced from other objects, and in lists of objects, all relative paths.
3. We should use the handle as the identifying key, as we do now (it is currently more complicated than that, with sets of objects appearing in subdirectories because Windows had a problem with too many files in one folder.) We should not change this, as these links should be consistent between runs of the reports. The handle becomes a public UID for that person. If you do not want people to know that source "746743746347346" is the same source at another site (say, because of privacy issues) then an alternate handle should be used, and we will keep the other in a set of UIDs.
4. Because we are using relative URLs, a link might look like "../../../../../source/746743746347346.html" in one place and "../source/746743746347346.html" in another. Both refer to the same object.

More below:

In an earlier prototype I was working on, I had a two-pass system, where the first pass asked each webreport what they were going to create links for,

I don't see how this works. For a start, it cannot meet the main goal of GEPS022: "The main goal of the refactor is to move each major part of NarrWeb into its own Web Report that can be run as a stand-along webreport, or as a sub-report of the refactored NarrWeb.". If there is no Person plugin, then you are just stuck with an empty list.

Well, I wrote that sentence, but I think I would express it differently after this discussion. I think what we are heading towards describing is a system where all webreports are effectively a sub-report of a centralized system.

For example, I now am thinking that it would work like this:

a) first select which data match, and select privacy settings
b) now select which webreports you want to generate. That might include:
* "Detailed Person Page" per person
* "Person List"
* "Surname List"
* "Calendar" listing each selected person's birthday, each family's anniversary
* "Detailed Source Page" per source
c) The report runs pass1 and asks each of these reports what they will generate. A complete list of pages to be generated is now known, as well as what is a backlink to what.
d) The report runs pass2 and each report is passed the global list of pages to be generated, so it knows if it can link to a page.

Even if there is a Person plugin, the process will not work, because the graph of links between objects classes is not simple directed (I think that is what I mean). There is no order in which you can ask webreports which links they will produce that will give all the answers. In particular, there is a loop between sources and media. 

I think I see what you mean. Perhaps the first pass is more like:

c) The report asks each report what they want to include, and each included object will have a set of dependencies of included items. The set of dependencies is traversed (breadth-first or depth-first)  until all dependencies have been included.
I initially had a scheme (as currently committed) where each plugin was related to an object class. The main class asked the person plugin which objects it was going to produce links to. The person plugin recursively called the other plugins, to tell them that they should produce pages for the objects that were going to be linked. This enabled all linked objects to be listed.

However, Nick pointed out:
To work towards this you should create a separate class for each section
of the NarrWeb report.  These sections should access a WebSite class for
shared information (not each other).  This will make further development
easier in the future. 

emphasis on "not each other"

I agreed that this would make the plugins too dependent on each other. If a particular plugin was missing, you would never get the objects that were linked from that object.

So I am modifying the mechanism to have the main class determine recursively all the objects that are linked. If I then ask each webreport what they are going to create links for, this means that all the link traversal needs to be done twice, which is very wasteful. Even asking each webreport what the url is for each object is somewhat wasteful (so as I describe, I just assume the url algorithm).

I don't think I see the waste... but, like Benny and Nick said, we can always optimize later. We need to get the behavior right first.

Hopefully this was more helpful; I feel like I understand the problem better :)



and then in the second pass all of the webreports knew the results from the first pass, and did what they promised.

Does that fit with your ideas?


The database consists of Person "Alice", Person "Bob", Citation "Page 5" and Source "Legend". All these objects are selected by the current filtering.

Alice has a Source Citation of Page 5/Legend for herself (i.e. in the Person Editor, under the Source Citation tab). Bob has a Source Citation of Page 5/Legend for his name (i.e. in the Name editor under the Source Citation tab).

Alice -> Page 5 -> Legend
Bob -> Name -> Page 5 -> Legend

The Source plugin is designed to show the Page/Volume for Citations that reference that source and a hyperlink to the pages for the objects that reference the Citation. So, in general, on the "Legend" page you would expect to see "Page 5" and a hyperlink to the pages for Alice and Bob.

Source Legend -> Page 5 -> Person Alice
                        -> Bob

Now consider three different plugins for people.

Plugin X output one page for each person, and displays the sources for people (but not the sources for names). So this plugin would produce a page for Alice mentioning the source "Page 5/Legend" for her. It should contain a hyperlink to the Source page "Legend". This plugin would also produce a page for Bob, but there would be no mention of the source, and hence no hyperlink to the Source page.

Plugin Y outputs one page for each person, and displays the sources for names, but not the sources for people. So Alice will not have any sources, and Bob will mention the source "Page 5/Legend"  and have a hyperlink to Legend.

Plugin Z outputs one page for each surname, containing details all people with that surname. So there would be one page for both Alice and Bob, and this might mention the source for both Alice and for Bob's name.

Now the idea of plugins is that it should be possible to run the source plugin and X, Y and Z independently.

At present, what I have done is to implement a central/kernel class, which determines the list of object to be output, which are Alice, Bob, Page 5 and Legend (this works well, and solves some current bugs where some linked objects are missed).

The question is, how does the Source plugin determine the URL for the backlink hyperlinks to the objects that reference the citation Page 5?

At present, I have implemented the central class to assume that the URL for all pages is derived algorithmically from the handle, so the assumed URLs for Alice and Bob are known.

I have implemented the central class to recursively discover all the objects that might need to be output, so it progresses:
Alice -> Page 5 -> Legend
Bob -> Name -> Page 5 -> Legend

As it follows these links, it remembers the backlinks (so they don't have to be rediscovered later).

So the Source plugin can look at the list of Source objects to be output, and can discover that it should output Legend. It can then look at the backlinks and discover that the citations are Page 5, and then look at its backlinks, and discover that it should output hyperlinks to Alice and Bob, whose assumed URLs it knows. Hence it can output:
Source Legend -> Page 5 -> Person Alice
                        -> Bob

However, with plugin X, there will be a backlink hyperlink to Bob, but when the user looks at the output page for Bob, he will find no source, and will wonder why the hyperlink was produced. Conversely with Plugin Y and Alice.

If you run plugin Z, then the assumed URL for people will not apply, so any hyperlinks will be wrong.

Similar problems apply with the hyperlink from Alice and Bob to the Source. At present, the list of objects to be output includes the Citation Page 5, but as there is no plugin which outputs citations, we can arrange for there to be no URL for Page 5. The person page generation then needs to determine the source object (entirely straightforward). At present, the assumed URL for the source object is based on the handle, and this can be included in the hyperlink on the person page. But what if the Source plugin does something different (e.g. there is only one page for all sources)?

Finally, if you decide to run more that one plugin (e.g. plugin X and Z), which should be the target for the backlink?

Sorry this is so long, but I hadn't explained my problem clearly before!