Re: [Plone-developers] publishTraverse, acquisition and multiple urls for the same content

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I once had a burning desire to clean up the mess that is the publisher. I spend a long time trying to just document what it does and how it works. The results are here:

http://docs.zope.org/zope_secrets/

I suggest you sit down.

I don't think it's possible to make it much better without braking some backward compatibility. There are a million old hooks used in one or two places that just need to go.

If we are to clean it up, I'd look at repoze.zope2 as a starting point. That code is much better and much better tested, although now slightly out of date.

Martin
—
Sent from Mailbox

On Wed, Aug 20, 2014 at 12:55 AM, Nathan Van Gheem <na...@va...>
wrote:

> Okay, I'll take a short crack on this with some thoughts.
> - The traversal side affects shouldn't matter unless you're directly
> linking to them? How come your links are getting messed up?
> - Changing zope2 traversal will likely have a lot of side affects. That
> being said, it's unpredictable and unusual behavior for most people. There
> is a lot of history and a lot of code that perhaps depends on it. Shooting
> from the hip, tightening up traversal is something that may be possible in
> the next major release of Plone(6). However, maybe by then we have some
> ideas on moving the traversal over to pyramid? :)
> - some people find acquisition powerful and use it to solve problems in
> some wonky ways...
> - I would think any changes like this to traversal are potentially very
> high risk with low benefits. This might not be an important problem to
> tackle for you right now :(
> Anyways, just some thoughts...
> -Nathan
> On Tue, Aug 19, 2014 at 4:02 AM, Mauro Amico <mau...@gm...> wrote:
>> I want to share a problem that I have with ''publishTraverse'' and
>> ''acquisition''.
>>
>> The Problem
>> -----------
>>
>> My problem with “acquisition” and publishTraverse is that the current
>> method returns too many different URLs for the same content. For instance
>> here is some potential url for the “kb” page of the plone.org website
>>
>> https://plone.org/documentation/kb
>> https://plone.org/documentation/manual/kb
>> https://plone.org/documentation/kb/manual/kb
>> https://plone.org/documentation/manual/spinner.gif/kb
>> ...
>>
>> and here is a generic "Plone" site with two content items "a" and "b"
>> (folderish or not)
>>
>> http://example.com/Plone/a
>> http://example.com/Plone/a/b/a
>> http://example.com/Plone/a
>> http://example.com/Plone/b/a
>> ...
>>
>> All the urls above returns 200 with the same content, while I would like
>> the "canonical url" to return 200 and the other to return 404.
>>
>> The behaviour described above constitute a problem because:
>>
>> * multiple url for the same content is a problem for SEO and is confusing
>> to
>>   people. For SEO, in the latest versions Plone introduced the canonical
>> META,
>>   but IMHO it's just a workaround. People are confused. For example:
>> sometimes
>>   some of my editors ask me: "I can't remove the
>> http://example.com/Plone/a/b/a/
>>   page. Can you do it for me?"
>>
>> * the page doesn’t seem really the same on all urls: if you open
>>   https://plone.org/documentation/kb and
>>   https://plone.org/documentation/manual/kb the second has a
>>   portlet that the first is missing
>>
>> * removing page from external cache (varnish or squid), for example after a
>>   content modification, will be a pain, because for the same content there
>>   could be multiple urls without any control or rules
>> (collective.purgebyid
>>   solve this)
>>
>> * when using subsite (or multiple plone site on the same zope app) the
>> problem is
>>   even more annoying: suppose that "a" is a subsite (marked with
>>   INavigationRoot) for http://a.example.org and "b" for
>> http://b.example.org,
>>   opening the url http://a.example.org/b will probably show the homepage
>> of site
>>   "a" inside the "b" site (collective.siteisolation and probably
>> collective.lineage do
>>   something to isolate subsite, but IMHO again are only workarounds)
>>
>> Are there other people with the same doubts and problems?
>>
>> Does anybody have a good and stable solution for that?
>>
>> My analysis
>> -----------------
>>
>> I tried to look in depth and identified a possible source of the problem
>> mentioned in:
>>
>>
>> https://github.com/zopefoundation/Zope/blob/2.13.21/src/ZPublisher/BaseRequest.py#L122
>>
>>                 # And lastly, of there is no view, try acquired
>> attributes, but
>>                 # only if there is no __bobo_traverse__:
>>                 try:
>>                     subobject=getattr(object, name)
>>                     # Again, clear any error status created by
>> __bobo_traverse__
>>                     # because we actually found something:
>>                     request.response.setStatus(200)
>>                 except AttributeError:
>>                     pass
>>
>>
>> I found many solutions (like collective.siteisolation) that work on higher
>> level
>> with IPublishTraverse adapter, but in my opinion the problem is with all
>> traversing
>> (e.g. https://plone.org/documentation/manual/spinner.gif/kb), so I think
>> that at the end the best solution could be to modify the default traverser
>> (or something
>> like that).
>>
>> In a site in production using Plone 4.2, all content are Dexterity, no
>> portlet, I added a log:
>>
>> +    import logging
>> +    logger = logging.getLogger('analyze.publishTraverse')
>>
>> ...
>>                 # And lastly, of there is no view, try acquired
>> attributes, but
>>                 # only if there is no __bobo_traverse__:
>>                 try:
>>                     subobject=getattr(object, name)
>> +                    logger.warning("obj:%r name:%r meta_type:%r",
>> +                        object, name, getattr(aq_base(subobject),
>> 'meta_type, '-')
>> +                    )
>>                     # Again, clear any error status created by
>> __bobo_traverse__
>>                     # because we actually found something:
>>                     request.response.setStatus(200)
>>                 except AttributeError:
>>                     pass
>>
>> After three weeks I checked the logs: some wrong urls that I preferred
>> they responded 404 and many "false positive", which were fortunately all
>> well known: portal_skins object("FileSystem Script", ...), Registry object
>> ("portal_css", "portal_javascript", ...) and the "Virtual Host Monster"
>> object.
>>
>> Probably I will extend logging period up to the second week of September,
>> after that I'm thinking to monkey patch the method with something like:
>>
>>                 # And lastly, of there is no view, try acquired
>> attributes, but
>>                 # only if there is no __bobo_traverse__:
>>                 try:
>>                     subobject=getattr(object, name)
>> +                    meta_type = getattr(aq_base(subobject), 'meta_type,
>> None)
>> +                    if meta_type.startswith('Dexterity ') or meta_type ==
>> 'Plone Site':
>> +                        subobject = None
>> +                        raise AttributeError
>>                     # Again, clear any error status created by
>> __bobo_traverse__
>>                     # because we actually found something:
>>                     request.response.setStatus(200)
>>                 except AttributeError:
>>                     pass
>>
>> Or
>>
>>                 # And lastly, of there is no view, try acquired
>> attributes, but
>>                 # only if there is no __bobo_traverse__:
>>                 try:
>>                     subobject=getattr(object, name)
>> +                    meta_type = getattr(aq_base(subobject), 'meta_type,
>> None)
>> +                    if meta_type.startswith('Filesystem '):
>> +                        pass  # e.g. object inside portal_skins
>> +                    if meta_type.endswith(' Registry'):
>> +                        pass  # e.g. portal_css
>> +                    if meta_type == 'Virtual Host Monster:
>> +                        pass  # e.g. VHM
>> +                    else:
>> +                        subobject = None
>> +                        raise AttributeError
>>                     # Again, clear any error status created by
>> __bobo_traverse__
>>                     # because we actually found something:
>>                     request.response.setStatus(200)
>>                 except AttributeError:
>>                     pass
>>
>>
>> Opinions?
>> Ideas?
>> Better solutions (I really don’t like monkey patch Zope2’s ZPublisher)?
>>
>> Thanks for the patience to read until here.
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> Plone-developers mailing list
>> Plo...@li...
>> https://lists.sourceforge.net/lists/listinfo/plone-developers
>>
>>
> -- 
> Nathan Van Gheem
> Solutions Architect
> Wildcard Corp