Re: [Phpwiki-talk] Numbers in wikiwords

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 10/1/07, Campo Weijerman <rf...@nl...> wrote:
>
> On Mon, Oct 01, 2007 at 01:35:02PM +0200, Sabri LABBENE wrote:
> > Reini Urban wrote:
> > >Campo Weijerman schrieb:
> > >> On Fri, Sep 28, 2007 at 10:57:24AM +0200, Sabri LABBENE wrote:
> > >>> Hi all,
> > >>> I'm using phpwiki-1.3.12 and I'm trying to make it
> > >recognize CamelCase words with numbers inside as wikiwords, fo example:
> > >>> - CamelCase2 -> is a wikiword
> > >>> - Camel2Case -> is also a wiki word
> > >>> - 2CamelCase -> is also a wiki word
> > >>>
> > >>> I think there should be a regular expression somewhere in
> > >the code that decides if a word is a wikiword. Can someone
> > >teel where to find it ? If there will some side effects
> > >whenever numbers are considered into wikiwords ?
> > >>
> > >> Hi,
> > >>
> > >> We had a similar requirement and solved it back with phpwiki 1.3.3 by
> > >> changing the definition of $WikiNameRegexp in index.php
> > >>
> > >> With more recent releases there is WIKI_NAME_REGEXP in
> > >> config/config.ini
> > >>
> > >> It takes some tweaking to arrive at the right compromise between the
> > >> regex being too wide or too narrow.  I think too wide is worse than
> > >> too narrow: you can always force linking to a page by putting the
> name
> > >> in [brackets], which is less painful than having to escape every
> other
> > >> word on a page...
> >
> > I tried the regexp and it keeps catching CamelCase words without digits
> > inside. I don't understand why you need to escape some other words in
> your
> > page. May be you have as requirement to only link pagenames that
> contains
> > digits.
>
> Sure.  The problem is, if you start tweaking the regexp it is easy to
> come up with something that considers too many words a WikiWord, and
> you'll end up having to escape lots of words.
>
> > >> We have been using this for years now:
> > >>
> > >> WIKI_NAME_REGEXP =
> > >>
> > >"(?<![[:alnum:]])[[:upper:]][[:alnum:]]*?[[:lower:]][[:alnum:]]*?[[:up
> > >> per:]][[:alnum:]]*(?![[:alnum:]])";
> > >>
> > >> Btw, the default is
> > >>
> > >> WIKI_NAME_REGEXP =
> > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:]]+){2,}(?![[:alnum:]])"
> > >
> > >config-dist.ini in CVS has these options:
> > >http://phpwiki.cvs.sourceforge.net/phpwiki/phpwiki/config/confi
> > >g-dist.ini?revision=1.83&view=markup
> > >
> > >; Perl regexp for WikiNames ("bumpy words"):
> > >;   (?<!..) & (?!...) used instead of '\b' because \b matches
> > >'_' as well
> > >; Allow digits: BumpyVersion132
> > >;   WIKI_NAME_REGEXP =
> > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:][:digit:]]+){2,}(?![[:
> > >alnum:]])"
> > >; Allow lower+digits+dots: BumpyVersion1.3.2
> > >;   WIKI_NAME_REGEXP =
> > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:][:digit:]\.]+){2,}(?![
> > >[:alnum:]])"
> > >; Default old behaviour, no digits as lowerchars.
> > >;WIKI_NAME_REGEXP =
> > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:]]+){2,}(?![[:alnum:]])"
> >
> > Great, it works !
> > Thanks Reini and Campo.
>
> Actually, the suggestions offered by Reini better match what you asked
> for.  I use phpwiki mostly for documenting IT-related stuff, and as we
> all know there are many acronyms used.  The traditional definition of
> WikiWord will include anything containing an embedded acronym, like
> for example DocBookXML2LaTeX (at least, I don't think it does).  The
> alternative regexp I am now using will match any sequence of non-blank
> non-punctuation that starts with a Capital letter and alternates
> sufficiently between lower and uppercase.  This works pretty well.

OK, I see.
BTW, did you faced some slowness in document rendering, or any other side
effects after changing the regexp aside from having to escape other words?

BR
-- Sabri.