From: Sabri L. <sab...@gm...> - 2007-10-01 13:05:40
|
On 10/1/07, Campo Weijerman <rf...@nl...> wrote: > > On Mon, Oct 01, 2007 at 01:35:02PM +0200, Sabri LABBENE wrote: > > Reini Urban wrote: > > >Campo Weijerman schrieb: > > >> On Fri, Sep 28, 2007 at 10:57:24AM +0200, Sabri LABBENE wrote: > > >>> Hi all, > > >>> I'm using phpwiki-1.3.12 and I'm trying to make it > > >recognize CamelCase words with numbers inside as wikiwords, fo example: > > >>> - CamelCase2 -> is a wikiword > > >>> - Camel2Case -> is also a wiki word > > >>> - 2CamelCase -> is also a wiki word > > >>> > > >>> I think there should be a regular expression somewhere in > > >the code that decides if a word is a wikiword. Can someone > > >teel where to find it ? If there will some side effects > > >whenever numbers are considered into wikiwords ? > > >> > > >> Hi, > > >> > > >> We had a similar requirement and solved it back with phpwiki 1.3.3 by > > >> changing the definition of $WikiNameRegexp in index.php > > >> > > >> With more recent releases there is WIKI_NAME_REGEXP in > > >> config/config.ini > > >> > > >> It takes some tweaking to arrive at the right compromise between the > > >> regex being too wide or too narrow. I think too wide is worse than > > >> too narrow: you can always force linking to a page by putting the > name > > >> in [brackets], which is less painful than having to escape every > other > > >> word on a page... > > > > I tried the regexp and it keeps catching CamelCase words without digits > > inside. I don't understand why you need to escape some other words in > your > > page. May be you have as requirement to only link pagenames that > contains > > digits. > > Sure. The problem is, if you start tweaking the regexp it is easy to > come up with something that considers too many words a WikiWord, and > you'll end up having to escape lots of words. > > > >> We have been using this for years now: > > >> > > >> WIKI_NAME_REGEXP = > > >> > > >"(?<![[:alnum:]])[[:upper:]][[:alnum:]]*?[[:lower:]][[:alnum:]]*?[[:up > > >> per:]][[:alnum:]]*(?![[:alnum:]])"; > > >> > > >> Btw, the default is > > >> > > >> WIKI_NAME_REGEXP = > > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:]]+){2,}(?![[:alnum:]])" > > > > > >config-dist.ini in CVS has these options: > > >http://phpwiki.cvs.sourceforge.net/phpwiki/phpwiki/config/confi > > >g-dist.ini?revision=1.83&view=markup > > > > > >; Perl regexp for WikiNames ("bumpy words"): > > >; (?<!..) & (?!...) used instead of '\b' because \b matches > > >'_' as well > > >; Allow digits: BumpyVersion132 > > >; WIKI_NAME_REGEXP = > > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:][:digit:]]+){2,}(?![[: > > >alnum:]])" > > >; Allow lower+digits+dots: BumpyVersion1.3.2 > > >; WIKI_NAME_REGEXP = > > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:][:digit:]\.]+){2,}(?![ > > >[:alnum:]])" > > >; Default old behaviour, no digits as lowerchars. > > >;WIKI_NAME_REGEXP = > > >"(?<![[:alnum:]])(?:[[:upper:]][[:lower:]]+){2,}(?![[:alnum:]])" > > > > Great, it works ! > > Thanks Reini and Campo. > > Actually, the suggestions offered by Reini better match what you asked > for. I use phpwiki mostly for documenting IT-related stuff, and as we > all know there are many acronyms used. The traditional definition of > WikiWord will include anything containing an embedded acronym, like > for example DocBookXML2LaTeX (at least, I don't think it does). The > alternative regexp I am now using will match any sequence of non-blank > non-punctuation that starts with a Capital letter and alternates > sufficiently between lower and uppercase. This works pretty well. OK, I see. BTW, did you faced some slowness in document rendering, or any other side effects after changing the regexp aside from having to escape other words? BR -- Sabri. |