Thread: [Python-markdown-discuss] wrapping character ranges in tags

Brought to you by: qaramazov, waylanhl

python-markdown-discuss

[Python-markdown-discuss] wrapping character ranges in tags

From: Eric A. <gi...@gm...> - 2008-10-01 09:05:09

Hi there,

I've made a custom inline pattern to wrap Chinese characters in <span  
class="char"></span> tags, using a regex (Chinese characters generally  
fall between \u4e00 and \u9fff).

The below works, but wraps each individual character in its own span  
tags, rather than wrapping consecutive runs of characters in a single  
set of tags:

class SpanPattern(markdown.Pattern):
     def handleMatch(self, m, doc):
         el = doc.createElement('span')
         el.appendChild(doc.createTextNode(m.group(2)))
         el.setAttribute('class','char')
         return el

md.inlinePatterns.insert(-1,SpanPattern(ur'([\u4e00-\u9fff]+)'))

The result is the same whether the + is included in the regex or not;  
is there some other trick I can use to make sure that five characters  
in row, for instance, will get wrapped together in one pair of spans?

TIA,

Eric

Re: [Python-markdown-discuss] wrapping character ranges in tags

From: Yuri T. <qar...@gm...> - 2008-10-07 05:42:02

Which version are you using?  When I run this with the last released
version (1.7), I actually get an error.

If you want to try this with the latest code from git, then the
following code does work:

class SpanPattern(markdown.Pattern):
    def handleMatch(self, m):
        el = etree.Element('span')
        el.text=m.group(2)
        el.set('class','char')
        return el

md.inlinePatterns.insert(-1,SpanPattern(ur'([\u4e00-\u9fff]+)'))
print md.convert(u"including 沙 shāfā - sofa")

produces:

<p>including <span class="char">沙</span> shāfā - sofa</p>

 - yuri

On Wed, Oct 1, 2008 at 2:04 AM, Eric Abrahamsen <gi...@gm...> wrote:
> Hi there,
>
> I've made a custom inline pattern to wrap Chinese characters in <span
> class="char"></span> tags, using a regex (Chinese characters generally
> fall between \u4e00 and \u9fff).
>
> The below works, but wraps each individual character in its own span
> tags, rather than wrapping consecutive runs of characters in a single
> set of tags:
>
> class SpanPattern(markdown.Pattern):
>     def handleMatch(self, m, doc):
>         el = doc.createElement('span')
>         el.appendChild(doc.createTextNode(m.group(2)))
>         el.setAttribute('class','char')
>         return el
>
> md.inlinePatterns.insert(-1,SpanPattern(ur'([\u4e00-\u9fff]+)'))
>
> The result is the same whether the + is included in the regex or not;
> is there some other trick I can use to make sure that five characters
> in row, for instance, will get wrapped together in one pair of spans?
>
> TIA,
>
> Eric
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Python-markdown-discuss mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss
>



-- 
http://sputnik.freewisdom.org/

Re: [Python-markdown-discuss] wrapping character ranges in tags

From: Eric A. <gi...@gm...> - 2008-10-07 10:55:52

On Oct 7, 2008, at 1:39 PM, Yuri Takhteyev wrote:

> Which version are you using?  When I run this with the last released
> version (1.7), I actually get an error.
>
> If you want to try this with the latest code from git, then the
> following code does work:

Brilliant! That does work. I was using 1.7.0 R 66 before, it was  
definitely working (I copied and pasted the code exactly), just not  
producing the result I was after. Thanks very much.

Yours,
Eric


>
>
> class SpanPattern(markdown.Pattern):
>    def handleMatch(self, m):
>        el = etree.Element('span')
>        el.text=m.group(2)
>        el.set('class','char')
>        return el
>
> md.inlinePatterns.insert(-1,SpanPattern(ur'([\u4e00-\u9fff]+)'))
> print md.convert(u"including 沙發 shāfā - sofa")
>
> produces:
>
> <p>including <span class="char">沙發</span> shāfā - sofa</p>
>
> - yuri
>
> On Wed, Oct 1, 2008 at 2:04 AM, Eric Abrahamsen <gi...@gm...>  
> wrote:
>> Hi there,
>>
>> I've made a custom inline pattern to wrap Chinese characters in <span
>> class="char"></span> tags, using a regex (Chinese characters  
>> generally
>> fall between \u4e00 and \u9fff).
>>
>> The below works, but wraps each individual character in its own span
>> tags, rather than wrapping consecutive runs of characters in a single
>> set of tags:
>>
>> class SpanPattern(markdown.Pattern):
>>    def handleMatch(self, m, doc):
>>        el = doc.createElement('span')
>>        el.appendChild(doc.createTextNode(m.group(2)))
>>        el.setAttribute('class','char')
>>        return el
>>
>> md.inlinePatterns.insert(-1,SpanPattern(ur'([\u4e00-\u9fff]+)'))
>>
>> The result is the same whether the + is included in the regex or not;
>> is there some other trick I can use to make sure that five characters
>> in row, for instance, will get wrapped together in one pair of spans?
>>
>> TIA,
>>
>> Eric
>>
>> -------------------------------------------------------------------------
>> This SF.Net email is sponsored by the Moblin Your Move Developer's  
>> challenge
>> Build the coolest Linux based applications with Moblin SDK & win  
>> great prizes
>> Grand prize is a trip for two to an Open Source event anywhere in  
>> the world
>> http://moblin-contest.org/redirect.php?banner_id=100&url=/
>> _______________________________________________
>> Python-markdown-discuss mailing list
>> Pyt...@li...
>> https://lists.sourceforge.net/lists/listinfo/python-markdown-discuss
>>
>
>
>
> -- 
> http://sputnik.freewisdom.org/