From: Andy H. <and...@gm...> - 2009-10-26 22:27:32
|
On Fri, Oct 9, 2009 at 6:58 AM, Oehlander, Carl <car...@sa...>wrote: > Thanks for the reply! > > > > The string “$[$123]” is a reference in a query/template language. It > appears in normal text but gets substituted before final display. > > > > “foo $[$123]” becomes i.e. “foo bar” > > > > In the documentation describing this language, this string also appears > (but not substituted). This is where it becomes a problem. I assume the same > thing happens with regular expressions in normal text. > > > > Is there a way to prohibit line-breaking on syntactical characters which > are not separated by spaces or explicit break-characters (‘-’, ‘/’)? In most > cases this is a matter of sequences of Computer-language elements. > You can make a custom break iterator using whatever rules you like. Take the stock rules from the icu file source/data/brkitr/line.txt, modify them as desired, get the rules into a string in your program and open a break iterator with the constructor RuleBasedBreakIterator( const UnicodeString &rules, ... The safest sort of change to make is to move the characters in question from the character class they are in to some other class that has the desired behavior. Explicitly subtract them from their default set and add them to their new set. -- Andy > > > Would this be considered by ICU as an add-on to LB24/25? > > > > Kind regards, > > Carl > > > > > > *From:* Mark Davis ☕ [mailto:ma...@ma...] > *Sent:* Freitag, 2. Oktober 2009 22:13 > *To:* icu...@li... > *Subject:* [JUNK]Re: [icu-design] Line breaking Implementation according > to UAX#14? > > > > We implement what is in Unicode CLDR, and also on > http://unicode.org/Public/5.2.0/ucd/auxiliary/LineBreakTest.html, which is > "tailoring of numbers described in Example 7 of Section 8.2 Examples of > Customization". Is “$[$1234]” a value that occurs in practice, or did this > just show up in a test? > > Mark > > On Fri, Oct 2, 2009 at 08:10, Oehlander, Carl <car...@sa...> > wrote: > > Hi, > > > > I am trying to figure out why ICU decides to break this string “$[$1234]” > between the first dollar-sign and the first bracket. Resulting in: > > > > $[ > > $1234] > > > > According to LB 25 in UAX#14 (http://unicode.org/reports/tr14/) “$[” > should never be separated (see “example pairs” under LB 25). Although it > definitely is in this case. My only guess is that this regular expression: > > > > ( PR | PO) ? ( OP | HY ) ? NU (NU | SY | IS) * (CL | CP) ? ( PR | PO) ? > > > > ..which is available under LB24 takes precedence over anything appearing in > LB25. > > > > It would be of great help if someone could explain what part of UAX#14 is > implemented in ICU or point me to the documentation explaining this. > > > > Kind regards, > > > > *Carl Öhlander* > TD Core AS&DM I18N (AG) > *SAP AG* > Dietmar-Hopp-Allee 16 > 69190 Walldorf > Phone +49 6227 7-40336 > Fax +49 6227 78-62575 > mailto:car...@sa... <car...@sa...> > http://www.sap.com/netweaver > > Sitz der Gesellschaft/Registered Office: Walldorf, Germany > Vorstand/SAP Executive Board: Léo Apotheker (Sprecher/CEO), Werner Brandt, > Erwin Gunst, Bill McDermott, Gerhard Oswald, John Schwarz, Jim Hagemann > Snabe > Vorsitzender des Aufsichtsrats/Chairperson of the SAP Supervisory Board: > Hasso Plattner > Registergericht/Commercial Register Mannheim No HRB 350269 > > Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige > vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich > erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine > Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte > benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen > Dank. > > This e-mail may contain trade secrets or privileged, undisclosed, or > otherwise confidential information. If you have received this e-mail in > error, you are hereby notified that any review, copying, or distribution of > it is strictly prohibited. Please inform us immediately and destroy the > original transmittal. Thank you for your cooperation. > > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry® Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9-12, 2009. Register now! > http://p.sf.net/sfu/devconf > _______________________________________________ > icu-design mailing list > icu...@li... > To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design > > > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > icu-design mailing list > icu...@li... > To Un/Subscribe: https://lists.sourceforge.net/lists/listinfo/icu-design > |