Re: [Flex-devel] unicode BOM & endianness
flex is a tool for generating scanners
Brought to you by:
wlestes
From: Peter M. <pet...@gm...> - 2012-07-09 15:25:13
|
Am I correct in thinking that if this is compiled on a big-endian machine, it would accept UTF-16BE? (I have an old SPARC box somewhere I can boot up to test). I think the only sane thing we can do is explicitly support only UTF-16BE and UTF-16LE and allow a function to toggle which is which as a result of processing tokens. The reason I think that is, after double checking the Unicode FAQ, the BOM is only valid at the start of a text stream (and is otherwise a zero width space) - and it must be the user's responsibility to define the start and end of a text stream. What do you think? On Mon, Jul 9, 2012 at 11:08 AM, Paul <pa...@pr...> wrote: > Right now the utf16 flex accepts UTF-16LE. > It does not handle BOM or endianness. > Presently in my own work, that is handled by the YY_INPUT macro. > I suspect there are varying philosophical positions here. > Would like some input suggestions for this. > > > Paul Neelands > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |