From: Jimmy Z. <cra...@co...> - 2006-08-22 00:02:21
|
How many different kind of encoding does woodstox currently support?? I am thinking about adding more encoding support right now VTD-XML supports support UTF8 ascii iso-8859 UTF-16BE and UTF-16LE ----- Original Message ----- From: "Tatu Saloranta" <cow...@ya...> To: "Jimmy Zhang" <cra...@co...>; <vtd...@li...> Cc: <n96...@ma...> Sent: Thursday, August 17, 2006 10:11 PM Subject: Re: [Vtd-xml-users] Fw: Problems with VDT (fwd) > --- Jimmy Zhang <cra...@co...> wrote: > >> one of the early email from din sush asks about how >> to split a large file >> into smaller ones >> >> then I thought about it and felt that a better >> solution (than current >> VTD-XML or Woodstox) >> can indeed be built.... >> >> the basic idea is to record only the offset and >> length of an element when >> splitting, so it doesn't >> need to read the whole thing into memory like >> VTD-XML did, nor does it need >> to need to perform >> decode/re-encoding and string creation like Pull.... >> >> after retaining the offset and length, just copy the >> file segment into >> separate files.... >> >> what do you guys think? > > For the specific splitting task that would be good. > Maybe such utility could be written, perhaps being > passed an Xpath expression defining where to split the > file (like, defining root nodes of resulting docs?). > And you could probably use much of VTD-XML code as > core of such tool? > > That should be very fast & memory efficient solution. > > -+ Tatu +- > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > |