From: Clark C . E. <cc...@cl...> - 2001-12-03 16:46:52
|
Ok. I've begun to think a bit about libyaml... first, the implementation should be done in C and SWIG should be used up-front to provide Java, Perl, and Python bindings. I'd also like Visual Basic bindings, but this requires COM support. So, I think sticking with a COM like mechanism [1] will work, wrapping with SWIG and then when someone who knows ActiveX a bit better has time, they can probably modify the source to make it fully COM compliant. By a COM like version, I mean the use of HRESULT as the return value for each function and each struct has a pointer to a vtable for the first member with the first three vtable entries being query, addref, and release. Also, some encapsulation of memory mangement would be needed so that IMalloc could be plugged in later on. Best, Clark |
From: Sven V. <ski...@ko...> - 2003-07-22 21:41:26
|
I'm sorry if this has been asked before, but I couldn't seem to get the search facility to work. http://www.yaml.org/about.html claims that Neil Watkiss is working on libyaml. Does anyone know what this will/does look like or whether/where (part of) it is available ? Neil himself seems to be unreachable (on honeymoon ?). A reply to an earlier mail about a C++ yaml library seems to suggest that nothing is available, but I'd like to know for sure before writing something myself. skimo |
From: Clark C. E. <cc...@cl...> - 2003-07-23 21:35:51
|
Sven, Syck is about as close as you get. It is a native "C" library with extension modules. It isn't exactly what I wanted (it doesn't support streaming, pull-based parsing or does it?) If you were to write a stream-based pull parser, I'd gladly help out. I just don't have the bandwith to pull it by myself these days. Best, Clark On Tue, Jul 22, 2003 at 11:41:55PM +0200, Sven Verdoolaege wrote: | I'm sorry if this has been asked before, but I couldn't | seem to get the search facility to work. | | http://www.yaml.org/about.html claims that Neil Watkiss | is working on libyaml. | Does anyone know what this will/does look like or | whether/where (part of) it is available ? | Neil himself seems to be unreachable (on honeymoon ?). | | A reply to an earlier mail about a C++ yaml library | seems to suggest that nothing is available, but I'd | like to know for sure before writing something myself. | | skimo | | | ------------------------------------------------------- | This SF.net email is sponsored by: VM Ware | With VMware you can run multiple operating systems on a single machine. | WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the | same time. Free trial click here: http://www.vmware.com/wl/offer/345/0 | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core | |
From: Ned K. <ne...@bi...> - 2003-07-23 22:10:41
|
On Wednesday 23 July 2003 02:37 pm, Clark C. Evans wrote: > Syck is about as close as you get. It is a native "C" > library with extension modules. It isn't exactly what > I wanted (it doesn't support streaming, pull-based parsing > or does it?) > > If you were to write a stream-based pull parser, I'd gladly > help out. I just don't have the bandwith to pull it by > myself these days. I was just looking at Syck for usage with Squeak. Syck's model is: * you call parse() * it calls back your parse_handler() function with each node * there are a few other optional callbacks, including ones for I/O. So it looks like you could turn Syck "inside out" by calling parse() in another thread, and reading the node stream from the parsing thread in your main thread. But depending on why you wanted a "stream-based pull parser", it might work for you as is without bothering with another thread or buffering. That is: you should be able to feed it from a stream because of its callbacks for I/O, and you get presented with a node at a time. This works for lots of uses. _Why, correct me if I'm wrong here (I haven't actually built anything with it yet...). -- Ned Konz http://bike-nomad.com GPG key ID: BEEA7EFE |
From: Oren Ben-K. <or...@be...> - 2003-07-24 18:03:39
|
Ned Konz [mailto:ne...@bi...] wrote: > > Syck is about as close as you get. It is a native "C" > > library with extension modules. It isn't exactly what > > I wanted (it doesn't support streaming, pull-based parsing > > or does it?) > > > > If you were to write a stream-based pull parser, I'd gladly > > help out. I just don't have the bandwith to pull it by > > myself these days. > > I was just looking at Syck for usage with Squeak. > > Syck's model is: > > * you call parse() > * it calls back your parse_handler() function with each node > * there are a few other optional callbacks, including ones for I/O. > > So it looks like you could turn Syck "inside out" by calling parse() > in another thread, and reading the node stream from the parsing > thread in your main thread. Multi-threading in C is a PITA. > But depending on why you wanted a "stream-based pull parser", > it might > work for you as is without bothering with another thread or > buffering. That is: you should be able to feed it from a stream > because of its callbacks for I/O, and you get presented with a node > at a time. This works for lots of uses. This *might* work but to be useful it would need to be directly supported by the library; it is way to easy to get it wrong... I'm aiming directly at a streaming pull parser... It is tricky but not too much so. I'm disappointed to see nobody bothered to write a pull parser generator so far, so I'm ending up writing my own. Have fun, Oren Ben-Kiki |
From: why t. l. s. <yam...@wh...> - 2003-07-25 05:12:34
|
Oren Ben-Kiki (or...@be...) wrote: > > But depending on why you wanted a "stream-based pull parser", > > it might > > work for you as is without bothering with another thread or > > buffering. That is: you should be able to feed it from a stream > > because of its callbacks for I/O, and you get presented with a node > > at a time. This works for lots of uses. This is the basic state of things. You get kicked out of syck_parse() everytime you hit a document separator, so streaming is pretty slick. > > I'm aiming directly at a streaming pull parser... It is tricky but not > too much so. I'm disappointed to see nobody bothered to write a pull > parser generator so far, so I'm ending up writing my own. > This is good news, Oren. :) Best of luck. I admit that my library has its limitations, but I was purposefully aiming for a small set of problems. I wanted a small YAML binding for tight coupling with agile languages. I wasn't even shooting for an emitter, but enough people talked me into it. I wish Syck was suited for direct usage from C, but I don't think its very friendly in that regard. Does anyone know of a _good_ XML parser for C whose model can be mimicked? Has anyone had success with data serialization in C? _why |
From: Oren Ben-K. <or...@be...> - 2003-07-24 18:01:58
|
Sven Verdoolaege wrote: > I'm sorry if this has been asked before, but I couldn't > seem to get the search facility to work. > > http://www.yaml.org/about.html claims that Neil Watkiss > is working on libyaml. > Does anyone know what this will/does look like or > whether/where (part of) it is available ? > Neil himself seems to be unreachable (on honeymoon ?). > > A reply to an earlier mail about a C++ yaml library > seems to suggest that nothing is available, but I'd > like to know for sure before writing something myself. I've started work on a pull-based C YAML parser. Every time I tried to start it in the past something new came up with regard to the spec itself. Hopefully this time I'll manage to break the spell (Brian, stop laughing at me, I can hear you! :-) I'm actually working on a pull-parser generator. I need to get it working enough to be fed the YAML grammar. Have fun, Oren Ben-Kiki |
From: Kirill S. <xi...@ga...> - 2006-05-22 13:57:34
|
I'd like to announce a start of my libyaml project. It's a C library for parsing and emitting YAML. For more information, please check "http://pyyaml.org/wiki/LibYAML". If you have any comments or suggestions, please post it to the mailing list of the wiki. Kirill. |
From: Jason D. <ja...@in...> - 2001-12-03 17:39:47
|
> By a COM like version, I mean the use of HRESULT as > the return value for each function and each struct > has a pointer to a vtable for the first member with > the first three vtable entries being query, addref, > and release. Also, some encapsulation of memory > mangement would be needed so that IMalloc could > be plugged in later on. If you want the objects to be used from all versions of the VB family, you have to implement IDispatch. That means there's actually seven entries in the vtable you have to reserve (and implement). That's going to be the least of your problems, though. Automation-compatible interfaces have a slew of restrictions placed on them that you're going to have to hide behind macros in order to get an API that's usable by both C programmers and COM scripters. I've learned from experience that this is not what you want to do. I wrote an RDF parser in Standard C last year. One of my goals was to be able to use it from Automation-compliant environments (like VB and JScript). It was also important to me that "normal" C programmers (using Windows, Linux, or any other platform) could use it as well. So instead of re-inventing a small subset of COM that could compile on all platforms, I wrote the parser with C programmers in mind. The API looks like a fairly typical C API. If you want people to actually use libyaml, this is the approach you should take. Otherwise, somebody else will create their own implementation. For my VB friends, I created a wrapper (or facade) COM object. This handled all the conversions to and from BSTRs, VARIANTs, and many other COM-isms. It's API looks like your typical COM API. Sure, it's an extra layer that could have been avoided by judicious use of macros but since we're talking about VB, I can guarantee this layer will not be the bottleneck. And think about the people who will be helping maintain the libyaml source with you. I can bet that many if not most of them will be Unix hackers who won't even be able to compile and test the VB specific parts of the code. COM has already started dying a slow but sure death as it is. The next version of VB (VB.NET) is no longer COM-centric. Your COM objects can still be used but it's no longer the "native" API. The great thing about the whole .NET Framework, though, is that you can write your wrapper in any language supported by the runtime (as long as it can make native calls, of course) and then use that wrapper from any other .NET language. .NET does not use vtables the way that COM does so you _have_ to write a wrapper. But that's a good thing. Hope this helps, Jason. |
From: Clark C . E. <cc...@cl...> - 2001-12-03 18:44:24
|
Jason, This helps a great deal, thank you for taking the time to provide feedback. | If you want the objects to be used from all versions of the VB family, you | have to implement IDispatch. Ok. So you would delegate all COM support (IDispach) to a wrapper... this gives credit to adding another target language to SWIG (IDispatch). | The API looks like a fairly typical C API. If you want people | to actually use libyaml, this is the approach you should take. Ok. I have two questions in this regard. 1. How do you do exception handling. There are three approaches that I am aware of. A. Have each function return an HRESULT that should be checked, provide nice macros to do this checking. Output parameters are done via pointer. This solution is reliable, but very cumbersome. B. Use a global variable approach, aka errno, and to keep it thread safe, make it a function so that internally thread local storage can be used. Problem here is most programmers don't check errno; or it results in almost twice as many function calls. C. Use a "connect" struct which is passed around in addition to the formal parameters. The connect struct can contain the current result code, error numbers, and other parser specific information... If COM compliance isn't required, then option C seems the best as it addresses error handling as a class of problems, not singular. 2. How do you handle interfaces? I'd like to specify YAML API as an interface, not as a bunch of function calls so that various implementations can dynamically be peiced together in a processor pipeline. A. You provide for objects, where the first item in the struct is a vtbl of functions relevant to that struct. (think SAX interface, via callbacks) B. You pass around vtables directly. C. You provide first class functions (for the parser and emitter), but also have the interfaces (A) for those who want to build a pipeline. I'm thinking C, but don't have any other ideas here. 3. How do you handle memory managment? A. Go with reference counting, (aka addref, release). B. Rely upon garbage collection, which is actually almost feisable... C. Be a bit more novel... ? I think that A is the option. What would your preference be? 1C, 2C, 3A? Thank you so much Jason. Best, Clark |
From: Jason D. <ja...@in...> - 2001-12-04 06:38:13
|
I'm not familiar with the term "connect" struct but in Standard C, I almost always write my APIs so that the first parameter to every function is a "this" pointer. To the outside world, this is a void* (hidden behind an appropriately named typedef) but inside the functions, I cast it to a specific struct* so that I can access the "member variables" of that object. This would include any state flags for errors and such. This approach is very similar to a lot of C APIs (like gtk+, for instance, or expat). I don't use vtables in C if I don't have to. If I need some sort of polymorphic behavior (which is pretty rare for my C projects) then I would just use a function pointer as one of the member variables of my C objects. But I would never try to emulate a C++ vtable so that I could use it through something like a COM interface. By making the C API object-oriented in this way, writing a wrapper COM object is really easy. The object will usually just have a void* member variable that represents the C object. The methods in the interface for that object would most likely map one to one to all the C functions that accept that object as it's first parameter. So the COM methods really just delegate to the C functions. Instead of passing in this, you pass in m_this (or whatever you call your C object variable). Errors can be checked and converted into HRESULTs, char* parameters and results can be converted to and from BSTRs, etc. I usually implement FinalConstruct on the COM object to allocate a new instance of the C object. This simply invokes one of the few functions that don't accept a "this" pointer as the first parameter (instead it returns your new "this"). FinalRelease deallocates the C object. I'm not sure what type of object model you're creating but reference counting complicates things. I'm glad that's gone in .NET. If you're just wrapping a simple API like a parser or emitter then you shouldn't have any trouble at all. You'll only have one object for the parser and one object for the emitter. But if you're creating something like a DOM then you're going to have a hell of a time getting your COM wrappers to not leak. Chris Sells has an interesting approach that you might want to take a peek at: http://discuss.microsoft.com/SCRIPTS/WA-MSD.EXE?A2=ind9909A&L=ATL&P=R13729 C programmers are smart enough to know when to allocate and deallocate nodes in their object model. Don't burden them with the pitfalls of reference counting if you don't have to. If you are creating a DOM-like API, I would definitely recommend making the YAML document object be a factory for all the nodes it contains. This way it can "own" the nodes. Traversing the tree would just return pointers that shouldn't be deallocated. This would actually make the COM wrappers easier to implement as long as you don't use pointers to compare identity. Asking the document for a node could return a new COM object. FinalRelease for that object would NOT deallocate the C node. (If you're worried about efficiency, you can use the flyweight pattern by implementing a custom class factory.) Paul Hollingsworth has some good advice on designing interfaces to model graphs at: http://homepage.interaccess.com/~hollp/Design.htm I hope this helps. I absolutely agree with you that supporting COM-based clients is important but I think it would be a major mistake to make your C API binary compatible with COM (even though it is possible). Your C users will hate the API. They're more important, really. VB users will have no idea that the objects they're using are simply wrappers around a "simple" C API. So let them suffer through an extra function call or two. Jason. |
From: Brian I. <in...@tt...> - 2001-12-03 19:18:04
|
On 03/12/01 11:59 -0500, Clark C . Evans wrote: > Ok. I've begun to think a bit about libyaml... first, > the implementation should be done in C and SWIG should > be used up-front to provide Java, Perl, and Python bindings. As the author of the world's best extension software for Perl, I can pretty much guarantee you that I won't be using SWIG anytime soon. :) Cheers, Brian |
From: Clark C . E. <cc...@cl...> - 2001-12-03 20:02:25
|
Right. And looking at the Python bindings, it's probably a bit more straight-forward to make native Python wrapper. Ok. Thanks for the heads-up. Best, Clark On Mon, Dec 03, 2001 at 11:18:00AM -0800, Brian Ingerson wrote: | On 03/12/01 11:59 -0500, Clark C . Evans wrote: | > Ok. I've begun to think a bit about libyaml... first, | > the implementation should be done in C and SWIG should | > be used up-front to provide Java, Perl, and Python bindings. | | As the author of the world's best extension software for Perl, I can pretty | much guarantee you that I won't be using SWIG anytime soon. :) | | Cheers, Brian | | _______________________________________________ | Yaml-core mailing list | Yam...@li... | https://lists.sourceforge.net/lists/listinfo/yaml-core -- Clark C. Evans Axista, Inc. http:\\axista.com 800.926.5525 Collaborative Software for Project Management |
From: Clark C . E. <cc...@cl...> - 2001-12-03 20:05:35
|
| As the author of the world's best extension software for Perl, I can pretty | much guarantee you that I won't be using SWIG anytime soon. :) Brian, How do perl extensions handle: 1. Memory management 2. Object life-time 3. Exceptions Python does it this way: 1. For memory management, you have specific python allocators. 2. For exceptions, a NULL is returned and then a errno like mechanism must be populated with the error (if there is one). 3. Object life-time is managed through reference count Overall, if this is how Perl's native interface works, then perhaps this is the best structure to build libyaml with. Best, Clark |
From: Neil W. <neilw@ActiveState.com> - 2001-12-03 20:33:11
|
> How do perl extensions handle: > > 1. Memory management Perl provides its own memory allocation macros. Read all about them in the perlapi manpage (or the bottom of the perlguts manpage for Perl < 5.6.0). void Newz(int id, void *ptr, int nitems, type); 'id' is just an integer. The idea is you just pick some constant and use it everywhere in your extension. I've never seen this used for anything. 'ptr' will be initialized to the allocated memory. 'nitems' tells perl how many 'type's it should allocate room for. char **lines; Newz(1352, lines, 20, char*); This example will make 'lines' point at an array of 20 char *s. void Safefree(void *ptr); The corresponding free() macro. I believe Safefree is NULL-safe. > 2. Object life-time Perl uses reference counting, just like Python. > 3. Exceptions From XS, you have to use the G_EVAL flag to trap the exception. After every call or eval, you have to check the "$@" variable, just as you would in a Perl script. Perl's call() macros don't return any special flag to signify that an error occurred - you have to check for it yourself. See the perlcall manpage for more information. |
From: Clark C . E. <cc...@cl...> - 2001-12-03 20:53:34
|
Thanks Neil. Looks like I left two questions out... How does perl handle... A. input/output stream encapsulation. I assume that the perl binding has specific input/output functions that should be used instead of printf? I'm not sure about Python here... but I assume so. B. How does perl handle multiple threads? I don't know how Python's native interface does this, so I'll have to do some research here. Given the thoughts thus far, here are some libyaml "requirements" 0. Any input/output must be done by a user defined method (interface). A default input/output method is provided for standard C usage. 1. All heap memory must be allocated by a user defined method. A default user method is provided for standard C usage using malloc/free 2. In those cases where data can be strictly owned by the consumer or producer, standard C memory ownership conventions will be used. (aka the owner is the one who allocates the memory -- this is usseful for memory allocated on the stack). In all other cases, a hook for user defined reference counting mechanism for resident objects must be supported. A default mechanism is provided for standard C usage. 3. For exception handling, an out-of-band error checking mechanism will be used. If an error occurs, the function shoud return NULL and set the error condition flag and message. Users of libyaml should check an error code on function return values of NULL. 4. Most YAML methods will take a "connection object" handle. This can be used by the provider/consumer for specific data (such as a file or database handle). The connection object will contain the error checking mechanism. The connection object will not be assumed thread safe... i.e., each thread should use it's own connection objects. Further thoughts? Thank you both for helping me to flesh out the requirements of the interface before I go off and code something which isn't cool. Best, Clark |
From: Neil W. <neilw@ActiveState.com> - 2001-12-03 23:48:34
|
Clark C . Evans [03/12/01 16:05 -0500]: > Thanks Neil. My pleasure. > Looks like I left two questions out... > How does perl handle... > > A. input/output stream encapsulation. I > assume that the perl binding has specific > input/output functions that should be used > instead of printf? I'm not sure about > Python here... but I assume so. Good assumption. Perl has an internal IO abstraction interface documented in the perlapio manpage. Before Perl 5.7.0, these were just aliases to the stdio functions, but after Perl 5.7.0 they are real functions. Basically, you use PerlIO* instead of FILE*, and PerlIO_<function> instead of f<function>. PerlIO_printf() <=> fprintf(), etc. PerlIO *f = PerlIO_open("/tmp/foo", "w"); PerlIO_printf(f, "Hello, %s\n", "World"); PerlIO_close(f); > B. How does perl handle multiple threads? Well, you can't currently use multiple threads from pure Perl -- you have to do it from XS. Basically, if your Perl was compiled with threading, all of the Perl API macros magically grab the PerlInterpreter* object from thread-local storage. Each thread always has its own copy of the Perl Interpreter, so shared data is a PITA (but possible, I think). Basically you can just write standard XS code, and as long as you don't use any static variables and you make sure everything's reentrant, moving to a threaded Perl should just work. But you won't have to worry about it, since there are _no_ threaded Perls out there. None. People who want threads use IPC (or Inline::Java) instead. Later, Neil |