[opendemo-devel] SAX vs. proprietary XML api
Status: Beta
Brought to you by:
girlich
From: Conor D. <co...@ma...> - 2000-08-08 21:04:24
|
I got to thinking about the demo format for OpenDemo. I can see two different ways to go about reading/writing an odxml document: a) Read/write to plain-text XML using the SAX (for reading at least) interface. The wbxml format could be used to shrink the XML file a bit. This approach has the benefit of using standard, established API's, so utility writers might have an easier time to develop their programs. b) Use a proprietary API to write text/binary xml files. This would expand on the wbxml concept. In addition to storing tags in binary format, the actual content (such as configstring indicies and vectors) could be stored in binary, so the demo file is even smaller. Also, if we change the API a bit we could greatly improve the speed of parsing the file. Instead of doing a strcmp() for each tag and attribute name, we could compare the tag's index (supplied when opening the xml file) to the index of the tag we are looking for. And because I can code better than I can talk, here is an example: With SAX: void startElement(void *userdata, const CHAR *name, const CHAR **attrs) { ... if (!strcmp(name, "mytag")) { for (i = 0; attrs[i] != NULL; i++) { const CHAR *aname = attrs[i++]; const CHAR *avalue = attrs[i]; if (!strcmp(aname, "my_integer_attr")) { int myint = strol(avalue); } else if (!strcmp(aname, "my_string_attr)) { const CHAR *mystring = avalue; } else { printf("Unknown tag name: %s\n", name); } } } ... } Proprietary API: typedef struct { int id; const char *string; } odxml_tag_t; /* Helper functions GetInteget, GetFloat, and GetString can convert from the type in which it was stored to the type that was requested. For example, calling GetInteger on an odxml_data_t object with stored_type==TYPE_STRING would convert the string data to an int with strtol(). */ typedef struct { enum { TYPE_INT, /* stored in binary format in xml file */ TYPE_FLOAT, /* ditto */ TYPE_STRING /* stored in text format in xml file */ } stored_type; union { int integer; float float; char *string; } data; } odxml_data_t; void startElement(void *userdata, const odxml_tag_t *name, const odxml_tag_t *attr_names, const odxml_data_t *attr_values) { ... if (name->id == MYTAG_INDEX) { for (i = 0; attr_names[i].string != NULL; i++) { const odxml_tag_t *aname = attr_names + i; const odxml_data_t *avalue = attr_values + i; switch (aname->id) { case MY_INTEGER_INDEX: int myint = GetInteger(avalue); break; case MY_STRING_INDEX: const CHAR *mystring = GetString(avalue); break; default: /* we can compare with strcmp() here if we wanted to */ printf("Unknown tag name: %s\n", aname->string); } } } } This piece of code could work for binary XML files and plain-text XML files (but not as fast). Since most of the data we are storing is numbers anyway, this would help to avoid many strtol() and itoa() calls (if the file is in binary format) in addition to reducing the file size. My big concern here is programs such as Keygrip already take awhile to load large dm2 files (which are compact), and having a text-only format might make load times unbearable. Conor Davis ce...@pl... |