From: James H. <ja...@is...> - 2003-08-07 13:33:57
|
Hi everyone. I'm going to be doing the modelling code, and while I was getting to know the project, I thought I'd take the liberty of cleaning up names.xml into proper xml with a DTD (document type declaration) so it validates. I uploaded it to the patches area ( http://sourceforge.net/tracker/index.php?func=detail&aid=784360&group_id=67048&atid=516699 ) - partly because I don't have CVS access and partly because I don't know the project well enough to make fairly major changes like that. There are a few things I'd like to note: 1. XML is case- and whitespace-sensitive. The author of this file has done a good job here, so they obviously know this. Nonetheless I think it's worth pointing out in any introduction to XML, because most people don't expect it to be either (I guess because html isn't). 2. XML Comments. XML comments go like this: <!-- this is a comment -->. Technically they're a little more involved than this, but everyone I know thinks that using complex comments is a bad idea. C and C++ comments are not valid. 3. Elements vs. Attributes. There are no hard and fast rules on when to use attributes and when to use elements. (An element is a tag - <element>, and an attribute is a key-value pair in the element - <element key="value">.) I think it makes sense to use attributes over elements only when the attribute is a fundamental property of the element that is guaranteed to only have one value. For example, in an email XML, the <to> structure would be a bad candidate for being an attribute as you often want to send email to more than one person. On the other hand, in a bottle of wine XML, the vintage would be a good candidate: <wine vintage="1986">. Here's a rundown of the things I changed to names.xml (roughly in order of top to bottom): 1. Put a DTD in. All XML documents require a DTD - it serves roughly the same purpose as a C/C++ header file. 2. Put a root element in. All XML documents require exactly one root element. I decided to make this <database>, with an attribute type called "type". This way we can write a tool that can operate on any of our XML files, check what type they are, and display them appropritately. 3. Seperated the document out into two logical sections - a name database and country rules. There didn't seem to me to be any reason to define names inside continents. English names are English names, wherever you are - if America uses different names, then they should be called American names. 4. Removed the quantity attribute on group. I couldn't see any point for it. Here are some things I'd like to do with names.xml, but didn't: 1. Improve the type-checking in the DTD. Most of the data isn't checked at all, and it could be - for example the <probability group="country"> could be checked to make sure country was a country that had been defined in <namegroups>. 2. Change the probability structures. At the moment they're very difficult to parse. However as the document is just a place to put names and there are no tools that use it, this wasn't necessary. 3. Write a tool so that designers don't have to edit the XML manually. It's my opinion that the only people who ever see text files should be programmers, everyone else should have tools. 4. Add a namespace. All our XML should be put in our own namespace, eventually. 5. (And I'm an idiot for forgetting to do this) Put a version attribute in the <database> element. If you're interested in learning more about XML and DTD's and so forth, there are good tutorials at www.w3schools.com. The parser I used to check names.xml is at http://www.stg.brown.edu/service/xmlvalid/. james. |
From: <red...@pr...> - 2003-08-07 20:32:17
|
Hi, Quoting James Harlow <ja...@is...>: > Hi everyone. I'm going to be doing the modelling code, and while I was > getting to know the project, I thought I'd take the liberty of cleaning > up names.xml into proper xml with a DTD (document type declaration) so > it validates. It was me who made it, and it was supposed to be XML like, not strict XML (cause I never had done it). The idea was having a simple database to store the names to be used in the random name generation, in a simple fashion. However, the changes are not dramatical and I think it would complicates too much the parsing (less if we can use a proper XML parser to do that)... So good job. > 2. XML Comments. XML comments go like this: <!-- this is a comment -->. > Technically they're a little more involved than this, but everyone I > know thinks that using complex comments is a bad idea. C and C++ > comments are not valid. Part of XML like desitions. > 1. Improve the type-checking in the DTD. Most of the data isn't checked > at all, and it could be - for example the <probability group="country"> > could be checked to make sure country was a country that had been > defined in <namegroups>. And how you do that? Excuse my maybe lame questions, but Ive never done XML, DB and Bussiness applications are not my strong point... > 3. Write a tool so that designers don't have to edit the XML manually. > It's my opinion that the only people who ever see text files should be > programmers, everyone else should have tools. Yes all agree on that, and the tools we are doing comes from that philosofy. For instance the PAQ Explorer and the repository are just 2 of them... > 4. Add a namespace. All our XML should be put in our own namespace, > eventually. Again, how? > 5. (And I'm an idiot for forgetting to do this) Put a version attribute > in the <database> element. Greetings Red Knight |
From: <red...@pr...> - 2003-08-08 15:07:33
|
> >>Hi everyone. I'm going to be doing the modelling code, and while I was > >>getting to know the project, I thought I'd take the liberty of cleaning > >>up names.xml into proper xml with a DTD (document type declaration) so > >>it validates. > >It was me who made it, and it was supposed to be XML like, not strict XML > >(cause I never had done it). The idea was having a simple database to store > >the names to be used in the random name generation, in a simple fashion. > >However, the changes are not dramatical and I think it would complicates too > >much the parsing (less if we can use a proper XML parser to do that)... So > >good job. > > > I should stress that I wasn't in any way criticising, I'm sure it was > just something that was just quickly laid out to get something "on > paper" - I do that all the time. Dont worry it was supposed to say: "and I think it wouldNT complicates too much the parsing", my mistake. Greetings Red Knight |
From: mamutas <ma...@pr...> - 2003-08-13 03:14:09
|
Hi guys, First, make sure when you reply the post it goes to Xen...@li... address. I got a feeling that James has answered Red Knight's questions, but the answers did not make to the mailing list. So, I will try to answer them = and James will correct me if I am wrong, ok? 1) Type checking in XML could be enforced via using of XML schemas = instead of DTDs. But I guess that was not meant for 'country-namegroup' link. = This issue could be resolved by using references to an element. 4) Namespaces could be specified in the DTD or XML header (or again, in = XML schemas). That is what I remember so far without digging through my memory since = it was about 2 years I worked with XML. In my understanding, if we are going to use proper XML for our = datafiles, then we should pick up some open source XML parser. I do not have preferences whether it will DOM or SAX so far, it all depends on what = memory management we are going to have for those DBs. Regards, mamutas -----Original Message----- From: xen...@li... [mailto:xen...@li...] On Behalf Of red...@pr... Sent: Thursday, August 07, 2003 3:15 PM To: James Harlow Subject: Re: [Xenocide-programming] names.xml Hi, Quoting James Harlow <ja...@is...>: > Hi everyone. I'm going to be doing the modelling code, and while I was > getting to know the project, I thought I'd take the liberty of = cleaning=20 > up names.xml into proper xml with a DTD (document type declaration) so = > it validates. It was me who made it, and it was supposed to be XML like, not strict = XML=20 (cause I never had done it). The idea was having a simple database to = store=20 the names to be used in the random name generation, in a simple fashion. = However, the changes are not dramatical and I think it would complicates = too much the parsing (less if we can use a proper XML parser to do that)... = So=20 good job. > 2. XML Comments. XML comments go like this: <!-- this is a comment=20 > -->. > Technically they're a little more involved than this, but everyone I=20 > know thinks that using complex comments is a bad idea. C and C++=20 > comments are not valid. Part of XML like desitions. > 1. Improve the type-checking in the DTD. Most of the data isn't=20 > checked > at all, and it could be - for example the <probability = group=3D"country">=20 > could be checked to make sure country was a country that had been=20 > defined in <namegroups>. And how you do that? Excuse my maybe lame questions, but Ive never done = XML, DB and Bussiness applications are not my strong point... > 3. Write a tool so that designers don't have to edit the XML manually. > It's my opinion that the only people who ever see text files should be = > programmers, everyone else should have tools. Yes all agree on that, and the tools we are doing comes from that = philosofy. For instance the PAQ Explorer and the repository are just 2 of them... > 4. Add a namespace. All our XML should be put in our own namespace,=20 > eventually. Again, how? > 5. (And I'm an idiot for forgetting to do this) Put a version=20 > attribute > in the <database> element. Greetings Red Knight ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including = Data Reports, E-commerce, Portals, and Forums are available now. Download = today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/= 01 _______________________________________________ Xenocide-programming mailing list = Xen...@li... https://lists.sourceforge.net/lists/listinfo/xenocide-programming --- Incoming mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.502 / Virus Database: 300 - Release Date: 7/18/2003 =20 --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.502 / Virus Database: 300 - Release Date: 7/18/2003 =20 |
From: James H. <ja...@is...> - 2003-08-13 03:58:25
|
On Tue, Aug 12, 2003 at 10:05:04PM -0500, mamutas wrote: > I got a feeling that James has answered Red Knight's questions, but the > answers did not make to the mailing list. So, I will try to answer > them and James will correct me if I am wrong, ok? *ahem* :-) Sorry, my bad. I've resent it. > 1) Type checking in XML could be enforced via using of XML schemas > instead of DTDs. But I guess that was not meant for 'country-namegroup' link. > This issue could be resolved by using references to an element. > 4) Namespaces could be specified in the DTD or XML header (or again, > in XML schemas). It can be done with DTD's. The problem is that I didn't want to alter the XML *too* much, which precludes strict enough type-checking. But yes, in general Schemas are a powerful alternative to DTD's. Namespaces should be specified in the top-level element you want them to apply to - in this case, the root element. > That is what I remember so far without digging through my memory since > it was about 2 years I worked with XML. > > In my understanding, if we are going to use proper XML for our > datafiles, then we should pick up some open source XML parser. I do not have > preferences whether it will DOM or SAX so far, it all depends on what > memory management we are going to have for those DBs. You're absolutely right. I've used QT's SAX2 implementation so far, but as QT is problematic to use on Windows I'd suggest fixing on Xerxes - http://xml.apache.org/#xerces. > mamutas james. |