Re: [Linkbat-devel] Re: question on moreinfo.data (Everyone please read)
Brought to you by:
jimmo
From: Shanta M. <sh...@fo...> - 2002-11-20 20:26:21
|
Ps I have already added a Category field in anticipation of Jim's idea of giving the user more questions in a category if they make errors. On Wed, 2002-11-20 at 12:18, Shanta McBain wrote: > On Wed, 2002-11-20 at 12:05, Luan Luu wrote: > > Hi All, > > > > The seperator ":" is good for now in my opinion, since the other characters, > > we will also have in some of the other data files. > > > > > The reason tag should be there and stay empty, not removed, because when you > > extract XML to DB, i guess we expect a REASON tag to be there. > > > > Is it possible the tyk.data file be modified a little bit? Like you > > mentioned, there are a couple of character in there which can not be convert > > into valid XML format, such as the alone '&' need to change to '&', in > > the line number 45. The othere '&' in the data file was already in > > converted format, so it is alright. only the previous one mentioned causes > > problem. Also, I tried to convert all the < and > into the '<' and '>' > > respectively. Is that ok? > > > > This file is all ready converted to MySQL. > Add what ever field you like. > to see go to my devel site > > http://webcthelpdesk.com/cgi-bin/Linkbat/linkbat.cgi?site=Linkbat > Create an account and log in to see the link to the table. > > > > You mentioned the Topics and TopicRef tag should be insert into all the data > > file? Where do you think it should be? > > > > For the new dyk.data.NEW you sent me, it would produce this. > > > > <KnowledgeUnit> > > <Attributes> > > <Type>Concept</Type> > > <Text>Linux can be started from any partition.</Text> > > </Attributes> > > <Pages> > > <PageRef primary="true">106</PageRef> > > </Pages> > > <Questions> > > > > </Questions> > > </KnowledgeUnit> > > > > Where should the topics and topicref goes? > > > > thanks. > > > > Best Regards > > -Luu > > > > > > > > > > > > >From: James Mohr <lin...@ji...> > > >Reply-To: lin...@ji... > > >To: lin...@li... > > >Subject: [Linkbat-devel] Re: question on moreinfo.data (Everyone please > > >read) > > >Date: Tue, 19 Nov 2002 11:03:51 +0100 > > > > > >(Note this was sent to the list.) > > > > > >Hey everyone, the conversion is almost done (well, at least the code for > > >it). > > >Thanks Luu! However, there are some important questions to answer NOW, > > >before > > >we continue. PLEASE, please, please read this and give me your input. > > > > > >On Tuesday 19 November 2002 00:31, Luan Luu wrote: > > > > according the the moreinfo.data, the format: > > > > ID#:TYPE:DESCRIPTION:LOCATION > > > > > > > > the XML are > > > > > > > > <KnowledgeUnit> > > > > <Atrributes> > > > > <Type sub-type="[TYPE]" location="[LOCATION]">MoreInfo</Type> > > > > <Text>[DESCRIPTION]</Text> > > > > </Atrributes> > > > > </KnowledgeUnit> > > > > > > > > In the reference to the brackets, is the pointer the the type, location, > > > > and description like that? > > > > > >Perfect. The only question is whether we should actually do it that way or > > >not. That is, should the sub-type and location be attributes within the > > ><Type> tag or should they be seperate tags, i.e. <SubType> <Location>? > > > > > >By gut feeling is that they should be attributes within the <Type> tag. > > >They > > >not necessarily attributes of KU, but rather provide additional info for > > >the > > >type. > > > > > >Comments anyone? This needs to be answered before we continue. > > > > > > > is the url be absolute path with the http infront right? > > > > > >Yes. You will note that in the data file, they just begin with // and not > > >http://. This was because I made an unwise decision to use the colon (:) as > > >the field seperator. However, the colon appears frequently in Linux > > >(especially with URLs) so it became a problem. > > > > > >I was going to change to something like a pipe (|) which comes less > > >frequently. Regardless, we will have a problem since the odds are that > > >whatever character we use, it will appear in text somewhere in the data. > > > > > >Obviously this is not a problem when we import directly into a database. > > >However, as I said to Shanta, I don't have a problem with importing the > > >files > > >directly into a database for the first release. However, eventually I want > > >the system to be independant of the data source (CSV, database) and > > >independant of the presentation (eXtropia, other portal). Therefore, we > > >need > > >to consider a new seperator. > > > > > >Suggestions? > > > > > > > Inside the tyk xml question tags, there is a topicRef tag, which is the > > > > reference of the PAGE_ID. So, do you want to put the Page_id in the > > > > topicRef tags or the actual topic name in the page.data ? > > > > > >The TopicRef is a references to a topic, such as Administration, > > >Networking, > > >Security, etc. This is just text. There are PageRef tags and these contains > > >the page *name* from page.data. However, I am nop longer sure we should do > > >it > > >that way (see below). This needs to be changed in tykToXML.pl. > > > > > >Keep in mind that the questions will exist only within a KU. This KU will > > >have > > >a primary page so we automatically have the primary page for the question. > > >However, the question could reference multiple topics. > > > > > >We have a problem with some of the questions where an angle bracket is one > > >of > > >the answer ("What symbol is used to "pipe" two commands together?") This > > >means we have two angle brackets together (<< or >>) which could confuse > > >the > > >XML parser. There are only a few and we can change them by hand. We just > > >need > > >to be aware of them. > > > > > >Also watch the format of the answers, even for the T/F questions: > > > > > > <Correct> > > > <Text>T</Text> > > > <Reason>why this answer is correct</Reason> > > > </Correct> > > > > > >not just > > > > > ><Correct>T</Correct> > > > > > >I think that as much as possible it is better to have the same format for > > >all > > >types of questions. More than likely, "fill in the blank" type questions > > >won't have an <Incorrect> answer, but I still want to have a <Reason> tag > > >to > > >provide an explanation why the answer is correct. > > > > > >However, since we do not yet have the reasons, I think you should simply > > >leave > > >as <Reason></Reason>. I think if we leave the text as "put your reason why > > >is > > >correct/incorrect. " we might forget to change it and then displaying that > > >text would look silly. If the <Reason> tag is empty, we can just ignore it. > > >OR we could simply not include the <Reason> tag at this point. What do you > > >think? > > > > > >With the Glossary KUs please create a <GlossaryTerms> container with the > > >GlossaryRefs to the other terms. These are the numbers at the end of each > > >line in glossary.data. They are the ID numbers of the other glossary terms. > > >Therefore, instead of reading in each line from glossary.data and > > >processing > > >it, you will need to read it all at once and put it into an array, then > > >parse > > >that array. > > > > > >EVERYONE PLEASE READ AND COMMENT: > > >Currently the glossary.idx file contains a list of pages that contain each > > >glossary item. This is created by an external script and is **not** done > > >when > > >the glossary item is loaded. That would take way too much time. The > > >question > > >is whether we should have PageRefs within the Glossary KU. > > > > > >Personally, I do not think so. We can create the index of glossary-page_id > > >along with everything else. If we include the page ID/page name within the > > >Glossary KU and add a new glossary item, then we would need to go looking > > >for > > >all of the pages that have that glossary term. Obviously we need to search > > >for the pages to add the <Glossary> tag within the page. However, I just > > >see > > >it as unnecessary work to add PageRefs withing the Glossary KU since we can > > >create the index by other, more efficient means. > > > > > > > > >EVERYONE PLEASE READ AND COMMENT: > > >It just hit me that we might be building a trap for ourselves. If we use > > >the > > >full path instead of the ID number, we will have problems if we ever rename > > >the file, move it to a different directory, etc. I **expect** to be moving > > >files to different directories real soon! I want to change the order of the > > >files and their locations. As we get more content, I can imagine that we > > >change locations again. > > > > > >I see three options: > > > > > >Use the full-path as the PageRef: > > >- Easy to find/insert the reference we want > > >- Tracking down the actual page from a KU is easy > > >- In the display code we don't need to do a look up to display the page. > > >- PROBLEM: moving/renaming the file. Since the XML files are text, we can > > >use > > >sed/perl to make a global change. > > > > > >Use just the page name without the path: > > >- Once the file name is defined, it is less likely that the page name will > > >be > > >changed. > > >PROBLEM: We must have *completely unique* page names. We cannot have a > > >"Known > > >Problems" in the Network section AND in the Printing section. They must be > > >named "Known Problems-Network" and "Known Problems-Printing" (or something > > >like that). > > > > > >Use the ID as the PageRef: > > >- Remains constant, independant of the actual name of the file. > > >- PROBLEM: Need to do a look-up to find the correct file. However, since > > >page.data is current sorted by chapter/section, I have found that it is > > >not > > >al that hard. For the existing moreinfo and DYK entries one PageRef can be > > >inserted automatically. Still, if we want to include more PageRefs, we will > > >hve to do it by hand and look up the ID, but we will have to look it up any > > >way to get the full path. So whether we lookup and insert the page name or > > >the page ID it's the same amount of work. > > > > > >I still like the idea of using the full-path and NOT and ID number. You > > >need > > >to do a look-up anyway to find the ID or the correct text for the full > > >path. > > >Tracking down the original page from the XML file is straight forward. > > >Making > > >a change would be a simple matter of running a sed/perl script. We could > > >even > > >write it in advance and it becomes a part of our "utility" package: > > > > > >rename_page.pl [-f filename] original_name new_name > > > > > >It then scans all PageRefs in the named file and changes them accordingly. > > > > > >Using an ID number bothers me because makes the construct dependant on an > > >external file or we are imposing a structure on it unnecessarily. > > >Therefore, > > >the knowledge base is not self-contained. > > > > > >EVERYONE PLEASE READ AND COMMENT: > > >We have a similar problem with the MoreInfoRefs for the Page KUs. Currently > > >they are referenced by their ID number and Luu did the same thing in her > > >code. However, once again, I am not happy with idea of using ID numbers > > >instead of text. So, do we reference the text of the MoreInfo KUs?? > > > > > >I have pretty much decided to go through the existing data files and add up > > >to > > >three topics. I will add these to the **end** of each line for all of the > > >data files. So, Luu, could you change the code to create a <Topics> > > >container > > >and <TopicRefs> for all of the data files? Note that I will probably not > > >list > > >three topics for everything. Therefore, the code will need to be smart > > >enough > > >to recognized this. Since you are probably asleep already I can work on it > > >today and send you at least one file with the topics, so you will see the > > >format. > > > > > >Regards, > > > > > >jimmo > > >-- > > >--------------------------------------- > > >"Be more concerned with your character than with your reputation. Your > > >character is what you really are while your reputation is merely what > > >others > > >think you are." -- John Wooden > > >--------------------------------------- > > >Be sure to visit the Linux Tutorial: http://www.linux-tutorial.info > > > > > > > > >------------------------------------------------------- > > >This sf.net email is sponsored by: To learn the basics of securing > > >your web site with SSL, click here to get a FREE TRIAL of a Thawte > > >Server Certificate: http://www.gothawte.com/rd524.html > > >_______________________________________________ > > >Linkbat-devel mailing list > > >Lin...@li... > > >https://lists.sourceforge.net/lists/listinfo/linkbat-devel > > > > > > _________________________________________________________________ > > Add photos to your e-mail with MSN 8. Get 2 months FREE*. > > http://join.msn.com/?page=features/featuredemail > > > > > > > > ------------------------------------------------------- > > This sf.net email is sponsored by: > > Battle your brains against the best in the Thawte Crypto > > Challenge. Be the first to crack the code - register now: > > http://www.gothawte.com/rd521.html > > _______________________________________________ > > Linkbat-devel mailing list > > Lin...@li... > > https://lists.sourceforge.net/lists/listinfo/linkbat-devel > -- > Shanta McBain <sh...@fo...> > > > > ------------------------------------------------------- > This sf.net email is sponsored by: > Battle your brains against the best in the Thawte Crypto > Challenge. Be the first to crack the code - register now: > http://www.gothawte.com/rd521.html > _______________________________________________ > Linkbat-devel mailing list > Lin...@li... > https://lists.sourceforge.net/lists/listinfo/linkbat-devel -- Shanta McBain <sh...@fo...> |