I think we have reached the point where we need to be talking about the actual
data/table structure in CSV text or a database. For now let's just work on
the database model, to make things simpler. However, we all need to keep in
mind that I eventually want to have the system work with CSV text files.
Rama has volunteered to work on the code to convert the XML files into
something the database can handle. Shanta is working on the "presentation
layer", so obviously you two will be working closely together. When the code
is done for the CSV->XML conversion, I would like Luu to start working on the
code to validate the XML files. That is, making sure the right tags are
there, tags match, etc.
Following are some thoughts and ideas about how the tables can be created. I
have no particular affection for any of these ideas, so feel free to tell me
The first thing we need to decided for the data model is how to store the KUs
themselves. Since different KU types have different attributes, I do not see
a way of storing them all in a single table. Instead, I would see one table
for each KU type and then a central table that contains the unique KU ID and
then the KU type. When making a cross reference from one KU to another, we
would then go through this central table.
Next is the inter-relationship between the KUs, which is a core aspect of
linkbat. There is a potential that each KU (regardless of type) can reference
every other type of KU. However, I cannot really imagine an index for each
relationship. For example, the Concept KU has an index listing all of the
Concept KUs and with pointers to all of the referenced Glossary KUs. Then
another index with the MoreInfo KUs. In my mind that would mean too many
tables. If there are five KU types and each has a index for the relationship
to the others, then we would have 20 tables (5x4). With 6 KU tables, 30
tables (6x5). Is my math right?
However, the question is whether 20 or 30 indexes is "too many". Does having
seperate indexes for each relationship provide any advantages? Quicker
access? If so, does it compensate for the extra work to manage 20-30 tables??
Alternatively, I could imagine one index that keeps tracks of the type of KUs,
UNIQUE_KU_ID - The ID of the current KU
REFERENCED_TYPE - The type of KU that this KU is referencing
REFERENCED_KU_ID - The ID of that KU that is being refereced.
A query could be done sorted by UNIQUE_KU_ID and then REFERENCED_TYPE. Once
we have that, the code to present a list of referenced KUs grouped by KU type
would be easy. Currently we are only dealing with a few thousands KUs.
However, what about when we get to tens of thousands?? Hopefully MySQL should
be in a position to deal with that **few** records. So, I would not see it as
a problem to have all of the KUs referenced in a single table like that.
Is the REFERENCED_TYPE even necessary here? We obviously need a table that
contains the attributes of each KU and a unique ID, so the REFERENCED_KU_ID
can be used to get the type through the central table that I mentioned above
and we have the KU type. On the other hand, to get a list of all referenced
KUs in the correct order we could have a query like this:
select REFERENCED_KU_ID from WHATEVER_TABLE where UNIQUE_KU_ID =
'CURRENT_KU_ID" sort by REFERENCED_TYPE
This is a single table access and is obviously faster than mutliple access.
Therefore, I think this would more than make up for having an extra field in
Since we can expect that new data files will have mixed KU types, I am
thinking that a single index which also contains the KU type would be more
efficient. Each time we read a new KU, it is inserted into that single index,
so we don't have to deal with mulitple indexes. Comments?
Although not KUs in the true sense, I think we need an index by topic and
skill level (if we decide to implement that). This would be used in
generateing the technical FAQs or lists of all info on a particular topic.
What about the reverse references. For example a list of all Concept KUs that
reference a particular Glossary KU. This is not the same as the list of
Glossary KUs that this Concept KU references. We click on a link that
displays a Glossary KU, with a link to all the KUs that it references. I
would also like links to all of the KUs that reference this glossary term.
Although it seems logical that each Concept KU that references a particular
Glossary KU should also be referenced by that Glossary KU, it is likely that
we will forget the reverse references when create the XML files. Think about
the glossary terms on the content pages. Each time we add a new page we have
to update the referenced glossary KUs to point back to this page.
I'm sure there is a lot more, but I should get this off to start the
"Be more concerned with your character than with your reputation. Your
character is what you really are while your reputation is merely what others
think you are." -- John Wooden
Be sure to visit the Linux Tutorial: http://www.linux-tutorial.info