Thread: [Linkbat-devel] Linkbat Dataflow

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi All!

I have included a diagram of how I envision the dataflow within linkbat. It is 
also available online (linkbat.sourceforge.net/dataflow.html). 

At the bottom we have the data management layer, which, as it's name implies, 
is where the management of the data occurs. That is new data, new KUs, new KU 
types, changes etc are inserted into the system at this level. Above that is 
the data access layer, which provides the data to the presentation layer. 

Between the data management and data access layers are the perl routines that 
read the XML files and convert the data into the appropriate files. Note that 
there are two lines from the XML files into the data access layer, one into 
CSV and one into SQL. My ultimate goal is to give the user a choice as to 
which format the data should be used. 

My perception is that it will be unavoidable to store the data in intermediate 
files during the conversion, or at the very least the data will be in a form 
in memory that allows for easy conversion to CSV. That is, the KUs and 
reference indexes will probably be stored in an array and it would be a 
simple matter of looping through the arrays and then writing out the data to 
text files. However, instead of writing the data to text files, it could 
simply be inserted into the database tables. Therefore, I could imagine a 
command line options that determines where the data is writen CSV file or DB. 

One important aspect of the code is adding data to the system. Once it is in 
the database, then adding new KUs becomes becomes a key issue. Each KU will 
need to have a unique ID, which I think should be added during the conversion 
to CSV/DB and not stored within the XML file. This will be used for the 
references between KUs. A see a problem there when we add data to the system 
since the new data will not have any knowledge of the KU IDs. 

I see one solution as a table that contrains the relationship between KU text 
and KU ID, which probably should also contain the KU type. So each row looks 
like this:

KUID:KU_TEXT_FROM_XML_FILE:KU_TYPE

When a new KU is added we can reference existing KUs using this table. 
However, how do we add the references from existing KUs to the new ones. We 
cannot simply add them to the existing database as the will not be in the 
original XML files. Therefore, I see the only way to add KUs is to redo the 
**complete** data import each time we add a new a KU.

Comments? Ideas?

In the transition from data access to presentation I see an "issue" with the 
different data sources. I would like to see the code that the presentation 
layer uses to be **independant** of the data source. That is, the presenation 
layer makes a request of the data access layer to deliver a specific piece or 
set of data and the data access layer takes care of the rest. For example, 
the presentation layer would call functions similar to this:

get_single_content_ku(KU_UD)
get_all_moreinfo_ku(TOPIC)
get_all_related_moreinfo_ku(KU_ID)

In each case the presentation layer wants a piece or set of data and asks the 
data access layer to deliver it. It is then the responsibility of the data 
access layer to determine if the data source and the proper method to access 
that data source in order to deliver the requested data. By standardizing the 
interface, I see it a lot easier to have different delivery methods. Each can 
load the necessary perl modules (for example) and simple call the appropriate 
functions. How the delivery process interacts with the web server is 
dependant on the delivery and web server.

Comments? Ideas?

Regards,

jimmo
-- 
---------------------------------------
"Be more concerned with your character than with your reputation. Your
character is what you really are while your reputation is merely what others
think you are." -- John Wooden
---------------------------------------
Be sure to visit the Linux Tutorial:  http://www.linux-tutorial.info

Thread: [Linkbat-devel] Linkbat Dataflow

linkbat-devel