[Gusdev-gusdev] GUS to PostgreSQL

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

ok, to follow on this with a new subject heading...

For those groups that are gearing up to port GUS to PostgreSQL, here are 
some quick thoughts on getting started.

The two main tasks, to start with, I think, are:
   - porting the schema creation scripts to postgres
   - remove the oracle dependencies from the perl object layer

Porting the Schema
As mentioned below, one way to go here is using a system-neutral syntax 
(xml) to create the schemas.   I think this should be the long term 
goal.  But, for a quick solution, there must be some way for us to 
manually transform the existing oracle-specific syntax to postgres.   I 
envision some kind of ad hoc scripts to do the transform.   This 
shouldn't be too big a deal.

I would say the first step is to identify what the transforms will be.

Object Layer
I just did a quick look in the object layer code (Perl).  (we can defer 
for now on the java object layer since its not in wide use).   Only two 
files seem to use oracle system tables, both in 
 $PROJECT_HOME/GUS/ObjRelP/lib/perl:  DbiTable.pm and DbiDatabase.pm.   
 There are about 10 separate sql queries that use the oracle system 
tables (listed below).   A quick fix would be to supply an Oracle.pm and 
a PostgreSQL.pm which have methods that provide the correct sql for 
those queries.   IF postgres has parallel tables, then this should be 
trivial.  if not...?    here are the tables used:

all_sequences
all_cons_columns
all_constraints
all_tab_columns
all_views

steve

:
Angel Pizarro wrote:

> Steve,
> This is a very accurate and complete synopsis of the issues we have 
> discussed so far. Without actually starting the project, I don't 
> foresee us coming up with anything else.
>
> As for translation between DB platforms, there is a Sourceforge.net 
> project that Allen Day belongs to named SQLFairy. See more about it at 
> http://sqlfairy.sourceforge.net/
>
> They claim to translate between Oracle and PostgreSQL, but this has 
> not worked in practice for me when I tried to translate the GUS schema 
> files to postgresql. Part of the problem is that we separate tables, 
> views and constraints into their own files in addition to separate 
> file groups for schemas. It also had trouble with multi-line comments.
>
> They have listed a platform independent XML syntax, but they as yet do 
> not have a formal DTD or XML schema (I asked already) . They said they 
> were willing to formalize on an XML schema if we provide it and is 
> close to what they have already. I don't see any reason to re-invent 
> the wheel here.
>
> Angel
>
> Steve Fischer wrote:
>
>> (Cesare and Giuseppe, are you on the gusdev mailing list?  If not, 
>> you can subscribe at: 
>> http://lists.sourceforge.net/lists/listinfo/gusdev-gusdev)
>>
>> Folks-
>>
>> Here is a quick outline of what might need to be done to make GUS 
>> RDBMS independent.    I discuss the two major subsystems of GUS:  the 
>> schema and the application framework.
>>
>> Please bear in mind that i am not a DB expert.   Comments and 
>> corrections encouraged.
>>
>> GUS Schema
>>   - schema definition.     it appears that different RDBMSs have 
>> differences in their data types and in the way they create tables and 
>> specify constraints.   it seems that this we need to define the 
>> schema in a system neutral syntax, say, xml, and then transform that 
>> into system-specific syntax.   i wonder if there are third party 
>> tools to do this.
>>
>>    - schema spaces.   the definition of the schema currently assumes 
>> "schema name spaces."     for example, we have RAD.Process.    For 
>> systems that don't support this, we may need to define the schema as, 
>> say, RAD_Process.   In other words, prepend the schema name to the 
>> table names.
>>
>>   - schema installation.    this will need to be system-specific, and 
>> will probably be generated by transforming a schema definition in a 
>> neutral syntax
>>
>>
>> GUS Application Framework
>>    - the application framework currently uses DBI and JDBC.   these 
>> are fine.
>>
>>   - the object layer API (and all the code that uses it) assumes that 
>> there is a schema space.   worse still, it hard codes the names as 
>> DoTS, RAD, Core, TESS and SRes.   The changes I envision here are:
>>           - removing hard coded references to schema names, replacing 
>> them with logical names.
>>           - allow the API to *keep* the schema space assumption.    
>> however, inside the object layer, use a RDBMS-specific "driver" to 
>> handle them in a system specific way.   for example, it would use '.' 
>> as the concatenation character for oracle and '_' for mySql.
>>            - the object layer's code generator (generates all the 
>> objects) has a system-specific module for discovering 
>> meta-information about the tables.   this is segregated into system 
>> specific modules.   for any new system, we would need to write new 
>> ones.    note:  any system that we support would need to offer this 
>> meta information somehow!!!!
>>
>>    - sql code.   there is plenty of direct SQL in our application 
>> code and in plugins.   here are the problems:
>>           - Wherever this sql refers to table names, the sql would 
>> need to be modified to get the table names from an API which can 
>> provide them in correct syntax.            - some of the code relies 
>> on system specific optimization directives.   i don't know how we'll 
>> handle this.
>>           - are there any other sql incompatibilities?   how about 
>> join and/or outer join syntax?
>>
>> Any other ideas?
>>
>> Steve
>>
>>
>>
>> Giuseppe Jurman wrote:
>>
>>>> Thanks Elisabetta,
>>>>
>>>> An important question would be also why PostgreSQL and not MySQL, the
>>>> latter appears to be evolved well and have now a good support to
>>>> transactions via InnoDB, etc. I am very inclined to use Gmod and chado
>>>> por our project, but would like to have a final check on the potential
>>>> of GUS before starting things.
>>>>   
>>>
>>>
>>> ITC-irst has started a working group for a full open source version of
>>> GUS, so to allow the substitution of the Oracle database management
>>> system for data tables and relations within GUS with an open source
>>> solution. In principle we are not oriented to the substitution
>>> of Oracle with another specific DMBS.
>>>
>>> It might be worthwhile to consider the possibility to turn DB
>>> independent the whole object-related part of GUS and then choosing the
>>> suitable DB-dedicated driver depending on the user's system. The first
>>> (heavy) step towards this option requires to check the existing code
>>> looking for the Oracle-specific SQL, and then designing an ad-hoc
>>> strategy to transform into either a DBI (better) or into a DBD (here
>>> the choice of the DBMS becomes a constraint).
>>>
>>> An important issue, for us, is how to align the DBD portions of the
>>> code along the next releases of GUS. In this view, we see the need of
>>> a DBI-DBD separation really relevant.
>>>
>>> We would like to perform the Oracle-PostgreSQL conversion in the most
>>> automated way possible. Before, the Oracle proprietary SQL code has to
>>> be somehow moved to the upper Perl Object Layer level, or to a new
>>> Perl Object Layer.
>>>
>>> Comments welcomed
>>>
>>>
>>> Giuseppe Jurman
>>> Cersare Furlanello
>>>
>>>  
>>>
>
>
>
> -------------------------------------------------------
> The SF.Net email is sponsored by EclipseCon 2004
> Premiere Conference on Open Tools Development and Integration
> See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
> http://www.eclipsecon.org/osdn
> _______________________________________________
> Gusdev-gusdev mailing list
> Gus...@li...
> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev