From: Laurie K. <lk...@mo...> - 2004-02-20 20:52:51
|
Steve wrote on Feb 13: >ok, to follow on this with a new subject heading... > >For those groups that are gearing up to port GUS to PostgreSQL, here are >some quick thoughts on getting started. > >The two main tasks, to start with, I think, are: > - porting the schema creation scripts to postgres > - remove the oracle dependencies from the perl object layer > >Porting the Schema >As mentioned below, one way to go here is using a system-neutral syntax >(xml) to create the schemas. I think this should be the long term >goal. But, for a quick solution, there must be some way for us to >manually transform the existing oracle-specific syntax to postgres. I >envision some kind of ad hoc scripts to do the transform. This >shouldn't be too big a deal. > >I would say the first step is to identify what the transforms will be. > > >Object Layer >I just did a quick look in the object layer code (Perl). (we can defer >for now on the java object layer since its not in wide use). Only two >files seem to use oracle system tables, both in > $PROJECT_HOME/GUS/ObjRelP/lib/perl: DbiTable.pm and DbiDatabase.pm. > There are about 10 separate sql queries that use the oracle system >tables (listed below). A quick fix would be to supply an Oracle.pm and >a PostgreSQL.pm which have methods that provide the correct sql for >those queries. IF postgres has parallel tables, then this should be >trivial. if not...? here are the tables used: > >all_sequences >all_cons_columns >all_constraints >all_tab_columns >all_views > > I am joining this discussion late, but would like to get involved as soon as possible in order to get GUS up and running here at Princeton. on PostgreSQL 7.3 (is that what others are using?) Steve and Angel, have either of you looked into the ora2pg perl module? It is at http://www.samse.fr/GPL/ora2pg/ and requires an Oracle connection in order to run. In response to Steve's identification of the Oracle metadata queries which need to be converted, I have completed and tested all of them and only have trouble with this one from DbiTable.pm: select column_name, data_type, nullable, column_id, data_precision, data_length, data_scale from all_tab_columns where table_name = '$table_name' and owner = '$owner' order by column_id Here is my best attempt at a conversion: select a.attname, t.typname, case when a.attnotnull then 'N' else 'Y' end as nullable, a.attnum, case when t.typlen < 1 then null else t.typlen end as data_precision, case when a.atttypmod < 0 then null else a.atttypmod-4 end as data_length, null as data_scale from pg_attribute a, pg_class c, pg_type t where a.attrelid = c.oid and c.relname = 'assay' and a.atttypid = t.oid and a.attnum > 0 order by a.attnum; Using RAD3.assay, the Oracle results yield the following: COLUMN_NAME DATA_TYPE NULLABLE col_id precision length scale ---------------------------- --------- -------- ------ --------- ------ ---------- ASSAY_ID NUMBER N 1 8 22 0 ARRAY_ID NUMBER N 2 4 22 0 PROTOCOL_ID NUMBER Y 3 10 22 0 ASSAY_DATE DATE Y 4 null 7 null ARRAY_IDENTIFIER VARCHAR2 Y 5 null 100 null ARRAY_BATCH_IDENTIFIER VARCHAR2 Y 6 null 100 null OPERATOR_ID NUMBER N 7 10 22 0 EXTERNAL_DATABASE_RELEASE_ID NUMBER Y 8 5 22 0 SOURCE_ID VARCHAR2 Y 9 null 50 null NAME VARCHAR2 Y 10 null 100 null DESCRIPTION VARCHAR2 Y 11 null 500 null MODIFICATION_DATE DATE N 12 null 7 null USER_READ NUMBER N 13 1 22 0 USER_WRITE NUMBER N 14 1 22 0 GROUP_READ NUMBER N 15 1 22 0 GROUP_WRITE NUMBER N 16 1 22 0 OTHER_READ NUMBER N 17 1 22 0 OTHER_WRITE NUMBER N 18 1 22 0 ROW_USER_ID NUMBER N 19 12 22 0 ROW_GROUP_ID NUMBER N 20 3 22 0 ROW_PROJECT_ID NUMBER N 21 4 22 0 ROW_ALG_INVOCATION_ID NUMBER N 22 12 22 0 while the PostgreSQL results yield: attname | typname | nullable | attnum | data_precision | data_length | data_scale ------------------------------+---------+----------+--------+----------------+-------------+------------ assay_id | int8 | N | 1 | 8 | | array_id | int4 | N | 2 | 4 | | protocol_id | int8 | Y | 3 | 8 | | assay_date | date | Y | 4 | 4 | | array_identifier | varchar | Y | 5 | | 100 | array_batch_identifier | varchar | Y | 6 | | 100 | operator_id | int8 | N | 7 | 8 | | external_database_release_id | int4 | Y | 8 | 4 | | source_id | varchar | Y | 9 | | 50 | name | varchar | Y | 10 | | 100 | description | varchar | Y | 11 | | 500 | modification_date | date | N | 12 | 4 | | user_read | int2 | N | 13 | 2 | | user_write | int2 | N | 14 | 2 | | group_read | int2 | N | 15 | 2 | | group_write | int2 | N | 16 | 2 | | other_read | int2 | N | 17 | 2 | | other_write | int2 | N | 18 | 2 | | row_user_id | int8 | N | 19 | 8 | | row_group_id | int4 | N | 20 | 4 | | row_project_id | int4 | N | 21 | 4 | | row_alg_invocation_id | int8 | N | 22 | 8 | | The PostgreSQL numeric data types are different from those in Oracle and this raises the issue of how to convert them. There is no equivalent of the NUMBER(10) or NUMBER(12) in the assay table above, for example, in PostgreSQL. The common integer types are int2, int4 and int8. There is a numeric type which supports precision and scale but it is much slower than the integer types. Hence, we *must* come to agreement on this conversion issue. Do I hear any volunteers? Here are the other queries, successfully converted. ============ DbiTable.pm ============ Oracle: select text from all_views where owner = '$owner' and view_name = '$table' PostgreSQL: select definition from pg_views where schemaname = '$owner' and viewname = '$table' ------------------------------------------- Oracle: select ac.owner, accs.table_name, accs.column_name, accr.owner, accr.table_name, accr.column_name from all_cons_columns accr, all_cons_columns accs, all_constraints ac where accs.owner = '$owner' and ac.table_name = '$table' and ac.owner = '$owner' and ac.constraint_type = 'R' and accr.constraint_name = ac.r_constraint_name and ac.r_owner = accr.owner and accs.constraint_name = ac.constraint_name PostgreSQL: select n1.nspname, r1.relname, a1.attname, n2.nspname, r2.relname, a2.attname from pg_namespace n1, pg_namespace n2, pg_class r1, pg_class r2, pg_attribute a1, pg_attribute a2, pg_constraint c where r1.relname = '$table' and n1.nspname = '$owner' and r1.relnamespace = n1.oid and r1.oid = c.conrelid and r1.oid = a1.attrelid and c.contype = 'f' and c.confrelid = r2.oid and r2.relnamespace = n2.oid and c.conkey[1] = a1.attnum and c.confkey[1] = a2.attnum and a2.attrelid = r2.oid ------------------------------------------- Oracle: select distinct acc.column_name from all_cons_columns acc, all_constraints ac where acc.owner = '$owner' and ac.table_name = '$queryTable' and ac.constraint_type = 'P' and acc.constraint_name = ac.constraint_name PostgreSQL: select distinct a.attname from pg_namespace n, pg_class r, pg_attribute a, pg_constraint c where n.nspname = '$owner' and r.relname = '$table' and n.oid = r.relnamespace and r.oid = a.attrelid and r.oid = c.conrelid and c.contype = 'p' and c.conkey[1] = a.attnum; ============ DbiDatabase.pm ============ Oracle: select sequence_owner,sequence_name from all_sequences PostgreSQL: select schemaname, relname from pg_statio_user_sequences; ------------------------------------------- Oracle: select acon.owner,acc1.owner,acc1.table_name,acc1.column_name, acon.table_name ,acc2.column_name from all_cons_columns acc1, all_constraints acon,all_cons_columns acc2 where acon.r_constraint_name = acc1.constraint_name and acc1.owner = '$owner' and acon.r_owner = acc1.owner and acc2.constraint_name = acon.constraint_name and acc2.owner = acon.owner PostgreSQL: select n1.nspname, n2.nspname, r1.relname, a1.attname, r2.relname, a2.attname from pg_namespace n1, pg_namespace n2, pg_class r1, pg_class r2, pg_attribute a1, pg_attribute a2, pg_constraint c where r1.nspname = '$owner' and r1.relnamespace = n1.oid and r1.oid = c.conrelid and r1.oid = a1.attrelid and c.contype = 'f' and c.confrelid = r2.oid and r2.relnamespace = n2.oid and c.conkey[1] = a1.attnum and c.confkey[1] = a2.attnum and a2.attrelid = r2.oid The next topic to tackle is CLOBs. Laurie ------------------------------- Laurie Kramer Department of Molecular Biology Princeton University lk...@mo... 609.258.5657 |