Re: [cx-oracle-users] speeding up TB data download
Brought to you by:
atuining
From: Chuck W. <chu...@ch...> - 2010-07-27 20:56:24
|
Thanks again for your response and code. We do have a view called all_tab_cols. The issue I am running into is for some of our tables have 3billion+ records and 80+columns. It is taking 10+mins to compute the max on one column. It is trivial to generate SQL which will get the information for all columns: select min(sz.fld1), max(sz.fld1), min(sz.fld2), max(sz.fld2), ... from ( select length(to_char(fld1)) fld1, length(fld2) fld2, ... from tbl1 ) sz appropriately depending on the type. Is there an alternative/cleaner approach? Thanks. ---- Anthony Tuininga <ant...@gm...> wrote: > Sure. You want something like this -- untested but it should give you > the right idea. > > connection = cx_Oracle.Connection(....) > columnCursor = connection.cursor() > queryCursor = connection.cursor() > columnCursor.execute(""" > select table_name, column_name, data_type > from user_tab_columns > order by table_name, column_id""") > for tableName, columnName, dataType in columnCursor: > if dataType in ("CHAR", "VARCHAR2"): > selectItem = "length(%s)" % columnName > elif dataType == "NUMBER": > selectItem = "length(to_char(%s)) % columnName > else: > continue > queryCursor.execute("select max(%s) from %s" % (selectItem, tableName)) > result, = queryCursor.fetchone() > print "Maximum length for %s.%s is %s" % (tableName, columnName, result) > > Anthony > > On Mon, Jul 26, 2010 at 10:46 AM, Chuck White <chu...@ch...> wrote: > > Thanks for your response. Can you please send me examples which query those tables (user_tab_columns, all_tab_columns or dba_tab_columns)? > > > > ---- Anthony Tuininga <ant...@gm...> wrote: > >> Hi, > >> > >> Just a suggestion: avoid the parsing completely if you can do it. If > >> all you're trying to do is calculate the maximum size of fields, you > >> can easily write something that does this: > >> > >> select max(length(<COL_NAME>)) from <TABLE_NAME> > >> > >> or for numbers > >> > >> select max(length(to_char(<COL_NAME>)) from <TABLE_NAME> > >> > >> This sort of query is far more efficient and can be done independently > >> of your extracts. You can quite easily generate all of these queries > >> by reading from user_tab_columns, all_tab_columns or dba_tab_columns. > >> If you need help with that, let me know. > >> > >> Anthony > > > > |