Array Limits for SORT_ARRAY and TRIM_ARRAY?

Help
2013-08-13
2014-01-19
  • I have created an aggregate function to calculate the 90th percentile of a data set:

    CREATE AGGREGATE FUNCTION perc90(IN val DOUBLE, IN flag boolean, INOUT buffer DOUBLE ARRAY, INOUT counter INT)
       RETURNS Double
       CONTAINS SQL
       BEGIN ATOMIC
       DECLARE positionRank DOUBLE;
       DECLARE sortedArray DOUBLE ARRAY;
         IF flag THEN
           SET positionRank = round(((0.90*counter)+0.5),0); 
           SET sortedArray = SORT_ARRAY(buffer);
           RETURN sortedArray[positionRank-1];
         ELSE
           IF val IS NULL THEN RETURN NULL; END IF;
           IF counter IS NULL THEN SET counter = 0; END IF;
           SET counter = counter + 1;
           IF counter = 1 THEN SET buffer = ARRAY[val];
           ELSE SET buffer[counter] = val; END IF;
           RETURN NULL;
         END IF;
       END
    

    this worked fine with a small data set of around 300 records but when increasing the sample to a 150,000 I encountered the hard limit of 1024 for the array so I tried the following:

     CREATE AGGREGATE FUNCTION perc90(IN val DOUBLE, IN flag boolean, INOUT buffer DOUBLE ARRAY[10000000], INOUT counter INT)
       RETURNS Double
       CONTAINS SQL
       BEGIN ATOMIC
       DECLARE positionRank DOUBLE;
       DECLARE sortedArray DOUBLE ARRAY;
       DECLARE trimedArray DOUBLE ARRAY;
         IF flag THEN
           SET positionRank = round(((0.9*counter)+0.5),0); 
           SET trimedArray = TRIM_ARRAY(buffer,(10000000 - counter));
           SET sortedArray = SORT_ARRAY(trimedArray);
           RETURN sortedArray[positionRank-1];
         ELSE
           IF val IS NULL THEN RETURN NULL; END IF;
           IF counter IS NULL THEN SET counter = 0; END IF;
           SET counter = counter + 1;
           IF counter = 1 THEN SET buffer = ARRAY[val];
           ELSE SET buffer[counter] = val; END IF;
           RETURN NULL;
         END IF;
       END
    

    and

     CREATE AGGREGATE FUNCTION perc90(IN val DOUBLE, IN flag boolean, INOUT buffer DOUBLE ARRAY[10000000], INOUT counter INT)
       RETURNS Double
       CONTAINS SQL
       BEGIN ATOMIC
       DECLARE positionRank DOUBLE;
       DECLARE sortedArray DOUBLE ARRAY;
       DECLARE trimedArray DOUBLE ARRAY;
         IF flag THEN
           SET positionRank = round(((0.9*counter)+0.5),0);
           SET sortedArray = SORT_ARRAY(buffer ASC NULLS LAST);
           RETURN sortedArray[1];
         ELSE
           IF val IS NULL THEN RETURN NULL; END IF;
           IF counter IS NULL THEN SET counter = 0; END IF;
           SET counter = counter + 1;
           IF counter = 1 THEN SET buffer = ARRAY[val];
           ELSE SET buffer[counter] = val; END IF;
           RETURN NULL;
         END IF;
       END
    

    but I seem to keep getting a "data exception: array data, right truncation / Error Code: -3491 / State: 2202F" error.

    I've tried looking for limits to these functions in the documentation  but I couldn't find anything.

    Am I doing something wrong or is it a limitation of the db?

     
  • Fred Toussi
    Fred Toussi
    2013-08-13

    Have you tried specifying the size for the ARRAY variables? For example DECLARE sortedArray DOUBLE ARRAY.

     
  • that works thank you :) I can't believe I missed it.

    I have it working but it takes around 20s for 150K rows which is not really great, the query:

    select 
    t
    from testtable
    order by t
    

    is pretty much instant and that's doing the majority of what the aggregate function does I just need to be able to go to a direct row.