Menu

Possible bug with SELECT INTO inside procedures?

Keith
2017-05-31
2017-06-01
  • Keith

    Keith - 2017-05-31

    I think I found a strange bug with the SELECT ... INTO ... FROM statement that can be used within procedures. It mostly works, but I found a case where my variable is never set even when the query returns a single row result.

    In my example, I have tables for drivers, test drives, and events that occur during the test drive. In addition, for quick access and data existential checking, I have a cache table that stores the most recent event for a driver. When an event is added to the event table, a trigger calls a procedure to update the cache table. (In normal production, the procedure will check if it needs to insert, update, or delete entries, but just an insert is enough to reproduce the problem in this example.) The SELECT ... INTO query used to find the id of the most recent event returns the correct result, but it fails to set the value into the variable.

    Is this a known problem?

    CREATE TABLE driver
    (
      id UUID NOT NULL,
      name VARCHAR(64) NOT NULL,
      PRIMARY KEY (id)
    );
    
    CREATE TABLE test_drive
    (
      id UUID NOT NULL,
      datetime TIMESTAMP NOT NULL,
      driver_id UUID NOT NULL,
      PRIMARY KEY (id),
      CONSTRAINT fk_test_drive_driver
        FOREIGN KEY (driver_id)
        REFERENCES driver (id)
    );
    
    CREATE TABLE event
    (
      id UUID NOT NULL,
      offset DOUBLE NOT NULL,
      test_drive_id UUID NOT NULL,
      PRIMARY KEY (id),
      CONSTRAINT fk_event_test_drive
        FOREIGN KEY (test_drive_id)
        REFERENCES test_drive (id)
    );
    
    -- Cache table to hold the most recent event for a driver. This comes in handy
    -- when the tables start to become really large and all we care about is the
    -- most recent event or if an event for a driver exists.
    CREATE TABLE mostrecentevent
    (
      driver_id UUID NOT NULL,
      event_id UUID NOT NULL,
      PRIMARY KEY (driver_id),
      CONSTRAINT fk_mostrecentevent_driver
        FOREIGN KEY (driver_id)
        REFERENCES driver (id)
    );
    
    -- Updates the cache table when a new event is inserted. Note that this
    -- doesn't do UPDATE or DELETE for sake of brevity. In the production version,
    -- other checks are done to either INSERT, UPDATE, or DELETE values from the
    -- cache table. For this example, the INSERT is enough to reproduce the
    -- problem.
    CREATE PROCEDURE update_mostrecentevent(event_id UUID)
    MODIFIES SQL DATA
    BEGIN ATOMIC
      DECLARE newest_event_id UUID;
      DECLARE driver_id UUID;
    
      -- Get the id of the driver. This always works.
      SELECT d.id
      INTO driver_id
      FROM event e
      JOIN test_drive td on td.id=e.test_drive_id
      JOIN driver d on d.id=td.driver_id
      WHERE e.id = event_id;
    
      -- Get the id of the newest event. The query works and returns the correct
      -- result, but it never assigns the result to the variable. In addition, I
      -- had to wrap the inner query into an outer query because the SELECT ... 
      -- INTO statement fails to parse if the ORDER BY or LIMIT follows the 
      -- FROM/JOIN clause.
      SELECT id
      INTO newest_event_id    -- THIS IS NEVER SET. WHY?????
      FROM
      (
        SELECT e.id, td.datetime + INTERVAL '1' SECOND(6) * e.offset AS e_datetime
        FROM test_drive td
        JOIN event e ON e.test_drive_id = td.id
        WHERE td.driver_id = driver_id
    
        -- This ORDER BY (and the LIMIT) fails to parse unless this inner SELECT
        -- is wrapped by the outer SELECT.
        ORDER BY e_datetime DESC
        LIMIT 1 
      ) AS tmp;
    
      -- This fails because newest_event_id is NULL and the table doesn't allow
      -- NULL.
      INSERT INTO mostrecentevent VALUES (driver_id, newest_event_id);
    END;
    
    -- After inserting an event, update the cache table.
    CREATE TRIGGER after_insert_event
    AFTER INSERT ON event
    REFERENCING NEW ROW AS NEW
    FOR EACH ROW
    BEGIN ATOMIC
      CALL update_mostrecentevent(NEW.id);
    END;
    
    INSERT INTO driver VALUES (UUID(), 'TEST');
    INSERT INTO test_drive VALUES (UUID(), NOW(), (SELECT id FROM driver));
    
    -- This throws an integrity constraint violation because newest_event_id in
    -- the procedure is NULL, and the mostrecentevent table does not allow NULL
    -- for event_id.
    INSERT INTO event VALUES(UUID(), 5.0, (SELECT id FROM test_drive));
    
     

    Last edit: Keith 2017-05-31
  • Fred Toussi

    Fred Toussi - 2017-05-31

    This happens when the SELECT returns no rows.

     
  • Keith

    Keith - 2017-05-31

    I know the SELECT will not assign anything if no rows are returned, but in this case, the SELECT does return a row. I can verify that is does by manually running the query myself (without the INTO) or by changing the SELECT into an INSERT INTO ... SELECT and put the id into a separate debugging table. In this example, there is only one event being inserted. The SELECT returns one row with that event's id, but it is not assigning the value to the newest_event_id variable.

     
  • Keith

    Keith - 2017-06-01

    I even have a work around for this problem by creating a temporary table, inserting the row returned by the SELECT, and then doing a simpler SELECT to get the value back. This clearly shows that the INTO part of my original SELECT is not setting the variable.

    CREATE TABLE tmp
    (
      event_id UUID NOT NULL
    );
    
    CREATE PROCEDURE update_mostrecentevent(event_id UUID)
    MODIFIES SQL DATA
    BEGIN ATOMIC
      DECLARE newest_event_id UUID;
      DECLARE driver_id UUID;
    
      SELECT d.id
      INTO driver_id
      FROM event e
      JOIN test_drive td on td.id=e.test_drive_id
      JOIN driver d on d.id=td.driver_id
      WHERE e.id = event_id;
    
      -- This is the same query, but inserting result in table instead.
      INSERT INTO tmp
      SELECT id
      FROM
      (
        SELECT e.id, td.datetime + INTERVAL '1' SECOND(6) * e.offset AS e_datetime
        FROM test_drive td
        JOIN event e ON e.test_drive_id = td.id
        WHERE td.driver_id = driver_id
        ORDER BY e_datetime DESC
        LIMIT 1 
      ) AS tmp;
    
      -- WHY DOES THIS WORK, BUT NOT MY ORIGINAL SELECT?
      SELECT event_id
      INTO newest_event_id
      FROM tmp;
    
      INSERT INTO mostrecentevent VALUES (driver_id, newest_event_id);
    END;
    
     

    Last edit: Keith 2017-06-01
  • Fred Toussi

    Fred Toussi - 2017-06-01

    I have checked it. It doesn't work because it is getting the value from the subquery and for SELECT INTO any subqueries are not processed and no rows are returned.

    Use your workaround.

     

    Last edit: Fred Toussi 2017-06-01
  • Keith

    Keith - 2017-06-01

    So subqueries are not processed for SELECT INTO (you said INSERT INTO, but I'm assuming you meant SELECT INTO)? This seems like an important feature to have. Can this be added as a feature request?

     
  • Fred Toussi

    Fred Toussi - 2017-06-01

    Issue fixed and committed to SVN for the next release.

     
  • Keith

    Keith - 2017-06-01

    Sweet! Could you tell me what the commit number is? Also, do you know when the next release is scheduled?

     
  • Fred Toussi

    Fred Toussi - 2017-06-01

    The SVN base/trunk head is fine. Next release perhaps later this month.

     
  • Keith

    Keith - 2017-06-01

    Thank you!

     

Log in to post a comment.