Parsing JSON for db insert


  • Anonymous

    I have about 1000 pages of raw json-formatted clobs, each page of which has about 1000 identically structured records.  I wrote a pl/sql block of code to extract the fields of each record, and each page, then shove the record into a table structured like the JSON records.  The problem is, it takes about 5 minutes to parse out a single record.  According to my calculations, that means 6years of run-time to extract all the records and shove them back into the db.

    Maybe there's a way I can shorten the parsing so that it only reads each JSON "record" once?  Is this an opportunity to use JSON_LIST?

    Thanks for your help!

    Here is my very simple PL/SQL:

        jsonObj        json;
        myClob clob;
        myPage number;
        myRecord number;
    type ConRec_Entry is RECORD (
        mySpeaker_State varchar2(4),
        mySpeaker_First varchar2(128),
        myCongress number,
        myTitle varchar2(2048),
        myOrigin_Url varchar2(1024),
        myCR_Number number,
        myCR_Order number,
        myCR_Volume number,
        myChamber varchar2(32),
        myCR_Session varchar2(32),
        myCR_ID varchar2(256),
        mySpeaking clob,
        myCapitolwords_URL varchar2(1024),
        mySpeaker_Party varchar2(32),
        myChar_Date varchar2(32),
        myBills varchar2(2048),
        myBioguide_ID varchar2(64),
        myCR_Pages varchar2(32),
        mySpeaker_Last varchar2(128),
        mySpeaker_Raw varchar2(128),
        myID number,
        PageNo number,
        ItemNo number
        my_Rec ConRec_Entry;
        my_State varchar2(4);
      for MyPage in 14..100
        for myRecord in 1..1000
          select jsoncontent into myClob from capitolwordsdata where id=myPage;
          my_Rec.mySpeaker_State:=json_ext.pp(jsonObj, '');
          my_Rec.mySpeaker_First:=json_ext.pp(jsonObj, '');
          my_Rec.myCongress:=      json_ext.pp(jsonObj, '');
          my_Rec.myTitle:=        json_ext.pp(jsonObj, '');
          my_Rec.myOrigin_URL:=   json_ext.pp(jsonObj, '');
          my_Rec.myCR_Number:=    json_ext.pp(jsonObj, '');
          my_Rec.myCR_Order:=     json_ext.pp(jsonObj, '');
          my_Rec.myCR_Volume:=    json_ext.pp(jsonObj, '');
          my_Rec.myChamber:=      json_ext.pp(jsonObj, '');
          my_Rec.myCR_Session:=   json_ext.pp(jsonObj, '');
          my_Rec.myCR_ID:=        json_ext.pp(jsonObj, '');
          my_Rec.mySpeaking:=     json_ext.pp(jsonObj, '');
          my_Rec.myCapitolWords_Url:=json_ext.pp(jsonObj, '');
          my_Rec.mySpeaker_Party:=json_ext.pp(jsonObj, '');
          my_Rec.myChar_Date:=    json_ext.pp(jsonObj, '');
          my_Rec.myBills:=        json_ext.pp(jsonObj, '');
          my_Rec.myBioguide_ID:=  json_ext.pp(jsonObj, '');
          my_Rec.myCR_Pages:=     json_ext.pp(jsonObj, '');
          my_Rec.mySpeaker_Last:= json_ext.pp(jsonObj, '');
          my_Rec.mySpeaker_Raw:=  json_ext.pp(jsonObj, '');
          -my_Rec.myID:=         json_ext.pp(jsonObj, '');
       insert into congressionalrecord values my_Rec;

        end loop;
      end loop;
      dbms_output.put_Line('Of Page' || myPage);

  • James Sumners
    James Sumners

    I'm going to suggest you parse the JSON in a more appropriate language. While PL/JSON is great, PL/SQL limits its ability to be efficient.

    I would dump all of the "pages" to plain text files (e.g. "page01.json"…"pageXX.json"). Then I would use a language that can import the JSON data into proper objects. Since your end goal is to update records in an Oracle database, I'd use Java (or Groovy for simplicity) as that language.