Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

parser creates duplicate DB records

Brian King
2005-06-29
2013-04-25
  • Brian King
    Brian King
    2005-06-29

    When using hibernate persistence and re-parsing a feed I get duplicate entries in the database.  I think this can be avoided, and the parser can be used to keep the database up-to-date, if the hibernate.ChannelBuilder is re-written to find objects by their natural keys before create new ones.  Here is an example for Channel:

    public ChannelIF createChannel(Element channelElement,
        String location, String title) {
      ChannelIF obj;
      try {
        List l = session.find("from Channel as channel where channel.locationString = ? ",
                     location,
                     Hibernate.STRING);
        if (l.size() > 0) {
          obj = (ChannelIF) l.get(0);
          obj.setTitle(title);
          session.update(obj);
        } else {
          obj = new Channel(channelElement, title);
          try {
            obj.setLocation(new URL(location));
           } catch (MalformedURLException e) {
                   throw new RuntimeException(e);
           }
           session.save(obj);
         }
       } catch (HibernateException e) {
             throw new RuntimeException(e);
       }

       return obj;
    }

    I re-wrote the ChannelBuilder methods for Channel, Image, and Item, and now I don't get duplicates.  The full changes are

    1. implement equals() in Category, Image
    2. pass channel location to createChannel()
    3. set unsaved-value="-1" in Channel, Category, Item, Image
    4. make string API for Item link.
    5. make string API for Image location.

    The other classes may be harder, because they do not have fields that can function as a natural key.

     
    • Josh K
      Josh K
      2006-01-03

      Is there any resolution to this issue?  I want to use informa for my own application, but the hibernate duplication is a major issue.

      For me, I have the issue whenever I restart my JVM.  I do not get duplicates during the initial run, but on subsequent runs, items which are already in the database get duplicated.  This makes informa pretty much unusable as anything but a feed parser.