Menu

Meta-blocking on DBpedia

Anonymous
2015-10-03
2016-01-15
  • Anonymous

    Anonymous - 2015-10-03

    Hi George,

    I’m trying to run Meta-blocking on DBpedia (Clean-Clean; 2009vs2007).
    Notes:
    I’m running on a Intel Xeon (Ten-Core) E5-2670v2 2.50 GHz with 48GB RAM.
    Workflow:
    TokenBlocking -> SizeBlockPruning (-> ComparisonBasedBlcokPruning) -> Meta-blocking WNP (Jaccard Similarity)

    Since I had memory issues trying to yield the actual blocks with Meta-blocking, I’m using an OnTheFlyWNP (inspired by an answer you gave in another thread - I attach the file).
    Now, the problem is that the running time is huge. I see from your thesis that the running time should be ~10 hours for the Materialisation + Restructure Time; but when I run in my setting it takes more than 24 hours (I limited to 24 hours the execution time, so I can’t give the final running time).

    Additional notes:
    After the ComparisonBasedBlockPruning the aggregate cardinality is ~3.5E10, and it is even less than 5.68E10 that is the baseline in your thesis. So, maybe I’m missing something.

    Thank you,
    Giovanni

     
    • gpapadis

      gpapadis - 2016-01-14

      Hi Giovanni,

      for some reason, I saw your post just now. Do you still have problems with Meta-blocking? If yes, let me know how I can help you. Yesterday, I uploaded a new version of Meta-blocking that is much faster (https://sourceforge.net/p/erframework/svn/HEAD/tree/trunk/BlockingFramework/src/MetaBlocking/FastImplementations/).

      Kind regards,
      George

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.