1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

Main Page

From jinsect

Jump to: navigation, search


JINSECT - N-gram Graph Based Toolkit

The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of [n-gram graphs] within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.

What does JINSECT stand for?

OK. You got me, it is an acronym: Java INteroperable Semantic Extraction Context-based Toolkit

The idea is that JINSECT is an open toolkit for NLP, that allows analysis of texts taking into account the context. This notion of context is basic in the n-gram graph framework and is applied throughout the applications and code of JINSECT.


How can I use the toolkit?

Check the Code Snippets section on how to use or integrate JInsect.

Presentations and more (friendly) info on n-gram graphs

Videos - Greek

  1. Ένα γραφικό παραμύθι ν-γραμμάτων (2012)

Presentations - English

  1. N-gram graphs and proximity graphs: bringing summarization, machine learning and bioinformatics to the same neighborhood (2014)
  2. N-gram Graphs: A generic machine learning tool in the arsenal of NLP, Video Analysis and Adaptive Systems (Part 1) (2010)
  3. N-gram Graphs: A generic machine learning tool in the arsenal of NLP, Video Analysis and Adaptive Systems (Part 2) (2010)

Presentations - Greek

  1. Ένα γραφικό παραμύθι ν-γραμμάτων (2012)
  2. Παρουσίαση διδακτορικής διατριβής στην αυτόματη εξαγωγή περιλήψεων (2009)

Videos - English

(Not yet...)

Related literature and publications

Summarization and Summary Evaluation

  1. George Giannakopoulos, George Kiomourtzis, & Vangelis Karkaletsis. (2014). NewSum: “N-Gram Graph”-Based Summarization in the Real World. In Innovative Document Summarization Techniques: Revolutionizing
  2. George Giannakopoulos, & Vangelis Karkaletsis. (2013). Together we stand NPowER-ed. Presented at the CICLing 2013, Karlovasi, Samos, Greece: Springer Berlin / Heidelberg.
  3. George Giannakopoulos, & Vangelis Karkaletsis. (2011). AutoSummENG and MeMoG in Evaluating Guided Summaries. In TAC 2011 Workshop. Gaithersburg, MD, U.S.A.
  4. Giannakopoulos, G., Vouros, G., & Karkaletsis, V. (2010). MUDOS-NG: Multi-document Summaries Using N-gram Graphs (Tech Report). 1012.2042. Retrieved from http://arxiv.org/abs/1012.2042
  5. Giannakopoulos, G., & Karkaletsis, V. (2010). Summarization system evaluation variations based on n-gram graphs. Presented at the Text Analysis Conference (TAC) 2010, Gaithersburg, MD, U.S.A.
  6. Giannakopoulos, G., & Karkaletsis, V. (2009). N-gram graphs: Representing documents and document sets in summary system evaluation. In Proceedings of Text Analysis Conference TAC2009.
  7. Giannakopoulos, G., Karkaletsis, V., & Vouros, G. (2008). Testing the use of N-gram Graphs in Summarization Sub-tasks. In Proceedings of Text Analysis Conference TAC2008. Washignton, U.S.A.
  8. Giannakopoulos, G., Karkaletsis, V., Vouros, G., & Stamatopoulos, P. (2008). Summarization system evaluation revisited: N-gram graphs. ACM Trans. Speech Lang. Process., 5(3), 1–39. doi:10.1145/1410358.1410359

Knowledge Understanding. IGI. Retrieved from http://www.igi-global.com/chapter/newsum/96746

Text Classification

  1. Giannakopoulos, G., Petra Mavridi, Georgios Paliouras, George Papadakis, & Konstantinos Tserpes. (2012). Representation Models for Text Classification: a comparative analysis over three Web document types. Presented at the International Conference on Web Intelligence, Mining and Semantics (WIMS 2012), Craiova, Romania: ACM.

Adaptive Systems and Personalization

  1. George Papadakis, George Giannakopoulos, Claudia Niederee, Themis Palpanas, & Wolfgang Nejdl. (2011). Detecting and exploiting stability in evolving heterogeneous information spaces. In Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries (pp. 95–104). Retrieved from http://disi.unitn.it/~themis/publications/jcdl11-DetectStabil.pdf
  2. Giannakopoulos, G., & Palpanas, T. (2009). Adaptivity in Entity Subscription Services. In Proceedings of ADAPTIVE2009. Athens, Greece.
  3. Giannakopoulos, G., & Palpanas, T. (2010). Content and Type as Orthogonal Modeling Features: a Study on User Interest Awareness in Entity Subscription Services. International Journal of Advances on Networks and Services, 3(2).


  1. Dimitris Polychronopoulos, Anastasia Krithara, Christoforos Nikolaou, Giorgos Paliouras, Yannis Almirantis, & George Giannakopoulos. (2014). Analysis and Classification of Constrained DNA Elements with N-gram Graphs and Genomic Signatures. Presented at the AlCoB 2014, Tarragona, Spain.

What's next?

The next steps will be:

  1. Add code snippets for some basic tasks. (Ongoing - Check the Code Snippets page)
  2. Give a user's manual for the AutoSummENG method (which is implemented in JInsect).
  3. Give the rationale of the n-gram graphs in a simple-to-understand text.
  4. Rewrite JInsect to support newer programming paradigms and distributed processing.

And that's enough for a start I think... Feel free to contact me at ggianna ( a t ) iit (d o t) demokritos (d o t) gr, for any questions or support.

Personal tools