The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classification and indexing.
What does JINSECT stand for?
OK. You got me, it is an acronym: Java INteroperable Semantic Extraction Context-based Toolkit
The idea is that JINSECT is an open toolkit for NLP, that allows analysis of texts taking into account the context. This notion of context is basic in the n-gram graph framework and is applied throughout the applications and code of JINSECT.
Check the [Code Snippets] section on how to use or integrate JInsect.
Presentations and more (friendly) info on n-gram graphs
Videos - Greek
Presentations - English
Presentations - Greek
Videos - English
Summarization and Summary Evaluation
- George Giannakopoulos, George Kiomourtzis, & Vangelis Karkaletsis. (2014). NewSum: “N-Gram Graph”-Based Summarization in the Real World. In Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding. IGI. Retrieved from http://www.igi-global.com/chapter/newsum/96746
- George Giannakopoulos, & Vangelis Karkaletsis. (2013). Together we stand NPowER-ed. Presented at the CICLing 2013, Karlovasi, Samos, Greece: Springer Berlin / Heidelberg.
- George Giannakopoulos, & Vangelis Karkaletsis. (2011). AutoSummENG and MeMoG in Evaluating Guided Summaries. In TAC 2011 Workshop. Gaithersburg, MD, U.S.A.
- Giannakopoulos, G., Vouros, G., & Karkaletsis, V. (2010). MUDOS-NG: Multi-document Summaries Using N-gram Graphs (Tech Report). 1012.2042. Retrieved from http://arxiv.org/abs/1012.2042
- Giannakopoulos, G., & Karkaletsis, V. (2010). Summarization system evaluation variations based on n-gram graphs. Presented at the Text Analysis Conference (TAC) 2010, Gaithersburg, MD, U.S.A.
- Giannakopoulos, G., & Karkaletsis, V. (2009). N-gram graphs: Representing documents and document sets in summary system evaluation. In Proceedings of Text Analysis Conference TAC2009.
- Giannakopoulos, G., Karkaletsis, V., & Vouros, G. (2008). Testing the use of N-gram Graphs in Summarization Sub-tasks. In Proceedings of Text Analysis Conference TAC2008. Washignton, U.S.A.
- Giannakopoulos, G., Karkaletsis, V., Vouros, G., & Stamatopoulos, P. (2008). Summarization system evaluation revisited: N-gram graphs. ACM Trans. Speech Lang. Process., 5(3), 1–39. doi:10.1145/1410358.1410359
- Giannakopoulos, G., Petra Mavridi, Georgios Paliouras, George Papadakis, & Konstantinos Tserpes. (2012). Representation Models for Text Classification: a comparative analysis over three Web document types. Presented at the International Conference on Web Intelligence, Mining and Semantics (WIMS 2012), Craiova, Romania: ACM.
Adaptive Systems and Personalization
- George Papadakis, George Giannakopoulos, Claudia Niederee, Themis Palpanas, & Wolfgang Nejdl. (2011). Detecting and exploiting stability in evolving heterogeneous information spaces. In Proceeding of the 11th annual international ACM/IEEE joint conference on Digital libraries (pp. 95–104). Retrieved from http://disi.unitn.it/~themis/publications/jcdl11-DetectStabil.pdf
- Giannakopoulos, G., & Palpanas, T. (2009). Adaptivity in Entity Subscription Services. In Proceedings of ADAPTIVE2009. Athens, Greece.
- Giannakopoulos, G., & Palpanas, T. (2010). Content and Type as Orthogonal Modeling Features: a Study on User Interest Awareness in Entity Subscription Services. International Journal of Advances on Networks and Services, 3(2).
- Dimitris Polychronopoulos, Anastasia Krithara, Christoforos Nikolaou, Giorgos Paliouras, Yannis Almirantis, & George Giannakopoulos. (2014). Analysis and Classification of Constrained DNA Elements with N-gram Graphs and Genomic Signatures. Presented at the AlCoB 2014, Tarragona, Spain.
The next steps will be:
- Add code snippets for some basic tasks. (Ongoing - Check the [Code Snippets] page)
- Give a user's manual for the AutoSummENG method (which is implemented in JInsect).
- Give the rationale of the n-gram graphs in a simple-to-understand text.
- Rewrite JInsect to support newer programming paradigms and distributed processing.
And that's enough for a start I think... Feel free to contact me at ggianna ( a t ) iit (d o t) demokritos (d o t) gr, for any questions or support.
The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classiﬁcation and indexing.