Sanchay is a platform for working on languages (especially South Asian) using computers. It is still in the development stage, but components like a text editor with customizable support for languages and encodings, annotation interfaces, etc. are ready


http://nlp-sanchay.sourceforge.net





Separate each tag with a space.

Release Date:

2008-11-04

Ratings and Reviews

Be the first to post a text review of Sanchay. Rate and review a project by clicking thumbs up or thumbs down in the right column.

Project Feed

  • Forum thread added

    created the can sanchay run from a server? forum thread

    posted 18 days ago

  • Forum thread added

    sanchay created the It\'s a bit lonely here forum thread

    posted by sanchay 33 days ago

  • Forum thread added

    sanchay created the It\'s a bit lonely here forum thread

    posted by sanchay 33 days ago

  • Forum thread added

    sanchay created the It\'s a bit lonely here forum thread

    posted by sanchay 33 days ago

  • Project Information Updated

    sanchay changed the public information on the Sanchay project

    posted by sanchay 100 days ago

  • Project Information Updated

    sanchay changed the public information on the Sanchay project

    posted by sanchay 100 days ago

  • nlp-sanchay Sanchay-0.3.0 file released: Sanchay-0.3.0.zip

    New: Some more features added in the Syntactic Annotation Interface. Most of the repetitive actions can now be performed with keys or by context (right-click) menu. This should make annotation much easier and faster. --- Many new applications have been included in this release. Also, there is now a single GUI from which all the applications can be started and it is possible to open multiple windows for one or more applications. The following applications are included in this release: - Sanchay Text Editor that is connected to some other NLP/CL components of Sanchay. - Table Editor with all the usual facilities. - A more intelligent Find-Replace-Extract Tool (can search over annotated data and allows you to see the matching files in the annotation interface). - Word List Builder. - Word List FST (Finite State Transducer) Visualizer that can be useful for anyone working with morphological analysis etc. - One of the Most Accurate Language and Encoding Identifier (currently trained for 54 langauge-encoding pairs, including most of the major Indian languages). - A user friendly Syntactic Annotation Interface that is the perhaps most heavily used part of Sanchay till now. Hopefully there will be an even more user friendly version soon. - A Parallel Corpus Annotation Interface, which is another heavily used components. (Don't take that 'heavily' too seriously). - An N-gram Language Modeling Tool that allows you to compile models in terms of bytes, letters and words. - A Discourse Annotation Interface that is yet to be actually used. - A more intelligent File Splitter. - An Automatic Annotation tool for POS (Part Of Speech) tagging, chunking and Named Entity Recognition. The first two should work reasonably work, but the last one may not be that useful for practical purposes. This is a CRF (Conditional Random Fields) based tool and it has been trained for Hindi for these three purposes. If you have annotated data, you can use it to train your own taggers and chunkers.

    posted 370 days ago

  • File released: /nlp-sanchay/Sanchay-0.3.0/Sanchay-0.3.0.zip

    posted 370 days ago

  • nlp-sanchay Sanchay-0.2 file released: Sanchay-0.2.zip

    This was the version finished more than one year ago, but it was not released on Sourceforge due to some reasons. Since the current version (0.3.0) is being released soon, this version is not recommended for new users.

    posted 404 days ago

  • File released: /nlp-sanchay/Sanchay-0.2/Sanchay-0.2.zip

    posted 404 days ago

Rate and Review

Be the first person to add a text review.

Would you recommend this project?






<

Related Projects

Thanks for your rating!

Would you also like to write a review?





Skip Review