TXM Icon

TXM

Unicode-XML-TEI text/corpus analysis platform

5.0 Stars (4)
188 Downloads (This Week)
Last Update:
Download PreparationEtImportDansTXM-23-09-2014.zip
Browse All Files
Windows Mac Linux

Screenshots

Description

TXM is a free and open-source cross-platform Unicode & XML based text/corpus analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in.

It offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org).

Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.

Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en.

TXM Web Site

Features

  • Provides qualitative analysis tools : concordancer of lexical patterns based on word & structure level queries, rich HTML based text editions navigation, patterns occurrences layout display
  • Provides quantitative analysis tools : factorial correspondance analysis, constrative word specificities, hierarchical classification, cooccurrents of patterns
  • Works on any collection of Unicode encoded documents of various formats: texts collections (TXT, XML, XML-TEI P5), recordings transcriptions (XML-Transcriber), aligned corpora (XML-TMX), press articles (XML-PPS Factiva, Europress) and more.
  • Applies various NLP tools on the fly on texts before analysis (e.g. TreeTagger for lemmatization and pos tagging)
  • Allows to build various subcorpora and partitions (for constrative analysis between text structures or groups of words)
  • Exports any result in CSV, XML or SVG format
  • Script drivable for repetitive tasks automation or platform extension (in Groovy/Java)
  • Includes a text editor to edit data sources, results and scripts
  • Runs as standalone Windows, Mac OS X or Linux application
  • Runs also as portal web application to access and analyze corpora online through a web browser (with access control management)
  • Open source: based on the best open source components for text analysis: CQP, R and Java & XSLT libraries
  • Modular architecture (Eclipse RCP OSGi and J2EE conformant): one toolbox connecting all core components is used by all the applications
  • Efficient Eclipse or Netbeans powered development framework

Update Notifications





User Ratings

★★★★★
★★★★
★★★
★★
4
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
Write a Review

User Reviews

  • disfunzione
    1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    Thanks very good project! +

    Posted 06/03/2013
  • vfncfaprhpk
    1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    good work, thanx!

    Posted 05/22/2013
  • blaskrusik1978
    1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    very good project, thanks!

    Posted 04/05/2013
  • dillonmatthews
    1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    txm works perfectly, thanks

    Posted 09/20/2012
Read more reviews

Additional Project Details

Languages

French, English, Russian

Intended Audience

Science/Research, Advanced End Users, Developers, End Users/Desktop

User Interface

Java SWT, Web-based, Console/Terminal, Eclipse

Programming Language

C, Groovy, Java, S/R

Registered

2008-12-04
Screenshots can attract more users to your project.
Features can attract more users to your project.

Icons must be PNG, GIF, or JPEG and less than 1 MiB in size. They will be displayed as 48x48 images.