Download Latest Version Preparation et Import dans TXM 2019.zip (1.2 MB)
Email in envelope

Get an email when there's a new version of TXM

Home / library / xsl
Name Modified Size InfoDownloads / Week
Parent folder
README.markdown 2019-07-10 8.0 kB
txm-front-teitxm2xmlw.xsl 2019-07-10 5.1 kB
1-default-html.xsl 2019-05-17 19.1 kB
2-default-pager.xsl 2019-05-17 10.3 kB
txm-posttok-addRef.xsl 2018-03-28 4.0 kB
txm-posttok-structure2wordAtt.xsl 2018-03-28 2.7 kB
txm-posttok-unbreakWords.xsl 2018-03-28 3.2 kB
txm-split-xces-ids-corpus2text.xsl 2018-03-28 2.5 kB
txm-front-idsHeader2textAtt.xsl 2018-03-28 3.0 kB
txm-front-teiHeader2textAtt.xsl 2017-10-04 5.5 kB
txm-split-teicorpus.xsl 2017-10-04 2.9 kB
txm-rename-files-no-dots.xsl 2017-10-04 2.4 kB
ts2xmlw.xsl 2016-07-19 2.6 kB
txm-filter-bnc_oral-xmlw.xsl 2016-04-18 6.4 kB
txm-filter-perseustreebank-xmlw.xsl 2016-01-21 2.8 kB
filter-number-act-scene-line.xsl 2015-12-21 2.2 kB
txm-filter-corpusakkadien-xmlw_syllabes-cuneiform.xsl 2015-10-29 7.6 kB
txm-edition-xtz-cuneiform.xsl 2015-10-25 15.2 kB
txm-edition-xtz.xsl 2015-09-30 14.3 kB
txm-edition-page-split.xsl 2015-09-30 7.1 kB
txm-edition-xtz-corpusakkadien-translit.xsl 2015-08-03 7.8 kB
txm-filter-corpusakkadien-xmlw_mots_effaces.xsl 2015-02-23 3.6 kB
txm-filter-corpusakkadien-xmlw_syllabes.xsl 2015-02-23 4.7 kB
txm-filter-teibvh-xmlw.xsl 2014-03-10 31.6 kB
txm-filter-teibvh-xmlw-posttok.xsl 2014-03-10 19.5 kB
txm-filter-teiperseus-xmlw.xsl 2014-02-27 2.4 kB
txm-filter-rnc-xmlw.xsl 2014-02-27 1.2 kB
txm-filter-qgraal_cm-xmlw.xsl 2014-02-17 9.3 kB
txm-edition-xmltxm-textgrid.xsl 2014-02-03 14.0 kB
txm-filter-teitextgrid-xmlw-posttok.xsl 2014-02-03 5.4 kB
txm-filter-teicorpustextgrid-xmlw.xsl 2014-02-03 4.2 kB
txm-filter-teip5-xmlw-simplify.xsl 2013-12-31 8.1 kB
txm-filter-teip5-xmlw-preserve.xsl 2013-12-31 4.1 kB
txm-filter-teip5-teibfm.xsl 2013-12-31 14.0 kB
txm-filter-teifrantext-xmlw.xsl 2013-12-31 3.8 kB
txm-filter-teifrantext-teibfm.xsl 2013-12-31 13.9 kB
filter-keep-only-select.xsl 2013-12-31 3.3 kB
filter-out-sp.xsl 2013-12-07 1.8 kB
filter-out-p.xsl 2013-12-07 1.8 kB
txm-filter-teibrown-xmlw.xsl 2013-07-10 5.0 kB
p4top5_perseus.xsl 2012-12-03 25.9 kB
Totals: 41 Items   312.3 kB 2

TXM XSLT IMPORT PROCESSING LIBRARY

This is a collection of XSLT (1.0 or 2.0) stylesheets that can be used to prepare various types of XML documents for import into TXM. Place them in the appropriate xsl/step subfolder when using XTZ+CSV import module or use "Front XSLT" option in the import parameters interface to select the appropriate filter in the XML/W+CSV import module.

Filters are usually named according to the following pattern: txm-filter-[input format]-[import module](-[option])?

Stylesheets for use with the XML TEI Zero+CSV (XTZ) import module

1-split-merge step

Due to a bug in TXM 0.7.8 and 0.7.9, this processing step is not working properly. The stylesheets mentioned below must be applied prior to the import using ExecXSL macro or any other XSLT 2.0 processor.

txm-rename-files-no-dots.xsl

This stylesheet is designed for TXM XTZ+CSV import module to replace dots with underscores in source file names. (A bug in TXM 0.7.8 prevented files containing dots in their names from being imported, this bug has been resolved in TXM 0.7.9).

txm-split-teicorpus.xsl

This stylesheet may be used to split a single file containing a teiCorpus into individual files for each TEI child.

2-front step

txm-front-teiHeader2textAtt.xsl

This stylesheet may be customized to extract metadata from teiHeader and create corresponding attributes of the text element.

txm-front-teitxm2xmlw.xsl

This stylesheet may be used to import TEI-TXM XML files with XML-TEI Zero+CSV (or XML/W + CSV) module. This module is more flexible than XML-TEI TXM. It allows re-tokenizing the texts, selecting and renaming annotations, and building synoptic editions.

3-posttok step

txm-posttok-addRef.xsl

This stylesheet may be customized to add a ref attribute to w elements which will be used as a default reference in TXM concordances.

txm-posttok-unbreakWords.xsl

This stylesheet may be customized to re-unite the words broken in the primary tokenization process (due to line or page breaks, for instance)

txm-posttok-structure2wordAtt.xsl

This stylesheet projects the number of nesting selected ancestor elements to attributes of the w element.Enter element names separated by | as the value of elementsToProject parameter.

4-edition step

1-default-html.xsl

This is an alternative stylesheet for creating default editions with the XTZ module. It transforms every TEI element into an HTML span with @class. This stylesheet must be used in conjunction with 2-default-pager.xsl.

2-defaut-pager.xsl

This stylesheet must be used in conjunction with 1-default-html.xsl to create edition pages.

Basic stylesheets for filtering XML sources

filter-keep-only-select.xsl

This stylesheet may be customized to filter out all the text and tags except the content of the specified element (select by default) and its ancestors.

filter-out-p.xsl

This stylesheet may be customized to filter out any particular xml element (p by default) and its content from the source document.

filter-out-sp.xsl

This stylesheet may be customized to filter out any particular xml element with a specific attribute value (sp with an attribute who with the value 'enqueteur' by default) and its content from the source document.

Basic stylesheets for adapting XML TEI P5 sources

txm-filter-teip5-teibfm.xsl

This stylesheet may be customized for use with any TEI P5 in the TEI BFM import module. Note that this module is experimental and may fail on documents that do not follow BFM encoding guidelines.

txm-filter-teip5-xmlw-preserve.xsl

This stylesheet may be customized for use with any TEI P5 in the XML/w+CSV import module. By default, it eliminates teiHeader and facsimile elements and their contents and preserves all other elements.

txm-filter-teip5-xmlw-simplify.xsl

This stylesheet may be customized for use with any TEI P5 in the XML/W+CSV import module. By default, it eliminates teiHeader, facsimile and all note elements and their contents and filters out all tags in the text body except ab, body, div, front, lb, p, pb, s, TEI, text and w.

Additional stylesheets for particular corpora

p4top5_perseus.xsl

This stylesheet is needed to convert Perseus TEI P4 files to TEI P5 prior to any import process.

txm-edition-page-split.xsl

This styleheet should be used to create separate HTML pages for TXM editions.

txm-edition-xmltxm-textgrid.xsl

This styleheet should be used to customize TXM editions of DARIAH-DE Textgrid texts.

txm-edition-xtz-corpusakkadien-translit.xsl

This stylesheet should be used to customize translitterated TXM editions of cuneiform Akkadian tablets, see the project wiki for more details.

txm-edition-xtz-cuneiform.xsl

This stylesheet should be used to create cuneiform TXM editions of Akkadian tablets, see the project wiki for more details.

txm-filter-corpusakkadien-xmlw_syllabes-cuneiform.xsl

This stylesheet should be used to on a corpus of Akkadian tablets with the XML/w+CSV import module, see the project wiki for more details.

txm-filter-perseustreebank-xmlw.xsl

This filter should be used on the Perseus Treebank corpus texts with the XML/w+CSV import module.

txm-filter-qgraal_cm-xmlw.xsl

This styleheet should be used on the diffracted format of Quest del Saint Graal source files with the XML/w+CSV import module.

txm-filter-rnc-xmlw.xsl

This filter should be used on the Russian National Corpus texts with the XML/w+CSV import module.

txm-filter-teibrown-xmlw.xsl

This filter should be used on the TEI Brown corpus texts with the XML/w+CSV import module.

txm-filter-teibvh-xmlw.xsl

This filter should be used on the TEI BVH texts with the XML/w+CSV import module.

txm-filter-teibvh-xmlw-posttok.xsl

This styleheet should be used to fix the tokenization errors and to adjust word properties in the tokenized version of TEI BVH texts.

txm-filter-teicorpustextgrid-xmlw.xsl

This styleheet should be used to prepare DARIAH-DE TEIcorpus xml files to TXM XML/w+CSV import process.

txm-filter-teifrantext-teibfm.xsl

This filter should be used on TEI Frantext texts with the TEI BFM import module. It is automatically applied in the TEI Frantext import module. Note that this module is experimental and may fail on documents that do not follow BFM encoding guidelines.

txm-filter-teifrantext-xmlw.xsl

This styleheet should be used on TEI Frantext texts with the XML/w+CSV import module.

txm-filter-teiperseus-xmlw.xsl

This filter should be used on the TEI Perseus corpus texts with the XML/w+CSV import module (after conversion to TEI P5).

txm-filter-teitextgrid-xmlw-posttok.xsl

This styleheet should be used to adjust word properties in the tokenized version of DARIAH-DE Textgrid texts.

txm-front-idsHeader2textAtt.xsl

This stylesheet may be used to project metadata from idsHeader (Mannheim German Language Institute corpus, IDS-XCES schema) to text attributes

txm-split-xces-ids-corpus2text.xsl

This stylesheet transforms a single file of a XCES-IDS corpus (Mannheim German Language Institute corpus) into as many files as separate texts for TXM XTZ import module. Designed for 1-split-merge step, which is currently buggy. Should be applied prior to the import process.

Please address any enquiries about the TXM XSLT library to textometrie@groupes.renater.fr

Source: README.markdown, updated 2019-07-10