Menu

New Project developer and admin, new version coming up!

Hello everybody,

My name is Arne Jans and I am a new contributor to the majix-project. Troggan has updated my status to project-admin, and I have planned to create a project homepage and to breathe some life into this project.

I have already created 3 mailing lists for open communication and have prepared some categories and stuff for the bugtracker and documentation-area.

There is also a new version coming up soon: v1.3

The code is nearly complete and adds various new features and enhancements to the already excellent majix:

- Cleaner XML output. No more useless repeating of the elements for character-properties.
- Call the majix conversion-process via XSLT-stylesheet and Java-Binding (tested with SAXON 6.5.3). So it is possible to do a XML-transformation and insertion of converted RTF-files into the resulting XML output on the fly.
- New DocScape-Template that adds the possibility to generate XML suitable for print-publishing via XMLTeX (Support for XSL:FO for direct production of PDF-files is planned).
- Recognition of various new character properties: underlined, double-underlined, subscript, superscript, condensed/expanded typesetting, font size
- Handling of deleted text (Compiler-Flag to skip generating or markup as <deleted>...</deleted>). Issue: When generated, deleted paragraphs/tables produce newlines resp. empty tables with empty paragraphs inside cells.
- Tabs, linebreaks, sectionbreaks and pagebreaks are generated as entities (&tab;) or as empty elements (<tab/>). Tabs can be underlined, doubleunderlined, hidden and deleted.
- Disable the generating of XML-elements by commenting out the corresponding tagmap-definition in the template.
- Commonly used RTF-specialcharacters are encoded as Unicode (e.g. nonbreaking-spaces, hyphens, quotes, listbullet).
- Compiler-Flag for disabling the output of the color-elements (will soon be a configuration entry).
- Compiler-Flag for translating the Word-Stylesheets to the resulting character properties (will soon be a configuration entry).

Enhanced paragraph-properties:
- Ability to generate tabulator-definition-list for every paragraph. Therefore the content of the paragraph-element is split up into the tabdefinition-element and the parcontent-element (see docscape-template for details). Global and Document-default tabsizes are also supported and are contained in the tabdeflists.
- Paragraph-alignment: left, right, justified, centered, distributed.
- Paragraph-Indentation: left-indent, right-indent, firstline-indent (in mm).
- Paragraph Line-spacing: atleast or exactly (in points).
- List-paragraphs: numbering style (numeric, alphanumeric), text before numbering and after, countstart, suppress numbering
- Listitems: tab-distance between item and numbering, plaintext numbering.

Enhanced table-properties:
- Parsing and generating of cell-borders (as attribute borderwidth measured in points)
- Cell Margins and Row Margins
- Cell Vertical Alignment (top, center, bottom)
- Cellwidth template-configurable in points or mm.

And of course some bugfixes:
- bug with the handling of "special"-attribute in the majixt-template is fixed. The attribute now properly reflects the activation of an XmlGeneratorFunctor to fill the dynamic attributes of the result-xml.
- bugfixed some wrong declarations in embedded DOCTYPE-declarations of the templates.
- RTF-Parser should now correctly parse not only Word-generated RTF-files, but also from WordPad, OpenOffice etc. (has to be tested).

TODO for me for the next time (hopefully before the release):
- Implement rowspan and colspan for merged tablecells.
- Move the compilerflags to the config-file
- Fix some bugs in the automaton-definition
- Rework the documentation with the new features and enhancements
- Test some more conversions of RTF-Files produced by other wordprocessors.
- Make a projecthomepage for majix.

TODO for some time later:
- Implement an integrated GUI for creating and editing automaton-definitions (it's hell to edit them by hand... :-). So it will be possible to define completely new XML-structures reflecting the document.
- Better stylesheet-parsing and generating (paragraph properties, stylesheet definition-block in xml)
- Font table parsing

I hope you find the enhancements and new features useful, please let me know about bugs you find or opinions about the new version. Developers who want to participate are also very welcome!

Best Regards,

Arne Jans

About my person:
I am a student of computer science at the University of Dortmund, Germany and work for the Company QuinScape GmbH in Dortmund (german website: www.quinscape.de).
QuinScape GmbH is an active supporter of the open-source development and community.

Posted by Arne Jans 2004-09-24

Log in to post a comment.