Menu

#216 hfst-ospell segfaults on empty description element in zhfst index file when built with TinyXML

future
open
None
1
2014-09-14
2013-12-11
sjurum
No

Steps to reproduce:

  1. build a recent version of hfst-ospell (svn HEAD is fine), with zhfst support and using TinyXML2 as the xml parser, install
  2. build libvoikko with hfst support, install
  3. edit the file $GTHOME/langs/GTLANG/tools/spellcheckers/fstbased/hfst/index.xml so that the description element is empty.
  4. build the zhfst file
  5. run voikkospell using that zhfst file

Expected result: voikkospell should function normally.

Actual result: voikkospell throws a segmentation fault:

:::gdb
602 else if (strcmp(info->Name(), "description") == 0)
(gdb)

Breakpoint 1, 0x00007fff86c11c00 in strlen ()
(gdb)
Single stepping until exit from function strlen,
which has no line number information.

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x00007fff86c11c00 in strlen ()
(gdb)

This happens around line 602 in the function ZHfstOspellerXmlMetadata::parse_info in ZHfstOspellerXmlMetadata.cc, or in the function ZHfstOspellerXmlMetadata::parse_description in the same file.

Running hfst-ospell through libvoikko is necessary, since hfst-ospell itself doesn't parse the metadata (and has no means of displaying the metadata). It would be good to add a simple option to print metadata in the zhfst file, just to be able to verify the integrity of both hfst-ospell and the zhfst file.

Discussion

  • sjurum

    sjurum - 2013-12-11

    It is likely that the same error exists also when using libxml2++ - the bug has been there for longer than the TinyXML2 support, IIRC.

     
  • sjurum

    sjurum - 2013-12-11

    Forget the text in the final paragraph about adding an option to print metadata - it is already there (= --verbose).

     
  • sjurum

    sjurum - 2013-12-12

    Forgot to mention the system: MacOSX 10.6 and 10.9.

     
  • Flammie Pirinen

    Flammie Pirinen - 2013-12-13

    empty descriptions should now throw parsing errors in [r3643].

     

    Related

    Commit: [r3643]

  • sjurum

    sjurum - 2014-03-20

    I guess this bug report can closed now.

     
MongoDB Logo MongoDB