Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#39 XML enitities with underscores not fully recognized

closed-fixed
None
5
2010-10-12
2010-10-09
Ewald Arnold
No

the html parser only accepts a-z in entities. So entities in docbook (xml) files like

&sect_abc_def;

appear in suggestions as "abc" and "def".

My quick hack changes the state after the next semikolon. I have no idea about further consequences but it works now as I expected :-)

Index: src/parsers/htmlparser.cxx

--- src/parsers/htmlparser.cxx (Revision 20394)
+++ src/parsers/htmlparser.cxx (Arbeitskopie)
@@ -138,12 +138,12 @@
token = head;
} else if (line[actual][head] == '&') {
state = ST_CHAR_ENTITY;
- }
+ }
break;
case ST_CHAR_ENTITY: // SGML element
- if ((tolower(line[actual][head]) < 'a') || (tolower(line[actual][head]) > 'z')) {
+ if ((tolower(line[actual][head]) == ';')) {
state = prevstate;
- head--;
+ /* head--; */
}
}
if (next_char(line[actual], &head)) return NULL;

Discussion

  • Seems basically reasonable, checked in the relevant bit into cvs

     
    • assigned_to: nobody --> caolan
    • status: open --> closed-fixed