Menu

#4 String concatenation bug, and GID feature request

open
nobody
None
3
2004-12-04
2004-12-04
No

The latest day that has gone to bound volume edition
(2004-05-27) has shown up a bug somewhere in that the
parser is concatenating words together if they have a
new column number between them.

For example, see "amatter" in http://www.
theyworkforyou.com/debate/?id=2004-05-27.1708.2
compared with "a matter" in http://www.publications.
parliament.
uk/pa/cm200304/cmhansrd/vo040527/debtext/40527-04.
htm#40527-04_spnew9

This day also showed up a feature request, which I
began to try to implement, but failed utterly :( If they
have subtly reflowed the columns, so that a large number
of columns have one speech moved from the beginning of
a column to the end of the previous, could the parser not
spot that the text and speaker were identical to the next
speech in the old version, and simply update the GID
accordingly? As it is, I have to enter many, many <stamp
parsemess-colnum="..."/> and I'd rather not. :)

As an example, the daily edition went (in GIDs): 1706.5
1706.6 1707.0 1707.1 1707.2 1707.3
whereas the new volume edition went: 1706.5 1706.6
1706.7 1706.8 1707.0 1707.1
but the content was identical.

Discussion


Log in to post a comment.