Thanks Wolfgang, that's what we ended up doing and you are right that we should probably upgrade to a more modern version of eXist.
We are using it in combination with Orbeon 4.2, what would be your recommended eXist version to go with that?

We have now re-run the reindex and we get 4 exist.log files (1-3) of size 5244330 bytes, containing the following, is this something to be concerned about?

2013-11-27 12:44:48,971 [http-apr-8080-exec-3] ERROR ( [flush]:246) - not a data-page: 0 not a data-page: 0

This indicates a problem in the structural index, which may be an effect of earlier problems. If you hit this I would suggest to stop eXist, and remove all secondary indexes (all except dom.dbx, symbols.dbx, collections.dbx) before you restart and reindex.

In any case, it is highly recommended to upgrade to 1.4.3. 1.4.1 is known to have some indexing bugs, which have been addressed a long, long time ago. Your issue reminds me of it.