Extracts text from corrupt DOCX files where Word itself fails.
This GUI program will extract text from damaged/corrupted Word files formatted in the new docx format where Word itself fails.
Docx files are actually zipped collections of XML files. XML as a format is unforgiving of data corruption. The main text in docx files is found in document.xml file in the collection. Damaged docx2txt uses 7Zip, an unzipper that will sometimes unzip partially corrupt document.xml files even though reporting an error.
Additionally the Perl routine used to extract the text from the document.xml file doesn't care about well-formedness of the XML, a stumbling block of Word 2007 and 2010.
Recent changes include the pretreating of docx files with InfoZip's zip.exe -FF repair command, improving success rates. Also added are links to the commercial WordFix which is recommended by me the author in case of failure of the program. Also included is a link to an upload page for the user to send the file to me the author for manual repair for only $22.
It couldn't fix ALL my files, but it certainly went a long way towards recovering everything it possibly could. Very happy with it.