Corrupt DOCX Salvager Icon

Corrupt DOCX Salvager

Extracts text from corrupt DOCX files where Word itself fails.

7 Recommendations
338 Downloads (This Week)
Last Update:
  Browse Code SVN Repository

Screenshots

Description

This GUI program will extract text from damaged/corrupted Word files formatted in the new docx format where Word itself fails.

Docx files are actually zipped collections of XML files. XML as a format is unforgiving of data corruption. The main text in docx files is found in document.xml file in the collection. Damaged docx2txt uses 7Zip, an unzipper that will sometimes unzip partially corrupt document.xml files even though reporting an error.

Additionally the Perl routine used to extract the text from the document.xml file doesn't care about well-formedness of the XML, a stumbling block of Word 2007 and 2010.

Recent changes include the pretreating of docx files with InfoZip's zip.exe -FF repair command, improving success rates. Also added are links to the commercial WordFix which is recommended by me the author in case of failure of the program. Also included is a link to an upload page for the user to send the file to me the author for manual repair for only $22.

Corrupt DOCX Salvager Web Site

Update Notifications





User Ratings

 
 
7
0
Write a Review

User Reviews

  • Posted by Nathan Glenn 2012-08-05

    It couldn't fix ALL my files, but it certainly went a long way towards recovering everything it possibly could. Very happy with it.

Read more reviews

Additional Project Details

Languages

English

Intended Audience

Advanced End Users, Education, End Users/Desktop, Non-Profit Organizations, Religion, Science/Research

User Interface

Win32 (MS Windows)

Programming Language

Perl, Tcl

Registered

2009-03-31

Icons must be PNG, GIF, or JPEG and less than 1 MiB in size. They will be displayed as 48x48 images.