docx2txt Icon


Perl based utility to extract formatted text content from MS Docx file

5.0 Stars (3)
54 Downloads (This Week)
Last Update:
Download docx2txt-1.4.tgz
Browse All Files
Windows Mac Linux



Docx2txt is a Perl based command-line utility to convert (even corrupted) Microsoft docx documents to reasonably formatted text files, along with appropriate character conversions. Apart from Perl it also requires a command line unzipping program like unzip/7z/pkzipc/wzunzip.

docx2txt Web Site


  • Consists of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file, with provision for maintaining separate system-wide configuration file and individual user-level configuration files.
  • Perl script also works with input/output redirection, and is useful in viewing docx file content directly with editors like vim, emacs, and file browsers like mc (midnight commander).
  • Can recover text from damaged docx documents in many cases.
  • Short line justifications, showing hyperlink and many character conversions (missing in MS text conversion).
  • Handles (bullet, decimal, letter, roman) lists along with indentation.
  • Installation via Makefiles and Windows batch file. On non-Windows systems scripts and configuration file can be installed in separate directories.
  • Can conveniently be used to build a web based docx document conversion service.


Other Useful Business Software

Resolve application issues 65% faster Icon

Find and resolve application problems before they become incidents

Resolve application issues 65% faster Icon
Minimize application downtime by resolving application issues 65% faster. Monitor applications across on-premises, hybrid, and public cloud infrastructures from a single console. Set up intelligent application performance alerts to find and fix issues before they impact business services and end-users. Monitor 200+ applications from a single dashboard—no more wasting time between multiple tools and custom scripts.

User Ratings

ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
Write a Review

User Reviews

  • 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    Docx2txt works perfectly.

    Posted 05/13/2013
  • 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    Very useful project!

    Posted 06/01/2012
  • 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5

    This is an excellent extractor of text from docx files. If you use CakeCMD or No-Frills Command Unzipper to unzip the docx files, it will even extract text from corrupt docx files. This works well in a CGI script providing a text extraction web service of even corrupt docx files. See my instance at

    Posted 09/24/2009
Read more reviews

Additional Project Details

Intended Audience

End Users/Desktop

User Interface


Programming Language

Perl, Unix Shell



Thanks for helping keep SourceForge clean.

Screenshot instructions:
Red Hat Linux   Ubuntu

Click URL instructions:
Right-click on ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Briefly describe the problem (required):

Upload screenshot of ad (required):
Select a file, or drag & drop file here.

Please provide the ad click URL, if possible:

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks