The Doc2Html command line operating program strippes the Word produced html files (by opening the documet, saving as html) leaving pure text + minimum html code. It also has a mode to convert data berween different charsets: DOS, Windows-1250 and ISO-8859