|
From: GitHub <no...@gi...> - 2012-07-11 14:57:53
|
Branch: refs/heads/master Home: https://github.com/moses-smt/mosesdecoder Commit: be1f959a1ab6f483107b90b63bb447287c8fd131 https://github.com/moses-smt/mosesdecoder/commit/be1f959a1ab6f483107b90b63bb447287c8fd131 Author: Rico Sennrich <ric...@gm...> Date: 2012-07-11 (Wed, 11 Jul 2012) Changed paths: M scripts/recaser/train-recaser.perl Log Message: ----------- truecase corpus before training recaser gives better results in (small) test, and the code already had a placeholder for it. (without truecasing, the recaser is more likely to uppercase words like "the" if they are often sentence-initial in the training corpus) If people don't want the default behavior changed, I can disable truecasing by default and add a command line parameter to enable it. Commit: bed4bc08ad3d0c223647ae91c7ddf4d640487068 https://github.com/moses-smt/mosesdecoder/commit/bed4bc08ad3d0c223647ae91c7ddf4d640487068 Author: Rico Sennrich <ric...@gm...> Date: 2012-07-11 (Wed, 11 Jul 2012) Changed paths: M scripts/recaser/recase.perl Log Message: ----------- distortion limit for recaser should be 0 Compare: https://github.com/moses-smt/mosesdecoder/compare/cfc1a7167059...bed4bc08ad3d |