Help save net neutrality! Learn more.
Close

#171 \"Re-activate\" output of word_count

1.6
closed-fixed
None
6
2006-07-02
2006-03-10
No

On older versions of OmegaT (it was there in 1.4.0),
just after the project was loaded, a
file "word_count" was created in the "omegat" folder.

This file gives the total word count of the project,
the total number of unique segments, plus the details
for each file.

The function was called buildWordCounts() in
CommandThread.java.

At some point, this function has been commented out,
with the mention "temporary removed".

In 1.6 RC6, it was still possible to reactivate this
function (only two includes were missing), but the
word count was way too low (it has something to do
with searches, I seem to remember).

It is not possible anymore in RC8, too many functions
have changed.

Could it be possible to have this function back?

Didier

Discussion

  • Maxym Mykhalchuk

    • labels: 481012 -->
    • priority: 5 --> 6
    • assigned_to: nobody --> mihmax
    • status: open --> open-accepted
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    (clearly a regression, so marking it as a bug)

     
  • Didier Briel

    Didier Briel - 2006-03-10

    Logged In: YES
    user_id=1343245

    >clearly a regression, so marking it as a bug
    Great, I didn't dare... :)

    Didier

     
  • Maxym Mykhalchuk

    • milestone: --> 1.6
    • status: open-accepted --> open-fixed
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Fixed in CVS.

    Checking in omegat/src/org/omegat/Bundle.properties
    new revision: 1.52

    Checking in
    omegat/src/org/omegat/core/threads/CommandThread.java
    new revision: 1.33

    done

     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    OK, actually it was reactivated, but the fix does not work
    properly. Thanks a lot to Didier for testing...

    Here go the comments out of our email conversation (DB is
    for Didier Briel comments (most of them), MM is myself):
    DB> The word count is very low, compared to most
    DB> other counting methods

    MM>Current implementation of word_count indeed
    MM>does not count stop words.

    MM>BTW, what about "Here're 2 of them" -- it's
    MM>5 words "Here", "re"(are),
    MM>"2", "of", "them" or less?

    DB> It's 4 words in Word and Writer,
    DB> 3 in OmegaT 1.4.0. That's because OmegaT
    DB> doesn't count numbers.

    DB> OmegaT 1.4.0 is higher than OOo and Word,
    DB> but within reasonable range.
    DB> ...
    DB> Whatever the differences, between 1.4.0
    DB> and Word, I have always found them
    DB> to be reasonable.

     
  • Maxym Mykhalchuk

    • status: open-fixed --> open-accepted
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Closing...
    This bug was fixed in OmegaT version 1.6.0 RC10 just released.

     
  • Maxym Mykhalchuk

    • summary: "Re-activate" output of word_count --> \"Re-activate\" output of word_count
    • status: open-accepted --> closed-accepted
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Oops, closed by mistake...

     
  • Maxym Mykhalchuk

    • status: closed-accepted --> open-accepted
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Fixed in CVS
    Sorry that not in RC10

    Checking in
    omegat/src/org/omegat/core/threads/CommandThread.java
    new revision: 1.35;
    Checking in omegat/src/org/omegat/util/StaticUtils.java
    new revision: 1.31;
    done

     
  • Maxym Mykhalchuk

    • status: open-accepted --> open-fixed
     
  • Didier Briel

    Didier Briel - 2006-06-21

    Logged In: YES
    user_id=1343245

    <<Fixed in CVS
    Sorry that not in RC10>>
    Never mind.
    Testing from the CVS, the results seem very good on a
    basic text (1000 words of Lorem Ipsum...).

    Since it is not closed:
    Any specific reason the "file name" column changed
    position (from 1st to 3rd)?
    I liked it better the other way round, it seems more
    natural to me (and more convenient when adding columns) to
    have "file name, length" than the reverse.

    Didier

     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Didier, the reason is that filename lengths have VERY
    variable length, and numbers of segments - not.

     
  • Didier Briel

    Didier Briel - 2006-06-23

    Logged In: YES
    user_id=1343245

    Since the file is tab separated, I thought it was intended
    mainly to be used in a spreadsheet (that's how I use it),
    in which case the length of a column is of no importance.
    But I can understand the point.

    Didier

     
  • Didier Briel

    Didier Briel - 2006-06-27

    Logged In: YES
    user_id=1343245

    Another thing I noticed.

    In the old version, the list of files was sorted
    alphabetically.
    Now, it isn't sorted anymore, which makes it a little more
    difficult to use.

    Didier

     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Didier, you mean alphabetically by file name, or by full path?

     
  • Maxym Mykhalchuk

    • status: open-fixed --> open-accepted
     
  • Nobody/Anonymous

    Logged In: NO

    Last time I checked (RC9 ?) it looked like the output was similar to the project
    files output: sorted by path.

    I think a sort by file name would make little sense especially in complex
    projects where a lot of subdirectories are involved, as for simple "flat"
    projects a sort by path would be equivalent to a sort by name so I suggest we
    keep (or implement if it is not done) a sort by path.

    JC

     
  • Didier Briel

    Didier Briel - 2006-06-28

    Logged In: YES
    user_id=1343245

    As JC wrote, the files were sorted by path/file.
    I.e., the same exact list you get in the Project files
    window.

    It is (at least for me) the best choice when used for
    tracking progress in a large project (e.g., 1000 HTML
    files in 10 directories).

    Didier

     
  • Maxym Mykhalchuk

    • status: open-accepted --> open-fixed
     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Fixed in CVS, will be released in 1.6.0 RC11
    Now the files get explicitly sorted by path

     
  • Maxym Mykhalchuk

    Logged In: YES
    user_id=488500

    Closing,
    new statistics have appeared in RC11 just released.

     
  • Maxym Mykhalchuk

    • status: open-fixed --> closed-fixed
     

Log in to post a comment.