Menu

Doubts on corpora/datasets with CC licenses

Help
2016-08-01
2016-08-01
  • Robert Shenoy

    Robert Shenoy - 2016-08-01

    Hi
    I represent a very small company into transcription We want to train acoustic models with the TEDLIUM corpus and other datasets since we cannot afford the paid-for corpora. Since TEDLIUM and other such datasets have a CC licence with noncommercial and nonderivative clauses, I am unsure as to whether they can be used for commercial purposes. As the actual data is not used for commercial purposes but only the acoustic models, can we say that it does not breach the license terms. Also, since the acoustic model is just a collection of statistical data, can it be considered not a derivative work? What is your take on this?

     
    • Nickolay V. Shmyrev

      This is not a right place to ask such questions. If you are in doubt ask a lawyer.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.