Menu

Tree [f8cdd9] master /
 History

HTTPS access


File Date Author Commit
 Osman_UN_Corpus 2015-10-14 elhaj elhaj [5d68fc] added in UN Corpus as documents. Rearranged the...
 ar 2015-10-12 elhaj elhaj [690435] First Commit
 bin 2015-10-14 elhaj elhaj [f8a500] Updated faseehCount to only conunt in complex w...
 data 2015-10-12 elhaj elhaj [690435] First Commit
 docs 2015-10-12 elhaj elhaj [690435] First Commit
 javadocs 2015-10-16 elhaj elhaj [36a49d] JavaDocs updated
 resources 2015-10-12 elhaj elhaj [690435] First Commit
 src 2015-10-15 elhaj elhaj [bcd53d] Updated the formula to include up-to-date facto...
 .classpath 2015-10-12 elhaj elhaj [690435] First Commit
 .gitignore 2015-10-11 madhajj madhajj [edd4ee] Initial commit
 .project 2015-10-12 elhaj elhaj [690435] First Commit
 LICENSE 2015-10-11 madhajj madhajj [edd4ee] Initial commit
 Mishkal Softwares-0.2.win32.zip 2015-10-12 elhaj elhaj [690435] First Commit
 README.md 2015-10-19 Dr Mahmoud EL-Haj Dr Mahmoud EL-Haj [f8cdd9] Update README.md
 library.zip 2015-10-12 elhaj elhaj [690435] First Commit
 mishkal-console.exe 2015-10-12 elhaj elhaj [690435] First Commit
 mishkal-gui.exe.log 2015-10-12 elhaj elhaj [690435] First Commit
 mishkal-webserver.exe 2015-10-12 elhaj elhaj [690435] First Commit
 mishkal-webserver.exe.log 2015-10-12 elhaj elhaj [690435] First Commit
 python26.dll 2015-10-12 elhaj elhaj [690435] First Commit
 temp3646987849369476241.txt 2015-10-16 elhaj elhaj [36a49d] JavaDocs updated
 temp850049573066017858.txt 2015-10-16 elhaj elhaj [36a49d] JavaDocs updated
 w9xpopen.exe 2015-10-12 elhaj elhaj [690435] First Commit

Read Me

Osman Readability Metric

About

Open Source tool for Arabic text readability

The tool is a Java open source to calculate readability for Arabic text with and without diacritics (Tashkeel).
The tool works better with diacritics added in (we provide a method to allow you add diacritics to plain Arabic text).

How to run

Class TestOSMAN shows how to measure OSMAN readability for text with and without diacritics.
Method calculateOsman(String text) can be called using an instance from the class OsmanReadability.
users can also add and remove diacritics using addTashkeel(String text) and removeTashkeel(String text).

The tool allows you to calculate other readability metrics such as ARI and LIX.
When using Eclipse (or other editors) make sure you set the encoding to UTF-8 for the console output (Run configuration -> Common Tab --> Other Encoding)

Import into Eclipse (step by step for begginers)

  • Install EGit: To install Egit and the Github Mylyn Connector from within Eclipse, navigate to the Help menu inside of Eclipse and select Install New Software. Enter the Juno update site url and search 'git' in the filter box. Once you've selected the EGit, JGit, and Mylyn GitHub items hit Next to finish the installation.
  • In Eclipse go to File --> Import --> and select Git (Import Git repositories from GitHub) from the Select import source window.
  • In the next window type "OsmanReadability" in the box and hit Search
  • The results box should show drelhaj/OsmanReadability
  • Select repository and click Finish

Download OSMAN UN Corpus

You can click on Download Zip button on the right hand side.
Or you can navigate to Osman_UN_Corpus navigate to each folder and download the dataset zip files one by one. To download any of the zip files click on the zip file then click on "View Raw".

Contact

Have a question? Get in touch with us on: dr.melhaj@gmail.com

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.