Build gen AI apps with an all-in-one modern database: MongoDB Atlas
MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
Start Free
Simple, Secure Domain Registration
Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.
Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
Text categorization, arabic language processing, language modeling
The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references:
(1) For Watan-2004 corpus
----------------------
M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on...
pyIRDG is a program written in Python to generate relational datasets in Prolog format. It uses data from the Internet Movie Database in combination with IMDbPY as backend. A graphical user interface written in pyQt allows the user to link multiple entities together as model for the generation process. The big four entities are Title, Person, Company and Character. Many attributes can be chosen for adding to the output .pl file. Three types of constraints on attributes are available to limit...