Cross-Language Computational Linguistics download

AFEWC corpus is a multilingual comparable text articles in Arabic, French, and English languages. Each triple article is related to the same topic (aligned at article level). AFEWC corpus is collected from Wikipedia. The corpus is available for free for research purposes only. It is composed of 40K aligned articles, 91.3M English words, 57.8M French words, 22M Arabic words, 2.8M English unique words, 1.9M French unique words, and 1.5M Arabic unique words.

Wikipedia text is available under Creative Commons Attribution-ShareAlike 3.0 License. https://en.wikipedia.org/wiki/Wikipedia:About
To cite the corpora:
M. Saad, D. Langlois, and K. Smaïli. Extracting Comparable Articles from Wikipedia and Measuring their Comparabilities. Procedia - Social and Behavioral Sciences, 95(0):40 – 47, 2013. ISSN 1877-0428.

Project Activity

See All Activity >

License

Creative Commons Attribution Non-Commercial License V2.0, Other License

Follow Cross-Language Computational Linguistics

Cross-Language Computational Linguistics Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free

Rate This Project

User Reviews

Be the first to post a review of Cross-Language Computational Linguistics!

Additional Project Details

User Interface

Non-interactive (Daemon)

Registered

2012-10-04

Similar Business Software

Altium Develop

Altium Develop is a multidisciplinary product creation platform that breaks down silos and empowers teams to design collaboratively without limits. Built on Altium Designer and Altium 365, it unifies electrical, mechanical, software, sourcing, and manufacturing teams in a shared environment....

See Software
ActCAD Software

ActCAD is a native dwg/dxf cad software suitable for professional 2D drafting and 3D modeling projects. ActCAD is trusted by over 30000 users in over 103 countries for more than 10 years. The interface, commands, icons, dialogs, shortcuts etc. are very much similar to other popular cad software...

See Software
Houzz Pro

Houzz Pro is the #1 construction management solution for residential contractors and designers. Get an all-in-one solution that spans the full customer lifecycle, including marketing, CRM, estimates, takeoffs, 3D floor plans, project management, selections, online invoicing & payments,...

See Software
Evocon

Trusted by manufacturers worldwide, Evocon is a simple and easy-to-use OEE software that helps manufacturing companies improve their production efficiency and reduce waste. The system enables automated data collection, real-time data visualization, downtime tracking, bottleneck identification,...

See Software
Dronedesk

Are you wasting hours on drone flight planning? Still using spreadsheets, doc templates, and paper checklists? If so, it's time to switch to Dronedesk, the web-based drone operations management application that makes planning safe drone flights super-efficient. Dronedesk does all the...

See Software
The Asset Guardian EAM (TAG)

Meet The Asset Guardian (TAG) Mobi – Now with mobiMentor AI to Maximize Wrench Time TAG Mobi is an AI-powered EAM solution for Microsoft Dynamics 365 Business Central, now enhanced with mobiMentor AI — an agentic AI ecosystem that gives maintenance experts more wrench time by automating admin...

See Software