textacy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

Features

Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions
Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
Clean, normalize, and explore raw text before processing it with spaCy
Extract structured information from processed documents, including n-grams, entities, acronyms, keyterms, and SVO triples
Compare strings and sequences using a variety of similarity metrics
Tokenize and vectorize documents then train, interpret, and visualize topic models

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow textacy

textacy Web Site

Other Useful Business Software

Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial

Rate This Project

User Reviews

Be the first to post a review of textacy!

Additional Project Details

Programming Language

Python

Related Categories

Python Natural Language Processing (NLP) Tool

Registered

2025-01-22

Similar Business Software

LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
spaCy

spaCy is designed to help you do real work, build real products, or gather real insights. The library respects your time and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
NLTK

The Natural Language Toolkit (NLTK) is a comprehensive, open source Python library designed for human language data processing. It offers user-friendly interfaces to over 50 corpora and lexical resources, such as WordNet, along with a suite of text processing libraries for tasks including...

See Software
kama.ai

A Responsible AI Agent platform providing accurate, accountable, and safe AI for your organization. As a Composite (hybrid) platform, it combines Knowledge Graph AI, governed Generative AI, and Intelligent Automation technologies. This combination gives you trusted answers that are accurate...

See Software
Enterprise Bot

Enterprise Bot, based in Switzerland, is a pioneer in Conversational AI, Process Automation, and Generative AI. With the trust of esteemed enterprise giants across industries like Generali, SIX, SBB, DHL, and SWICA, Enterprise Bot is revolutionizing both customer and employee experiences....

See Software