textacy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

Features

Access and extend spaCy's core functionality for working with one or many documents through convenient methods and custom extensions
Load prepared datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
Clean, normalize, and explore raw text before processing it with spaCy
Extract structured information from processed documents, including n-grams, entities, acronyms, keyterms, and SVO triples
Compare strings and sequences using a variety of similarity metrics
Tokenize and vectorize documents then train, interpret, and visualize topic models

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow textacy

textacy Web Site

Other Useful Business Software

$300 Free Credits for Your Google Cloud Projects

Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial

Rate This Project

User Reviews

Be the first to post a review of textacy!

Additional Project Details

Programming Language

Python

Related Categories

Python Natural Language Processing (NLP) Tool

Registered

2025-01-22

Similar Business Software

spaCy

spaCy is designed to help you do real work, build real products, or gather real insights. The library respects your time and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the...

See Software
LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
NLTK

The Natural Language Toolkit (NLTK) is a comprehensive, open source Python library designed for human language data processing. It offers user-friendly interfaces to over 50 corpora and lexical resources, such as WordNet, along with a suite of text processing libraries for tasks including...

See Software
kama.ai

A Responsible AI Agent platform providing accurate, accountable, and safe AI for your organization. As a Composite (hybrid) platform, it combines Knowledge Graph AI, governed Generative AI, and Intelligent Automation technologies. This combination gives you trusted answers that are accurate...

See Software
TextBlob

TextBlob is a Python library for processing textual data, offering a simple API to perform common natural language processing tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, and classification. It stands on the giant shoulders of NLTK and Pattern, and plays...

See Software