• Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    textlint

    textlint

    The pluggable natural language linter for text and markdown

    Textlint is an extensible linting tool for text and markdown files, designed to enforce style guidelines, detect errors, and improve writing quality.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Data-Juicer

    Data-Juicer

    Data processing for and with foundation models

    Data-Juicer is an open-source data processing and augmentation framework designed to enhance the quality and diversity of datasets for machine learning tasks. It includes a modular pipeline for scalable data transformation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    NLG-Eval

    NLG-Eval

    Evaluation code for various unsupervised automated metrics

    NLG-Eval is a toolkit for evaluating the quality of natural language generation (NLG) outputs using multiple automated metrics such as BLEU, METEOR, and ROUGE.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    UniEM

    UniEM

    Unified embedding model

    UniEM is a unified embedding model designed to create high-quality text embeddings for various natural language processing tasks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    AllenNLP

    AllenNLP

    An open-source NLP research library, built on PyTorch

    AllenNLP makes it easy to design and evaluate new deep learning models for nearly any NLP problem, along with the infrastructure to easily run them in the cloud or on your laptop. AllenNLP includes reference implementations of high quality models for both core NLP problems (e.g. semantic role labeling) and NLP applications (e.g. textual entailment). AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional allennlp subcommands. There is ecosystem of open source plugins, some of which are maintained by the AllenNLP team here at AI2, and some of which are maintained by the broader community. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    MTBook

    MTBook

    Machine Translation: Foundations and Models

    This is a tutorial, the purpose is to introduce the basic knowledge and modeling methods of machine translation systematically, and on this basis, discuss some cutting-edge technologies of machine translation (formerly known as "Machine Translation: Statistical Modeling and Deep Learning") method"). Its content is compiled into a book, which can be used for the study of senior undergraduates and graduate students in computer and artificial intelligence related majors, and can also be used as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    XLM (Cross-lingual Language Model)

    XLM (Cross-lingual Language Model)

    PyTorch original implementation of Cross-lingual Language Model

    ...The repository provides preprocessing pipelines, training code, and fine-tuning scripts so you can reproduce benchmark results or adapt models to your own multilingual corpora. Pretrained checkpoints cover dozens of languages and multiple model sizes, balancing quality and compute needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    CC-Net

    CC-Net

    Tools to download and cleanup Common Crawl data

    cc_net provides tools to download, segment, clean, and filter Common Crawl to build large-scale text corpora, including monolingual datasets and the multilingual CC-100 collection introduced in the associated paper. It includes pipelines to fetch snapshots, extract text, de-duplicate, identify language, and apply quality filtering based on heuristics and language models. The outputs are intended for pretraining language models and for creating standardized corpora that can be reproduced or updated with new crawls. The repository documents practical concerns like HTTP failures, snapshot differences, and stats JSONs, reflecting community use across many languages. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NLP Best Practices

    NLP Best Practices

    Natural Language Processing Best Practices & Examples

    In recent years, natural language processing (NLP) has seen quick growth in quality and usability, and this has helped to drive business adoption of artificial intelligence (AI) solutions. In the last few years, researchers have been applying newer deep learning methods to NLP. Data scientists started moving from traditional methods to state-of-the-art (SOTA) deep neural network (DNN) algorithms which use language models pretrained on large text corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Chatito

    Chatito

    Dataset generation for AI chatbots, NLP tasks

    Chatito is a tool that helps generate datasets for training and validating chatbot models using a simple domain-specific language (DSL).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    NeuroNER

    NeuroNER

    Named-entity recognition using neural networks

    ...Enables the users to create or modify annotations for a new or existing corpus. Train the neural network that performs the NER. During the training, NeuroNER allows monitoring of the network. Evaluate the quality of the predictions made by NeuroNER. The performance metrics can be calculated and plotted by comparing the predicted labels with the gold labels.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo