• Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • Say goodbye to broken revenue funnels and poor customer experiences Icon
    Say goodbye to broken revenue funnels and poor customer experiences

    Connect and coordinate your data, signals, tools, and people at every step of the customer journey.

    LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
    Learn More
  • 1
    PaperAI

    PaperAI

    Semantic search and workflows for medical/scientific papers

    PaperAI is an open-source framework for searching and analyzing scientific papers, particularly useful for researchers looking to extract insights from large-scale document collections.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    BEIR

    BEIR

    A Heterogeneous Benchmark for Information Retrieval

    BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DOLMA

    DOLMA

    Data and tools for generating and inspecting OLMo pre-training data

    DOLMA (Data Optimization and Learning for Model Alignment) is a framework designed to manage large-scale datasets for training and fine-tuning language models efficiently.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Network Management Software and Tools for Businesses and Organizations | Auvik Networks Icon
    Network Management Software and Tools for Businesses and Organizations | Auvik Networks

    Mapping, inventory, config backup, and more.

    Reduce IT headaches and save time with a proven solution for automated network discovery, documentation, and performance monitoring. Choose Auvik because you'll see value in minutes, and stay with us to improve your IT for years to come.
    Learn More
  • 5
    FastRAG

    FastRAG

    Efficient Retrieval Augmentation and Generation Framework

    fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool set for advancing retrieval augmented generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Milvus Bootcamp

    Milvus Bootcamp

    Dealing with all unstructured data, such as reverse image search

    Milvus Bootcamp is a collection of tutorials, examples, and best practices for using Milvus, an open-source vector database designed for AI-powered similarity search and retrieval applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SparseML

    SparseML

    Libraries for applying sparsification recipes to neural networks

    SparseML is an optimization toolkit for training and deploying deep learning models using sparsification techniques like pruning and quantization to improve efficiency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Data-Juicer

    Data-Juicer

    Data processing for and with foundation models

    Data-Juicer is an open-source data processing and augmentation framework designed to enhance the quality and diversity of datasets for machine learning tasks. It includes a modular pipeline for scalable data transformation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    AutoGPTQ

    AutoGPTQ

    An easy-to-use LLMs quantization package with user-friendly apis

    AutoGPTQ is an implementation of GPTQ (Quantized GPT) that optimizes large language models (LLMs) for faster inference by reducing their computational footprint while maintaining accuracy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution Icon
    Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution

    K-12 Schools, Higher Education, Businesses, Restaurants

    Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
    Learn More
  • 10
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NNCF

    NNCF

    Neural Network Compression Framework for enhanced OpenVINO

    NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    API-for-Open-LLM

    API-for-Open-LLM

    Openai style api for open large language models

    API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DeepSparse

    DeepSparse

    Sparsity-aware deep learning inference runtime for CPUs

    A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Keras Hub

    Keras Hub

    Pretrained model hub for Keras 3

    Keras Hub is a repository of pre-trained models for Keras 3, offering a collection of ready-to-use models for various machine-learning tasks. KerasHub is an extension of the core Keras API; KerasHub components are provided as Layer and Model implementations. If you are familiar with Keras, congratulations. You already understand most of KerasHub.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    SetFit

    SetFit

    Efficient few-shot learning with Sentence Transformers

    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DataProfiler

    DataProfiler

    Extract schema, statistics and entities from datasets

    DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Adapters

    Adapters

    A Unified Library for Parameter-Efficient Learning

    Adapters is an add-on library to HuggingFace's Transformers, integrating 10+ adapter methods into 20+ state-of-the-art Transformer models with minimal coding overhead for training and inference. Adapters provide a unified interface for efficient fine-tuning and modular transfer learning, supporting a myriad of features like full-precision or quantized training (e.g. Q-LoRA, Q-Bottleneck Adapters, or Q-PrefixTuning), adapter merging via task arithmetics or the composition of multiple adapters...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TorchDistill

    TorchDistill

    A coding-free framework built on PyTorch

    torchdistill (formerly kdkit) offers various state-of-the-art knowledge distillation methods and enables you to design (new) experiments simply by editing a declarative yaml config file instead of Python code. Even when you need to extract intermediate representations in teacher/student models, you will NOT need to reimplement the models, which often change the interface of the forward, but instead specify the module path(s) in the yaml file. In addition to knowledge distillation, this...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    torchtext

    torchtext

    Data loaders and abstractions for text and NLP

    We recommend Anaconda as a Python package management system. Please refer to pytorch.org for the details of PyTorch installation. LTS versions are distributed through a different channel than the other versioned releases. Alternatively, you might want to use the Moses tokenizer port in SacreMoses (split from NLTK). You have to install SacreMoses. To build torchtext from source, you need git, CMake and C++11 compiler such as g++. When building from source, make sure that you have the same C++...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Colossal-AI

    Colossal-AI

    Making large AI models cheaper, faster and more accessible

    The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine. There is an urgent demand to train models in a distributed environment....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DeepPavlov

    DeepPavlov

    A library for deep learning end-to-end dialog systems and chatbots

    DeepPavlov makes it easy for beginners and experts to create dialogue systems. The best place to start is with user-friendly tutorials. They provide quick and convenient introduction on how to use DeepPavlov with complete, end-to-end examples. No installation needed. Guides explain the concepts and components of DeepPavlov. Follow step-by-step instructions to install, configure and extend DeepPavlov framework for your use case. DeepPavlov is an open-source framework for chatbots and virtual...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    This project is developed based on the commercially available large model Llama-2 released by Meta. It is the second phase of the Chinese LLaMA&Alpaca large model project. The Chinese LLaMA-2 base model and the Alpaca-2 instruction fine-tuning large model are open-sourced. These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Wikipedia2Vec

    Wikipedia2Vec

    A tool for learning vector representations of words and entities

    Wikipedia2Vec is an embedding learning tool that creates word and entity vector representations from Wikipedia, enabling NLP models to leverage structured and contextual knowledge.
    Downloads: 1 This Week
    Last Update:
    See Project