Showing 42 open source projects for "gpu max performance"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    fastai

    fastai

    Deep learning library

    fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    SSD in PyTorch 1.0

    SSD in PyTorch 1.0

    High quality, fast, modular reference implementation of SSD in PyTorch

    This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for research based on SSD. Multi-GPU training and inference: We use DistributedDataParallel, you can train or test with arbitrary GPU(s), the training schema will change accordingly. Add your own modules without pain. We abstract backbone, Detector, BoxHead, BoxPredictor, etc. You can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    FEDML Open Source

    FEDML Open Source

    The unified and scalable ML library for large-scale training

    ...Highly integrated with TensorOpera open source library, TensorOpera AI provides holistic support of three interconnected AI infrastructure layers: user-friendly MLOps, a well-managed scheduler, and high-performance ML libraries for running any AI jobs across GPU Clouds. A typical workflow is shown in the figure above. When a developer wants to run a pre-built job in Studio or Job Store, TensorOperaLaunch swiftly pairs AI jobs with the most economical GPU resources, and auto-provisions, and effortlessly runs the job, eliminating complex environment setup and management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SIG Rust

    SIG Rust

    Rust language bindings for TensorFlow

    SIG Rust provides idiomatic Rust bindings for TensorFlow, making it possible for developers to work with TensorFlow functionality from within the Rust programming language. Rather than replacing TensorFlow itself, it acts as an integration layer that connects Rust applications to the TensorFlow C API. The repository is designed for developers who want Rust’s performance, safety, and systems programming strengths while still accessing TensorFlow’s machine learning capabilities. It includes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 5
    MLPACK is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and flexibility for expert users. * More info + downloads: https://mlpack.org * Git repo: https://github.com/mlpack/mlpack
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    YOLOv4-large

    YOLOv4-large

    Scaled-YOLOv4: Scaling Cross Stage Partial Network

    YOLOv4-large is an open-source implementation of the Scaled-YOLOv4 object detection architecture, designed to improve both the accuracy and scalability of real-time computer vision models. The project provides a PyTorch implementation of the Scaled-YOLOv4 framework, which extends the original YOLOv4 architecture using Cross Stage Partial (CSP) networks and new scaling techniques. Unlike earlier object detection systems that only scale depth or width, this architecture scales multiple aspects...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TensorLayer

    TensorLayer

    Deep learning and reinforcement learning library for scientists

    TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extensive collection of customizable neural layers to build advanced AI models quickly, based on this, the community open-sourced mass tutorials and applications. TensorLayer is awarded the 2017 Best Open Source Software by the ACM Multimedia Society. This project can also be found at OpenI and Gitee. 3.0.0 has been pre-released, the current version...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    BytePS

    BytePS

    A high performance and generic framework for distributed DNN training

    BytePS is a high-performance and generally distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on either TCP or RDMA networks. BytePS outperforms existing open-sourced distributed training frameworks by a large margin. For example, on BERT-large training, BytePS can achieve ~90% scaling efficiency with 256 GPUs (see below), which is much higher than Horovod+NCCL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 10
    imgaug

    imgaug

    Image augmentation for machine learning experiments

    imgaug is a library for image augmentation in machine learning experiments. It supports a wide range of augmentation techniques, allows to easily combine these and to execute them in random order or on multiple CPU cores, has a simple yet powerful stochastic interface and can not only augment images but also key points/landmarks, bounding boxes, heatmaps and segmentation maps. Affine transformations, perspective transformations, contrast changes, gaussian noise, dropout of regions,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Tensorpack

    Tensorpack

    A Neural Net Training Interface on TensorFlow, with focus on speed

    ...On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not offer the data processing flexibility needed in research. Tensorpack squeezes the most performance out of pure Python with various auto parallelization strategies. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DIGITS

    DIGITS

    Deep Learning GPU training system

    The NVIDIA Deep Learning GPU Training System (DIGITS) puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks. DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real-time with advanced visualizations, and selecting the best performing model from the results browser for deployment. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Intel neon

    Intel neon

    Intel® Nervana™ reference deep learning framework

    neon is Intel's reference deep learning framework committed to best performance on all hardware. Designed for ease of use and extensibility. See the new features in our latest release. We want to highlight that neon v2.0.0+ has been optimized for much better performance on CPUs by enabling Intel Math Kernel Library (MKL). The DNN (Deep Neural Networks) component of MKL that is used by neon is provided free of charge and downloaded automatically as part of the neon installation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    NNVM

    NNVM

    Open deep learning compiler stack for cpu, gpu

    The vision of the Apache NNVM Project is to host a diverse community of experts and practitioners in machine learning, compilers, and systems architecture to build an accessible, extensible, and automated open-source framework that optimizes current and emerging machine learning models for any hardware platform. Compilation of deep learning models into minimum deployable modules. Infrastructure to automatically generates and optimize models on more backend with better performance....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Caffe2

    Caffe2

    Caffe2 is a lightweight, modular, and scalable deep learning framework

    Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind. Caffe2 is a deep learning framework that provides an easy and straightforward way for you to experiment with deep learning and leverage community contributions of new models and algorithms. You can bring your creations to scale using the power of GPUs in the cloud or to the masses on mobile with Caffe2’s cross-platform...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    GPU Machine Learning Library. This library aims to provide machine learning researchers and practitioners with a high performance library by taking advantage of the GPU enormous computational power. The library is developed in C++ and CUDA.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    LBP in multiple platforms

    LBP implementation in multiple computing platforms (ARM,GPU, DSP...)

    The Local Binary Pattern (LBP) is a texture operator that is used in several different computer vision applications and implemented in a variety of platforms. When selecting a suitable LBP implementation platform, the specific application and its requirements in terms of performance, size, energy efficiency, cost and developing time has to be carefully considered. This is a software toolbox that collects software implementations of the Local Binary Pattern operator in several platforms: - OpenCL for CPU & GPU - OpenCL for GPU (branchless) - C code optimized for ARM - OpenGL ES 2.0 shaders mobile GPUs - C code for TI C64x DSP core (branchless) - C code for TTA processor synthesis If you use the code somewhere, please cite: Bordallo López M., Nieto A., Boutellier J., Hannuksela J., and Silvén O. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB