Showing 28 open source projects for "linux is"

View related business solutions
  • Context for your AI agents Icon
    Context for your AI agents

    Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.

    Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
    Try for free
  • Desktop and Mobile Device Management Software Icon
    Desktop and Mobile Device Management Software

    It's a modern take on desktop management that can be scaled as per organizational needs.

    Desktop Central is a unified endpoint management (UEM) solution that helps in managing servers, laptops, desktops, smartphones, and tablets from a central location.
    Learn More
  • 1
    Synthea Patient Generator

    Synthea Patient Generator

    Synthetic Patient Population Simulator

    SyntheaTM is an open-source, synthetic patient generator that models the medical history of synthetic patients. Our mission is to provide high-quality, synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. The models used to generate synthetic patients are informed by...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    Copulas

    Copulas

    A library to model multivariate data using copulas

    Copulas is a Python library for modeling multivariate distributions and sampling from them using copula functions. Given a table of numerical data, use Copulas to learn the distribution and generate new synthetic data following the same statistical properties. Choose from a variety of univariate distributions and copulas – including Archimedian Copulas, Gaussian Copulas and Vine Copulas. Compare real and synthetic data visually after building your model. Visualizations are available as 1D...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    Mimesis

    Mimesis

    High-performance fake data generator for Python

    Mimesis is an open source high-performance fake data generator for Python, able to provide data for various purposes in various languages. It's currently the fastest fake data generator for Python, and supports many different data providers that can produce data related to people, food, transportation, internet and many more. Mimesis is really easy to use, with everything you need just an import away. Simply import an object, called a Provider, which represents the type of data you...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    CTGAN

    CTGAN

    Conditional GAN for generating synthetic tabular data

    CTGAN is a collection of Deep Learning based synthetic data generators for single table data, which are able to learn from real data and generate synthetic data with high fidelity. If you're just getting started with synthetic data, we recommend installing the SDV library which provides user-friendly APIs for accessing CTGAN. The SDV library provides wrappers for preprocessing your data as well as additional usability features like constraints. When using the CTGAN library directly, you may...
    Downloads: 0 This Week
    Last Update:
    See Project
  • The complete IT asset and license management platform Icon
    The complete IT asset and license management platform

    Gain full visibility and control over your IT assets, licenses, usage and spend in one place with Setyl.

    The platform seamlessly integrates with 100+ IT systems, including MDM, RMM, IDP, SSO, HR, finance, helpdesk tools, and more.
    Learn More
  • 5
    Synthetic Data Kit

    Synthetic Data Kit

    Tool for generating high quality Synthetic datasets

    Synthetic Data Kit is a CLI-centric toolkit for generating high-quality synthetic datasets to fine-tune Llama models, with an emphasis on producing reasoning traces and QA pairs that line up with modern instruction-tuning formats. It ships an opinionated, modular workflow that covers ingesting heterogeneous sources (documents, transcripts), prompting models to create labeled examples, and exporting to fine-tuning schemas with minimal glue code. The kit’s design goal is to shorten the “data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    YData Synthetic

    YData Synthetic

    Synthetic data generators for tabular and time-series data

    A package to generate synthetic tabular and time-series data leveraging state-of-the-art generative models. Synthetic data is artificially generated data that is not collected from real-world events. It replicates the statistical components of real data without containing any identifiable information, ensuring individuals' privacy. This repository contains material related to Generative Adversarial Networks for synthetic data generation, in particular regular tabular data and time-series. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    BlenderProc

    BlenderProc

    Blender pipeline for photorealistic training image generation

    A procedural Blender pipeline for photorealistic training image generation. BlenderProc has to be run inside the blender python environment, as only there we can access the blender API. Therefore, instead of running your script with the usual python interpreter, the command line interface of BlenderProc has to be used. In general, one run of your script first loads or constructs a 3D scene, then sets some camera poses inside this scene and renders different types of images (RGB, distance,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Field Sales+ for MS Dynamics 365 and Salesforce Icon
    Field Sales+ for MS Dynamics 365 and Salesforce

    Maximize your sales performance on the go.

    Bring Dynamics 365 and Salesforce wherever you go with Resco’s solution. With powerful offline features and reliable data syncing, your team can access CRM data on mobile devices anytime, anywhere. This saves time, cuts errors, and speeds up customer visits.
    Learn More
  • 10
    Zylthra

    Zylthra

    Zylthra: A PyQt6 app to generate synthetic datasets with DataLLM.

    Welcome to Zylthra, a powerful Python-based desktop application built with PyQt6, designed to generate synthetic datasets using the DataLLM API from data.mostly.ai. This tool allows users to create custom datasets by defining columns, configuring generation parameters, and saving setups for reuse, all within a sleek, dark-themed interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Twinify

    Twinify

    Privacy-preserving generation of a synthetic twin to a data set

    twinify is a software package for the privacy-preserving generation of a synthetic twin to a given sensitive tabular data set. On a high level, twinify follows the differentially private data-sharing process introduced by Jälkö et al.. Depending on the nature of your data, twinify implements either the NAPSU-MQ approach described by Räisä et al. or finds an approximate parameter posterior for any probabilistic model you formulated using differentially private variational inference (DPVI)....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Synth

    Synth

    The Declarative Data Generator

    Synth is an open-source data-as-code tool that provides a simple CLI workflow for generating consistent data in a scalable way. Use Synth to generate correct, anonymized data that looks and quacks like production. Generate test data fixtures for your development, testing, and continuous integration. Generate data that tells the story you want to tell. Specify constraints, relations, and all your semantics. Seed development and environments and CI. Anonymize sensitive production data. Create...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DBFeeder

    DBFeeder

    Highly Customizable Test Data Generator

    DBFeeder is a great tool to generate synthetic testdata for Oracle Databases and it is ideal for companies who wants to outsource development. Thanks to his original approach, data can be highly customizable and it even fits primary and foreign keys constraints of tables.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    nITROGEN
    Internet of Things RandOm GENerator
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Synthetic Mixed Data Generator
    A Synthetic Data Generator for producing mixed datasets described by relevant, irrelevant, and redundant features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ML for Trading

    ML for Trading

    Code for machine learning for algorithmic trading, 2nd edition

    On over 800 pages, this revised and expanded 2nd edition demonstrates how ML can add value to algorithmic trading through a broad range of applications. Organized in four parts and 24 chapters, it covers the end-to-end workflow from data sourcing and model development to strategy backtesting and evaluation. Covers key aspects of data sourcing, financial feature engineering, and portfolio management. The design and evaluation of long-short strategies based on a broad range of ML algorithms,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Tofu

    Tofu

    Tofu is a Python tool for generating synthetic UK Biobank data

    Tofu is a Python library for generating synthetic UK Biobank data. The UK Biobank is a large open-access prospective research cohort study of 500,000 middle-aged participants recruited in England, Scotland and Wales. The study has collected and continues to collect extensive phenotypic and genotypic detail about its participants, including data from questionnaires, physical measures, sample assays, accelerometry, multimodal imaging, genome-wide genotyping and longitudinal follow-up for a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    rpackage conjurer

    Synthetic data generation using R

    Builds synthetic data applicable across multiple domains. This package also provides flexibility to control data distribution to make it relevant to many industry examples.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TGAN

    TGAN

    Generative adversarial training for generating synthetic tabular data

    We are happy to announce that our new model for synthetic data called CTGAN is open-sourced. The new model is simpler and gives better performance on many datasets. TGAN is a tabular data synthesizer. It can generate fully synthetic data from real data. Currently, TGAN can generate numerical columns and categorical columns. TGAN has been developed and runs on Python 3.5, 3.6 and 3.7. Also, although it is not strictly required, the usage of a virtualenv is highly recommended in order to avoid...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    benerator is a framework for creating realistic and valid high-volume test data, used for load and performance testing and showcase setup. Data is generated from an easily configurable metadata model and exported to databases, XML, CSV or flat files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    OraMasking

    Data masking tool for Oracle database

    For Oracle database, mask sensitive data by replacement (static or expression), substitution (synthetic data set is included), random values (Lorum Ipsum) or deletion. Generates update/delete statements or triggers or runs directly against the database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Ava: Testdata Xsl

    Ava: Testdata Xsl

    generates Testdata on base of excel: creates xml,excel,csv,html,sql,+

    this tool for test-data-generation receives an 'excel-sheet' as primary input. second important paramter is the 'number of test-records to produce'. The excel-data will be reused as long data is needed. This tool is hightly paramatrisazable by the use of 'xsl scripts'. data can be created, updated, modified and finally exported in a format of your choice Main Fuctions: (1) Generates Testdata (excel, xsl, xml) (2) Exports generated testdata in multiple formats (csv, excel, html,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    A Data Generator

    A tool to generate synthetic test data useful to Record matchers

    With growing amount of information from multiple sources it has become very hard to relate information to the correct real life entities. Record matching software try to solve this by machine learning techniques. To do this effectively, its necessary to train the record matcher with proper test data which is identical to real life data. Hence, there is a need for a data generator to create the synthetic data to be used for evaluating the quality and capability of record matching software....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    DATA Gen™

    DATA Gen™

    DATA Gen™ - Test Data Generator to generate realistic test data.

    DATA Gen™ Test Data Generator offers facilities to automate the task of creating test data for new or existing data bases. It helps lower the programming effort required, while reducing manual test data generation errors and the ripple effect that they cause on production systems, users and maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    This software allows for a user to generate test data. This is useful for testing Hadoop or other data processing clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next