Compare the Top Data Science Software that integrates with Apache Arrow as of June 2025

This a list of Data Science software that integrates with Apache Arrow. Use the filters on the left to add additional filters for products that have integrations with Apache Arrow. View the products that work with Apache Arrow in the table below.

What is Data Science Software for Apache Arrow?

Data science software is a collection of tools and platforms designed to facilitate the analysis, interpretation, and visualization of large datasets, helping data scientists derive insights and build predictive models. These tools support various data science processes, including data cleaning, statistical analysis, machine learning, deep learning, and data visualization. Common features of data science software include data manipulation, algorithm libraries, model training environments, and integration with big data solutions. Data science software is widely used across industries like finance, healthcare, marketing, and technology to improve decision-making, optimize processes, and predict trends. Compare and read user reviews of the best Data Science software for Apache Arrow currently available using the table below. This list is updated regularly.

  • 1
    Daft

    Daft

    Daft

    Daft is a framework for ETL, analytics and ML/AI at scale. Its familiar Python dataframe API is built to outperform Spark in performance and ease of use. Daft plugs directly into your ML/AI stack through efficient zero-copy integrations with essential Python libraries such as Pytorch and Ray. It also allows requesting GPUs as a resource for running models. Daft runs locally with a lightweight multithreaded backend. When your local machine is no longer sufficient, it scales seamlessly to run out-of-core on a distributed cluster. Daft can handle User-Defined Functions (UDFs) in columns, allowing you to apply complex expressions and operations to Python objects with the full flexibility required for ML/AI. Daft runs locally with a lightweight multithreaded backend. When your local machine is no longer sufficient, it scales seamlessly to run out-of-core on a distributed cluster.
  • Previous
  • You're on page 1
  • Next