Showing 14 open source projects for "define"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Toapi

    Toapi

    Convert websites into structured APIs automatically with Python tool

    ...Instead of building a traditional web crawler that collects and stores data before exposing it through an API, Toapi simplifies the process by allowing developers to define data structures that automatically generate an API layer from existing web pages. It works by parsing HTML content from a source site and mapping selected elements into structured data that can be returned as JSON through API endpoints. Developers define items and routes that determine how web pages are parsed and how the resulting data is exposed through the API interface. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    gain

    gain

    Asyncio-based Python framework for building fast web crawling spiders

    ...It is built on top of asynchronous technologies such as asyncio, aiohttp, and uvloop to support high-performance crawling with concurrent network requests. It provides a structured framework for creating spiders that can navigate websites, extract structured data, and process the collected results. Developers define crawlers using components such as spiders, parsers, and items, allowing them to organize crawling logic and data extraction rules clearly. Gain supports CSS selectors and XPath expressions for parsing page content and extracting specific elements. Gain also allows developers to configure headers, concurrency levels, and proxy settings to control how crawlers interact with target websites. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Geziyor

    Geziyor

    Blazing fast Go framework for web crawling and data scraping tasks

    ...It focuses on speed and scalability, allowing large numbers of requests to be processed concurrently. Geziyor supports use cases such as data mining, monitoring web content, and automated testing workflows. It provides a flexible architecture where developers define parsing functions that process responses and extract the desired data. Geziyor includes features for managing requests, handling cookies, respecting robots rules, and exporting collected data in multiple formats. With built-in tools for caching, metrics collection, and proxy management, it enables developers to build robust and customizable scraping systems using Go.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    skycaiji

    skycaiji

    Open source web scraping system for automated data collection tasks

    SkyCaiji is an open source web scraping and data collection system designed to gather information from websites through configurable extraction rules. It focuses on simplifying the process of building crawlers by allowing users to visually define scraping rules rather than writing complex code. It can collect structured or unstructured data from many types of webpages and automate the extraction process for large datasets. SkyCaiji is designed to run on a variety of hosting environments including local machines, shared hosting environments, and cloud servers. It integrates with content management systems so collected data can be published automatically without manual intervention. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    kimuraframework

    kimuraframework

    AI-first Ruby framework for building fast, flexible web scraping spide

    Kimurai is an open source web scraping framework written in Ruby that simplifies the process of building automated data extraction tools. It provides a clean domain-specific language that allows developers to define scraping logic and data schemas with minimal boilerplate code. Kimurai can use AI-assisted extraction to identify where data resides in HTML pages, automatically generating selectors that are cached for future use so subsequent scraping runs operate with pure Ruby performance. Kimurai supports scraping both static and JavaScript-rendered websites by working with multiple engines, including headless browsers and simple HTTP-based approaches. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Roach

    Roach

    The complete web scraping toolkit for PHP

    Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package on its own or install one of the framework-specific adapters. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    ...If you have a job website you'd like to write a scraper for, you are welcome to implement it, Review the Base Scraper for implementation details. JobFunnel supports scraping jobs from the same job website across locales & domains. If you are interested in adding support, you may only need to define session headers and domain strings, Review the Base Scraper for further implementation details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    crawly

    crawly

    High-level web crawling and scraping framework for Elixir apps

    ...Crawly follows the Elixir and OTP architecture model, enabling concurrent and fault-tolerant crawling processes that can handle many requests efficiently. Developers define specialized components called spiders to control how pages are visited and how information is extracted from them. It also supports extensibility through middlewares, pipelines, and fetchers that allow customization of request handling, data processing, and crawling behavior.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    appcrawler

    appcrawler

    Automated mobile app crawler and testing tool built on Appium

    ...AppCrawler works by traversing the interface structure of an application and executing predefined or dynamically discovered actions on clickable components. Its behavior can be customized using configuration files that define traversal rules, element selection logic, and specific actions triggered by conditions encountered during testing. AppCrawler supports rule-based filtering such as blacklists and whitelists to control which elements are explored and which are ignored.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    ruia

    ruia

    Async Python framework for fast and flexible web scraping spiders

    ...Ruia follows a “write less, run faster” philosophy, emphasizing concise code and streamlined spider development. It provides a structured approach to building scraping projects through components such as data items, spiders, middleware, and plugins. Developers can define structured fields to extract information from HTML content and process responses asynchronously to improve crawling performance. It also supports middleware and plugin systems that allow customization of request handling, response processing, and additional functionality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    ...While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers, but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Gecco

    Gecco

    Lightweight Java web crawler framework with jQuery-style extraction

    ...It integrates several well-known Java libraries and frameworks, including tools for HTTP requests, HTML parsing, JSON processing, and application development. Through its annotation-based design, developers can define crawling rules and data extraction logic directly within Java classes, reducing boilerplate code and improving readability. Gecco also provides mechanisms for handling dynamic web content, including support for asynchronous requests and extraction of JavaScript variables from pages. Gecco emphasizes extensibility and follows an open design that allows additional components and integrations to be added without modifying the core codebase.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This is an advanced web scraper with user friendly GUI which let the user define rules and web addresses to extract data from one time or periodically and a target database filed that the data should be saved in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Wadsworth is a java based web scripting engine. It uses user-defined XML scripts to define its actions. It can be used as a web testing tool, or as a web scraper, or to automate any web actions you wish. It can also be invoked and controlled by another
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB