Showing 18 open source projects for "build"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    crawlee

    crawlee

    A web scraping and browser automation library for Node.js

    Crawlee is a web scraping and browser automation library. It helps you build reliable crawlers. Fast. Crawlee won't fix broken selectors for you (yet), but it helps you build and maintain your crawlers faster. When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages. DouyinCrawler supports both automated scraping and batch operations to process multiple targets efficiently. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    MechanicalSoup

    MechanicalSoup

    A Python library for automating interaction with websites

    A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript. MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    BrowserBox

    BrowserBox

    Remote isolated browser API for security

    ...BrowserBox lets your provide controllable access to web resources in a way that's both more sandboxed than, and less restricted than, traditional web <iframe> elements. Build applications that need cross-origin access, while delivering complex user stories that benefit from an encapsulated browser abstraction. Since the whole stack is written in JavaScript you can easily extend it to suit your needs. The technology that puts unrestricted browser capabilities within reach of a web app has never before existed in the open.
    Downloads: 3 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    skycaiji

    skycaiji

    Open source web scraping system for automated data collection tasks

    ...It integrates with content management systems so collected data can be published automatically without manual intervention. SkyCaiji also supports automated workflows that continuously gather data and process it based on defined collection rules. Its architecture enables users to build scalable web scraping pipelines that can run unattended once configured.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    furl

    furl

    The easiest way to parse and modify URLs in Python

    furl is a small Python library that makes it easier to parse, inspect, build, and modify URLs without relying on tedious low-level URL handling code. It wraps URL components into convenient objects, so developers can work directly with schemes, usernames, passwords, hosts, ports, paths, queries, and fragments. The library supports simple path editing, query argument changes, fragment manipulation, inline method chaining, and URL joining.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    ...Newspaper4k also includes natural language processing capabilities that can generate summaries and identify keywords from extracted article text. Newspaper4k supports both single-article extraction and full news site processing, allowing users to build sources representing entire publications and iterate through their articles. It maintains compatibility with the original project so that existing code written for newspaper3k can continue working with minimal changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Geziyor

    Geziyor

    Blazing fast Go framework for web crawling and data scraping tasks

    ...Geziyor includes features for managing requests, handling cookies, respecting robots rules, and exporting collected data in multiple formats. With built-in tools for caching, metrics collection, and proxy management, it enables developers to build robust and customizable scraping systems using Go.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    dxy-covid-19-crawler

    dxy-covid-19-crawler

    Realtime crawler for COVID-19 outbreak statistics from DXY data

    ...The crawler periodically retrieves pandemic statistics and stores them in a database so that historical changes in the outbreak can be preserved and analyzed later. It was created to make up-to-date infection data more accessible for developers, researchers, and analysts who wanted to build visualizations or conduct data analysis during the early stages of the pandemic. DXY-COVID-19-Crawler automatically crawls data at regular intervals, typically every minute, ensuring that newly published statistics are captured as quickly as possible. Retrieved data is stored in MongoDB and archived so that the entire progression of the outbreak can be traced over time. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    pspider

    pspider

    Simple Python framework for building multithreaded web crawlers

    ...PSpider defines a set of modules and utility classes that help developers manage crawling tasks, filter URLs, and process scraped content. By organizing crawling tasks into structured stages, PSpider allows developers to build scalable spiders while keeping the codebase relatively compact and readable. Its modular design also makes it easier to extend the framework with additional features or integrate it into existing Python projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    gocrawl

    gocrawl

    Polite concurrent web crawler library for Go with flexible hooks

    gocrawl is a lightweight web crawling library written in the Go programming language that enables developers to build custom web crawlers and data extraction tools. gocrawl focuses on providing a minimal yet powerful crawling engine that can be easily extended and adapted for different web scraping or indexing tasks. It is designed to be polite when accessing websites by respecting crawling rules such as robots.txt policies and applying crawl delays for each host.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    serverless-chrome

    serverless-chrome

    Run headless Chrome/Chromium on AWS Lambda

    Serverless Chrome contains everything you need to get started running headless Chrome on AWS Lambda (possibly Azure and GCP Functions soon). The aim of this project is to provide the scaffolding for using Headless Chrome during a serverless function invocation. Serverless Chrome takes care of building and bundling the Chrome binaries and making sure Chrome is running when your serverless function executes. In addition, this project also provides a few example services for common patterns...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ECommerceCrawlers

    ECommerceCrawlers

    Collection of Python ecommerce and website crawler examples projects

    ...It aggregates many independent crawler examples created by contributors and organized into separate subprojects that target specific sites or data sources. These examples demonstrate how to build and operate web scrapers capable of collecting structured information such as product listings, news content, job postings, social media data, and other publicly available web data. It aims to help developers understand the full workflow of web scraping, including request simulation, data extraction, storage, and handling anti-scraping techniques. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    mzitu

    mzitu

    Python crawler that downloads image galleries and analyzes titles

    mzitu is a Python-based web crawling project designed to automatically download and organize image galleries from a specific photography site. It demonstrates how to build a scraper that navigates gallery pages, retrieves image links, and saves the images locally in a structured directory layout. It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    PGBuild

    Compile your mobile web pages into mobile aps via build.phonegap.com

    PGbuild is a Phonegap development system that automates the development process by connecting your CMS/web server with the online service [Phonegap Build](http://build.phonegap.com). PGBuild is essentially a web spider that make off-line versions of web pages. The off-line version is zippped and send to the Phonegap Build service. The spider is controlled by a project file that sets the rules for the spider and the options for the phonebap build service. You may create and manage your phonegap project source files manually on your webserver or use PGBuild to connect to a CMS system to extract content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo