Showing 19 open source projects for "web extract"

View related business solutions
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • The #1 Embedded Analytics Solution for SaaS Teams. Icon
    The #1 Embedded Analytics Solution for SaaS Teams.

    Qrvey saves engineering teams time and money with a turnkey multi-tenant solution connecting your data warehouse to your SaaS application.

    Qrvey’s comprehensive embedded analytics software enables you to design more customizable analytics experiences for your end users.
    Try Developer Playground
  • 1
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 2
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Tarsier

    Tarsier

    Vision utilities for web interaction agents

    At Reworkd, we iterated on all these problems across tens of thousands of real web tasks to build a powerful perception system for web agents... Tarsier! In the video below, we use Tarsier to provide webpage perception for a minimalistic GPT-4 LangChain web agent. Tarsier visually tags interactable elements on a page via brackets + an ID e.g. [23]. In doing this, we provide a mapping between elements and IDs for an LLM to take actions upon (e.g. CLICK [23]). We define interactable elements...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 5
    Wapiti

    Wapiti

    Wapiti is a web-application vulnerability scanner

    Wapiti is a vulnerability scanner for web applications. It currently search vulnerabilities like XSS, SQL and XPath injections, file inclusions, command execution, XXE injections, CRLF injections, Server Side Request Forgery, Open Redirects... It use the Python 3 programming language.
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 6
    justniffer
    justniffer is a TCP sniffer. It reassembles and reorders packets and displays the tcp flow in a customizable way. It can log network traffic in web server log format. It can also log network services performances (e.g. web server response times) and extract http content (images, html, scripts, etc)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    --- IMPORTANT : This project has been moved to GitHub at https://github.com/clstoulouse/motu-client-python. Download the last version from the release page https://github.com/clstoulouse/motu-client-python/releases. --- Motu is a high efficient and robust Web Server which fills the gap between heterogeneous Data Providers to End Users. Motu handles, extracts and transforms oceanographic huge volumes of data without performance collapse. This client enables to extract and download...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Requests-HTML

    Requests-HTML

    Pythonic HTML Parsing for Humans

    This library intends to make parsing HTML (e.g. scraping the web) as simple and intuitive as possible. When using this library you automatically get full JavaScript support! (Using Chromium, thanks to puppeteer) CSS Selectors (a.k.a jQuery-style, thanks to PyQuery). XPath Selectors, for the faint of heart. Mocked user-agent (like a real web browser). Automatic following of redirects. Connection–pooling and cookie persistence. The Requests experience you know and love, with magical parsing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • All-in-One Payroll and HR Platform Icon
    All-in-One Payroll and HR Platform

    For small and mid-sized businesses that need a comprehensive payroll and HR solution with personalized support

    We design our technology to make workforce management easier. APS offers core HR, payroll, benefits administration, attendance, recruiting, employee onboarding, and more.
    Learn More
  • 10
    Xplico

    Xplico

    Xplico is a Network Forensic Analysis Tool (NFAT)

    Xplico is a Network Forensic Analysis Tool (NFAT). The goal of Xplico is extract from an internet traffic capture the applications data contained. For example, from a pcap file Xplico extracts each email (POP, IMAP, and SMTP protocols), all HTTP contents, each VoIP call (SIP, MGCP, MEGACO, RTP), IRC, WhatsApp... Xplico is able to classify more than 140 (application) protocols. Xplico cam be used as sniffer-decoder if used in "live mode" or in conjunction with netsniff-ng. Xplico is used...
    Downloads: 47 This Week
    Last Update:
    See Project
  • 11
    GreenOdoo

    GreenOdoo

    Portable Odoo (formerly OpenERP) for windows and linux x64

    Portable Odoo (formerly OpenERP) Portable Odoo (formerly OpenERP) for windows and linux x64. Usage Extract the zip file and run start.bat( for windows) or start.sh (for linux) file. Open Brower and visit http://127.0.0.1:8069 Soure Repository https://github.com/buke/GreenOdoo Author: wangbuke <wangbuke@gmail.com>
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Cloud Export is a tool to automatically extract your data from web applications and save it to your local file system for backup purposes, but more extensive than Google Takeout. Plans are based on http://www.dataliberation.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Ymap - Yeast Mapping Analysis Pipeline

    Ymap - Yeast Mapping Analysis Pipeline

    Pipeline for large-scale genome changes analysis of genome datasets.

    The active use repository has migrated over to: https://github.com/darrenabbey/ymap The repository here was errantly created with some large binary files included. Attempts to extract the files from the history here have failed. A copy of the history was successfully scrubbed and then hosted at github. -------- Eukaryotic pathogens have complicated and dynamic genomes. To facilitate analysis of copy number variations (CNV), single nucleotide polymorphisms (SNPs), and loss...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    Penetration-Testing-Toolkit v1.0

    A web interface for various penetration testing tools

    Penetration-Testing-Toolkit is a web based project to automate Scanning a network,Exploring CMS, Generating Undectable metasploit payload, DNS-Queries, IP related informations, Information Gathering, Domain related info etc
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Spondulas

    Spondulas is browser emulator designed to retrieve web pages for hunti

    Spondulas is browser emulator and parser designed to retrieve web pages for hunting malware. It supports generation of browser user agents, GET/POST requests, and SOCKS5 proxy. It can be used to parse HTML files sent via e-mail. Monitor mode allows a website to be monitored at intervals to discover changes in DNS or content over time. Autolog mode creates an investigation file that documents redirection chains. The retrieved web pages are parsed for links and reported to an output file. More...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Lioness (Languages Interop Framework)
    Framework for making Windows applications that are one .exe file in AutoHotKey_L,C++,C#, VB.NET,Java,Groovy,Common Lisp,Nemerle,Ruby,Python,PHP,Lua,Tcl,Perl,Jint,S#,WSH VBScript,HTML/JavaScript/CSS,COM, PowerShell without compiling . For .NET 4.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    To create a framework to extract Web data and store in local RDBMS, to generate assessment reports on quality of the data being extract, and to publish the quality reports on the Web.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PySMBSearch is a crawler and search engine for SMB shares. It consists of a crawler script, which creates an index and stores it in an SQL database, and a CGI script that can be used to extract queries from the database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    litersta

    litersta

    Litersta - textual analytics - software

    Unstructured text is no match for Litersta - see further details here: https://litersta.com Working with text now becomes effortless when paired with Litersta textual analytics software. Unlike database fields, which are easily queried, text contains unstructured data that must be parsed for key objects that can be transformed in to powerful metrics. Litersta - textual analytics - software leverages statistical algorithms to programmatically locate, and extract, overall document...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next