Showing 51 open source projects for "json-java"

View related business solutions
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • Business Automation Software for SMBs Icon
    Business Automation Software for SMBs

    Fed up with not having the time, money and resources to grow your business?

    The only software you need to increase cash flow, optimize resource utilization, and take control of your assets and inventory.
    Learn More
  • 1
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    ...Monitor and track PDF file changes, and know when a PDF file has text changes. Know when your favourite product is on sale, or other special deals are announced before anyone else. Detect and monitor changes in JSON API responses.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 2
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    ...Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any number of Scrapyd servers by grouping and filtering, and then invoke the HTTP JSON API of Scrapyd on the cluster with just a few clicks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 4
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Financial reporting cloud-based software. Icon
    Financial reporting cloud-based software.

    For companies looking to automate their consolidation and financial statement function

    The software is cloud based and automates complexities around consolidating and reporting for groups with multiple year ends, currencies and ERP systems with a slice and dice approach to reporting. While retaining the structure, control and validation needed in a financial reporting tool, we’ve managed to keep things flexible.
    Learn More
  • 5
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    html-metadata

    html-metadata

    MetaData html scraper and parser for Node.js (supports Promises

    The aim of this library is to be a comprehensive source for extracting all HTML-embedded metadata. Currently, it supports Schema.org microdata using a third-party library, a native BEPress, Dublin Core, Highwire Press, JSON-LD, Open Graph, Twitter, EPrints, PRISM, and COinS implementation, and some general metadata that doesn't belong to a particular standard (for instance, the content of the title tag, or meta description tags). Planned is support for RDFa, AGLS, and other yet unheard-of metadata types. Contributions and requests for other metadata types are welcome! ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    jsoup

    jsoup

    Java library for working with real-world HTML

    ...The parser will make every attempt to create a clean parse from the HTML you provide, regardless of whether the HTML is well-formed or not. You have HTML in a Java String, and you want to parse that HTML to get at its contents, or to make sure it's well formed, or to modify it. The String may have come from user input, a file, or from the web.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    WFDownloader App

    WFDownloader App

    Free batch downloader for image, wallpaper, video, audio, document,

    ...It's a programmable downloader and also works with password protected sites. Say goodbye to downloading one by one. Go to the Help menu or check out website to get started. Note that this cross-platform version requires Java (minimum version Java 8) to be installed on your Operating System. For non-java required OS specific versions, check app's website.
    Leader badge
    Downloads: 249 This Week
    Last Update:
    See Project
  • BoldTrail Real Estate CRM Icon
    BoldTrail Real Estate CRM

    A first-of-its-kind homeownership solution that puts YOU at the center of the coveted lifetime consumer relationship.

    BoldTrail, the #1 rated real estate platform, is built to power your entire brokerage with next-generation technology your agents will use and love. Showcase your unique brand with customizable websites for your company, offices, and every agent. Maximize lead capture with a modern, portal-like consumer search experience and intelligent behavior tracking. Hyper-local area pages, home valuation pages and options for rich lifestyle data keep customers searching with your brokerage as the local experts. The most robust lead gen tools on the market help your brokerage, teams & agents effectively drive new business - no matter their budget. Empower your agents to generate free leads instantly with our simple to use landing pages & IDX squeeze pages. Drive more leads with higher quality and lower cost through in-house tools built within the platform. Diversify lead sources with our automated social media posting, integrated Google and Facebook advertising, custom text codes and more.
    Learn More
  • 10
    ai-scrapper
    🚀 Discover AI Web Scraper! 🚀 Tired of copying and pasting data from websites? I developed a desktop application with Electron and Gemini AI to extract structured data easily and efficiently! 🤖✨
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    ... - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender : https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 17 This Week
    Last Update:
    See Project
  • 12
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    Crawlab

    Crawlab

    Distributed web crawler admin platform for spiders management

    Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. Please use docker-compose to one-click to start up. By doing so, you don't even have to configure MongoDB database. The frontend app interacts with the master node, which communicates with other components such as MongoDB, SeaweedFS and worker nodes. Master node and worker nodes communicate with each other via gRPC (a RPC framework). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ACHE Focused Crawler

    ACHE Focused Crawler

    ACHE is a web crawler for domain-specific search

    ACHE is a focused web crawler. It collects web pages that satisfy some specific criteria, e.g., pages that belong to a given domain or that contain a user-specified pattern. ACHE differs from generic crawlers in sense that it uses page classifiers to distinguish between relevant and irrelevant pages in a given domain. A page classifier can be defined as a simple regular expression (e.g., that matches every page that contains a specific word) or a machine-learning-based classification model....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Scrapyd

    Scrapyd

    A service daemon to run Scrapy spiders

    Scrapyd can manage multiple projects and each project can have multiple versions uploaded, but only the latest one will be used for launching new spiders. A common (and useful) convention to use for the version name is the revision number of the version control tool you’re using to track your Scrapy project code. For example: r23. The versions are not compared alphabetically but using a smarter algorithm (the same packaging uses) so r10 compares greater to r9, for example. Scrapyd is an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    ... - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby Database - Written in Java Cross Platform See also Free Email Sender in this link: https://sourceforge.net/projects/gitst-free-email-ender/ Please install Microsoft OpenJDK to start the application https://www.microsoft.com/openjdk
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GitGet

    GitGet

    Ever wanted to download only a part of a Git repository.

    Ever wanted to download only a part of a Git repository. Just paste the URL of the repo you want to download and sit back and enjoy. This simple java application makes use of Web Scraping and downloads only those files you need, thus helping you save your precious bandwidth and space.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    ...While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify JASP for advanced data editing and RapidMiner for advanced prediction modeling. DSTK is written in C#, Java and Python to interface with R, NLTK, and Weka. It can be expanded with plugins using R Scripts. We have also created plugins for more statistical functions, and Big Data Analytics with Microsoft Azure HDInsights (Spark Server) with Livy. License: R, RStudio, NLTK, SciPy, SKLearn, MatPlotLib, Weka, ... each has their own licenses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Save For Offline

    Save For Offline

    Android app for saving webpages for offline reading

    Android app for saving webpages for offline reading. Save For Offline is an Android app for saving full web pages for offline reading, with lots of features and options. In you web browser selects 'Share', and then 'Save For Offline'. Saves real HTML files which can be opened in other apps/devices. Download & save entire web pages with all assets for offline reading & viewing. Save HTML files in a custom directory. Save in the background, no need to wait for it to finish saving. Night mode,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    phoneutria
    A Java Web crawler: multi-threaded, scalable, with high performance, extensible and polite. It can be used to crawl and index any web or enterprise domain and is configurable through a XML configuration file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    WebCollector

    WebCollector is an open source web crawler framework based on Java.

    WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes. Github: https://github.com/CrawlScript/WebCollector Demo: https://github.com/CrawlScript/WebCollector/blob/master/YahooCrawler.java
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next