Menu

Tree [c0afd4] main v1.0.0 /
 History

HTTPS access


File Date Author Commit
 LICENSE 2025-05-31 somkietacode somkietacode [e842ce] Initial commit
 README.md 2025-05-31 somkietacode somkietacode [c0afd4] Update README.md
 index.html 2025-05-31 somkietacode somkietacode [d3ab40] Add files via upload
 index.js 2025-05-31 somkietacode somkietacode [d3ab40] Add files via upload
 package.json 2025-05-31 somkietacode somkietacode [d3ab40] Add files via upload

Read Me

AI Web Scraper

A desktop application built with Electron and Gemini AI to extract structured data from websites.
image
image
image

Features

  • Enter one or multiple URLs to scrape (one per line).
  • Provide a custom extraction prompt.
  • Define the JSON schema for output.
  • Simulate a real device with a custom User-Agent header.
  • Display scraping status and results in a table.
  • Export results as a JSON file or copy to clipboard.

Prerequisites

  • Node.js (>=14.x)
  • npm (comes with Node.js)

Installation

  1. Clone this repository:
    powershell git clone https://github.com/somkietacode/ai-scrapper.git cd ai-scrapper

  2. Install dependencies:
    powershell npm install

Usage

  1. Start the application:
    powershell npm start

  2. In the app window:

  3. Paste your Gemini API key.
  4. Enter one or more website URLs.
  5. Customize the extraction prompt and output JSON schema.
  6. Click Start Scraping.
  7. After scraping completes, view results, export or copy.

Configuration

  • The Electron window auto-sizes to the content and is non-resizable.
  • CORS checks are disabled in the main process, and all page fetches happen via IPC in Node (no external proxy required).
  • The User-Agent header simulates an iPhone Safari browser.

License

This project is licensed under the MIT License.


Built with ❤️ by Somkieta Rahim Alex

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.