Search Results for "html source extractor" - Page 3

Showing 518 open source projects for "html source extractor"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    notebooker

    notebooker

    Productionise & schedule your Jupyter Notebooks

    Productionise and schedule your Jupyter Notebooks, just as interactively as you wrote them. Notebooker is a webapp which can execute and parametrise Jupyter Notebooks as soon as they have been committed to git. The results are stored in MongoDB and searchable via the web interface, essentially turning your Jupyter Notebook into a production-style web-based report in a few clicks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 13 This Week
    Last Update:
    See Project
  • 3
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 4
    Offline HTML Viewer

    Offline HTML Viewer

    Fast offline HTML viewer for opening local HTML files on Windows

    Echo Offline Viewer is a lightweight offline HTML viewer for Windows designed to open and browse local HTML files without requiring an internet connection or a full web browser. The application provides a simple and clean interface for viewing offline web pages, making it useful for archived websites, documentation, and locally stored HTML content. Key advantages include fast startup, minimal system resource usage, and a fully read-only design that ensures files and system data remain...
    Leader badge
    Downloads: 59 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Shapash

    Shapash

    Explainability and Interpretability to Develop Reliable ML models

    Shapash is a Python library dedicated to the interpretability of Data Science models. It provides several types of visualization that display explicit labels that everyone can understand. Data Scientists can more easily understand their models, share their results and easily document their projects in an HTML report. End users can understand the suggestion proposed by a model using a summary of the most influential criteria.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ...Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 7
    Unstructured.IO

    Unstructured.IO

    Open source libraries and APIs to build custom preprocessing pipelines

    The unstructured library provides open-source components for ingesting and pre-processing images and text documents, such as PDFs, HTML, Word docs, and many more. The use cases of unstructured revolve around streamlining and optimizing the data processing workflow for LLMs. unstructured modular bricks and connectors form a cohesive system that simplifies data ingestion and pre-processing, making it adaptable to different platforms and is efficient in transforming unstructured data into structured outputs.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Frontend Slides

    Frontend Slides

    Create beautiful slides on the web using Claude's frontend skills

    Frontend Slides is a lightweight tool that enables users to create visually appealing, animation-rich web presentations without requiring knowledge of CSS or JavaScript by leveraging a guided, interactive workflow. It operates on a “show, don’t tell” philosophy, generating visual previews of styles so users can select their preferred design rather than describing it abstractly. The system produces fully self-contained HTML presentations with inline CSS and JavaScript, eliminating the need...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    WeasyPrint

    WeasyPrint

    The awesome document factory

    WeasyPrint is a smart solution helping people to create PDF documents. You can generate gorgeous statistical reports, invoices, tickets, and anything you want as long as you have some webdesign skills! Design your documents just as you design your websites! WeasyPrint follows the widely used HTML and CSS specifications from the W3C. You can use your usual web tools, languages and frameworks, but for print. Creating high-quality digital documents requires features that you love to use as...
    Downloads: 19 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Best-of Web Development with Python

    Best-of Web Development with Python

    A ranked list of awesome python libraries for web development

    This curated list contains 570 awesome open-source projects with a total of 2.4M stars grouped into 26 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from Github and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! A ranked list of awesome python libraries for web...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    voila

    voila

    Voilà turns Jupyter notebooks into standalone web applications

    From notebooks to standalone web applications and dashboards. Voilà allows you to convert a Jupyter Notebook into an interactive dashboard that allows you to share your work with others. It is secure and customizable, giving you control over what your readers experience. Unlike the usual HTML-converted notebooks, each user connecting to the Voilà tornado application gets a dedicated Jupyter kernel which can execute the callbacks to changes in Jupyter interactive widgets. To render the bqplot...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Tally

    Tally

    Let agents classify your bank transactions

    Tally is an open-source, AI-assisted tool designed to automate the classification of personal financial transactions, helping users turn raw bank data into meaningful categories without manual tagging. At its core, Tally pairs a local rule engine with large language models so that an AI assistant (like Claude Code, Copilot, or any CLI agent) interprets, suggests, and categorizes expenses, savings, subscriptions, and income events based on your own rules and behavior.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    CodeChecker

    CodeChecker

    CodeChecker is an analyzer tooling, defect database

    CodeChecker is a static analysis infrastructure built on the LLVM/Clang Static Analyzer toolchain, replacing scan-build in a Linux or macOS (OS X) development environment. Executes Clang-Tidy and Clang Static Analyzer with Cross-Translation Unit analysis, Statistical Analysis (when checkers are available). Creates the JSON compilation database by wiretapping any build process (e.g., CodeChecker log -b "make"). Automatically analyzes GCC cross-compiled projects: detecting GCC or Clang...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    ipyvizzu

    ipyvizzu

    Build animated charts in Jupyter Notebook and similar environments

    ipyvizzu - Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax ipyvizzu is an animated charting tool for Jupyter, Google Colab, Databricks, Kaggle and Deepnote notebooks among other platforms. ipyvizzu enables data scientists and analysts to utilize animation for storytelling with data using Python. It's built on the open-source JavaScript/C++ charting library Vizzu. There is a new extension of ipyvizzu, ipyvizzu-story with which the animated charts can be presented right from the notebooks. Since ipyvizzu-story's syntax is a bit different to ipyvizzu's, we suggest you to start from the ipyvizzu-story repo if you're interested in using animated charts to present your findings live or to share your presentation as an HTML file.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Chandra

    Chandra

    OCR model for complex documents with layout-aware structured outputs

    Chandra is an advanced OCR model designed to extract and structure information from complex documents such as tables, forms, handwritten notes, and mathematical content. It focuses on preserving full document layout, meaning that extracted text is accompanied by positional metadata like bounding boxes for each element. Chandra supports multiple output formats including Markdown, HTML, and JSON, making it suitable for downstream processing and integration into data pipelines. It is capable of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Trame

    Trame

    Weave various components and technologies into a Web App

    ...With best-in-class platforms at its core, trame provides complete control of 3D visualizations and data processing. Developers benefit from a write-once environment from trame. trame is an open source project licensed under Apache License Version 2.0 which allows users to create open source or commercial applications without any licensing worries. By relying simply on Python and HTML, trame focuses on one's data and associated analysis and visualizations while hiding the complications of web development.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    UI UX Pro Max

    UI UX Pro Max

    AI SKILL that provide design intelligence

    UI UX Pro Max is an open-source AI skill designed to provide intelligent design assistance for professional user interfaces and user experiences across web, mobile, and cross-platform frameworks. It uses an AI reasoning engine to generate complete design systems tailored to project requirements, recommending layouts, typography, colors, spacing, and component structures automatically based on natural language prompts.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 20
    MkDocs

    MkDocs

    Project documentation with Markdown

    MkDocs is a fast, simple and downright gorgeous static site generator that's geared towards building project documentation. Documentation source files are written in Markdown, and configured with a single YAML configuration file. Start by reading the introductory tutorial, then check the User Guide for more information. There's a stack of good-looking themes available for MkDocs. Choose between the built in themes: mkdocs and readthedocs, select one of the third-party themes listed on the...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22
    screenshot-to-code

    screenshot-to-code

    Drop in a screenshot and convert it to clean code

    screenshot-to-code converts UI screenshots or design images into working front-end code, accelerating the path from concept to prototype. It uses modern vision-capable or code-generating models to infer layout structure, typography, and components, then outputs clean HTML/CSS (often Tailwind) or framework code. A web interface lets you upload images, tune options, and preview generated results, while a backend service orchestrates the model calls and post-processing. The tool focuses on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Mercury

    Mercury

    Convert Python notebook to web app and share with non-technical users

    Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license. Mercury is a perfect tool to convert Python notebook to interactive web application and share with non-programmers.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 24
    Pysheeet

    Pysheeet

    Python Cheat Sheet

    Pysheeet is a community-driven collection of Python code snippets covering common patterns and tasks like sockets, file I/O, data structures, and more. Each snippet is concise and battle-tested, designed to save coding time and reduce boilerplate. With documentation hosted on Read the Docs and an active GitHub repo, it’s a go-to resource for Python developers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    SublimeLinter-eslint

    SublimeLinter-eslint

    This linter plugin for SublimeLinter provides an interface to ESLint

    This linter plugin for SublimeLinter provides an interface to ESLint. It will be used with "JavaScript" files, but since eslint is pluggable, it can actually lint a variety of other files as well. SublimeLinter will detect some installed local plugins, and thus it should work automatically for e.g. .vue or .ts files. If it works on the command line, there is a chance it works in Sublime without further ado. Make sure the plugins are installed locally colocated to eslint itself. T.i.,...
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB