Showing 16 open source projects for "structured text"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    news-please is an open source news crawler and information extraction tool designed to collect and structure articles from online news websites. It provides an integrated pipeline that crawls news sites, retrieves article pages, and extracts structured information such as headlines, authors, publication dates, and article text. news-please can recursively follow internal links and read RSS feeds to gather both recent and archived articles from a news outlet when given only the root URL of a site. It combines several established technologies and libraries to perform web crawling and content extraction, enabling reliable processing across a wide range of news sources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Powerline

    Powerline

    Statusline plugin for vim with prompts for several other applications

    Powerline is a statusline plugin for vim, and provides statuslines and prompts for several other applications, including zsh, bash, tmux, IPython, Awesome, i3 and Qtile. Powerline was completely rewritten in Python to get rid of as much vimscript as possible. This has allowed much better extensibility, leaner and better config files, and a structured, object-oriented codebase with no mandatory third-party dependencies other than a Python interpreter. Using Python has allowed unit testing of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Weibo Crawler

    Weibo Crawler

    Python crawler for collecting and downloading Sina Weibo user data

    weibo-crawler is a Python-based data collection tool designed to retrieve information from Sina Weibo user accounts. It automates the process of gathering posts, user profile details, and engagement metrics from one or more target accounts. weibo-crawler can extract comprehensive information about users, including profile attributes such as nickname, follower count, following count, and account metadata. It also captures detailed data about each post, including the content, publishing time,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Mini QR

    Mini QR

    Create & scan cute qr codes easily

    Mini QR is a web app focused on making QR codes feel friendly and design-forward, combining a polished QR generator with a built-in scanner so you can both create and decode codes in the same place. It emphasizes customization so the QR you generate can match a brand, event theme, or personal style, including color and styling controls, framed layouts with labels, and the ability to add a logo image. Because QR reliability matters as much as looks, it exposes practical settings like error...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    SuperSocket

    SuperSocket

    Extensible socket server application framework for .NET

    ...SuperSocket is designed to be flexible enough for custom binary or text protocols while still offering reusable abstractions for common server patterns. It is most useful for .NET teams that need robust networking infrastructure with room for domain-specific protocol logic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Distpicker

    Distpicker

    Plugin for picking provinces, cities and districts of China

    ...The plugin is customizable in terms of initial values, placeholder text, and which dropdowns to include, and it also supports dynamic updates if you need to reset or reload region data based on custom logic. It’s especially handy for applications that require structured address input across multiple locales without building and maintaining your own region datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TreeGraph

    TreeGraph

    Information Manager(split/analyze/compare/combine).

    For Homepage, Blog, Family Tree, Database, C#|hjt|js|chm Editor. Convert hjt2xml, (c#)cs2xml, chm2xml, js2xml, xml2cs, xml2js, xml2hjt, cs2hjt, hjt2cs, cs2chm, hjt2chm. IE/Opera/Firefox/PocketPC supported.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    mzitu

    mzitu

    Python crawler that downloads image galleries and analyzes titles

    mzitu is a Python-based web crawling project designed to automatically download and organize image galleries from a specific photography site. It demonstrates how to build a scraper that navigates gallery pages, retrieves image links, and saves the images locally in a structured directory layout. It focuses on automating the collection of large sets of images by programmatically parsing page content and iterating through gallery entries. mzitu also includes a simple analysis script that processes downloaded folder names to generate statistics and visualizations. Using text segmentation and frequency analysis, the project can create a word cloud representing common keywords found in the dataset. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Content Repository 5, Content-driven CMS
    The Content Repository 5 middleware contains fully conforming implementation of the Content Repository for Java Technology API (JCR, specified in JSR 170 and JSR 283). It's a hierarchical content store with support for structured and unstructured content, full text search, versioning, transactions, observation, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Expose

    Expose

    A simple static site generator for photoessays

    Expose is a lightweight static site generator designed specifically for creating photoessays from folders of images and videos. Implemented as a Bash script, it converts directories of media files into cleanly structured static websites with built-in themes. By default, it includes both a blog-style layout and a Medium-inspired theme, but users can also build their own templates. Expose reads associated text files, YAML metadata, and folder structures to automatically generate navigation menus, captions, and styling for each gallery. It supports image and video customization through ImageMagick and FFmpeg, enabling batch effects, filters, watermarks, and even video stabilization. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Piggydb

    Piggydb

    Piggydb helps you have more fun with knowledge creation.

    Piggydb is a flexible and scalable knowledge building platform that supports a heuristic or bottom-up approach to discover new concepts or ideas based on your input. You can begin with using it as a flexible outliner, diary or notebook, and as your database grows, Piggydb helps you to shape or elaborate your own knowledge. Piggydb is a Web application provided as a self-contained package that contains a Web server and database engine. With Piggydb, you can create highly structured content...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    ANts P2P
    ANts P2P realizes a third generation P2P net. It protects your privacy while you are connected and makes you not trackable, hiding your identity (ip) and crypting everything you are sending/receiving from others.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    This module provides a framework for adding advanced articles containing text (structured in sections) together with related images, links and/or additional information to the OpenCms (version 6) content management system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    High-performance software for information retrieval research. Emphasis on semi-structured text retrieval, especially for HTML and XML. The goal is to facilitate information retrieval research by providing an interchangable toolkit of functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Chaperon is a LALR(1) parser, which parse structured text documents and generate XML documents as output. It includes a parser generator like yacc and a regex scaner like lex. As input use Chaperon a grammar written in XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo