Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.
Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
Try for free
Repair-CRM
For small companies that repair and maintenance customer machines
All-In-One Solution with an Online Booking portal for automating scheduling & dispatching to ditch paperwork and improve the productivity of your technicians!
XUProxy is an extensible multi-protocol proxy based on the Twisted framework. It supports multiple protocol plugins (currently only HTTP), and multiple "filter" plugins for things like logging, caching, and Proxomitron-compatible ad filtering.
HTTP functional and non-functional (load and performance) toolkit based on jython/grinder (http://grinder.sf.net) ...includes capabilities to support: SOA services, REST, json/xml encoding, AES and WS security ... and a stub to collect requests
A python package to find repetitive format pattern in HTML pages and extract information from them using this pattern. The idea is that in pages that have some kind of a list, there will be a repetitive pattern for the human eye (the page format).
now here: https://github.com/plastex/plastex
plasTeX is a Python-based LaTeX document processing framework. It gives DOM-like access to a LaTeX document, as well as the ability to generate mulitple output formats (e.g. HTML, DocBook, tBook, etc.).
html2wordml is a python application for converting HTML pages to a WordML Microsoft Word XML document. The application can be used to create a new WordML document or to merge content into an existing template.
A powerful python module that lets you output HTML code from within a python script in a very efficient and convenient fashion. Code your web-page like a GUI! Create tags and modify their attributes at anytime during your script. http://pyh/googlecod
Our new software release will dramatically improve your medspa business performance while enhancing the customer experience
AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
This python script takes an exported wordpress xml file and outputs a single html document containing all posts in order of entry, and a table of contents broken down by Category. CSS tags added for easy formatting.
A HTML scraper that uses machine learning frameworks to extract labelled fields from raw HTML. The project also involves the development of a tool to display the semi structured data generated by the scraper component.
ZML, the Zeitung Markup Language, is a simple CMS for small newspapers. It was specifically designed to publish a student newspaper in print and on the Web. It uses LaTeX and XHTML. So far, it is documented in German only.
A Python tool for creating websites or project documentation. Pages can be stored as reST (text) or html. With a simple templating and macro system it can autogenerate index pages and navigation links. Facilities for multiple translations as well.
A client-side blog tool written in Python. It features categories, auto-ftp, three different markups for entries supported, archives and rss feeds auto-generated, flexible templating and macro system. Plugin system with emailer, spell checker, etc.
PyBookmark manipulates bookmark files. It can sync files (no server required), merge, sort, remove duplicates, and check links. Its library pybookmarklib provides access to these operations, data structures, and parser for further extensibility.
SciBook is a framework for xslt transformation from xhtml to html.
The transformation can be extended by adding plugins. The standard LaTeX plugin can convert
LaTeX code to images.