Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.
Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
Try for free
Turn traffic into pipeline and prospects into customers
For account executives and sales engineers looking for a solution to manage their insights and sales data
Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
The csvdatamix project aims to randomize CSV input data files in order to conceal the original state of the data. Similar to data masking or data transformation. Also has mapping abilities to translate back to the original state of the data.
A Python script that can be used to get information on TV shows and Movie Shows from thetvdb.org and themoviedb.org. This is an learning experience and anybody can chime in on everything.
Epicxml is an xql like command line interpretor for xml files management. Used as command: Epicxml is a shell friendly, xpath like query tool, wich allows you to navigate, find/create/update/delete/print nodes. It is based on picxml.
Proactively monitor, manage, and support client networks with ConnectWise Automate
Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.
pyxser stands for python xml serialization and is a python object to XML serializer that validates every XML deserialization against the pyxser 1.0 XML Schema. pyxser is written entirely in C as a python extension.
XUProxy is an extensible multi-protocol proxy based on the Twisted framework. It supports multiple protocol plugins (currently only HTTP), and multiple "filter" plugins for things like logging, caching, and Proxomitron-compatible ad filtering.
SamChanEd is a command line tool to organize channels list on Samsung TV. Currently it supports only analog channels on C series of TV sets. TV icon by http://cemagraphics.deviantart.com/
Inventors: Validate Your Idea, Protect It and Gain Market Advantages
SenseIP is ideal for individual inventors, startups, and businesses
senseIP is an AI innovation platform for inventors, automating any aspect of IP from the moment you have an idea. You can have it researched for uniqueness and protected; quickly and effortlessly, without expensive attorneys. Built for business success while securing your competitive edge.
HTTP functional and non-functional (load and performance) toolkit based on jython/grinder (http://grinder.sf.net) ...includes capabilities to support: SOA services, REST, json/xml encoding, AES and WS security ... and a stub to collect requests
The purpose of this projects is to define, design and code a models-sharing system for the RepRap replication machines, so that end-users and designers can easily find, create, modify or build objects models.
The London Datastore (http://data.london.gov.uk) was created by the Greater London Authority (GLA) as an innovation towards freeing London’s data. This SourceForge Project will be used to Open Source our development efforts surrounding data formats
Redland is a set of object-based, modular and portable C RDF libraries providing RDF APIs for the graph, triple storage (librdf), RDF/XML parsing and serializing (Raptor), SPARQL RDF querying (Rasqal). Language APIs in Perl, PHP, Python, Ruby and others.
now here: https://github.com/plastex/plastex
plasTeX is a Python-based LaTeX document processing framework. It gives DOM-like access to a LaTeX document, as well as the ability to generate mulitple output formats (e.g. HTML, DocBook, tBook, etc.).
C4me aims to provide a convenient way of editing XML files (and, in a distant future, even more modding-related files) for modifications for Sid Meier's Civilization 4. It's now in its infancy and not really usable - join and help changing that!
Making use of our library you can easily deploy and consume services available on the web. PyServices is a pythonic library that provides a default interface to WebServices written in many different protocols. Our objective is describe and implement
Parse, analyze and -- most importantly -- use COBOL data definitions. This gives you access to COBOL data from Python programs. Write data analyzers, one-time data conversion utilities and Python programs that are part of COBOL systems. Really.
This is an ETL software which loads data from DBF/XBase files into MySQL. This utility has command line interface, designed to work without user interaction.
Jeszra is a visual designtool, which combines
2D- vector graphics and GUI design.
Jeszra is written in Tcl/Tk and creates reusable code for:
* Tcl, Ruby, Lisp and Python.
* DocBook based reference pages.
* SVG import and export.
SDict Viewer is a viewer for dictionaries in open format developed by AXMASoft (free dictionaries are available for download at http://sdict.com). Primary goal of the project is to provide usable dictionary app for Nokia Internet Tablets running Maemo.
This is a pure-Python XPath evaluator based on ElementTree. It supports a substantial fraction of the XPath 1.0 specification, but only the self, child, and attribute axes. The parser underlying the evaluator attempts to handle all of XPath 1.0.
3PT (short for Python based PowerPoint) is a project in which a basic, native Python based PowerPoint in- and export has been implemented. The created PPT files are compatible with PPT 97-2003.