Showing 90 open source projects for "python web crawler"

View related business solutions
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use. Icon
    Simply solve complex auth. Easy for devs to set up. Easy for non-devs to use.

    Transform user access with Frontegg CIAM: login box, SSO, MFA, multi-tenancy, and 99.99% uptime.

    Custom auth drains 25% of dev time and risks 62% more breaches, stalling enterprise deals. Frontegg platform delivers a simple login box, seamless authentication (SSO, MFA, passwordless), robust multi-tenancy, and a customizable Admin Portal. Integrate fast with the React SDK, meet compliance needs, and focus on innovation.
    Start for Free
  • 1
    Full Stack FastAPI Couchbase

    Full Stack FastAPI Couchbase

    Full stack, modern web application generator

    Full stack, modern web application generator. Using FastAPI, Couchbase as a database, Docker, automatic HTTPS, and more. Couchbase has a great set of features that is not easily or commonly found in alternatives. REST backend tests based on Pytest, integrated with Docker, so you can test the full API interaction, independent on the database. As it runs in Docker, it can build a new data store from scratch each time (so you can use ElasticSearch, MongoDB, or whatever you want, and just test...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Bots open source edi translator

    Bots open source edi translator

    Bots is a complete translator for edi: edifact, x12, xml, tradacoms

    Bots is a complete translator for edi (Electronic Data Interchange). EDI data formats eg: edifact, x12, tradacoms, xml. Mail: http://groups.google.com/group/botsmail Web-site: http://bots.sourceforge.net Wiki: http://bots.readthedocs.io Develop: https://github.com/eppye-bots/bots
    Leader badge
    Downloads: 40 This Week
    Last Update:
    See Project
  • 3
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking. Since pyspider has various components, you can just run pyspider to start a standalone...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    survol

    RDF-based framework monitoring business systems activity

    A Python agent and a web interface aiming to help the analysis and investigation of a legacy application. A set of machines, processes, databases, programs etc ... all communicating with each other, manipulating your data, and whose software architecture has become, with time, complicated, difficult to understand, and undocumented. Data are aggregated with an RDF inference engine, creating a global vision of the business information processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 5
    Extended Memory Semantics (EMS)

    Extended Memory Semantics (EMS)

    Persistent shared object memory and parallelism for Node.js and Python

    EMS makes possible persistent shared memory parallelism between Node.js, Python, and C/C++. Extended Memory Semantics (EMS) unifies synchronization and storage primitives to address several challenges of parallel programming. A modern multi-core server has 16-32 cores and nearly 1TB of memory, equivalent to an entire rack of systems from a few years ago. As a consequence, jobs formerly requiring a Map-Reduce cluster can now be performed entirely in shared memory on a single server without using...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    redfish-lab

    Get started with the Redfish RESTful API from the DMTF

    Redfish-lab allows a smooth ramp-up with the Redfish RESTful API on an HPE ProLiant server, including UEFI/BIOS configuration with various scripting languages like PowerShell and Python. Small tutorials/articles are also proposed in the Wiki section.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    PyXB (“pixbee”) is a pure Python package that generates Python source code for classes that correspond to data structures defined by XMLSchema. In concept it is similar to JAXB for Java and CodeSynthesis XSD for C++.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    Smart-M3 is a functional platform that provides a cross domain search extent for triple based information. Smart-M3 enables smart cross domain applications that rely on information level interoperability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 2 This Week
    Last Update:
    See Project
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 10
    QAL

    QAL

    Query Abstraction Layer

    Project has moved to: https://github.com/OptimalBPM/qal QAL is a collection of libraries for mining, transforming and writing data from and to a number of places. Sources and destinations include different SQL and NoSQL backends, file formats like .csv, XML and excel. Even untidy HTML web pages. It has a database abstraction layer that supports connectivity to Postgres, MySQL, DB2, Oracle, MS SQL server. JSON and MongoDB is coming. It uses XML/JSON formats(self-generated SQL schemas...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Another plain text format with target in the easy edition of outlines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    acp245

    ACP245 Suite: Brazil Telematics protocol implementation

    This "suite" includes a portable reference implementation and testing tools for ACP245, the automotive Telematics protocol defined by Brazil government. see: http://www.denatran.gov.br/simrav/simrav.asp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    FlightFeather's goal is "social networking for everyone". This means that anyone should have a chance to run a popular social networking site -- on minimal hardware, and without wasting bandwidth.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    SPARQL Endpoint interface to Python

    This project has been moved to http://rdflib.github.io/sparqlwrapper/

    A library to allow query a SPARQL end-point in Python
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Easy Equations

    Easy Equations

    Hand Written Equation Creator

    Easy Equations is a utility using which one can write the mathematical equations. The focus of this utility is to provide user friendly access to write mathematical equations which is helpful for students, lecturers, mathematicians and Research persons who prefer using mathematical equations in a document, PowerPoint or web sites. Works on Windows as well as Linux platforms. Software Requirements: JDK 7 or higher. Linux Platform with kernel version 2.7 or higher.(for Linux users).python...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Meresco is both an OAI Data Provider and a Service Provider. SourceForge is only used to host the source control (subversion). Sources: http://sources.meresco.org/ Binaries: http://repository.cq2.org/ Mail: http://groups.google.com/group/meresco
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Aurora Application Server is a new Python Web Application Server and Framework. The main goal of the project is to provide the developer with a complete set of tools to speed up the application development process. See project wiki for more information.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    JS SmartM3 KP API

    javascript API for SmartM3

    Porting of the SmartM3 Triple Space's KP on JavaScript. Due to limitations on JS connectivity a "WebSocket to TCP" relay has been developed in order to enable JavaScript KP to communicato with a SmartM3 SIB. Relay is based on jWebSocketServer. User manual currently only in italian :(
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    HeWIT helps you create and fill out forms. It tells you if you've missed anything or made any mistakes. You can then send the form over email, upload to a web site, or pass it by memory stick to whoever needs it next.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Angosso

    Angosso

    Performance and stability

    Develop: domain secure, and performing secure Domain Name System (DNS) dynamic updates. Domain Name System Security Extensions Servlet API Package The javax.servlet.http package contains a number of classes and interfaces that describe and define the contracts between a servlet class running under the HTTP protocol and the runtime environment provided for an instance of such a class by a conforming servlet container.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    This is the Open Source RESTful client for the take.io platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Python XML Serialization
    pyxser stands for python xml serialization and is a python object to XML serializer that validates every XML deserialization against the pyxser 1.0 XML Schema. pyxser is written entirely in C as a python extension.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.