Showing 82 open source projects for "python web crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Resolve Support Tickets 2x Faster​ with ServoDesk Icon
    Resolve Support Tickets 2x Faster​ with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 1
    Budou

    Budou

    Budou is an auto organizer tool for beautiful line breaking in CJK

    Budou is a Python library developed by Google to improve web typography for CJK (Chinese, Japanese, Korean) languages by producing semantically meaningful line breaks. Unlike English, CJK scripts lack spaces or hyphenation cues, often resulting in awkward or unreadable text wrapping on web pages. Budou addresses this issue by segmenting sentences into logical lexical chunks and wrapping each chunk in non-breaking HTML <span> tags.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Full Stack FastAPI Couchbase

    Full Stack FastAPI Couchbase

    Full stack, modern web application generator

    Full stack, modern web application generator. Using FastAPI, Couchbase as a database, Docker, automatic HTTPS, and more. Couchbase has a great set of features that is not easily or commonly found in alternatives. REST backend tests based on Pytest, integrated with Docker, so you can test the full API interaction, independent on the database. As it runs in Docker, it can build a new data store from scratch each time (so you can use ElasticSearch, MongoDB, or whatever you want, and just test...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Bots open source edi translator

    Bots open source edi translator

    Bots is a complete translator for edi: edifact, x12, xml, tradacoms

    Bots is a complete translator for edi (Electronic Data Interchange). EDI data formats eg: edifact, x12, tradacoms, xml. Mail: http://groups.google.com/group/botsmail Web-site: http://bots.sourceforge.net Wiki: http://bots.readthedocs.io Develop: https://github.com/eppye-bots/bots
    Downloads: 20 This Week
    Last Update:
    See Project
  • 4
    pyspider

    pyspider

    A powerful Spider(Web Crawler) system in Python

    pyspider is a powerful Spider(Web Crawler) system in Python. Components are connected by message queue. Every component, including message queue, is running in their own process/thread, and replaceable. That means, when process is slow, you can have many instances of processor and make full use of multiple CPUs, or deploy to multiple machines. This architecture makes pyspider really fast. benchmarking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • 5

    survol

    RDF-based framework monitoring business systems activity

    A Python agent and a web interface aiming to help the analysis and investigation of a legacy application. A set of machines, processes, databases, programs etc ... all communicating with each other, manipulating your data, and whose software architecture has become, with time, complicated, difficult to understand, and undocumented. Data are aggregated with an RDF inference engine, creating a global vision of the business information processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    redfish-lab

    Get started with the Redfish RESTful API from the DMTF

    Redfish-lab allows a smooth ramp-up with the Redfish RESTful API on an HPE ProLiant server, including UEFI/BIOS configuration with various scripting languages like PowerShell and Python. Small tutorials/articles are also proposed in the Wiki section.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Extended Memory Semantics (EMS)

    Extended Memory Semantics (EMS)

    Persistent shared object memory and parallelism for Node.js and Python

    EMS makes possible persistent shared memory parallelism between Node.js, Python, and C/C++. Extended Memory Semantics (EMS) unifies synchronization and storage primitives to address several challenges of parallel programming. A modern multi-core server has 16-32 cores and nearly 1TB of memory, equivalent to an entire rack of systems from a few years ago. As a consequence, jobs formerly requiring a Map-Reduce cluster can now be performed entirely in shared memory on a single server without...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    PyXB (“pixbee”) is a pure Python package that generates Python source code for classes that correspond to data structures defined by XMLSchema. In concept it is similar to JAXB for Java and CodeSynthesis XSD for C++.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 4 This Week
    Last Update:
    See Project
  • Dun and Bradstreet Risk Analytics - Supplier Intelligence Icon
    Dun and Bradstreet Risk Analytics - Supplier Intelligence

    Use an AI-powered solution for supply and compliance teams who want to mitigate costly supplier risks intelligently.

    Risk, procurement, and compliance teams across the globe are under pressure to deal with geopolitical and business risks. Third-party risk exposure is impacted by rapidly scaling complexity in domestic and cross-border businesses, along with complicated and diverse regulations. It is extremely important for companies to proactively manage their third-party relationships. An AI-powered solution to mitigate and monitor counterparty risks on a continuous basis, this cutting-edge platform is powered by D&B’s Data Cloud with 520M+ Global Business Records and 2B+ yearly updates for third-party risk insights. With high-risk procurement alerts and multibillion match points, D&B Risk Analytics leverages best-in-class risk data to help drive informed decisions. Perform quick and comprehensive screening, using intelligent workflows. Receive ongoing alerts of key business indicators and disruptions.
    Learn More
  • 10
    QAL

    QAL

    Query Abstraction Layer

    ...Of course custom SQL:s are also supported. It is currently distributed as a Python 3 Library (pip3 install python3-qal) and Debian .deb package. It is related the Optimal BPM project, see its Optimal Sync application for usage examples. The text of this page is released under the Creative Commons Zero Waiver 1.0 (CC0).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DoCookBook

    DoCookBook

    Cookbook Style Document for DocBook Customizations

    This project has been moved to GitHub: https://github.com/tomschr/dbcookbook/ The DoCookBook project aims to create an open source book about DocBook and the DocBook XSL stylesheets written as a cookbook and released under a Creative Commons license.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    acp245

    ACP245 Suite: Brazil Telematics protocol implementation

    This "suite" includes a portable reference implementation and testing tools for ACP245, the automotive Telematics protocol defined by Brazil government. see: http://www.denatran.gov.br/simrav/simrav.asp
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    SPARQL Endpoint interface to Python

    This project has been moved to http://rdflib.github.io/sparqlwrapper/

    A library to allow query a SPARQL end-point in Python
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Easy Equations

    Easy Equations

    Hand Written Equation Creator

    ...The focus of this utility is to provide user friendly access to write mathematical equations which is helpful for students, lecturers, mathematicians and Research persons who prefer using mathematical equations in a document, PowerPoint or web sites. Works on Windows as well as Linux platforms. Software Requirements: JDK 7 or higher. Linux Platform with kernel version 2.7 or higher.(for Linux users).python necessary only in linux environment to use COPY functionality.python is pre installed in recent linux distributions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Aurora Application Server is a new Python Web Application Server and Framework. The main goal of the project is to provide the developer with a complete set of tools to speed up the application development process. See project wiki for more information.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    JS SmartM3 KP API

    javascript API for SmartM3

    Porting of the SmartM3 Triple Space's KP on JavaScript. Due to limitations on JS connectivity a "WebSocket to TCP" relay has been developed in order to enable JavaScript KP to communicato with a SmartM3 SIB. Relay is based on jWebSocketServer. User manual currently only in italian :(
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    HeWIT helps you create and fill out forms. It tells you if you've missed anything or made any mistakes. You can then send the form over email, upload to a web site, or pass it by memory stick to whoever needs it next.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Angosso

    Angosso

    Performance and stability

    Develop: domain secure, and performing secure Domain Name System (DNS) dynamic updates. Domain Name System Security Extensions Servlet API Package The javax.servlet.http package contains a number of classes and interfaces that describe and define the contracts between a servlet class running under the HTTP protocol and the runtime environment provided for an instance of such a class by a conforming servlet container.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    This is the Open Source RESTful client for the take.io platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Efficent WEB UI for wget utility written in python(twisted) . It's work without any web server. Script consists of a SINGLE file
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    XUProxy is an extensible multi-protocol proxy based on the Twisted framework. It supports multiple protocol plugins (currently only HTTP), and multiple "filter" plugins for things like logging, caching, and Proxomitron-compatible ad filtering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Scripts and tools for OpenStreetMap (osm) Sample maps can be found at : <http://www.leretourdelautruche.com/map/index.html>
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A tool for autonomous and virtual topical data integration using the focused web-harvesting method.
    Downloads: 0 This Week
    Last Update:
    See Project