Showing 25 open source projects for "python web crawler"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Powerful cloud-based licensing solution designed for fast-growing software businesses. Icon
    Powerful cloud-based licensing solution designed for fast-growing software businesses.

    A single-point of license control for desktop, SaaS, and mobile applications, APIs, VMs and devices.

    10Duke Enterprise is a cloud-based, scalable and flexible software licensing solution enabling software vendors to easily configure, manage and monetize the licenses they provide to their customers in real-time.
    Learn More
  • 1
    Archivematica

    Archivematica

    Free and open-source digital preservation system

    Archivematica is a web- and standards-based, open-source application which allows your institution to preserve long-term access to trustworthy, authentic, and reliable digital content. Our target users are archivists, librarians, and anyone working to preserve digital objects. You are free to copy, modify, and distribute Archivematica with attribution under the terms of the AGPLv3 license. Archivematica is an open-source application based on recognized standards that makes it possible to...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Zero Install
    Zero Install is a decentralised cross-distribution software installation system. Create one package that works everywhere! With dependency handling and automatic updates, full support for shared libraries, and integration with native package managers
    Leader badge
    Downloads: 2,793 This Week
    Last Update:
    See Project
  • 3
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Comet Backup - Fast, Secure Backup Software for MSPs Icon
    Comet Backup - Fast, Secure Backup Software for MSPs

    Fast, Secure Backup Software for Businesses and IT Providers

    Comet is a flexible backup platform, giving you total control over your backup environment and storage destinations.
    Learn More
  • 5
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    MarcXimiL is a flexible multi-platform bibliographic similarity analysis framework. Features: deduplication, information monitoring, visual analysis, plagiarism detection. Supported: MARCXML, OAI-PMH2 harvesting, and importation of text MARC.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    iCamp is a research and development project funded by the European Commission. The project aims at creating an infrastructure for collaboration and networking in Higher Education across systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    A tool for autonomous and virtual topical data integration using the focused web-harvesting method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Trident Project
    A digital repository and metadata editing initiative of Duke University Libraries
    Downloads: 0 This Week
    Last Update:
    See Project
  • Loan management software that makes it easy. Icon
    Loan management software that makes it easy.

    Ideal for lending professionals who are looking for a feature rich loan management system

    Bryt Software is ideal for lending professionals who are looking for a feature rich loan management system that is intuitive and easy to use. We are 100% cloud-based, software as a service. We believe in providing our customers with fair and honest pricing. Our monthly fees are based on your number of users and we have a minimal implementation charge.
    Learn More
  • 10
    A web platform to host a virtual library for public and free books.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    This is attempt to make an open source bookmarking system that supports tagging, distributed data storage, genetic "splicing" of strains of bookmarking tags and much more!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Make AsciiDoc part of your literate programming tool set. With eWEB you can weave and tangle literate programs written as AsciiDoc documents, using embedded WEB code snippets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    This project will provide translation of mathematical content, from TeX to MathML and vice-versa, and to graphics formats, as a web service. TeX, running as a daemon, is used for mathematical typography.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Museum portal based on Plone and PostgreSQL presenting archive-, photo-, subject matter and book materials in addition to online articles. Supports importing of data from museum systems in CIDOC XML format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    The Flamenco search and browse interface framework uses hierarchical faceted metadata to allow users to both refine and expand queries, while maintaining a consistent representation of a collection and seamlessly integrating keyword queries.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Whiki is a hierarchic data structure. This project contains a web content management system to handle a whiki database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    The Open Archive Cataloguer (zOAC) project applies the OAI-PMH protocol for automatic metadata harvesting and aggregation of bibliographic records and has been developed over the web application server Zope. Based on Pentila's ZOpenArchives Zope Product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A web-based search interface tailored to the New Zealand Gazette PDF archive for the NZ library community. A generic Python-based Swish-e search interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CiteULike is a free service to help academics share, store, and organise the papers they're reading. This open source project contains the code to scrape citations from publishers' web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    collection good tip code as utility packet,esp. Chinses! and design one mail core Blog system. only through mail we can blogging us Blog! and as one opening proj. , let China Python easy share tip code into uniform packet; and enjoy OpenSource...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Methodios intends to be a library software based on well known background free software such as PostgreSQL and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    open-tamil

    Tamil Tools, Tamil Library for Python 2, 3

    Open-Tamil is a full featured Tamil text processing library in Python. It works fully in Python 2, 3. Published via pip - python package index. See: https://pypi.python.org/pypi/Open-Tamil/0.67
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    cosmos

    cosmos

    Algorithms that run our universe | Your personal library of every algo

    Cosmos (by OpenGenus Foundation) is your personal offline collection of every algorithm and data structure one will ever encounter and use in a lifetime. This provides solutions in various languages spanning C, C++, Java, JavaScript, Swift, Python, Go and others. This work is maintained by a community of hundreds of people and is a massive collaborative effort to bring the readily available coding knowledge offline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PyShelf

    PyShelf

    FOSS Ebook Server, With no windowing requirements

    PyShelf is an Open Source python based, ebook server, that does not and never will require a windowing system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    abelujo

    abelujo

    Free software for bookshops.

    Abelujo helps bookshops or publishers manage their stock of books. It provides a quality bibliographic search, works with a barcode scanner, allows to dispatch books in many places, records sells, understands deposits, can export lists to txt, pdf or csv (Excel or LibreOffice), has statistics, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next