Showing 24 open source projects for "python web crawler"

View related business solutions
  • Accounts Payable Software | AvidXchange Icon
    Accounts Payable Software | AvidXchange

    AvidXchange is an Industry Leader in AP Automation Software for Middle Market Businesses.

    Drive greater business success by automating the accounts payable process to boost efficiency, accuracy and speed in the processing of invoices and payments.
    Learn More
  • HOA Software Icon
    HOA Software

    Smarter Community Management Starts Here

    Simplify HOA management with software that handles everything from financials to communication.
    Learn More
  • 1
    Archivematica

    Archivematica

    Free and open-source digital preservation system

    Archivematica is a web- and standards-based, open-source application which allows your institution to preserve long-term access to trustworthy, authentic, and reliable digital content. Our target users are archivists, librarians, and anyone working to preserve digital objects. You are free to copy, modify, and distribute Archivematica with attribution under the terms of the AGPLv3 license. Archivematica is an open-source application based on recognized standards that makes it possible to...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Zero Install
    Zero Install is a decentralised cross-distribution software installation system. Create one package that works everywhere! With dependency handling and automatic updates, full support for shared libraries, and integration with native package managers
    Leader badge
    Downloads: 3,663 This Week
    Last Update:
    See Project
  • 3
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    mediaTUM is free software written in Python for archiving and retrieval of images, documents and other research data. It was originally developed in the framework of the DFG project IntegraTUM and is continuously expanded with new functionalities as required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Automated RMM Tools | RMM Software Icon
    Automated RMM Tools | RMM Software

    Proactively monitor, manage, and support client networks with ConnectWise Automate

    Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.
    Learn More
  • 5
    The archive-crawler project is building Heritrix: a flexible, extensible, robust, and scalable web crawler capable of fetching, archiving, and analyzing the full diversity and breadth of internet-accesible content.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    MarcXimiL is a flexible multi-platform bibliographic similarity analysis framework. Features: deduplication, information monitoring, visual analysis, plagiarism detection. Supported: MARCXML, OAI-PMH2 harvesting, and importation of text MARC.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    iCamp is a research and development project funded by the European Commission. The project aims at creating an infrastructure for collaboration and networking in Higher Education across systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    A tool for autonomous and virtual topical data integration using the focused web-harvesting method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Trident Project
    A digital repository and metadata editing initiative of Duke University Libraries
    Downloads: 0 This Week
    Last Update:
    See Project
  • Most modern and flexible cloud platform for MLM companies Icon
    Most modern and flexible cloud platform for MLM companies

    ERP-class software for multi-level marketing

    For direct selling (MLM) companies, from startup to well established enterprises with millions of distributors across the world
    Learn More
  • 10
    This is attempt to make an open source bookmarking system that supports tagging, distributed data storage, genetic "splicing" of strains of bookmarking tags and much more!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Make AsciiDoc part of your literate programming tool set. With eWEB you can weave and tangle literate programs written as AsciiDoc documents, using embedded WEB code snippets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    This project will provide translation of mathematical content, from TeX to MathML and vice-versa, and to graphics formats, as a web service. TeX, running as a daemon, is used for mathematical typography.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Museum portal based on Plone and PostgreSQL presenting archive-, photo-, subject matter and book materials in addition to online articles. Supports importing of data from museum systems in CIDOC XML format.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The Flamenco search and browse interface framework uses hierarchical faceted metadata to allow users to both refine and expand queries, while maintaining a consistent representation of a collection and seamlessly integrating keyword queries.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Whiki is a hierarchic data structure. This project contains a web content management system to handle a whiki database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The Open Archive Cataloguer (zOAC) project applies the OAI-PMH protocol for automatic metadata harvesting and aggregation of bibliographic records and has been developed over the web application server Zope. Based on Pentila's ZOpenArchives Zope Product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A web-based search interface tailored to the New Zealand Gazette PDF archive for the NZ library community. A generic Python-based Swish-e search interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CiteULike is a free service to help academics share, store, and organise the papers they're reading. This open source project contains the code to scrape citations from publishers' web sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    collection good tip code as utility packet,esp. Chinses! and design one mail core Blog system. only through mail we can blogging us Blog! and as one opening proj. , let China Python easy share tip code into uniform packet; and enjoy OpenSource...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Methodios intends to be a library software based on well known background free software such as PostgreSQL and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    open-tamil

    Tamil Tools, Tamil Library for Python 2, 3

    Open-Tamil is a full featured Tamil text processing library in Python. It works fully in Python 2, 3. Published via pip - python package index. See: https://pypi.python.org/pypi/Open-Tamil/0.67
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    cosmos

    cosmos

    Algorithms that run our universe | Your personal library of every algo

    Cosmos (by OpenGenus Foundation) is your personal offline collection of every algorithm and data structure one will ever encounter and use in a lifetime. This provides solutions in various languages spanning C, C++, Java, JavaScript, Swift, Python, Go and others. This work is maintained by a community of hundreds of people and is a massive collaborative effort to bring the readily available coding knowledge offline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    PyShelf

    PyShelf

    FOSS Ebook Server, With no windowing requirements

    PyShelf is an Open Source python based, ebook server, that does not and never will require a windowing system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    abelujo

    abelujo

    Free software for bookshops.

    Abelujo helps bookshops or publishers manage their stock of books. It provides a quality bibliographic search, works with a barcode scanner, allows to dispatch books in many places, records sells, understands deposits, can export lists to txt, pdf or csv (Excel or LibreOffice), has statistics, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next