Showing 54 open source projects for "python web crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 1
    PowerTalk automatically speaks Microsoft PowerPoint presentations. For presenters who find speaking difficult, audiences containing people with visual impairments and fun educational uses. Uses synthesised computer speech provided with Windows
    Downloads: 21 This Week
    Last Update:
    See Project
  • 2
    A collection of tools for working with the comparative data analysis ontology including import/export facilities for common phylogenetic file formats, and also a triple-store framework.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    The Virtual Commons (http://commons.asu.edu) is an open software initiative devoted to computational experiments on collective action and resource governance and funded by Arizona State University's Center for Behavior, Institutions, and the Environment (http://cbie.asu.edu). NOTE: we've moved our development to GitHub at https://github.com/virtualcommons - please look for the latest versions there.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    This software enables easy creation and sharing of district maps online.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Crowdtesting That Delivers | Testeum Icon
    Crowdtesting That Delivers | Testeum

    Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

    Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights. Click to perfect your product now.
    Click to perfect your product now.
  • 5
    Web-as-corpus tools in Java. * Simple Crawler (and also integration with Nutch and Heritrix) * HTML cleaner to remove boiler plate code * Language recognition * Corpus builder
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Crawl a set of files, accumulating information on the temporal and spatial extent of the data in each file, for later search and retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    This Project moved to https://sourceforge.net/projects/synbiowave/ because the name GeneWave is a registered trademark... Please do not use this project anymore.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    This project aims to provide an open source fleet management system with special focus on modularity and integration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Stats Jam project is an extension to Mediawiki that allows users to embed database queries and visualisations into their wiki pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Test your software product anywhere in the world Icon
    Test your software product anywhere in the world

    Get feedback from real people across 190+ countries with the devices, environments, and payment instruments you need for your perfect test.

    Global App Testing is a managed pool of freelancers used by Google, Meta, Microsoft, and other world-beating software companies.
    Try us today.
  • 10
    JLink lets users author flow charts based on ISO 5807 and IBM standards. Developers can use JLink to add flowcharts to applications, serve a flow chart over the web in PDF or PNG, or dynamically create a flowchart with Javascript, Python or Ruby scripts
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 11
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Crow - Computational Representation Of Whatever. A platform for the integration and mining of complex and distributed data. Represents cross-linked semantic web documents as a network of software objects and offers easy ways to filter, and sort them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Wattos is a collection of mostly Java programs for Structural Biology and NMR Spectroscopy. It's programs analyze, annotate, parse, archive, and disseminate experimental NMR data deposited by authors world wide into the PDB and BMRB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The goal of zAutomation project is to design/implement hardware, firmware and software for remote control and monitoring of physical objects, by using the ZigBee technology and internet. The field of application is robotics and domotics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Design and develop Recommendation and Adaptive Prediction Engines to address eCommerce opportunities. Build a portfolio of engines by creating and porting algorithms from multiple disciplines to a usable form. Try to solve NetFlix and other challenges.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    iROS is a meta-operating system for technology-rich "interactive rooms". The core components (Event Heap, DataHeap, iCrafter) provide communication, data storage, and service management for an iRoom.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    BioMa is a specimen based Biodiversity database Manager. It is designed to store, organize, and manipulate biodiversity-related scientific data, either for the purposes of museums, scientific collections, or research projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A dialect of XUL implementing most of Mozilla XUL's Fourth Draft. XML User Interface Language (XUL) is a method for easily creating GUI applications. Lux XUL supports Python scripting via Jython 2.1.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A.I. security app. Development ceased.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    python enteprise integration framework project. Powerfull class library based on EAI patterns and a modeling and simulation tool.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Healthcare Xchange Protocol for interoperative communications. Data exchange/transfer, platform independent,XML-RPC, HL7, SOAP, EDIFACT, simple,easy, authenticated, secure, transparent, no geo-restrictions, open sourced, peer reviewed, collab development
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The Comparative Toxicogenomics Database (under development) will be a publicly-available, web-based database of genes and proteins of human toxicological significance. It is being developed using an Oracle 9i database, Tomcat, and Python.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.