Showing 17 open source projects for "python web crawler"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Autoplot

    Autoplot

    Autoplot is an interactive browser for data on the web

    Autoplot is an interactive browser for data on the web. Give Autoplot a URL or local file name and it creates a sensible plot of the data. Autoplot allows you to interactively browse data stored in ascii, .cdf, netcdf, and many other formats. Autoplot's source has been moved to GitHub. Thanks to SourceForge for many years of hosting!
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    aseryla

    aseryla

    Aseryla code repositories

    This project describes a model of how the semantic human memory represents the information relevant to the objects of the world in text format. It provides a system and a GUI application capable of extracting and managing concepts and relations from English texts. https://aseryla2.sourceforge.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    scipion-xmipp

    scipion-xmipp

    Image processing framework to integrate EM software packages.

    Scipion is an image processing framework to obtain 3D models of macromolecular complexes using Electron Microscopy (3DEM). It integrates several software packages and presents an unified interface for both biologists and developers. Scipion allows to execute workflows combining different software tools, while taking care of formats and conversions. Additionally, all steps are tracked and can be reproduced later on. Xmipp is a well-known package in the EM image processing. It is integrated...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    A web-based interface for the Hadoop MapReduce framework that simplifies the process of writing and running MapReduce jobs. Aimed at introducing parallelism concepts in introductory computer science courses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 5
    This project aims to be a easy-to-use toolkit of algorithms and utilities for semantic data mining. So far all algorithms are implemented as web services and we provide widgets for their use in the Orange4WS data mining platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Ex-Crawler
    Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Web-as-corpus tools in Java. * Simple Crawler (and also integration with Nutch and Heritrix) * HTML cleaner to remove boiler plate code * Language recognition * Corpus builder
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Crawl a set of files, accumulating information on the temporal and spatial extent of the data in each file, for later search and retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    The Stats Jam project is an extension to Mediawiki that allows users to embed database queries and visualisations into their wiki pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 10
    iDocs is a intellectual document work flow with text mining options project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Design and develop Recommendation and Adaptive Prediction Engines to address eCommerce opportunities. Build a portfolio of engines by creating and porting algorithms from multiple disciplines to a usable form. Try to solve NetFlix and other challenges.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A.I. security app. Development ceased.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    HORUS is a system for knowledge acquisition, hypothesis generation, inference and learning. It is an interactive, internet environment accessible to a diverse community of users (public-access or membership basis) - see also UMKAILASH project for more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Green Cloud Online portal

    Green Cloud Online portal

    Green cloud: A versatile sensor based precision agriculture system, in

    In Green Cloud here is a central global server which does the work of global database management along with providing complex analysis services for the users (farmers, scientists) through the client machines and the public access web portal. The existing agricultural databases throughout the world can also be used for inference purposes and their data may also be updated by our server. The meteorological servers can be accessed for climate data.This server-client architecture mandatory...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next