Showing 146 open source projects for "x86_64-linux-gnu"

View related business solutions
  • Business Continuity Solutions | ConnectWise BCDR Icon
    Business Continuity Solutions | ConnectWise BCDR

    Build a foundation for data security and disaster recovery to fit your clients’ needs no matter the budget.

    Whether natural disaster, cyberattack, or plain-old human error, data can disappear in the blink of an eye. ConnectWise BCDR (formerly Recover) delivers reliable and secure backup and disaster recovery backed by powerful automation and a 24/7 NOC to get your clients back to work in minutes, not days.
    Learn More
  • Translate docs, audio, and videos in real time with Google AI Icon
    Translate docs, audio, and videos in real time with Google AI

    Make your content and apps multilingual with fast, dynamic machine translation available in thousands of language pairs.

    Google Cloud’s AI-powered APIs help you translate documents, websites, apps, audio files, videos, and more at scale with best-in-class quality and enterprise-grade control and security.
    Learn More
  • 1
    crowdspider is a multi-thread web crawler. crowdspider is (just) a web crawler, NOT an indexer. You have to write some code yourself in order to save pages or index them in a database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ItSucks
    This project is a java web spider (web crawler) with the ability to download (and resume) files. It is also highly customizable with regular expressions and download templates. All backend functionalities are also available in a separate library.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Folksonomy Web Crawler
    A Web crawler prototype designed to index pages of certain resource sharing platforms based on folksonomy tags. The results are displayed in an Excel spreadsheet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line
    Downloads: 1 This Week
    Last Update:
    See Project
  • The Most Powerful Software Platform for EHSQ and ESG Management Icon
    The Most Powerful Software Platform for EHSQ and ESG Management

    Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.

    Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
    Learn More
  • 5
    Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    bee-rain is a web crawler that harvest and index file over the network. You can see result by bee-rain website : http://bee-rain.internetcollaboratif.info/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Methanol is a scriptable multi-purpose web crawling system with an extensible configuration system and speed-optimized architectural design. Methabot is the web crawler of Methanol.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Vivantio IT Service Management Icon
    Vivantio IT Service Management

    Your service operation isn’t one-size-fits all, so your IT service management solution shouldn’t be either

    The Vivantio Platform allows you to focus on the IT service management tools that make sense for your organization’s unique service model: from incident, problem and change requests, to service requests, client knowledge and asset management
    Learn More
  • 10
    APC Anti Crawler is a php5 class based on APC which can be used to limit the amount of http request per IP. It stop web crawler to download your entire website.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    The DeDuplicator is an add-on module (plug-in) for the web crawler Heritrix. It offers a means to reduce the amount of duplicate data collected in a series of snapshot crawls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    This project will provide a tool for users to get a better understanding of the content and structure of an existing website. It will do this by providing a customised web spider as well as extensions to the GUESS graph visualisation application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This is simple link checker. It can crawl any site and help to find broken links. It also having download CSV report option.The CSV file includes url ,parent page url and status of page [broken or ok]. It is be very useful for search engine optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Crawler.NET is a component-based distributed framework for web traversal intended for the .NET platform. It comprises of loosely coupled units each realizing a specific web crawler task. The main design goals are efficiency and flexibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    NightCrawler is a multithreaded web spider which uses MIME types to download files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    HtmlClient provides an SGML/HTML/XHTML parser and connection client making web-spidering as easy for developers as actually surfing the web with a premade browser. Based on Apache's HttpClient.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a web crawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Funnel is a project for use on intranets, or selected sites on the Internet to gather together and index information from several different sources and make it available through a sane, usable interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A web-spider, based on the availability of URL APIs to most web based databases, mapping web pages to two dimensional FreeMind mind-maps. Mapp.it runs locally like a web application and uses a small footprint CherryPy webserver.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Larbin is a Web crawler intended to fetch a large number of Web pages, it should be able to fetch more than 100 millions pages on a standard PC with much u/d. This set of PHP and Perl scripts, called webtools4larbin, can handle the output of Larbin and p
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DirIndexFaker is a PHP script designed to produce fake apache directory listings for the purpose of slowing down, and overloading with false positives the web spiders used by the RIAA, MPAA, and other Copyright Cartel members.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A basic Perl web spider with grandiose aspirations. Supports XML log file output and resumable spidering sessions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.
    Downloads: 1 This Week
    Last Update:
    See Project