Showing 10 open source projects for "command line search text"

View related business solutions
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
    Learn More
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
    Get started free
  • 1
    Trafilatura

    Trafilatura

    Python & command-line tool to gather text on the Web

    Trafilatura is a Python package and command-line tool designed to gather text on the Web. It includes discovery, extraction and text-processing components. Its main applications are web crawling, downloads, scraping, and extraction of main texts, metadata and comments. It aims at staying handy and modular: no database is required, the output can be converted to various commonly used formats. Going from raw HTML to essential parts can alleviate many problems related to text quality, first...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Easyspider - Distributed Web Crawler

    Easyspider - Distributed Web Crawler

    Easy Spider is a distributed Perl Web Crawler Project from 2006

    .../ https://www.artikelschreiben.com/ https://buzzerstar.com/ https://easyperlspider.sourceforge.io/ http://artikelschreiber.net/ http://sebastianenger.com/ http://unaique.de/ http://unaique.org/ It is fun to look at some code that is few years ago and to see how one has improved himself. If you want to write text automatically try https://www.artikelschreiber.com/en/ or https://www.unaique.net/en/!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Yet another web crawler? Yes, but this ones uses the full power of regular expressions to accept or reject, examine or ignore, save or refuse pages. You also use MIME types to do all this. Powerful and flexible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fleet Management Software Icon
    Fleet Management Software

    Tracx Trucking Management Software provides the best for trucking companies looking for insight on their trucking dispatch company.

    Tracx TMS is a cloud-based, web and desktop solution for brokers and dispatchers to manage assigned drivers and loads. The platform offers invoice control, management of account receivable and payable, load status tracking, assign drivers to loads and track fuel cards. Additionally, Tracx TMS offers GPS integration allowing users to see trucks on the dispatching map.
    Learn More
  • 5
    A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    elk is a powerful open-source python based command-line web crawler that can recursively search for files and text on websites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.
    Downloads: 1 This Week
    Last Update:
    See Project
  • R3 Contract Management for GovCon Icon
    R3 Contract Management for GovCon

    Designed to meet the unique needs of Federal Government Contractors.

    R3 Contract Management for GovCon is a practical, flexible, and affordable software solution designed specifically for small to mid-size Federal Government Contractors. It provides you with a centralized contract management system. This gives you greater control over your contracts, reduces risks, increases your effectiveness and provides you with cost savings from increased productivity.
    Learn More
  • 10
    Webhunter is a distributed, multi-threaded web crawler designed for both general indexing and crawling the web for focused content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next