Showing 25 open source projects for "processing"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    This package contains different tools to add NLP capabilities for Lucene 4.x (it has been tested using Lucene version from 4.6.x to 4.8.1). Although it was originally developed for German, it is, mostly, language independent. It allows the user to lemmatize words to be indexed, to weight termy ba their parts of speech (e.g. weighting nouns mor hevaily than pronouns), and to add synonyms taken from GermaNet or a list you provide to the search index and thereby increase recall of lucene.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    SEO & SEM - Marketing Text Writer

    SEO & SEM - Marketing Text Writer

    Open Source SEO & SEM Text Creation Tools for free Article Writer

    Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing. These SEO Content Genrators and Article Writers based on Text Writer: https://www.artikelschreiber.com/en/ https://www.unaique.net/en/ https://www.unaique.com/ https://www.artikelschreiben.com/ https://www.buzzerstar.com/ https://googleduplicatecontentsolver.sourceforge.io/ https://inkassos.github.io/inkasso/ https://www.artikelschreiber.com/opensource/ https://www.sebastianenger.com/ https://www.artikelschreiber.com/marketing/review/ https://muckrack.com/markus-muller https://linktr.ee/textgenerator Code Contains: - Perl Source code, language databases and more
    Downloads: 0 This Week
    Last Update:
    See Project
  • D&B Hoovers is Your Sales Accelerator Icon
    D&B Hoovers is Your Sales Accelerator

    For sales teams that want to accelerate B2B sales with better data

    Speed up sales prospecting with the rich audience targeting capabilities of D&B Hoovers so you can spend more sales time closing.
    Learn More
  • 5
    This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    A system to perform analysis of large documents for the purpose of cataloging similar documents. Similarity is based upon contextual analysis of these documents done by identifying common words and proper nouns.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Pypes is a framework which allows users to break complex data processing logic down into a series of smaller less complex tasks. These tasks, referred to as components, can then be connected so that the output of one becomes the input to another.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    SYRAH si propone di far emergere e rappresentare i concetti espressi per mezzo di un linguaggio naturale. SYRAH aims to discover and represent concepts expressed in natural languages. NLP, lemma, lemmario, italiano, rete, semantica, clustering, semantic
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Switchboard is a conceptual-level interface to many web and network related functions (SOAP, REST, XML parsing, screen-scraping, FTP, network sniffing), designed for the Processing environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    (Almost) all a scholar in the Humanities needs (polytonic Greek fonts, stylistic and metrical analysis tools, search engines on TLG and PHI) concentrated in only one Linux Live CD, ready to use everywhere at home or at University, without installation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Estraier is a personal full-text search system for web sites, local file systems, mail boxes, and so on. Estraier has flexible interface and it can handle multilingual documents and various file formats with external plug-ins.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The Infomap NLP software performs automatic indexing of words and documents from free-text corpora, using a variant of LSA to enable information retrieval and other applications. It was developed by the Infomap Project at Stanford University's CSLI.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Fast Local File Search Using Lucene, HTMLParser and Highlighter Support Chinese now
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TM4J is a topic map engine implemented entirely in Java. Topic maps are a standard paradigm for the interchange of knowledge structures. This project aims to produce a complete suite of tools for creating, processing and publishing topic map information.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    SimpleRDF/XSL template simplifies RDF/XML sources as much as possible to allow easy processing. SimpleRDF/PHP5 parser takes advantage of SimpleRDF/XSL. It has extremly simple API. You can parse any RDF/XML compatible document (incl. RSS) and much more...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    The "Universal Content Evaluation and Categorisation Software" is a program for analysing a website’s, or more generally, a text’s content. The text is arranged in dozens of categories, permitting more efficient web searches and information processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    A C++ library for processing Internet Archive ARC, CDX, and DAT files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This code supplies miniature pedagogical Java implementations of information retrieval, spidering, and text-processing software. It was initially developed for an introductory course on Intelligent Information Retrieval and Web Search in UT Austin.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    palbum is a perl script which turns a directory (or directory hierarchy) full of images into a nice image gallery. It generates thumbnails and index.html files, and requires no configuration. It uses the ubiquitous netpbm library for image processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ...Uses regular expressions to search a set of DOM nodes, and transparently handles highlighting matches that span multiple elements. Highlight events are passed to a user defined highlighter for processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next