Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.
Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
Start Free Trial
Train ML Models With SQL You Already Know
BigQuery automates data prep, analysis, and predictions with built-in AI assistance.
Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and textprocessing techologies in order to easely extract useful data from arbitrary web pages.
cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
Open Source SEO & SEM Text Creation Tools for free Article Writer
Open Source Tool for Search Engine Optimization (SEO & SEM) used for automatic content processing.
These SEO Content Genrators and Article Writers based on Text Writer:
https://www.artikelschreiber.com/en/
https://www.unaique.net/en/
https://www.unaique.com/
https://www.artikelschreiben.com/
https://www.buzzerstar.com/
https://googleduplicatecontentsolver.sourceforge.io/
https://inkassos.github.io/inkasso/
https://www.artikelschreiber.com/opensource/
https://www.sebastianenger.com/
https://www.artikelschreiber.com/marketing/review/
https://muckrack.com/markus-muller
https://linktr.ee/textgenerator
Code Contains:
- Perl Source code, language databases and more
This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
(Almost) all a scholar in the Humanities needs (polytonic Greek fonts, stylistic and metrical analysis tools, search engines on TLG and PHI) concentrated in only one Linux Live CD, ready to use everywhere at home or at University, without installation
The "Universal Content Evaluation and Categorisation Software" is a program for analysing a websites, or more generally, a texts content. The text is arranged in dozens of categories, permitting more efficient web searches and information processing.
The DocConversion project provides a distributed document conversion solution with a well defined API which makes use of existing convstion tools and/or a centralized conversion server. This is part of the PRONIR research at http://www.pronir.nl
This code supplies miniature pedagogical Java implementations of information retrieval, spidering, and text-processing software. It was initially developed for an introductory course on Intelligent Information Retrieval and Web Search in UT Austin.
A highlighter for XML documents, written in Java. Uses regular expressions to search a set of DOM nodes, and transparently handles highlighting matches that span multiple elements. Highlight events are passed to a user defined highlighter for processing.