Search Results for "text processing" - Page 5

718 projects for "text processing" with 1 filter applied:

  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Microsoft Works format import library
    libwps is a Microsoft Works file format import filter based on top of the librevenge (see https://sourceforge.net/p/libwpd/wiki/librevenge/ ). Currently, libwps can import all word processing Works formats since about 1995 with some success. It may also be able to import some basic database and spreadsheet files.
    Leader badge
    Downloads: 354 This Week
    Last Update:
    See Project
  • 2
    biblatex
    Biblatex is a LaTeX package which provides full-featured bibliographic facilities
    Leader badge
    Downloads: 32 This Week
    Last Update:
    See Project
  • 3
    Articlefox is a workflow system that can be used to prepare the articles of a small journal.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    FileCut

    FileCut

    Simple cross-platform application to cut and join any text file.

    FileCut is a simple easy-to-use cross-platform application to cut to a given line and then join any text file. It is also possible to join in normal or reverse order. Works also from command-line interface, e.g. 'java -jar filecut.jar -c file.txt . 10', to cut 'file.txt' at line 10, and 'java -jar filecut.jar -j . >file.txt', to join files in the current directory in 'file.txt'. FileCut is portable, does not need installation and is developed in Java, so needs the Java Virtual Machine...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Script Echo Color

    Script Echo Color

    Terminal text colorizing and simplifies script coding.

    ScriptEchoColor simplifies Linux terminal text colorizing, formatting and several steps of script coding.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    GATE
    NOTE THAT THE SOURCE CODE AND ISSUE TRACKER HAVE NOW MOVED TO GITHUB. FIND US AT https://github.com/GateNLP/ GATE (General Architecture for Text Engineering) is an architecture, framework and development environment for developing, evaluating and embedding Human Language Technology. See http://gate.ac.uk for full details.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    xmlj

    XMLJ is a Java XML Editor and validator project.

    XMLJ is a Java XML Editor and validator project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Unihanconver

    Unihanconver

    Traditional/Simplified Chinese conversion with CLI or GUI

    Tool to convert between Traditional/Simplified Chinese directly in Unicode (not GB/Big5 conversion). It is written in Perl and does not use any external libraries. It provides a command-line utility as well as a GTK+ interface for X Window.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    SubLin

    SubLin

    Software tool to subtract lines of any text file from another.

    SubLin is a simple easy-to-use cross-platform application to subtract lines of any text file from another. It is also possible to keep or ignore case sensitive. Works also from command-line interface, e.g. "java -jar sublin.jar -s file1.txt file2.txt >new_file1.txt", to create output file "new_file1.txt", or "java -jar sublin.jar -s file1.txt file2.txt >>new_file1.txt", to create or append to output file "new_file1.txt". SubLin is portable, does not need installation and is developed in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DupRem

    DupRem

    Simple application to remove duplicate and empty lines on text files.

    DupRem is a simple easy-to-use cross-platform application to remove duplicate and empty lines from any text file. It is also possible to keep or ignore case sensitive. Works also from command-line interface, e.g. "java -jar duprem.jar -r input_file.txt >output_file.txt", to create output file, or "java -jar duprem.jar -r input_file.txt >>output_file.txt", to create or append to output file. DupRem is portable, does not need installation and is developed in Java, so needs the Java Virtual...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PCSecrets

    PCSecrets

    Encrypt and manage secret text data

    PCSecrets is a PC application that holds secret text data - protected by a master password and strong encryption. Use it as a password manager or just somewhere to hold any text data securely in one place. It can hold a second, hidden set of secrets that is undetectable and plausibly deniable. The program is also a PC counterpart of the Secrets for Android app. It uses the same data structure and provides synchronization that allows easy transfer of secrets between the two. For those who...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    CSM (Conversational Speech Model)

    CSM (Conversational Speech Model)

    A Conversational Speech Generation Model

    The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Agena

    Agena

    Agena is an interpreted procedural programming language.

    Agena is an easy-to-learn procedural programming language designed for science, scripting, and many other applications. Binaries are available for Windows, Linux, Solaris, OS/2, Mac OS X, Raspberry Pi and DOS.
    Leader badge
    Downloads: 209 This Week
    Last Update:
    See Project
  • 15
    Universal Sentence Encoder

    Universal Sentence Encoder

    Encoder of greater-than-word length text trained on a variety of data

    The Universal Sentence Encoder (USE) is a pre-trained deep learning model designed to encode sentences into fixed-length embeddings for use in various natural language processing (NLP) tasks. It leverages Transformer and Deep Averaging Network (DAN) architectures to generate embeddings that capture the semantic meaning of sentences. The model is designed for tasks like sentiment analysis, semantic textual similarity, and clustering, and provides high-quality sentence representations in a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Mindwtr

    Mindwtr

    A complete Getting Things Done (GTD) productivity system for desktop a

    Mindwtr: The Privacy-First GTD System Mindwtr is a Getting Things Done (GTD) productivity tool designed for "Mind Like Water." It runs completely offline—no accounts, no tracking, and no subscriptions. The Core GTD Workflow Capture: Instantly offload thoughts to your Inbox. Clarify: Process tasks rapidly with the built-in "2-Minute Rule" timer. Organize: Sort tasks by Contexts (@work, @home), Areas, and Projects. Reflect: Keep your system trustworthy with a guided Weekly...
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Transformers4Rec

    Transformers4Rec

    Transformers4Rec is a flexible and efficient library

    Transformers4Rec is an advanced recommendation system library that leverages Transformer models for sequential and session-based recommendations. The library works as a bridge between natural language processing (NLP) and recommender systems (RecSys) by integrating with one of the most popular NLP frameworks, Hugging Face Transformers (HF). Transformers4Rec makes state-of-the-art transformer architectures available for RecSys researchers and industry practitioners. Traditional recommendation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    YAYI

    YAYI

    Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM

    YAYI is an open-source large language model project developed to provide a multilingual conversational AI system capable of performing a wide variety of natural language processing tasks. The model is trained on diverse datasets covering multiple languages and domains so that it can support applications ranging from dialogue systems to text analysis and knowledge retrieval. The architecture is based on transformer-style language models optimized for conversational understanding and generation. In addition to producing coherent responses, the system is designed to handle tasks such as summarization, translation, question answering, and text classification. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    GPT-2 Output Dataset

    GPT-2 Output Dataset

    Dataset of GPT-2 outputs for research in detection, biases, and more

    The GPT-2 Output Dataset is a large collection of model-generated text, released by OpenAI alongside the GPT-2 research paper to study the behaviors and limitations of large language models. It contains 250,000 samples of GPT-2 outputs, generated with different sampling strategies such as top-k truncation, to highlight the diversity and quality of model completions. The dataset also includes corresponding human-written text for comparison, enabling researchers to explore methods for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Change File Encoding

    Change encoding of text files.

    Change File Encoding is a utility that allows you to change the encoding of text files. For example, files saved in US-ASCII can be converted to UTF-8. Over 170 encodings are supported. Requires Java 1.8 or higher.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    funNLP

    funNLP

    Resources, corpora, and tools for Chinese natural language processing

    FunNLP is a large, curated collection of resources, corpora, and tools for Chinese natural language processing (NLP). It aggregates datasets, lexicons, wordlists, sentiment dictionaries, knowledge graphs, and pretrained model references, serving as a one-stop resource hub for Chinese NLP practitioners. The repository is organized into categories such as sentiment analysis, text classification, named entity recognition, knowledge graphs, and various lexicons (e.g. sensitive words, emotion dictionaries, stopwords). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ekho

    ekho

    Chinese text-to-speech engine

    ekho is a project with relatively sparse documentation, but from the repository it appears to be a small-scale tool for audio processing and playback, possibly with features for speech synthesis or manipulation. The repo includes scripts and configuration files suggesting interactions with media/audio handling libraries. Because of limited README detail, it seems targeted at users comfortable reading and modifying code, rather than end users expecting polished UIs. The code structure implies...
    Downloads: 11 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB