Showing 144 open source projects for "text encoding"

View related business solutions
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese LLaMA & Alpaca large language model + local CPU/GPU training

    This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    EnigmaLike
    EnigmaLike is an Enigma-like encoding tool for text files that encodes word-by-word using a dictionary/code book/encryption reel set-up. EnigmaLike is written using PerlTk and has instructions in PDF format. For Linux.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Language-Aware String Extractor

    multi-encoding strings(1) replacement with language identification

    Enhanced version of the standard Unix strings(1) program which uses language models for automatic language identification and character-set identification, supporting over 1400 languages, dozens of character encodings, and 4800+ language/encoding pairs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Penumbra

    Penumbra

    Penumbra Color Theme

    ...The palette consists of nine nearly symmetric base colors, which are used to build the main light and dark themes, along with two additional high-contrast dark variants tailored for people with mild to moderate visual impairments. Its design focuses on functionality first, while maintaining an aesthetic quality that draws from familiar natural tones. Beyond its use in text editors and terminal environments, Penumbra’s carefully structured accent palettes are also suited for encoding information in data visualizations, where perceptual uniformity and hue differentiability are critical.
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 6
    OpenPrompt

    OpenPrompt

    An Open-Source Framework for Prompt-Learning

    ...The template is one of the most important modules in prompt learning, which wraps the original input with textual or soft-encoding sequence. Use the implementations of current prompt-learning approaches.* We have implemented various of prompting methods, including templating, verbalizing and optimization strategies under a unified standard. You can easily call and understand these methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    q - Text as Data

    q - Text as Data

    Run SQL directly on CSV or TSV files

    q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files). q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs etc. It supports automatic column name and column type detection, and provides full support for multiple encodings. q fully supports all types of encoding. Use -e data-encoding to set the input data encoding, -Q query-encoding to set the query encoding, and use -E output-encoding to set the output encoding. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    SwiftMailer

    SwiftMailer

    Comprehensive mailing tools for PHP

    SwiftMailer is a flexible, object-oriented PHP library for sending emails via SMTP, Sendmail, or other transports (including third-party APIs). It supports features essential to robust email sending: attachments, HTML vs. plain text bodies, inline images, MIME types, message queues, and signed messages via DKIM. The API lets developers build richly composed messages with multiple parts, headers, and encodings without dealing with raw mail formatting. With built-in retry logic, logging, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    file-splitter-rejoiner

    file splitter and rejoiner

    /* * * Freeware * Open Source * 2 tools in one application * using .Net 4.8 * (1) Simple files splitter and rejoiner tool using memory buffer * (2) Simple files base64 encoder and decoder using random sized Stream GB/TB+ data sizes * A good tool for an essentials inventory * Just when required. * Simple precise short and straightforward coding * Tested bugs free and perfect when I developed and released it. * * Developer: Tushar Jain * Release Time: 09:33 PM *...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10

    S-DES Crypto App

    Encryption/Decryption demonstration app using the S-DES algorithm

    ...The GUI pops up that allows entering a 10-bit encryption/decryption key in binary (0, 1 digits) and the plaintext/ciphertext in the same form (8-bit). Executing the algorithm, decription of the encoding/decoding functionality is provided in the three main text boxes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    ...It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings. By throwing these strings at forms, APIs, databases, and UIs, teams can discover encoding bugs, sanitizer gaps, rendering issues, and security oversights early. The list is language-agnostic and repository-friendly, meaning you can consume it from CI pipelines or local scripts with minimal setup. Because it’s crowdsourced, it reflects real issues practitioners have faced in production, not just theoretical cases. Using the list regularly helps harden applications against the fragile edges of text processing and user input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    JDS: Data Security

    JDS: Data Security

    JDS is a powerful data protection program

    JDS is a powerful tool for data protection. Documents are encrypted using a special algorithm. The data encryption algorithm is constantly updated to improve security. The program also provides other functions. Such as sending a file to a vault with password access, fast text encryption in 3 types, and much more. The program is easy to use and does not require special knowledge. The standard version has 2 languages (Russian and English) and 9 themes. You can download the language...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Unix2Dos

    Unix2Dos

    DOS/Mac to Unix and vice versa text file format converter

    The Dos2unix package includes utilities dos2unix and unix2dos to convert plain text files in DOS or Mac format to Unix format and vice versa. In DOS/Windows text files a line break, also known as newline, is a combination of two characters: a Carriage Return (CR) followed by a Line Feed (LF). In Unix text files a line break is a single character: the Line Feed (LF). In Mac text files, prior to Mac OS X, a line break was single Carriage Return (CR) character. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 14
    YouTokenToMe

    YouTokenToMe

    Unsupervised text tokenizer focused on computational efficiency

    YouTokenToMe is a fast and efficient unsupervised text tokenization library designed for training subword embeddings, particularly useful for NLP models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Texar

    Texar

    Toolkit for Machine Learning, Natural Language Processing

    ...Texar-TensorFlow (this repo) and Texar-PyTorch have mostly the same interfaces. Both further combine the best design of TF and PyTorch. Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Universal Encoder Decoder - AyaN Softwar

    Universal Encoder Decoder - AyaN Softwar

    84 Type Encoding/Decoding Options And Full Offline - AyaN Software

    In the era of digital communication and data security and computer management the character arranging encoding and decoding system is doing its best. You can encode and decode data easily with the online tools but this the software Universal Encoder Decoder can do all types of encoding and decoding as fast as light. Some most advance feature of this encoding and decoding is given below , which make this software different from others : This software - first of all easy to install and easy to use, just one click and it will start. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    he

    he

    A robust HTML entity encoder/decoder written in JavaScript

    he is a JavaScript library that provides robust HTML entity encoding and decoding, with full Unicode support. It supports all standardized named character references (e.g., ©, —), handles numeric and hex entities, and deals properly with astral Unicode symbols (i.e., code points outside the BMP). The library is designed so that he.decode(input) will safely convert HTML-entity encoded strings into proper Unicode text, and he.encode(text, options) will encode non-ASCII or special characters into safe entity references. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    TeXML is an XML vocabulary for TeX. The processor transforms the TeXML markup into the TeX markup, escaping special and out-of-encoding characters. The intended audience is developers who automatically generate [La]TeX or ConTeXt files.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 22

    CryptedNotepad

    Lightweight Microsoft Notepad replacement

    Lightweigth Microsoft Notepad replacement -UAC support -automatic encoding recognition -*nix/MS new line recognition -Rijndael crypted text files -Regex search/replace
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A Generic Platform for Iris Recognition

    A Generic Platform for Iris Recognition

    A framework that allows iris recognition algorithms to be evaluated

    ...Summary results are output in plain text log files; detail result values in csv files and graphs in jpg and enhanced metafile. Several open source algorithms are included, from Libor Masek, Xin Li and JIRRM. Also, several implementations from papers are included.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    MStorage

    MStorage

    MStorage - storage for notes.

    MStorage - is storage for notes, it suggests opportunity to save and catalog notes, articles and tips in local drive by directories tree. If you usually store your notes in simple text files, but you begin upset when have to walk through directories tree every time when you need to find something, what you exactly want - this app will help you. WHATS NEW in v.1.1.*: - search in file - search in directory - readonly option for file - choose font, style and size of font in text editor - carousel navigation in picture view window - AES 128-bit encryption for files - filter for files in tree - check for update file in local drive - auto and manual checking a new version of app - view file in OS file explorer - fixed bug with default OS Encoding FULL DOCUMENTATION https://sourceforge.net/p/mstorage/wiki/Home/ Tested on Win7x64, Win10x32, Ubuntu 16.04x32.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Stringy

    Stringy

    A PHP string manipulation library with multibyte support

    Stringy is a PHP library that provides a set of string manipulation functions inspired by the String class in other programming languages. It offers a fluent interface for common string operations, including case conversion, trimming, and formatting. Stringy is designed to simplify string handling by providing a consistent and expressive API, making it a valuable tool for text processing in PHP applications.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB