Showing 82 open source projects for "python text parser"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 1
    crossplane

    crossplane

    Quick and reliable way to convert NGINX configurations into JSON

    Reliable and fast NGINX configuration file parser and builder. Since crossplane is usually used to create payloads that are sent to different servers, it's important to keep security in mind. For that reason, the --ignore option was added. It can be used to keep certain sensitive directives out of the payload output entirely.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ADFILT

    ADFILT

    Web filter lists for countless different topics

    This is the place where I, Imre Kristoffer Eilertsen, host my web filter lists for countless different topics, for use in adblock tools and the like. GitHub was in mid-2017 by far the easiest way for laymen like me to store pure text files, which is a necessity to create subscribable lists. This is a hobby project of mine, in which I work just as much on these lists and this repo as I feel like. But don't be fooled by the appearance, as these are nevertheless some lists that I've placed lots...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    PersonGen

    PersonGen

    A minor Project in Python which uses the RandomUser API .

    A Small Program in Python That Makes Use of RandomUser API To Generate Random Person Data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    pylatexenc

    pylatexenc

    Simple LaTeX parser providing latex-to-unicode and unicode-to-latex

    Simple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion. Python 3.4 or 2.7. The library is designed to be as backward-compatible as reasonably possible and is able to run on old Python versions should it be necessary. (Use the setup.py script directly if you have Python 3.7, poetry doesn't seem to work with old Python versions.) The pylatexenc.latexencode module provides a function unicode_to_latex() which converts a Unicode string into LaTeX text and escape sequences...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Rdbtools

    Rdbtools

    Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

    Rdbtools is a parser for Redis' dump.rdb files. The parser generates events similar to an XML sax parser and is very efficient memory-wise. Rdbtools is written in Python, though there are similar projects in other languages. Every run of RDB Tool requires to specify a command to indicate what should be done with the parsed RDB data. Valid commands are JSON, diff, justkeys, justkeyvals and protocol. The JSON command output is UTF-8 encoded JSON. By default, the callback try to parse RDB data...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    jsonfield

    jsonfield

    A reusable Django model field for storing ad-hoc JSON data

    ... to be database-agnostic, or when the built-in JSONField's extended querying is not being leveraged. e.g., a configuration field. JSONField is not intended to provide extended querying capabilities. That said, you may perform the same basic lookups provided by regular text fields (e.g., exact or regex lookups). Since values are stored as serialized JSON, it is highly recommended that you test your queries to ensure the expected results are returned.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Budou

    Budou

    Budou is an auto organizer tool for beautiful line breaking in CJK

    Budou is a Python library developed by Google to improve web typography for CJK (Chinese, Japanese, Korean) languages by producing semantically meaningful line breaks. Unlike English, CJK scripts lack spaces or hyphenation cues, often resulting in awkward or unreadable text wrapping on web pages. Budou addresses this issue by segmenting sentences into logical lexical chunks and wrapping each chunk in non-breaking HTML <span> tags. These spans can be styled with CSS to ensure smooth, visually...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    HTMLMinifier

    HTMLMinifier

    Javascript-based HTML compressor/minifier (with Node.js support)

    HTMLMinifier is a highly configurable, well-tested, JavaScript-based HTML minifier. Minifier options like sortAttributes and sortClassName won't impact the plain-text size of the output. However, they form long repetitive chains of characters that should improve compression ratio of gzip used in HTTP compression. SVG tags are automatically recognized, and when they are minified, both case-sensitivity and closing slashes are preserved, regardless of the minification settings used for the rest...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 10
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Downloads: 22 This Week
    Last Update:
    See Project
  • 11

    dpanalyzer

    postprocessing tool for Project Gutenberg Distributed Proofreaders

    Specialized tool for PostProcessors of books produced by Project Gutenberg Distributed Proofreaders. Parses the markup structure of a project file out of the formatting rounds; reports about the text structure found, and identifies markup errors. Planned future features: generation of normalized dp output by rejoining split paragraphs and moving around footnotes, renumbering of pages; conversion to basic LaTeX and basic HTML markup for further processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13

    JSONjuicer

    JSON parser and encoder

    A Java open-source library which makes encoding and decoding Java data-structures to and from JSON text easy and intuitive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Gumbo

    Gumbo

    An HTML5 parsing library in pure C99

    ... into a parse tree, and free that parse tree all at once. To install the python bindings, make sure that the C library is installed first, and then sudo python setup.py install from the root of the distro. This installs a 'gumbo' module; pydoc gumbo should tell you about it. Tested on over 2.5 billion pages from Google's index. Passes all html5lib tests, including the template tag.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Analyzer for Files

    A tool to look into file contents

    Analyzer for Files (AoF) is a tool to look into file contents, analyze the structure with installed plug-ins, and show the results with several split windows including converted data and a tree if successful. It was designed as a workbench with a core and plug-in extensions. It can handle the normal plain-text file and data, complex binaries supported with the corresponding plug-ins. What's more, the developers can deploy and release their own plug-ins according to the plug-in developing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Human Speakable Programming Language

    Human Speakable Programming Language

    foundation of the General Intelligence Operating System

    HSPL is Human Speakable Programming Language, allowing for communication between human-to-computer and human-to-human in the same language. This project has moved to http://sourceforge.net/p/spel We are currently working on human-to-computer programming-language with mostly English base vocabulary. Though once we have that, we plan to add support for other world Languages, including Chinese, Spanish, Russian, Arabic, Hindi, among others. Eventually HSPL shall be the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Simple Java delimited and fixed width file parser. Handles CSV, Excel CSV, Tab, Pipe delimiters, just to name a few. Maps column positions in the file to user friendly names via XML. See "FlatPack Feature List" under News for complete feature list.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    A python script that uses wxwidgets. View or edit delimited data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Stand-alone Java library implementing parser/formatter/comparator/validator for JSON/XML-like text formats oriented on JSON-like object model (list,map,scalar + reflection). Library is designed to maximize adaptivity via set of extendable modules.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Wiko, the wiki compiler, compiles wiki like files into html and LaTeX, combining easy wiki syntax, your preferred non-web text editor and svn/cvs control to write static webs, cientific articles or even blogs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    LIstFOrmatCONverter

    A tool to convert an exported text file from one format to another.

    Sometimes an upgraded application changes data formats which can break compatibility with previous versions. If import/export text files are of the following format: "descript1","descript2","descript3" "data1","data2","data3" "data1","data2","data3" then this program can rearrange the data of large exported files in order to be imported into another version or application with little effort.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PROJECT HAS MOVED: https://github.com/wiki2beamer
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    ExiProcessor

    Command-line program for processing Efficient XML Interchange (EXI)

    ExiProcessor is a command-line program that encodes text XML files into binary EXI and decodes EXI files into XML. It uses the open source Java-based library EXIficient (http://exificient.sourceforge.net) as the EXI parser. In essence, ExiProcessor is a command-line interface to EXIficient. ExiProcessor can help people learn about the various EXI encoding and decoding options and how those options affect compression ratios. The source code itself can also be used as an example of how...
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.