Showing 94 open source projects for "python text parser"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    EpiDoc: Epigraphic Documents in TEI XML

    EpiDoc: Epigraphic Documents in TEI XML

    XML text markup for ancient documents

    The EpiDoc Collaborative is developing specifications and tools for standards-based, digital publication and interchange of scholarly and educational editions of documentary and literary texts like inscriptions and papyri. The link below will take you to the EpiDoc home page on this site.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    stml

    stml

    Indentation Procedure in HTML

    Functionalities of the STML * Indentation Procedure in HTML * Indentation Procedure in HTML * Better Implementation for Python Coders & others * Reduce usage of closing tags
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    ADFILT

    ADFILT

    Web filter lists for countless different topics

    This is the place where I, Imre Kristoffer Eilertsen, host my web filter lists for countless different topics, for use in adblock tools and the like. GitHub was in mid-2017 by far the easiest way for laymen like me to store pure text files, which is a necessity to create subscribable lists. This is a hobby project of mine, in which I work just as much on these lists and this repo as I feel like. But don't be fooled by the appearance, as these are nevertheless some lists that I've placed lots...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    PersonGen

    PersonGen

    A minor Project in Python which uses the RandomUser API .

    A Small Program in Python That Makes Use of RandomUser API To Generate Random Person Data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    pylatexenc

    pylatexenc

    Simple LaTeX parser providing latex-to-unicode and unicode-to-latex

    Simple LaTeX parser providing latex-to-unicode and unicode-to-latex conversion. Python 3.4 or 2.7. The library is designed to be as backward-compatible as reasonably possible and is able to run on old Python versions should it be necessary. (Use the setup.py script directly if you have Python 3.7, poetry doesn't seem to work with old Python versions.) The pylatexenc.latexencode module provides a function unicode_to_latex() which converts a Unicode string into LaTeX text and escape sequences...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    HTML parser in Delphi

    A Delphi class with functions to read and dissect a HTML file

    THTMLdom is a (Delphi) class with functions to read a HTML source file and dissect it into a tree of THTMLelement. The attributes of the HTML tags are stored in the elements. Functions are provided to select elements on the basis of the attribute values or tag names. The structure of the tree can be shown and it can be rendered as plain text. The source is plain Delphi pascal, requiring a version that supports Tdictionary. There is no dependency on 3rd party units. The file to be parsed must...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 8
    Rdbtools

    Rdbtools

    Parse Redis dump.rdb files, Analyze Memory, and Export Data to JSON

    Rdbtools is a parser for Redis' dump.rdb files. The parser generates events similar to an XML sax parser and is very efficient memory-wise. Rdbtools is written in Python, though there are similar projects in other languages. Every run of RDB Tool requires to specify a command to indicate what should be done with the parsed RDB data. Valid commands are JSON, diff, justkeys, justkeyvals and protocol. The JSON command output is UTF-8 encoded JSON. By default, the callback try to parse RDB data...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    jsonfield

    jsonfield

    A reusable Django model field for storing ad-hoc JSON data

    ... to be database-agnostic, or when the built-in JSONField's extended querying is not being leveraged. e.g., a configuration field. JSONField is not intended to provide extended querying capabilities. That said, you may perform the same basic lookups provided by regular text fields (e.g., exact or regex lookups). Since values are stored as serialized JSON, it is highly recommended that you test your queries to ensure the expected results are returned.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • 10
    Budou

    Budou

    Budou is an auto organizer tool for beautiful line breaking in CJK

    Budou is a Python library developed by Google to improve web typography for CJK (Chinese, Japanese, Korean) languages by producing semantically meaningful line breaks. Unlike English, CJK scripts lack spaces or hyphenation cues, often resulting in awkward or unreadable text wrapping on web pages. Budou addresses this issue by segmenting sentences into logical lexical chunks and wrapping each chunk in non-breaking HTML <span> tags. These spans can be styled with CSS to ensure smooth, visually...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    HTMLMinifier

    HTMLMinifier

    Javascript-based HTML compressor/minifier (with Node.js support)

    HTMLMinifier is a highly configurable, well-tested, JavaScript-based HTML minifier. Minifier options like sortAttributes and sortClassName won't impact the plain-text size of the output. However, they form long repetitive chains of characters that should improve compression ratio of gzip used in HTTP compression. SVG tags are automatically recognized, and when they are minified, both case-sensitivity and closing slashes are preserved, regardless of the minification settings used for the rest...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Downloads: 17 This Week
    Last Update:
    See Project
  • 13
    AST explorer

    AST explorer

    A web tool to explore the ASTs generated by various parsers

    Paste or drop code into the editor and inspect the generated AST. Depending on the parser settings, it not only supports ES5/CSS3. Since the future syntax is supported, the AST explorer is a useful tool for developers who want to create AST transforms. In fact, transformers are included so you can prototype your own plugins. Save and fork code snippets. Copy the URL to share them. Copying an AST or dropping a file containing an AST into the window will parse the AST and update the code using...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    TFieldedText

    Fielded Text (CSV) file parser/generator

    TFieldedText is a component which allows you to easily: generate and parse Fielded Text Files (eg. CSV files); and create and edit Fielded Text Meta files. For more information about the Fielded Text standard see http://www.fieldedtext.org
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    dpanalyzer

    postprocessing tool for Project Gutenberg Distributed Proofreaders

    Specialized tool for PostProcessors of books produced by Project Gutenberg Distributed Proofreaders. Parses the markup structure of a project file out of the formatting rounds; reports about the text structure found, and identifies markup errors. Planned future features: generation of normalized dp output by rejoining split paragraphs and moving around footnotes, renumbering of pages; conversion to basic LaTeX and basic HTML markup for further processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    htmlarea

    htmlarea

    Small, powerful, full featured WYSIWYG editor

    HTMLArea 4 is a browser based WYSIWYG editor that easily replaces the TEXTAREA in your web pages. It is written in JavaScript, and suitable for use in any modern web browser, and any page on your web site. Current version is 4.0-2016-08-29
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17

    JSONjuicer

    JSON parser and encoder

    A Java open-source library which makes encoding and decoding Java data-structures to and from JSON text easy and intuitive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    QAL

    QAL

    Query Abstraction Layer

    ...) for representing queries, transformation and merging, making it scriptable. This means that QAL can be backend agnostic about a subset of SQL features and data types. Of course custom SQL:s are also supported. It is currently distributed as a Python 3 Library (pip3 install python3-qal) and Debian .deb package. It is related the Optimal BPM project, see its Optimal Sync application for usage examples. The text of this page is released under the Creative Commons Zero Waiver 1.0 (CC0).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20

    Analyzer for Files

    A tool to look into file contents

    Analyzer for Files (AoF) is a tool to look into file contents, analyze the structure with installed plug-ins, and show the results with several split windows including converted data and a tree if successful. It was designed as a workbench with a core and plug-in extensions. It can handle the normal plain-text file and data, complex binaries supported with the corresponding plug-ins. What's more, the developers can deploy and release their own plug-ins according to the plug-in developing...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    HTML XHTML Parser + XPath

    Delphi HTML XHTML Parser +XPath

    Delphi HTML Parser This module lets you work with HTML documents as DOM tree and use XPath for searching tags. It is very simple way to parse HTML. This tested with version Delphi XE5,6 Usage Add in Uses parser.pas; begin HtmlTxt:= ''; //here your html NodeList:= TNodeList.Create; ValueList:= TStringList.Create; DomTree:= TDomTree.Create; DomTreeNode:= DomTree.RootNode; If DomTreeNode.RunParse(HtmlTxt) then begin {your code example
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Human Speakable Programming Language

    Human Speakable Programming Language

    foundation of the General Intelligence Operating System

    HSPL is Human Speakable Programming Language, allowing for communication between human-to-computer and human-to-human in the same language. This project has moved to http://sourceforge.net/p/spel We are currently working on human-to-computer programming-language with mostly English base vocabulary. Though once we have that, we plan to add support for other world Languages, including Chinese, Spanish, Russian, Arabic, Hindi, among others. Eventually HSPL shall be the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Simple Java delimited and fixed width file parser. Handles CSV, Excel CSV, Tab, Pipe delimiters, just to name a few. Maps column positions in the file to user friendly names via XML. See "FlatPack Feature List" under News for complete feature list.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    A python script that uses wxwidgets. View or edit delimited data.
    Downloads: 0 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.