Menu

Announcing: GEDCOM Plugin for Notepad++ (28 Jan 2014)

I have been looking at raw GEDCOM files quite a lot lately. I usually go to the Notepad++ text editor for this because it is good at recognizing file encodings. Notepad++'s popularity derives from it being open-source and general-purpose, robust and lightweight. Many plugins have been developed for it to extend its capabilities to more specialized uses (especially for programmers). So that prompted me to cook up a GEDCOM plugin to provide syntax highlighting and folding.

Technically, this is a lexer plugin (GedcomLexer.dll) for GEDCOM files. A lexer is a program that performs lexical analysis, in this case of GEDCOM files. The lexer follows the data representation grammar of GEDCOM specification version 5.5.1. It recognizes the possible tokens in a line: level, xref_id, tag, user tag, pointer, value, and escape. Each of these tokens has a default style supplied by the plugin which can be customized through the Notepad++'s Style Configurator. When an invalid character in a token is detected, the lexer enters the Invalid state and outputs the remainder of the line in the Invalid style (default red).

In the current release (0.1.0), folding (hiding detail text) is based on the line level. In GEDCOM files, logical records begin at line level 0. Subordinate lines with levels 1 or higher contribute to the logical record which was defined by the level 0 line that preceded it. So, folding allows a user to see only level 0 lines (logical record starts) or level 0 lines plus selected additional levels, giving the user some control over the amount of detail displayed.

The plugin has been tested with a variety of GEDCOM files (*.ged), including UTF-8, UTF-16, ANSI, and ASCII. In release 0.1.0, the ANSEL character set is not supported.


Posted by smitchell 2019-08-30

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.