Menu

Viewing British Library Basic RDF-XML open datasets in a dedicated GUI application

Curt Selak
Attachments
1950s.png (88797 bytes)
abstracts.png (110339 bytes)
narrower.png (102534 bytes)
pre.png (97311 bytes)
start.png (79123 bytes)
windows.png (170461 bytes)

This Wiki page shows you how to use the software you have downloaded for viewing datasets available from the British Library. (Please note that neither this SourceForge project nor its creator are affiliated with or endorsed by the Library in any way at all: the Library is only the source of the data). Whether you downloaded the viewer for the British National Bibliography (BNB), the British Library Integrated Catalogue (BLIC), or selected Cataloguing in Publication (CIP) records (for trying the viewer without an enormous footprint on one's storage device), the respective applications work identically to one another. The viewer in the examples will usually be the one for BNB.

If you haven't downloaded an application yet, here are the links:

  • BNB (approximately 632 MB)
  • BLIC (approximately 1.1 G)
  • CIP (approximately 32 MB)

As noted elsewhere, the downloads are large but are not nearly as large as those one obtains directly from the library (they contain the Library's bibliographic records, but the records are only decompressed/deflated at runtime).

I'll explain further below, but the applications do require Java to run. If your operating system is Windows and you do not wish to install Java, you can still download Java and run it on Windows 8 or greater for the limited purpose of using the viewers, without having to uninstall anything later or having to permit a Windows installer to make changes to your computer's Registry. Here is the link:

The next section has instructions for using the .zip file.

Starting the viewer

I recommend starting the viewer from a terminal. The first thing to do is change directories (the "cd" command) to the location of the file that you downloaded.

There are two distinct ways of starting the application. The first runs it without the ability to save or print records:

java -jar BNB.jar

I figured that was a good idea in case the viewer needed to run on a computer configured to forbid Java applications from accessing the file system.

The other way opens the viewer with greater functionality. It will save records to a file, print records, and also produce PDF output. The command is the same, except that the letter p is typed after it, and the name of a file to which to save records follows the letter p:

java -jar BNB.jar p myrecords

The name of the file can be whatever you like: what is important is that it's not already the name of an existing file because if it is, the viewer will subsequently append data to it should you tell it to do.

For either of the above commands to start the viewer, Java needs to be installed on your computer. The url to go to for obtaining Java is https://java.com.

If you don't wish to install Java but would like to use the viewer on a computer running Windows, you can try to run Java without installing it. First, click this link to download a Java runtime that I made to that end. Unzip the archive (runtime.zip) to the same directory as you saved the viewer in. The directory ultimately should look practically the same as this (except for British Library Records Viewers, which stands in here for the name you'll have given the directory on your computer):

The contents of the .zip archive needs to be in a sub-directory called runtime: Windows is likely to offer to create the runtime directory when it unzips the archive. Then open a terminal (i.e. PowerShell: I don't think the command below will run in a DOS-type terminal), navigate to the directory where both the downloaded Java and the viewer are stored, and type the following:

runtime/bin/java -jar BNB.jar

An alternative is to double-click BNB.bat, one of three files extracted when you unzipped the archive (BLIC.bat or CIP.bat respectively run BLIC.jar [which is huge: I'm not taking it for granted that you've downloaded it yet] and CIP.jar).

Either ought to work on Windows 8 and up. Please ask for help (curtthomasselak@gmail.com) if it does not.

To start the application with a view to saving and printing records rather than just displaying them, one types:

runtime/bin/java -jar BNB.jar p myfile

or, one can double-click one of three further files that were extracted when the archive was unzipped: BNBPrint.bat, BLICPrint.bat, or CIPPrint.bat. The name of the file created will be blbibliographicrecords.bnb, blbibliographicrecords.blic, or blbibliographicrecords.cip.

Basic concepts

After the viewer opens, you may wish to either click the maximize icon in the upper-right hand corner one or more times in order for the size of the
viewer on your screen to be optimal, or do the same by means of the maximize/unmaximize commands that can usually be counted on to appear when you use the ALT-SPACE key combination.

The menus that you see at the very top of the window depend on whether or not you started the viewer with the "p" option. If the application was started without the "p" option, only the Information menu appears.

Immediately beneath the menus, a bibliographic record is displayed. It is blank until a record is selected.

The text area beneath the bibliographic record is for finding records. While the text area's presence might suggest that something happens once that the ENTER key is pressed, nothing does. It's a key combination, CTRL + O (the letter O), that accepts the input instead.

The records themselves are in the drop-down box immediately below the text area. The portion of each bibliographic record that is visible is the title. To display a record, the UpArrow or DownArrow key is sufficient.

Below the drop-down box containing the records is another drop-down box which loads a set of records when the ENTER key is pressed. There can be a pause whilst the viewer loads another set of records.

Finally, at the very bottom of the window there is a series of panes, four of which display information about using the viewer, and two of which contain what are in effect toolbars.

Navigation


To go from one part of the screen to another in either direction, one uses the TAB key, or the SHIFT and TAB key combination.

At the very bottom of the window, one changes tabs either by using the LeftArrow and RightArrow keys, or by using CTRL + PAGEUP or CTRL + PAGEDOWN.

To navigate from a tab to its corresponding pane (i.e., gain access to scroll bars or buttons), one uses CTRL + DownArrow; to go back (from a pane to its corresponding tab), one uses CTRL + UpArrow.

More navigation


If you tab to the drop-down box that contains the records, you can type the first few letters that you are looking for to proceed to a record the title of which starts with the letters. If many records are loaded at once, as can happen if the set of records is from the 21st century, or if the records overall are larger, it might require a little patience, but will ultimately work. Something of note is that although the titles are displayed, what is inside the drop-down box genuinely is the records themselves. You can conceivably modify each viewer's source code (included in the respective .jar archives) in order to change what part of the record the drop-down box displays, as well as the behavior to expect when the alphanumeric keys are used to locate specific items.

Keyboard focus

The mouse is not particularly important when using the viewer. Keyboard shortcuts will prove far more convenient. Here, I'll list all of the key combinations, partly in order to assure you that there's one place where you can find every one, and also in order to underscore that the shortcuts' availability can depend on what part of the window has your keyboard's focus and can currently receive its input.

Window

The TAB key will move to another component regardless of where you are in the window. A summary will look like:

Move between componentsTab
Move backwards between components Shift + Tab

There is more. You can use any of the following regardless of which part of the window you last tabbed to or clicked on with the mouse:

Scroll record downCTRL + , (COMMA)
Scroll record upCTRL + . (PERIOD/FULL STOP)
Go to next recordCTRL + O (letter O, not zero)
Increase record text's sizeCTRL + =
Decrease record text's sizeCTRL + - (HYPHEN/MINUS)
Display Legend panelCTRL + L
Display Highlights panelCTRL + H

Bottom of window

The following, which I have discussed under 'Navigation', work when the focus within the viewer is on a tab or on a panel that is connected to one:

Cycle through bottom panel's tabsCTRL + PageUp or PageDown
Move keyboard focus from tab to panelCTRL + DownArrow
Move keyboard focus from panel to tabCTRL + UpArrow
Scroll panel when it has the keyboard focusPageUp or PageDown

As indicated under "Window", the keyboard shortcuts that work globally include one that can display the Highlights panel (CTRL + ;) and one that
can display the Legend panel(CTRL + L).

Records and sets of records

The commands below work when your keyboard is focused on the drop-down box containing the currently available records:

Display record countCTRL + I
Save current record to fileCTRL + \
Save matching recordsCTRL + [
Save non-matching recordsCTRL + /
Save all loaded recordsCTRL + ]

If the program was started without the "p" option, the commands for saving records will cause an alert box to be displayed.

The built-in key presses for navigating the combo box could already be familiar, and can also be mentioned here:

Display list of recordsSPACE bar or DownArrow
Page down in list of recordsPageDown
Page up in list of recordsPageUp
Display preceding recordUpArrow
Display following recordDownArrow

The built-in keys also work when the keyboard focus is on the drop-down box for choosing which set of records to load. All that I think is different is that the ENTER key does more than close the drop-down box:

Load recordsENTER

Highlighting fields

The keyboard shortcuts for highlighting specific parts of a record are in a class somewhere in between the key combinations that work globally and those that are available depending on the keyboard focus. They can be used as long as the Highlights panel is visible (as noted earlier, it's made visible by pressing CTRL + ;, regardless of which part of the window possesses the keyboard focus), and are as follows:

Title (dcterms:title)ALT + 1 (the number 1)
Total extent of resource (dcterms:extent)ALT + 2
Contributor (dcterms:contributor)ALT + 3
Creator (dcterms:creator)ALT + 4
Date issued (dcterms:issued)ALT + 5
Alternative title (dcterms:alternative)ALT + 6
Note I (dcterms:description)ALT + 7
Table of contents (dcterms:tableOfContents)ALT + 8
Copyright date (dcterms:dateCopyrighted)ALT + 9
Edition statement (isbd:P1008)ALT + 0
Note II (dates; isbd:P1038)ALT + -
Note III (language[s]; isbd:P1073)ALT + =
Series statement (rda:seriesStatement)ALT + q
Terms of availability (rda:termsOfAvailability)ALT + w
Number of volumes (bibo:numVolumes)ALT + e
Abstract (dcterms:abstract)ALT + r
ISBN-10 (bibo:isbn10)ALT + t
ISBN-13 (bibo:isbn13)ALT + y
Control number (etc.; dcterms:identifier)ALT + u
Place of publication (isbd:P1016)ALT + k
Publisher (dcterms:publisher)ALT + o
Type of resource (dcterms:type)ALT + p
Audience (dcterms:audience)ALT + [
Language (dcterms:language)ALT + ]
Dewey code (dcterms:subject)ALT + \
Subject heading (dcterms:subject)ALT + /

The records

Within each record, the metadata is delimited by a single character followed by two space characters. Pressing CTRL + L displays the pane labelled "Legend" which lists the delimiters.

The manner in which the records are displayed was devised as a compromise between the format one associates with human-readable MARC records and formats with minimal markup provided for differentiating between fields..

Locating and filtering records

The viewer can display one set of records at a time. In the BNB viewer, there are sets of records for each year (based on the BNB number), and one set each for the 1950s and 1960s. The BLIC viewer's sets of records are simply based on an upper threshold, and the CIP viewer's sets of records are simply weekly releases of BNB records.

We can see how it works by loading the set of records from the 1950s, and typing in the text field with a view to locating particular records:

If I press CTRL + O, the viewer will display the next record that contains the string "Lewis, Wyndham". It is of note that what I really typed in the box was ".*Lewis, Wyndham.*". If I delete everything that is in the text field but ".*", each time that I press CTRL + O, the next record that is loaded will be displayed. That's because the expression ".*" always ever matches each and every record.

Regular expressions

In the viewers, what are known as regular expressions need be used to locate records. You construct a regular expression by combining what are known as literals with what are known as metacharacters. ".*Lewis, Wyndham.*", in the above example, is a regular expression inasnuch as the part that reads "Lewis, Wyndham" is a literal, that is to say, will only ever be there if there is a match, while the "." and "*" are metacharacters that mean, respectively, any possible character (spaces included) and zero or more in a row of same. The expression per se stands for any number (including zero) of any character excepting newlines (which our records never contain, so newlines are moot), followed by "Lewis, Wyndham", followed by any number (including zero) of any character excepting newlines. The syntax that we're using here sometimes occurs in a simpler form in word processing or other productivity software.

To find what you are looking for, all that you typically need to do is put ".*" before and after the string that you wish to locate. It won't matter if the characters you expressly wish to find are at the very beginning of the record, at the very end of it, or somewhere in the middle. What will not work is entering only literals. The rudimentary regular expression we have been using matches the literals only because it also matches anything and everything that either precedes or follows them.

There are countless places on the web to learn about regular expressions. The viewers contain a panel inside of which there's an introduction; it is a menu item as well. If you look elsewhere for information, the type of regular expression that is relevant is the Java regular expression, introduced around 2002. Once that one is using Java regular expressions, as one is in the viewers, it is rarely ever necessary to consider the differences to those used in other programming languages or platforms such as Perl.

Again, you are covered just as long as what you enter in the viewer's text input area begins with ".*" and also ends with ".*". Where regular expressions are used below, it needn't be presumed that you need to know them in order to really use the viewer. It's that regular expressions that are a little more precise than the rudimentary form I have just presented can help demonstrate the difference that your input makes when you are using the viewers to filter large numbers of records.

More about literals

I mentioned that in regular expressions, some of what one finds often amounts to literals, that is to say: characters that appear in the regular expression exactly as they will appear in the records that are matched. Nowhere in the context of the viewers is this more so than where the characters used to keep the fields apart are concerned.

If one types CTRL + L, the list of delimiters within the records is displayed. They aren't on your keyboard, and the most convenient means of using them in your searches is to type what is known as the delimiter's code point preceded by an escape sequence, which is \u. To include the bullet that precedes the title in a record, for instance, one types \u2022; it is a good idea to follow it by two space characters (which also fall under literals), because the field delimiter is never used in a record absent two trailing space characters.

An exercise that can help accustom one to the way that the viewer can often work entails nothing more than typing .*\u21C7.* into the text input area and pressing CTRL + O one or more times. The field delimiter for abstracts incorporated in a bibliographic record is \u21C7, or ⇇, so each successive press of CTRL + O takes you to the next record to contain an abstract. It is of note that abstracts are usually found the most in records from the 21st century, if also from the 1990s.

In the above screenshot, the records dated 2014 are loaded and I have tabbed to the drop-down box containing the records themselves. When the mouse hovered over the drop-down box, it was indicated that the box contained 143,195 records. If I press CTRL + I (remember, the drop-down box currently possesses the keyboard focus, as it must in order for CTRL + I to be available), a message/alert box pops up indicating that 4,510 records matched .*\u21C7.*, which is to say that 4,510 of the 143,195 records had a field containing an abstract.

To display the abstract in crimson, one can press CTRL + ; in order to make certain that the Highlights panel is visible, and then press ALT + R. In order to scroll the record and continue reading the abstract, one presses CTRL + , (comma) to go down, and CTRL + . (period/full stop) to go back up. To increase the size of the text, one presses CTRL + =, and to decrease it, CTRL + -.

Filtering records

The item selected in the drop-down box listing the available sets of records corresponds to what the drop-down box with each record's title contains. A second way to determine what records are loaded is to type a regular expression in the text input area, as we did earlier, and look for matching records in each and every set. The drop-down box will then be populated with the matching records from all of BNB, BLIC, or CIP, depending on our viewer.

Below, I've typed a rudimentary regular expression like we used earlier into the text input area. Then, I've tabbed to the bottom of the window, selected the tab labelled Perfected Regular Expression (in this case by pressing ALT + F, once that I was at the bottom of the window), and pressed CTRL + DownArrow to go inside the panel where the Start button is found. Then I've pressed Start (by means of the SPACE bar, not the ENTER key as one might have thought) and waited about a minute.

All of the results are in the drop-down box. If I'd like, I can type a different regular expression in the text input area with a view to narrowing the search criteria. (We'll do as much shortly). If I wish to discard the results, I can press the Perfected Regular Expression tab's Restore button.

Saving records

You saw earlier that it's necessary when starting any of the three viewers to type p on the command line followed by the name of a file in order to save records once that the viewer is running. I've also provided a Java runtime that should do the same thing when a .bat file's icon is double-clicked, or when the appropriate command is entered.

Each viewer can not only save records to a file, but can also print them to paper or to PDF; to do either, however, the records must first be saved.

To save one or more records, keyboard shortcuts (not the Saved records menu) are used.

The shortcut that saves the record currently being viewed is CTRL + \. To look at the file to which the record has been saved, one can use the Saved records menu. To modify the file prior to or after viewing it, you need to open it in a different application: the viewer can append data to it, but cannot edit it.

Adding records to a file one at a time by using CTRL + \ is prudent, but it might not be convenient if you've done as we were doing earlier and searched with a view to finding as many records as matched. We'll return to our previous example.

Where we left our viewer, the topmost drop-down box was populated with 89 matching records.

Above, I've changed the regular expression .*Bauman, Zygmunt.* to .*[\u204C\u204D] Bauman, Zygmunt.* (which is equivalent to .*[⁌⁍] Bauman, Zygmunt.*). Pressing CTRL + I indicates that 79 out of the 89 records in the drop-down box match .*[\u204C\u204D] Bauman, Zygmunt.*. The old regular expression looked for Bauman, Zygmunt anywhere in a given record, whereas the new one specifies that the name need occur in either the dcterms:contributor or dcterms:creator field.

Where saving the records to look at later is concerned, the alternatives available here are to save all 89 records, save only the 79 that match the regular expression entered in the text input area, or save only those that do not match the regular expression in the text input area.

Pressing CTRL + ] will save all 89 records. CTRL + ] obviously needs to be used with care: you saw before that the number of records dated 2014 amounted to well over 100,000, and although the footprint on the disc or other storage device might not be that bad, reviewing that many records by means of the Saved records menu would prove touch and go at best (a tabbed view remains an option, but could take really long).

Pressing CTRL + [ will only save the 79 records that match .*[\u204C\u204D] Bauman, Zygmunt.*, the regular expression currently in the text input area.

Those that don't match could be saved instead by pressing CTRL + /.

In the scenario we find, pressing CTRL + ] will do the same thing as pressing CTRL + [ followed by CTRL + /.

Printing records

The Saved records menu is principally for viewing the file containing your saved records, or for printing it. Depending on what you choose, it either shows the contents of the entire file, or breaks it up into tabs holding 100 records each. There is a menu option that can be used to write ("spin off") either the currently viewed tab or the file in the state one's currently viewing it to a separate file.

Once that a means of viewing the file has been chosen, an Actions menu is available and the file can be sent to a printer or saved to PDF.

The PDF file that is produced is not the type that consists of a series of raster images useful only for printing or viewing. It is searchable, and text can be copied from it and pasted into other applications.

I hope that each viewer, once that it is used, will help demonstrate the breadth of the bibliographic records that the Library has so generously shared with one and all.