Menu

Challenges Using HDF5 and Java

Scott Forest Hull II Eric Lingerfelt Dasha

This article details several technical challenges (and their solutions) encountered during the development of NiCE plugins that utilize HDF5 and HDF5 Java libraries.

Tables of Contents

Writing a String Attribute to a Group's Metadata

The example below lists the proper way to write an Attribute with a String datatype (or its subclass H5Datatype) to the metadata of a Group (or its subclass H5Group). This method could not be located at the HDF5 website or any associated forums. In this example, a name/value pair is being written to an H5Group.

//Create a custom String data type for the value
H5Datatype datatype = (H5Datatype) h5File.createDatatype(Datatype.CLASS_STRING,

value.length(), Datatype.NATIVE,Datatype.NATIVE);

//1D of size 1
long[] dims = { 1 };

//Create a String array of size one to hold the value
String[] values = new String[1];

//Assign the value to the first array index
values[0] = value;

//Create a byte array from values using the stringToByte method
byte[] bvalue = Dataset.stringToByte(values, value.length());

//Create an attribute object
Attribute attribute = new Attribute(name, datatype, dims);

//Set the value of the attribute to bvalue
attribute.setValue(bvalue);

//Write the attribute to the group's metadata
h5Group.writeMetadata(attribute);

Reading a String Attribute from a Group's Metadata

The example below lists the proper way to read the value of an Attribute with a String datatype (or its subclass H5Datatype) from the metadata of a Group (or its subclass H5Group). This method could not be located at the HDF5 website or any associated forums. In this example, the String value of an Attribute belonging to an H5Group is being printed to stdout.

//Get the first attribute from the metadata of a group
Attribute attribute = (Attribute) group.getMetadata().get(0);

//Get the attribute's value which is a byte array
byte[] attributeByteArrayValue = (byte[])attribute.getValue();

//Get the attribute's value as a String array of length 1
String[] attributeValueStringArray = Dataset.byteToString(attributeByteArrayValue, attributeByteArrayValue.length);

//Get the only value from the string array
attributeValue = attributeValueStringArray[0];

//Print the value to stdout
System.out.println("The Attribute's Value is " + attributeValue);

HDFView and Viewing String Datatypes

When exploring an H5File using the HDFView Java program, all Datasets with a String datatype (or its subclass H5Datatype) will be presented using the TextView perspective. For small Datasets, only the first column of Strings are shown. For larger Datasets, only lines are shown. This leads to a false positive belief that a bug has been encountered. By using the h5dump utility program, one can see that the String Datasets have been written successfully.

Writing Data to Compound Datasets

When writing data to a compound Dataset, the init() method of the Dataset object must be called immediately before calling the write(Object data) method. If the init() method is not called, then all values within the Dataset will be zeroes.

Reading a Multiple Dimension String from a Dataset

Here is some sample code below on how to read a 2D array of Strings from a Dataset given a HDF file.

public static String[][] get2dStringArray(FileFormat hdfFile, String path) throws Exception
{
    String[][] stringArray = null;
    Dataset dataset = (Dataset) hdfFile.get(path);

       if (dataset != null)
       {
         dataset.init();

          long[] dims = dataset.getDims();
          long[] start = dataset.getStartDims();
          stringArray = new String[(int) dims[0]][(int) dims[1]];

          for (int i = 0; i

Other interesting facts

No matter what order you place HDF5 groups or attributes, the HDF5 will always re-order the name property into ASCII alphabetical order.

C++ and Java equivalence calls

Since we will be managing HDF5 through two different API's--a Java implementation and a C++ implementation--of a reactor model, then there must be some consistency between similar types of calls respectable to the HDF5 language.

File Handling

Open and close a file in Java:

FileFormat fileFormat = FileFormat.getFileFormat(FileFormat.FILE_TYPE_HDF5);
File file = new File(URI uri) // You are given a path to the file
H5File h5File = (H5File) fileFormat.createFile(file.getPath(),

FileFormat.FILE_CREATE_DELETE)

//To close a file
h5File.close();

In C++:

//Opens a file
file = H5Fcreate(FILE, H5ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

//Close
status = H5Fclose(file);

Create a group

In Java:

//Create a Group
H5Group h5Group = (H5Group) h5File.createGroup(name, parentH5Group);

In C++:

//Will need a way to maintain the listing order for createDataGroup, 
//as it looks like local pathing is required to create subsets of groups

Group group = new Group( file->createGroup( "/Data" ));

Write Attributes - Metadata(Double, Integer, String)

In Java:

//Shared operations.  The values attribute changes depending on double, or integer
long[] dims{1};
double[] values { value }; 
// Type changes depending if integer, double, or string setting!!!

//Name of attribute, dataType (see above), etc

Attribute attribute = new Attribute(name, datatype, dims, values);
H5Group.writeMetadata(attribute);

String attribute:

    //Assume the dims are 1 and the values are string[]

    //convert to byte first

    byte[]bvalue  = Dataset.stringToByte(values, value.length());

    Attribute attribute = new Attribute(name, datatype, dims);

    attribute.setValue(bvalue);

    h5Group.writeMetadata(attribute);

In C++: Integer and Double (change type):

    IntType int_type(PredType::NATIVE_INT64);
    DataSpace att_space(H5S_SCALAR); 
    Attribute att = ds.createAttribute(" myAttribute", int_type, att_space );
    int data = 77;

    att.write( int_type, &data );

String:

    // Create a fixed-length string
    StrType fls_type(0, len_of_your_attribute); // 0 is a dummy argument

    // Open your group

    // Create dataspace for the attribute
    DataSpace att_space(H5S_SCALAR);

    // Create an attribute for the group
    Attribute your_attribute = your_group.createAttribute(attr_name, fls_type, att_space);

    // Write data to the attribute
    your_attribute.write(fls_type, buffer_for_attribute);

    //You can replace the first statement with this for variable-length string
    StrType vls_type(0, H5T_VARIABLE);

Write a Dataset

In Java:

//See GridLabelProvider.write/readDatasets 
//for a more thorough explanation 
Dataset dataSet = h5File.createScalarDS(name, h5Group, dataType, dims, null, null, 0, null);

In C++:

http://www.hdfgroup.org/HDF5/doc/Intro/IntroExamples.html#CreateExtendWrite

Get a Group

In Java:

//The file format method provides a direct, O(1) 
//grab whereas utilizing parentH5Group.getMemberList() requires O(n) time.
(H5Group) parentH5Group.getFileFormat().get(parentH5Group.getFullName() + 
System.getProperty("file.separator") + groupName)

In C++:

//open Group takes a name of the group.  
//If a subgroup of "Data", must use "/Data/otherGroup" as argument
group = new Group(file->openGroup("Data"));

Get an Attribute

In Java:

//Iterate over the attributes in the group to get the attribute
for(int i = 0; i

Get a Dataset

In Java:

//The file format method provides a direct, O(1) 
//grab whereas utilizing parentH5Group.getMemberList() requires O(n) time.
(Dataset) parentH5Group.getFileFormat().get(parentH5Group.getFullName() +  
System.getProperty("file.separator") + datasetName)

In C++:

Use this file: www.hdfgroup.org/ftp/HDF5/current/src/unpacked/c++/examples/compound.cpp

Other notes

The C/C++ libraries are different and can be used interchangable (to some extent). Please note that c libraries will not throw exceptions for errors and will allow terrible things to happen if you are not careful!

Other Articles

http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2010-April/002979.html


Related

Wiki: Developer Documentation
Wiki: NiCE and HDF5

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.