The h5lite package provides a simplified, user-friendly
interface for interacting with HDF5 files in R. While HDF5 is a complex
hierarchical data format, h5lite is designed to feel
familiar to R users by mapping HDF5 concepts directly to R’s native data
structures.
If you are new to HDF5, the easiest way to understand it is through R analogues. HDF5 files function like a file system within a single file, containing “groups” (folders) and “datasets” (files).
| R Concept | HDF5 Concept | Description |
|---|---|---|
| List | Group | A container that holds other objects (datasets or other groups). |
| Vector / Matrix | Dataset | A multidimensional array of data (numeric, character, etc.). |
| Data Frame | Compound Dataset | A table where each column can have a different data type. |
| Attribute | Attribute | Metadata attached to a specific object (e.g., units, timestamps). |
| Factor | Enum | An integer vector with associated string labels. |
The package uses two primary functions: h5_write() to
save data and h5_read() to load it.
You can write standard R arrays and vectors directly to an HDF5 file.
h5lite automatically handles dimensions.
library(h5lite)
file <- tempfile(fileext = ".h5")
# 1. 1D Array (Vector)
vec <- c(1.5, 2.3, 4.2)
h5_write(vec, file, "examples/vector")
# 2. 2D Array (Matrix)
mat <- matrix(1:9, nrow = 3, ncol = 3)
h5_write(mat, file, "examples/matrix")
# 3. 3D Array
arr <- array(1:24, dim = c(4, 3, 2))
h5_write(arr, file, "examples/array_3d")
# 4. Scalar
# By default, R treats length-1 vectors as arrays.
# Wrap in I() to write a true HDF5 scalar.
val <- I(42)
h5_write(val, file, "examples/scalar")Note: While h5lite supports preserving row and
column names for matrices and vectors, these examples omit them for
simplicity. See vignette('attributes-in-depth') for details
on how dimension names are stored.
h5lite automatically selects the appropriate HDF5 data
type based on the content of your R objects.
| R Data Type | HDF5 Equivalent | Description |
|---|---|---|
| Numeric | variable | Selects optimal type: int8,
float32, etc. |
| Logical | H5T_STD_U8LE |
Stored as 0 (FALSE) or 1 (TRUE)
(uint8). |
| Character | H5T_STRING |
Variable or fixed-length UTF-8 strings. |
| Complex | H5T_COMPLEX |
Native HDF5 2.0+ complex numbers. |
| Raw | H5T_OPAQUE |
Raw bytes / binary data. |
| Factor | H5T_ENUM |
Integer indices with label mapping. |
| integer64 | H5T_STD_I64LE |
64-bit signed integers via bit64
package. |
| POSIXt | H5T_STRING |
ISO 8601 string
(YYYY-MM-DDTHH:MM:SSZ). |
| List | H5O_TYPE_GROUP |
Recursive container structure. |
| Data Frame | H5T_COMPOUND |
Table of mixed types. |
| NULL | H5S_NULL |
Creates a placeholder. |
You can use the as argument to explicitly set the HDF5
data type for numeric, logical, and character vectors. See
vignette('data-types') for details.
R lists are naturally hierarchical, making them perfect for creating HDF5 groups.
Data frames are written as native HDF5 compound datasets, allowing efficient storage of tabular data with mixed types.
For more details on these structures, including how to handle factors
and nested lists, refer to vignette('data-frames').
Attributes are small pieces of metadata attached to objects.
h5lite writes R attributes (like units or
description) as HDF5 attributes.
# Write a dataset
h5_write(1:10, file, "measurements")
# Attach an attribute to it
h5_write("meters", file, "measurements", attr = "units")See vignette('attributes-in-depth') for information on
reading specific attributes and how special R attributes like
dimnames are handled.
You can inspect the contents of an HDF5 file without reading the data
into memory using h5_ls() and h5_str().
# List contents
h5_ls(file)
#> [1] "examples" "examples/vector"
#> [3] "examples/matrix" "examples/array_3d"
#> [5] "examples/scalar" "experiment"
#> [7] "experiment/config" "experiment/config/id"
#> [9] "experiment/config/status" "experiment/data"
#> [11] "study_data" "measurements"
# Print structure tree (like R's str())
h5_str(file)
#> /
#> ├── examples/
#> │ ├── vector <float64 × 3>
#> │ ├── matrix <uint8 × 3 × 3>
#> │ ├── array_3d <uint8 × 4 × 3 × 2>
#> │ └── scalar <uint8 scalar>
#> ├── experiment/
#> │ ├── config/
#> │ │ ├── id <uint8 × 1>
#> │ │ └── status <utf8[6] × 1>
#> │ └── data <float64 × 10>
#> ├── study_data <compound[2] × 5>
#> │ ├── $id <uint8>
#> │ └── $val <float64>
#> └── measurements <uint8 × 10>
#> └── @units <utf8[6] × 1>For advanced file operations, including moving, deleting, and
verifying objects, refer to
vignette('data-organization').