topolow
is an R package that implements a novel,
physics-inspired algorithm for antigenic cartography mapping and
analysis. The algorithm addresses critical challenges in mapping
antigenic relationships from incomplete experimental data, particularly
for rapidly evolving pathogens like influenza, SARS-CoV-2, HIV, and
dengue viruses.
You can install the development version of topolow directly from GitHub:
# Install devtools if needed
if (!require("devtools")) install.packages("devtools")
# Install topolow
::install_github("omid-arhami/topolow") devtools
Alternatively, you can install using the single source file:
# For Windows binary
install.packages("path/to/topolow_X.zip", repos = NULL)
# For source package
install.packages("path/to/topolow_X.tar.gz", repos = NULL, type = "source")
For 3D visualization capabilities, install the rgl
package:
install.packages("rgl")
Note for macOS users: The rgl
package requires XQuartz
to be installed for proper OpenGL support. You can download it from https://www.xquartz.org/, then install the downloaded
package and restart your computer.
Even without rgl, you can use all core functionality of topolow. The package will automatically fall back to 2D visualizations.
Here’s a simple example to check if Topolow is working and to analytically validate its result.
Let us take 4 points in a 2D space, two reference antigens S/1 and S/2 and two test antigens V/1 and V/2.
S/1 at (0, 0)
S/2 at (3, 0)
V/1 at (2, 2)
V/2 at (0, 4)
The pairwise Euclidean distances between these points are computed as follows:
\(d(S/1,S/2) = \sqrt{(3-0)^2 + (0-0)^2} = \sqrt{9 + 0} = \sqrt{9} = 3.\)
\(d(S/1,V/1) = \sqrt{(2-0)^2 + (2-0)^2} = \sqrt{4 + 4} = \sqrt{8} = 2\sqrt{2} \approx 2.828.\)
\(d(S/1,V/2) = \sqrt{(0-0)^2 + (4-0)^2} = \sqrt{0 + 16} = \sqrt{16} = 4.\)
\(d(S/2,V/1) = \sqrt{(2-3)^2 + (2-0)^2} = \sqrt{1 + 4} = \sqrt{5} \approx 2.236.\)
\(d(S/2,V/2) = \sqrt{(0-3)^2 + (4-0)^2} = \sqrt{9 + 16} = \sqrt{25} = 5.\)
\(d(V/1,V/2) = \sqrt{(0-2)^2 + (4-2)^2} = \sqrt{4 + 4} = \sqrt{8} = 2\sqrt{2} \approx 2.828.\)
Imagine we have measured the distances of V/1 against S/1 and S/2, and V/2 against S/1 and S/2. We use Topolow to find the distance between V/1 and V/2 which is missing in the distance matrix (dist_mat in code below). From the analytical calculations we expect d(V/1,V/2) = 2.828.
Remember that this is the simplest example with an analytical solution that lets us verify the result. The true value of using Topolow to find missing distances is when there are many points and many missing distances in the data.
library(topolow)
# Create a 4×4 simple distance matrix
<- matrix(c(
dist_mat # S/1 S/2 V/1 V/2
0, 3, 2.828, 4, # S/1
3, 0, 2.236, 5, # S/2
2.828, 2.236, 0, NA, # V/1
4, 5 , NA, 0 # V/2
nrow=4)
), rownames(dist_mat) <- colnames(dist_mat) <- c("S/1", "S/2", "V/1", "V/2")
# Run TopoLow in 2D
<- create_topolow_map(dist_mat, ndim=2, mapping_max_iter=1000,
result k0=1, cooling_rate=0.0001, c_repulsion=0.001,
write_positions_to_csv = FALSE, verbose = TRUE)
# Investigate the results
print(dist_mat)
print(result$est_distances)
/1 S/2 V/1 V/2
S/1 0.000000 3.000027 2.827970 4.000056
S/2 3.000027 0.000000 2.235928 5.000045
S/1 2.827970 2.235928 0.000000 2.828457
V/2 4.000056 5.000045 2.828457 0.000000 V
All of the estimated distances are close to the analytical solution, including model’s estimate for the missing distance between V/1 and V/2.
This package includes computationally intensive examples in the
inst/examples
directory. These examples demonstrate
complete use cases in the paper but require computational time and
resources.
To run these studies after installing Topolow, you can copy all associated files, subdirectories, and the Rmd files to your machine. Then read through the markdown notebooks and choose which parts you wish to run. There are usually options to use the provided parameters to bypass some parts of the simulations.
Note: Results of time-intensive sections are also provided in csv files and explained at the beginning of each Rmd file.
Topolow employs a novel physical model where:
This approach allows Topolow to effectively optimize antigenic positions through a series of one-dimensional calculations, eliminating the need for complex gradient computations required by traditional MDS methods.
What it is
Computes for each antigen a velocity vector showing the
rate and direction of each antigen’s drift. [ v_i = {{j:,t_j<t_i}
K{ij}} ]
Key parameters
sigma_x
(antigenic bandwidth) and sigma_t
(temporal bandwidth) — default: auto-estimated via Silverman’s
ruleclade_depth
— depth (in tree edges) for phylo-aware
clade filtering (Average Leaf-to-Backbone Distance)The algorithm can handle input data in various formats - if the raw
input consists of one or multiple long tables with references on columns
and challenges on rows, they are converted to the standard matrix form.
(See the example scripts in inst/examples
)
The package accepts distance matrices with the following characteristics:
Key parameters for the TopoLow algorithm:
The optimal values for each data can be determined through adaptive
Monte Carlo simulations done by functions
initial_parameter_optimization
and
run_adaptive_sampling
. (See the example scripts in
inst/examples
)
Topolow demonstrates significant improvements over traditional MDS approaches:
Topolow is particularly valuable for:
When using topolow on HPC systems with SLURM (only available in Topolow v0.3.2), additional setup might be needed:
module load R/4.4.1
install.packages(c("reshape2", "data.table", "dplyr", "ggplot2"))
initial_parameter_optimization(
# ... other parameters ...
r_module = "R/4.4.1", # Set this to match your cluster's R module
use_slurm = TRUE
)
See the full documentation of the package and all functionalities in https://github.com/omid-arhami/topolow/blob/main/build/topolow-manual.pdf
For detailed documentation of a specific function in Topolow package:
# View documentation
?function_name
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is protected by a pre-publication license.
The license will transition upon publication - see the LICENSE file for details.
If you use this package, please cite the article:
Omid Arhami, Pejman Rohani, Topolow: A mapping algorithm for antigenic cross-reactivity and binding affinity assays, Bioinformatics, 2025;, btaf372, https://doi.org/10.1093/bioinformatics/btaf372