designit: a flexible engine to generate experiment layouts

Juliane Siebourg-Polster, Iakov Davydov, Guido Steiner, Balazs Banfai


Examples in this vignette are used were used in our presentation.

It uses a subset of the longitudinal_subject_samples dataset.


dat <- longitudinal_subject_samples |> 
  filter(Group %in% 1:5, Week %in% c(1, 4)) |>
  select(SampleID, SubjectID, Group, Sex, Week)

# for simplicity: remove two subjects that don't have both visits
dat <- dat |>
  filter(SubjectID %in%
    (dat |> count(SubjectID) |> filter(n == 2) |> pull(SubjectID)))

subject_data <- dat |>
  select(SubjectID, Group, Sex) |>

Batch effects matter

Here’s an example of plate effect. Here both top and bottom rows of the plate are used as controls.

This is the experiment design:

These are the readouts:

Due to the plate effect, the control rows are affected differently. It is virtually impossible to normalize readouts in a meaningful way.

Go fully random?

Gone wrong: Random distribution of 31 grouped subjects into 3 batches turns out unbalanced:

Block what you can and randomize what you cannot.” (G. Box, 1978)


To avoid batch or gradient effects in complex experiments, designit is an R package that offers flexible ways to allocate a given set of samples to experiment layouts. It’s strength is that it implements a very general framework that can easily be customized and extended to fit specific constrained layouts.

Sample Batching


bc <- BatchContainer$new(
  dimensions = list("batch" = 3, "location" = 11)
) |>

Batch composition before optimization

batch location SubjectID Group Sex
1 1 NA NA NA
1 2 P32 5 M
1 3 P10 3 F
... ... ... ... ...
3 9 P31 3 F
3 10 P33 5 M
3 11 P24 5 F


bc <- optimize_design(
  scoring = list(
    group = osat_score_generator(
      batch_vars = "batch",
      feature_vars = "Group"
    sex = osat_score_generator(
      batch_vars = "batch",
      feature_vars = "Sex"
  n_shuffle = 1,
  acceptance_func =
    ~ accept_leftmost_improvement(..., tolerance = 0.01),
  max_iter = 150,
  quiet = TRUE

Batch composition after optimization

batch location SubjectID Group Sex
1 1 NA NA NA
1 2 P01 1 F
1 3 P10 3 F
... ... ... ... ...
3 9 P29 5 F
3 10 P33 5 M
3 11 P12 3 F

Plate layouts

Continuous confounding

Assays are often performed in well plates (24, 96, 384)

Observed effects

Since plate effects often cannot be avoided, we aim to distribute sample groups of interest evenly across the plate and adjust for the effect computationally.



bc <- BatchContainer$new(
  dimensions = list("plate" = 3, "row" = 4, "col" = 6)
) |>
  plate = plate, row = row, column = col,
  .color = Group, title = "Initial layout by Group"
  plate = plate, row = row, column = col,
  .color = Sex, title = "Initial layout by Sex"

2-step optimization

Across plate optimization using osat score as before

bc1 <- optimize_design(
  scoring = list(
    group = osat_score_generator(
      batch_vars = "plate",
      feature_vars = "Group"
    sex = osat_score_generator(
      batch_vars = "plate",
      feature_vars = "Sex"
  n_shuffle = 1,
  acceptance_func =
    ~ accept_leftmost_improvement(..., tolerance = 0.01),
  max_iter = 150,
  quiet = TRUE

Within plate optimization using distance based sample scoring function

bc2 <- optimize_design(
  scoring = mk_plate_scoring_functions(
    plate = "plate", row = "row", column = "col",
    group = "Group"
  shuffle_proposal_func = shuffle_with_constraints(dst = plate == .src$plate),
  max_iter = 150,
  quiet = TRUE

2-step optimization multi_plate_layout()

We are performing the same optimization as before, but using the multi_plate_layout() function to combine the two steps.

bc <- optimize_multi_plate_design(
  across_plates_variables = c("Group", "Sex"),
  within_plate_variables = c("Group"),
  plate = "plate", row = "row", column = "col",
  n_shuffle = 2,
  max_iter = 500 # 2000
#> 1 ... 2 ... 3 ...

#> Warning: Removed 4509 rows containing missing values or values outside the scale range
#> (`geom_line()`).
#> Warning: Removed 4509 rows containing missing values or values outside the scale range
#> (`geom_point()`).

Glimpse on more complex application



see vignette invivo_study_design for the full story.



mirror server hosted at Truenetwork, Russian Federation.