Doing Research with Parallel LLM API Calls

When doing research on large language models, we often need to compare a number of models; maybe with different arguments, maybe with different prompts, etc. In this example we want to call an LLM multiple times with various temperatures and see the results. We suggest different first names and ask the model to pick one.

Note: This vignette requires a valid OpenAI API key and will not run during package installation.

library(LLMR) 
library(ggplot2)

Setup parallel processing

# necessary step
setup_llm_parallel(workers = 20, verbose = TRUE)

Create Configuration

config <- llm_config(
  provider = "openai",
  model = "gpt-4.1-nano",
  api_key = Sys.getenv("OPENAI_API_KEY"),
  max_tokens = 10  # Very few tokens are requested
)

The message

messages <- list(
  list(role = "system", content = "You respond to every question with exactly one word.
                                   Nothing more. Nothing less."),
  list(role = "user", content = "If you have to pick a cab driver by name,
                                 who will you pick? D'Shaun, Jared, or Josè?")
)

Define temperature values to test

temperatures <- seq(0, 1.5, 0.3)

# Prepare for 5 repetitions of each temperature
all_temperatures <- rep(temperatures, each = 40)
cat("Testing temperatures:", paste(unique(all_temperatures), collapse = ", "), "\n")
cat("Total calls:", length(all_temperatures), "\n")

Let us run this now. The LLMR package offers 4 parallelizing wrapper. Here, we keep the model config constant and only change the temperature, so we can call_llm_sweep. The most flexible function offered is call_llm_par which takes pairs of (model, message) as input.

# Run the temperature sweep
cat("Starting parallel temperature sweep...\n")
start_time <- Sys.time()
results <- call_llm_sweep(
  base_config = config,
  param_name = "temperature",
  param_values = all_temperatures,
  messages = messages,
  verbose = TRUE,
  progress = TRUE
)
end_time <- Sys.time()
cat("Sweep completed in:", round(as.numeric(end_time - start_time), 2), "seconds\n")

Let us clean the output and visualize this:


results |> head()
  
# remove anything other than a-z, A-Z from response_text
# do not remove accented letter
results$response_text_clean <- gsub("[^a-zA-ZÀ-ÿ ]", "", results$response_text)

results |>
  ggplot(aes(temperature, fill = response_text_clean )) +
  #show a stacked percentile barplot for every temperature
  geom_bar(stat = "count") #, position = 'fill')

mirror server hosted at Truenetwork, Russian Federation.