GERDA: German Election Data for R

This R package provides data on German elections since 1953, together with helpers for merging socioeconomic covariates. As of v0.6 it exposes 39 datasets covering:

GERDA was compiled by Vincent Heddesheimer, Florian Sichart, Andreas Wiedemann, and Hanno Hilbig. See the GERDA website and the accompanying publication: doi.org/10.1038/s41597-025-04811-5. The package is under active development; comments and bug reports are welcome at hhilbig@ucdavis.edu or via GitHub issues.

Installation

install.packages("gerda")                  # from CRAN
devtools::install_github("hhilbig/gerda")  # development version

To install the vignette along with the development version, pass build_vignettes = TRUE:

devtools::install_github("hhilbig/gerda", build_vignettes = TRUE)

Then read it with vignette("gerda"). The CRAN release ships the vignette by default.

Main functions

Data access:

Bundled data (no download required):

Merging helpers:

Party mapping:

Example

library(gerda)
library(dplyr)

federal <- load_gerda_web("federal_muni_harm_25") |>
  add_gerda_covariates() |>
  add_gerda_census()

County covariates (INKAR, 1995–2022)

add_gerda_covariates() appends 30 county-level indicators to federal, state, or local election data. Variables cover demographics, GDP and sectoral structure, unemployment (overall, youth, long-term), education, income, healthcare, childcare, housing, transport, and municipal public finances. Coverage is strongest for 1998–2021; newer indicators are available only for recent years. Use gerda_covariates_codebook() for per-variable detail including original INKAR codes and missing-data rates.

Zensus 2022 (municipality-level)

add_gerda_census() appends 14 indicators from the German Zensus 2022. Because the census is a single 2022 snapshot, the same values are attached to all election years; analyses that rely on within-unit variation in these variables are not supported.

Indicators cover population and age structure, migration background, household size, and housing (dwellings, vacancy, ownership, rent per square metre, single-family share). Most variables have above 95% municipality coverage. avg_household_size_census22 is missing for about 12.5% of municipalities because Destatis suppresses small-cell values.

Deprecations

As of v0.6, federal_cty_unharm exposes both the upstream columns (ags, year) and the canonical GERDA county-level names (county_code, election_year). The ags and year aliases will be removed in v0.7. New code should use county_code and election_year, which match the rest of the county-level datasets and work directly with add_gerda_covariates().

mirror server hosted at Truenetwork, Russian Federation.