Split and Coalesce Duplicated Records [R package pickmax version 0.1.0]

pickmax: Split and Coalesce Duplicated Records

Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies.

Version:	0.1.0
Imports:	dplyr, rlang, magrittr
Published:	2025-07-15
DOI:	10.32614/CRAN.package.pickmax
Author:	Sbonelo Chamane [aut, cre] (ORCID: 0000-0001-5350-5203), Musawenkosi Mabaso [aut], Ronel Sewpaul [aut], Sean Jooste [aut], Kutloano Skhosana [aut], Khangelani Zuma [aut]
Maintainer:	Sbonelo Chamane <SChamane at hsrc.ac.za>
License:	GPL-3
NeedsCompilation:	no
CRAN checks:	pickmax results

Documentation:

Reference manual:

pickmax.html , pickmax.pdf

Downloads:

Package source:	pickmax_0.1.0.tar.gz
Windows binaries:	r-devel: pickmax_0.1.0.zip, r-release: pickmax_0.1.0.zip, r-oldrel: pickmax_0.1.0.zip
macOS binaries:	r-release (arm64): pickmax_0.1.0.tgz, r-oldrel (arm64): pickmax_0.1.0.tgz, r-release (x86_64): pickmax_0.1.0.tgz, r-oldrel (x86_64): pickmax_0.1.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=pickmax to link to this page.

mirror server hosted at Truenetwork, Russian Federation.