mintyr is a high-performance data processing toolkit
designed specifically for animal breeding and genomic
selection. Leveraging the zero-copy and multi-threading
capabilities of data.table, it significantly simplifies the
construction of automated data pipelines in large-scale commercial
breeding programs (e.g., coordinating data across nucleus and multiplier
farms, or handling multi-trait growth test records).
The package is not only highly optimized for iterative analysis
workflows with the ASReml-R package (supporting dynamic
modeling and multi-trait/multi-breed nested grouping), but is also
capable of generating and batch-exporting formatted phenotypic data
files required for automated pipeline analyses in other mainstream
command-line breeding software (e.g., HIBLUP, DMU).
mintyr covers five critical stages in the lifecycle of
breeding data analysis:
๐ High-Performance Data I/O (import_xlsx,
import_csv, export_xlsx)
A transparent round-trip for multi-file, multi-sheet tabular data:
import many files into one tidy data.table, transform
freely, then write the original file/sheet structure back out โ no
bookkeeping required.
import_xlsx,
import_csv): native support for merging multiple
files and sheets simultaneously, with source tracking columns
(excel_name, sheet_name) appended
automatically to prevent data confusion across different farms or
batches. In-place data.table conversion keeps the memory
footprint minimal, and import_xlsx can spread the per-sheet
parse across CPU cores on demand (opt-in via workers) โ a
fork pool on Linux/macOS, a PSOCK cluster on Windows.export_xlsx): the round-trip
companion โ a single path argument decides the destination.
A directory writes one .xlsx per
excel_name value (one sheet per sheet_name); a
.xlsx file path writes everything into one
workbook. Worksheet splitting follows the data automatically, and the
tracking columns are stripped by default so exported sheets match the
originals.๐ Automated Data Reshaping & Nesting
(w2l_nest, c2p_nest,
r2p_nest)
c2p_nest: Column-to-pairs nested transformation that
automatically renames feature columns, providing standard uniform inputs
for iterative multi-trait genetic correlation evaluations.w2l_nest / w2l_split: Wide-to-long format
transformations with subsetting and nesting by grouping variables (e.g.,
farm, breed, or line).๐งช Cross-Validation & Model Evaluation
(split_cv, nest_cv)
data.table structures,
facilitating the evaluation of breeding value prediction accuracy
(GP).๐ Batch Exporting for Breeding Software
(export_nest, export_list)
tempdir()/Line/Breed/data.txt), providing seamless
text-file preparation to bridge the gap with command-line driven
breeding evaluation software like HIBLUP and DMU.๐ ๏ธ Phenotypic Statistics & Preprocessing
(top_perc, format_digits,
get_path_info)
You can install this package from either CRAN or GitHub:
### From CRAN
install.packages("mintyr")
### From GitHub
pak::pak("tony2015116/mintyr")Special thanks to AI assistance, for helping transform the initial
concepts and inspirations for the mintyr package into
reality. Their contribution has been invaluable in refining ideas,
improving code structure, and crafting documentation.