| Type: | Package | 
| Title: | A Curated Collection of Digestive System and Gastrointestinal Disease Datasets | 
| Version: | 0.2.0 | 
| Maintainer: | Renzo Caceres Rossi <arenzocaceresrossi@gmail.com> | 
| Description: | Provides an extensive and curated collection of datasets related to the digestive system, stomach, intestines, liver, pancreas, and associated diseases. This package includes clinical trials, observational studies, experimental datasets, cohort data, and case series involving gastrointestinal disorders such as gastritis, ulcers, pancreatitis, liver cirrhosis, colon cancer, colorectal conditions, Helicobacter pylori infection, irritable bowel syndrome, intestinal infections, and post-surgical outcomes. The datasets support educational, clinical, and research applications in gastroenterology, public health, epidemiology, and biomedical sciences. Designed for researchers, clinicians, data scientists, students, and educators interested in digestive diseases, the package facilitates reproducible analysis, modeling, and hypothesis testing using real-world and historical data. | 
| License: | GPL-3 | 
| Language: | en | 
| URL: | https://github.com/lightbluetitan/digestivedatasets, https://lightbluetitan.github.io/digestivedatasets/ | 
| BugReports: | https://github.com/lightbluetitan/digestivedatasets/issues | 
| Encoding: | UTF-8 | 
| LazyData: | true | 
| Suggests: | ggplot2, testthat (≥ 3.0.0), dplyr, knitr, rmarkdown | 
| Depends: | R (≥ 4.1.0) | 
| Imports: | utils | 
| RoxygenNote: | 7.3.2 | 
| Config/testthat/edition: | 3 | 
| VignetteBuilder: | knitr | 
| NeedsCompilation: | no | 
| Packaged: | 2025-09-06 08:41:56 UTC; Renzo | 
| Author: | Renzo Caceres Rossi
     | 
| Repository: | CRAN | 
| Date/Publication: | 2025-09-07 22:20:09 UTC | 
DigestiveDataSets: A Curated Collection of Digestive System and Gastrointestinal Disease Datasets
Description
This package provides a wide variety of datasets focused on the digestive system, stomach, intestines, liver, pancreas, and associated diseases, including clinical trials, observational studies, experimental datasets, cohort data, and case series involving gastrointestinal disorders such as gastritis, ulcers, pancreatitis, liver cirrhosis, colon cancer, colorectal conditions, Helicobacter pylori infection, irritable bowel syndrome, intestinal infections, and post-surgical outcomes.
Details
DigestiveDataSets: A Curated Collection of Digestive System and Gastrointestinal Disease Datasets
A Curated Collection of Digestive System and Gastrointestinal Disease Datasets.
Author(s)
Maintainer: Renzo Caceres Rossi arenzocaceresrossi@gmail.com
See Also
Useful links:
Anorexia Weight Change
Description
This dataset, anorexia_weight_change_df, is a data frame containing weight change data for young female anorexia patients. It includes pre- and post-treatment weights, along with the type of treatment administered.
Usage
data(anorexia_weight_change_df)
Format
A data frame with 72 observations and 3 variables:
- Treat
 Factor indicating the treatment type (3 levels)
- Prewt
 Numeric vector indicating the patient's weight before treatment (in kilograms)
- Postwt
 Numeric vector indicating the patient's weight after treatment (in kilograms)
Details
The dataset name has been kept as 'anorexia_weight_change_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the MASS package version 7.3-65.
Recurrent Bleeding from Ulcers
Description
This dataset, bleeding_ulcers_df, is a data frame containing data from 40 experiments designed to compare a new surgery for stomach ulcer with an older surgery.
Usage
data(bleeding_ulcers_df)
Format
A data frame with 80 observations and 9 variables:
- author
 Factor indicating the author of the study (20 levels)
- year
 Integer indicating the year of the study
- quality
 Integer representing the quality score of the experiment
- age
 Integer indicating the age of the patients
- r
 Integer indicating the number of recurrent bleeds
- m
 Integer indicating the total number of patients
- bleed
 Integer indicating bleeding events
- treat
 Factor indicating treatment type (6 levels)
- table
 Factor representing the experiment table (40 levels)
Details
The dataset name has been kept as 'bleeding_ulcers_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the SMPracticals package version 1.4-3.1.
Campylobacter Infections Time Series
Description
This dataset, campylobacter_infections_ts, is a time series object containing the number of cases of campylobacter infections in northern Quebec (Canada), recorded in four-week intervals from January 1990 to October 2000. Campylobacterosis is an acute bacterial infectious disease attacking the digestive system.
Usage
data(campylobacter_infections_ts)
Format
A time series object ('ts') with 140 observations:
- Start
 c(1990, 1)
- End
 c(2000, 10)
- Frequency
 13 (observations per year)
Details
The dataset name has been kept as 'campylobacter_infections_ts' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'ts' indicates that the dataset is a time series object. The original content has not been modified in any way.
Source
Data taken from the tscount package version 1.4.3. Original source: Ferland, R., Latour, A. and Oraichi, D., "Integer-valued GARCH process". Journal of Time Series Analysis, 2006; 27(6): 923–942.
Cholera Daily Deaths in England, 1849
Description
This dataset, cholera_deaths_1849_tbl_df, is a tibble containing daily deaths from Cholera and Diarrhaea in England for each day of the 12 months of 1849. It includes the month, cause of death, day of month, number of deaths, date, and day of week for each observation.
Usage
data(cholera_deaths_1849_tbl_df)
Format
A tibble with 730 observations and 6 variables:
- month
 Character indicating the month of observation
- cause_of_death
 Factor with 2 levels indicating cause of death (Cholera or Diarrhaea)
- day_of_month
 Character indicating the day of the month
- deaths
 Numeric value indicating the number of deaths
- date
 Date object indicating the exact date
- day_of_week
 Ordered factor with 7 levels indicating the day of week
Details
The dataset name has been kept as 'cholera_deaths_1849_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
Data taken from the HistData package version 0.9-3. Original source: Bingham P., Verlander, N. Q., Cheal M. J. (2004). "John Snow, William Farr and the 1849 outbreak of cholera that affected London: a reworking of the data highlights the importance of the water supply". Public Health, 118(6), 387–394, Table 2.
Chemotherapy for Stage B/C Colon Cancer
Description
This dataset, colon_stageBC_chemo_df, is a data frame containing data from one of the first successful trials of adjuvant chemotherapy for stage B/C colon cancer. The dataset includes 1858 observations (with two records per patient: one for recurrence and one for death) and 16 clinical variables.
Usage
data(colon_stageBC_chemo_df)
Format
A data frame with 1858 observations and 16 variables:
- id
 Numeric patient identifier
- study
 Numeric study code
- rx
 Factor with 3 levels indicating treatment group
- sex
 Numeric gender code
- age
 Numeric age in years
- obstruct
 Numeric obstruction status
- perfor
 Numeric perforation status
- adhere
 Numeric adhesion status
- nodes
 Numeric count of lymph nodes
- status
 Numeric event status
- differ
 Numeric differentiation grade
- extent
 Numeric tumor extent
- surg
 Numeric surgery code
- node4
 Numeric node4 status
- time
 Numeric follow-up time
- etype
 Numeric event type
Details
The dataset name has been kept as 'colon_stageBC_chemo_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the OncoDataSets package version 0.1.0.
Features from Colonoscopic Video
Description
This dataset, colonoscopy_features_tbl_df, is a tibble containing features extracted from 76 colonoscopic videos. Each video was recorded using both White Light (WL) and Narrow Band Imaging (NBI). The dataset includes histology results (classification ground truth), the opinion of endoscopists (4 experts and 3 beginners), and 698 features derived from patients with gastrointestinal lesions.
Usage
data(colonoscopy_features_tbl_df)
Format
A tibble with 76 observations and 7 variables:
- feature 294
 Numeric feature extracted from colonoscopic videos
- feature 441
 Numeric feature extracted from colonoscopic videos
- feature 472
 Numeric feature extracted from colonoscopic videos
- feature 486
 Numeric feature extracted from colonoscopic videos
- class_agreement
 Numeric score representing agreement among endoscopists
- missinglabel_indicator
 Numeric indicator for missing labels
- ground truth
 Character string representing the histology-based classification
Details
The dataset name has been kept as 'colonoscopy_features_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
Data taken from the gmmsslm package version 1.1.6.
PubMed Data of miRNAs in Colorectal Cancer
Description
This dataset, crc_mirnas_pubmed_tbl_df, is a tibble containing information from PubMed abstracts related to microRNAs (miRNAs) in colorectal cancer. The data provides publication metadata, article abstracts, and associated miRNAs across 508 observations with 8 variables.
Usage
data(crc_mirnas_pubmed_tbl_df)
Format
A tibble with 508 observations and 8 variables:
- PMID
 Numeric PubMed identifier
- Year
 Numeric publication year
- Title
 Character article title
- Abstract
 Character full abstract text
- Language
 Character publication language
- Type
 Character article type
- Topic
 Character research topic
- miRNA
 Character microRNA identifiers
Details
The dataset name has been kept as 'crc_mirnas_pubmed_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
Data taken from the OncoDataSets package version 0.1.0.
Cystic Fibrosis SNP
Description
This dataset, cystic_fibrosis_snps_df, is a data frame containing genetic association data for cystic fibrosis, including a case-control indicator and 23 single nucleotide polymorphisms (SNPs) with specified inter-marker distances. The dataset contains 186 observations across 24 variables.
Usage
data(cystic_fibrosis_snps_df)
Format
A data frame with 186 observations and 24 variables:
- y
 Integer case-control indicator
- loc1
 Integer SNP genotype at location 1
- loc2
 Integer SNP genotype at location 2
- loc3
 Integer SNP genotype at location 3
- loc4
 Integer SNP genotype at location 4
- loc5
 Integer SNP genotype at location 5
- loc6
 Integer SNP genotype at location 6
- loc7
 Integer SNP genotype at location 7
- loc8
 Integer SNP genotype at location 8
- loc9
 Integer SNP genotype at location 9
- loc10
 Integer SNP genotype at location 10
- loc11
 Integer SNP genotype at location 11
- loc12
 Integer SNP genotype at location 12
- loc13
 Integer SNP genotype at location 13
- loc14
 Integer SNP genotype at location 14
- loc15
 Integer SNP genotype at location 15
- loc16
 Integer SNP genotype at location 16
- loc17
 Integer SNP genotype at location 17
- loc18
 Integer SNP genotype at location 18
- loc19
 Integer SNP genotype at location 19
- loc20
 Integer SNP genotype at location 20
- loc21
 Integer SNP genotype at location 21
- loc22
 Integer SNP genotype at location 22
- loc23
 Integer SNP genotype at location 23
Details
The dataset name has been kept as 'cystic_fibrosis_snps_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the gap.datasets package version 0.0.6. Original source: Liu JS, Sabatti C, Teng J, Keats BJB, Risch N (2001). "Bayesian Analysis of Haplotypes for Linkage Disequilibrium Mapping". Genome Research, 11:1716–1724.
Digestive Cancer Survival Times
Description
This dataset, digestive_cancer_survival_df, is a data frame containing survival times (in days) of cancer patients with advanced cancer of the stomach, bronchus, colon, ovary, or breast. All patients included in this dataset received treatment that involved supplemental ascorbate.
Usage
data(digestive_cancer_survival_df)
Format
A data frame with 17 observations and 5 variables:
- stomach
 Integer values indicating survival times (in days) for patients with stomach cancer
- bronchus
 Integer values indicating survival times (in days) for patients with bronchial cancer
- colon
 Integer values indicating survival times (in days) for patients with colon cancer
- ovary
 Integer values indicating survival times (in days) for patients with ovarian cancer
- breast
 Integer values indicating survival times (in days) for patients with breast cancer
Details
The dataset name has been kept as 'digestive_cancer_survival_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the RbyExample package version 0.0.100.
E. coli Infections Time Series
Description
This dataset, ecoli_infections_df, is a data frame containing the weekly number of reported disease cases caused by Escherichia coli in the state of North Rhine-Westphalia (Germany) from January 2001 to May 2013, excluding cases of EHEC and HUS.
Usage
data(ecoli_infections_df)
Format
A data frame with 646 observations and 3 variables:
- year
 Numeric value indicating the year of observation
- week
 Numeric value indicating the week of observation
- cases
 Numeric value indicating the number of reported E. coli cases
Details
The dataset name has been kept as 'ecoli_infections_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the tscount package version 1.4.3.
Gastric Cancer Clinical Trial
Description
This dataset, gastric_cancer_trial_df, is a data frame containing data from a randomized clinical trial conducted by the Gastrointestinal Tumor Study Group on patients with gastric cancer. It includes survival time, event occurrence, and group assignment.
Usage
data(gastric_cancer_trial_df)
Format
A data frame with 90 observations and 3 variables:
- time
 Numeric vector representing survival time
- event
 Numeric vector indicating event occurrence (e.g., death or relapse)
- group
 Factor with 2 levels representing treatment groups
Details
The dataset name has been kept as 'gastric_cancer_trial_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the package coin version 1.4-3.
Gastrointestinal Damage Prevention
Description
This dataset, gi_damage_prevention_df, is a data frame containing results from four randomised clinical trials on the prevention of gastrointestinal damages by Misoprostol, reported by Lanza et al. (1987–1989).
Usage
data(gi_damage_prevention_df)
Format
A data frame with 198 observations and 3 variables:
- study
 Factor indicating the clinical trial (4 levels)
- treatment
 Factor indicating the treatment group (2 levels: control or Misoprostol)
- classification
 Ordered factor indicating the degree of gastrointestinal damage (5 levels)
Details
The dataset name has been kept as 'gi_damage_prevention_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the HSAUR3 package version 1.0-15.
Helicobacter pylori Infection in Preschoolers
Description
This dataset, helicobacter_children_tbl_df, is a tibble containing the prevalence of Helicobacter pylori infection in preschool children according to parental history of duodenal or gastric ulcer.
Usage
data(helicobacter_children_tbl_df)
Format
A tibble with 863 observations and 2 variables:
- ulcer
 Factor with 2 levels indicating parental history of duodenal or gastric ulcer
- infected
 Factor with 2 levels indicating Helicobacter pylori infection status
Details
The dataset name has been kept as 'helicobacter_children_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
Data taken from the package pubh version 2.0.0.
Colic Horse Surgery
Description
This dataset, horse_colic_surgery_df, is a data frame containing clinical observations of horses with colic, where the primary task is to determine if the lesion requires surgery. The data consists of 300 cases with 31 clinical variables, modified from the original UCI repository version with adjusted factor levels.
Usage
data(horse_colic_surgery_df)
Format
A data frame with 300 observations and 31 variables:
- surgery
 Factor with 2 levels indicating surgical requirement
- age
 Factor with 1 level (age group)
- hospitalID
 Integer hospital identifier
- temp_rectal
 Numeric rectal temperature
- pulse
 Numeric pulse rate
- respiratory_rate
 Numeric respiratory rate
- temp_extreme
 Factor with 4 levels (temperature extremes)
- pulse_peripheral
 Factor with 4 levels (peripheral pulse)
- capillayr_refill_time
 Factor with 3 levels (capillary refill time)
- pain
 Numeric pain score
- peristalsis
 Numeric peristalsis measure
- abdominal_distension
 Numeric distension score
- nasogastric_tube
 Numeric tube measure
- nasogastric_reflux
 Numeric reflux quantity
- nasogastric_reflux_PH
 Numeric reflux pH
- rectal_examination
 Numeric exam result
- abdomen
 Numeric abdomen assessment
- cell_volume
 Numeric cell volume
- protein
 Numeric protein level
- abdominocentesis_appearance
 Numeric appearance score
- abdomcentesis_protein
 Numeric protein measure
- outcome
 Factor with 3 levels (outcome status)
- surgical_lesion
 Factor with 2 levels (lesion type)
- lesion_type1
 Factor with 60 levels (primary lesion type)
- lesion_type2
 Integer secondary lesion code
- lesion_type3
 Integer tertiary lesion code
- cp_data
 Factor with 2 levels (CP data)
- temp_extreme_ordered
 Ordered factor with 4 levels (temperature)
- temp_extreme_num
 Numeric temperature measure
- mucous_membranes_col
 Factor with 6 levels (membrane color)
- mucous_membranes_group
 Factor with 5 levels (membrane group)
Details
The dataset name has been kept as 'horse_colic_surgery_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way beyond factor level adjustments.
Source
Data taken from the VIM package version 6.2.2 (originally from UCI repository).
Studies on CAM for Irritable Bowel Syndrome
Description
This dataset, ibs_cam_trials_df, is a data frame containing results from 19 clinical trials examining complementary and alternative medicine (CAM) interventions for irritable bowel syndrome (IBS). The dataset includes 12 variables characterizing each trial and its outcomes.
Usage
data(ibs_cam_trials_df)
Format
A data frame with 19 observations and 12 variables:
- id
 Integer trial identifier
- study
 Character study name/location
- year
 Integer publication year
- country
 Character country where study was conducted
- ibs.crit
 Character IBS diagnostic criteria used
- days
 Integer study duration in days
- visits
 Integer number of study visits
- jadad
 Integer Jadad score for study quality
- x.a
 Integer active treatment events
- n.a
 Integer active treatment sample size
- x.p
 Integer placebo group events
- n.p
 Integer placebo group sample size
Details
The dataset name has been kept as 'ibs_cam_trials_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the metadat package version 1.4-0.
SmartPill Intestinal Transit
Description
This dataset, intestinal_smartpill_df, is a data frame from a prospective cohort study evaluating gastric emptying, small bowel transit time, and total intestinal transit time using a SmartPill motility capsule. The study involved 8 critically ill trauma patients and 87 healthy volunteers. The capsule wirelessly transmitted pH, pressure, and temperature to a recorder attached to each subject's abdomen.
Usage
data(intestinal_smartpill_df)
Format
A data frame with 95 observations and 22 variables:
- Group
 Numeric indicator of group membership
- Gender
 Numeric indicator of gender
- Race
 Numeric code indicating racial background
- Height
 Height in centimeters
- Weight
 Weight in kilograms
- Age
 Age in years
- GE.Time
 Gastric emptying time (minutes)
- SB.Time
 Small bowel transit time (minutes)
- C.Time
 Colon transit time (minutes)
- WG.Time
 Whole gut transit time (minutes)
- S.Contractions
 Number of contractions in the stomach
- S.Sum.of.Amplitudes
 Sum of contraction amplitudes in the stomach
- S.Mean.Peak.Amplitude
 Mean peak amplitude in the stomach
- S.Mean.pH
 Mean pH level in the stomach
- SB.Contractions
 Number of contractions in the small bowel
- SB.Sum.of.Amplitudes
 Sum of contraction amplitudes in the small bowel
- SB.Mean.Peak.Amplitude
 Mean peak amplitude in the small bowel
- SB.Mean.pH
 Mean pH level in the small bowel
- Colon.Contractions
 Number of contractions in the colon
- Colon.Sum.of.Amplitudes
 Sum of contraction amplitudes in the colon
- C.Mean.Peak.Amplitude
 Mean peak amplitude in the colon
- C.Mean.pH
 Mean pH level in the colon
Details
The dataset name has been kept as 'intestinal_smartpill_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the medicaldata package version 0.2.0. Original source: Rauch et al., "Use of Wireless Utility Capsule to Determine Gastric Emptying and Small Intestinal Transit Times in Critically Ill Trauma Patients". Journal of Critical Care, 2012; 27(5): 534.e7–534.e12.
Satellite Tumors in GI Surgery
Description
This dataset, intestinal_surgery_df, is a data frame containing intestinal surgery data from 844 cancer patients. The data consists of pairs (n_i, s_i) where n_i is the number of satellites removed and s_i is the number of satellites found to be malignant.
Usage
data(intestinal_surgery_df)
Format
A data frame with 844 observations and 2 variables:
- n
 Numeric value representing the number of satellites removed
- s
 Numeric value representing the number of malignant satellites found
Details
The dataset name has been kept as 'intestinal_surgery_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the deconvolveR package version 1.2-1. Original source: Efron, B. (2016). "Empirical Bayes deconvolution estimates". Biometrika, 103(1), 1–20.
Prednisone vs Placebo in Liver Cirrhosis
Description
This dataset, liver_cirrhosis_prednisone_df, is a data frame containing data from a randomized control trial comparing prednisone (n=251) versus placebo (n=237) in 488 liver cirrhosis patients. The dataset includes both survival and longitudinal measurements of prothrombin index development over time, with 2968 total observations across 9 variables.
Usage
data(liver_cirrhosis_prednisone_df)
Format
A data frame with 2968 observations and 9 variables:
- ID
 Integer patient identifier
- Time
 Numeric time measurement
- death
 Integer death indicator
- obstime
 Numeric observation time
- proth
 Integer prothrombin index value
- Trt
 Factor with 2 levels indicating treatment group (prednisone/placebo)
- start
 Numeric start time
- stop
 Numeric stop time
- event
 Numeric event indicator
Details
The dataset name has been kept as 'liver_cirrhosis_prednisone_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the JSM package version 1.0.1.
Ontario Lynch Syndrome families
Description
This dataset, lynch_ontario_families_df, is a data frame containing data from 32 Lynch Syndrome families segregating mismatch repair mutations selected from the Ontario Familial Colorectal Cancer Registry. The dataset includes 765 individuals (both probands and relatives) with 11 variables per observation.
Usage
data(lynch_ontario_families_df)
Format
A data frame with 765 observations and 11 variables:
- famID
 Integer family identifier
- indID
 Integer individual identifier
- fatherID
 Integer father's identifier
- motherID
 Integer mother's identifier
- gender
 Integer gender code
- status
 Integer disease status
- time
 Integer time variable
- currentage
 Integer current age
- mgene
 Integer mutation gene status
- proband
 Integer proband indicator
- relation
 Integer relationship code
Details
The dataset name has been kept as 'lynch_ontario_families_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the FamEvent package version 3.2.
Norovirus Outbreak in Derbyshire
Description
This dataset, norovirus_derbyshire_df, is a data frame describing an outbreak of norovirus in the summer of 2001 in a primary school and nursery in Derbyshire, England. It contains 492 observations across 5 variables tracking illness patterns among students.
Usage
data(norovirus_derbyshire_df)
Format
A data frame with 492 observations and 5 variables:
- class
 Factor with 15 levels representing school classes
- day_absent
 Integer day of absence
- start_illness
 Integer day when illness started
- end_illness
 Integer day when illness ended
- day_vomiting
 Integer day when vomiting occurred
Details
The dataset name has been kept as 'norovirus_derbyshire_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the outbreaks package version 1.9.0. Original source: O'Neill and Marks (2005).
Pancreatic Cancer Clinical Trial
Description
This dataset, pancreatic_cancer_df, is a data frame containing data from a Phase II clinical trial of patients with locally advanced or metastatic pancreatic cancer. It includes time-to-event data for disease progression and death, as well as staging information.
Usage
data(pancreatic_cancer_df)
Format
A data frame with 41 observations and 4 variables:
- stage
 Factor indicating disease stage (locally advanced or metastatic)
- onstudy
 Factor indicating time (in days) from enrollment
- progression
 Factor indicating time (in days) to disease progression
- death
 Factor indicating time (in days) to death
Details
The dataset name has been kept as 'pancreatic_cancer_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the asaur package version 0.50.
Mayo Clinic Primary Biliary Cirrhosis
Description
This dataset, pbc_mayo_survival_df, is a data frame containing data from a randomized control trial conducted at Mayo Clinic from 1974 to 1984, studying the progression of primary biliary cirrhosis. The dataset includes both survival and longitudinal measurements with 1945 observations across 16 clinical variables.
Usage
data(pbc_mayo_survival_df)
Format
A data frame with 1945 observations and 16 variables:
- ID
 Integer patient identifier
- Time
 Numeric time measurement
- death
 Numeric death indicator
- obstime
 Numeric observation time
- serBilir
 Numeric serum bilirubin measurement
- albumin
 Numeric serum albumin measurement
- alkaline
 Integer alkaline phosphatase level
- platelets
 Integer platelet count
- drug
 Factor with 2 levels indicating treatment group
- age
 Numeric age in years
- gender
 Factor with 2 levels indicating patient sex
- ascites
 Factor with 2 levels indicating presence of ascites
- hepatom
 Factor with 2 levels indicating presence of hepatomegaly
- start
 Numeric start time for interval
- stop
 Numeric stop time for interval
- event
 Numeric event indicator
Details
The dataset name has been kept as 'pbc_mayo_survival_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the JSM package version 1.0.1.
Indomethacin for Post-ERCP Pancreatitis
Description
This dataset, post_ercp_pancreatitis_tbl_df, is a tibble containing results from a randomized, placebo-controlled, prospective 2-arm trial of rectal indomethacin (100 mg) versus placebo to prevent post-ERCP pancreatitis in 602 participants, as reported by Elmunzer, Higgins, et al. (2012) in the New England Journal of Medicine.
Usage
data(post_ercp_pancreatitis_tbl_df)
Format
A tibble with 602 observations and 33 variables:
- id
 Numeric subject identifier
- site
 Factor indicating study site (4 levels)
- age
 Numeric age of the participant
- risk
 Numeric risk score
- gender
 Factor indicating gender (2 levels)
- outcome
 Factor indicating study outcome (2 levels)
- sod
 Factor indicating presence of sphincter of Oddi dysfunction (2 levels)
- pep
 Factor indicating presence of post-ERCP pancreatitis (2 levels)
- recpanc
 Factor indicating recurrent pancreatitis (2 levels)
- psphinc
 Factor indicating pancreatic sphincterotomy (2 levels)
- precut
 Factor indicating precut sphincterotomy (2 levels)
- difcan
 Factor indicating difficult cannulation (2 levels)
- pneudil
 Factor indicating pneumatic dilation (2 levels)
- amp
 Factor indicating ampullary interventions (2 levels)
- paninj
 Factor indicating pancreatic injury (2 levels)
- acinar
 Factor indicating acinarization (2 levels)
- brush
 Factor indicating brushing procedures (2 levels)
- asa81
 Factor indicating ASA 81 mg use (3 levels)
- asa325
 Factor indicating ASA 325 mg use (3 levels)
- asa
 Factor indicating ASA status (3 levels)
- prophystent
 Factor indicating prophylactic stent placement (2 levels)
- therastent
 Factor indicating therapeutic stent use (2 levels)
- pdstent
 Factor indicating pancreatic duct stent (2 levels)
- sodsom
 Factor indicating somatostatin use for SOD (2 levels)
- bsphinc
 Factor indicating biliary sphincterotomy (2 levels)
- bstent
 Factor indicating biliary stent (2 levels)
- chole
 Factor indicating cholecystectomy (2 levels)
- pbmal
 Factor indicating presence of pancreaticobiliary malignancy (2 levels)
- train
 Factor indicating if performed by trainee (2 levels)
- status
 Factor indicating trial status (2 levels)
- type
 Factor indicating procedure type (4 levels)
- rx
 Factor indicating treatment group: placebo or indomethacin (2 levels)
- bleed
 Numeric bleeding indicator
Details
The dataset name has been kept as 'post_ercp_pancreatitis_tbl_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'tbl_df' indicates that the dataset is a tibble. The original content has not been modified in any way.
Source
Data taken from the medicaldata package version 0.2.0.
H2 Antagonists in UGIB
Description
This dataset, ugi_bleeding_df, is a data frame containing results from 27 studies examining the effectiveness of histamine H2 antagonists (cimetidine or ranitidine) in treating acute upper gastrointestinal hemorrhage, with 14 variables per study.
Usage
data(ugi_bleeding_df)
Format
A data frame with 27 observations and 14 variables:
- id
 Integer study identifier
- trial
 Character trial name/location
- year
 Integer publication year
- ref
 Integer reference number
- trt
 Character treatment description
- ctrl
 Character control description
- nti
 Integer treatment group sample size
- b.xti
 Integer treatment group bleeding events
- o.xti
 Integer treatment group other events
- d.xti
 Integer treatment group deaths
- nci
 Integer control group sample size
- b.xci
 Integer control group bleeding events
- o.xci
 Integer control group other events
- d.xci
 Integer control group deaths
Details
The dataset name has been kept as 'ugi_bleeding_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the metadat package version 1.4-0.
View Available Datasets in DigestiveDataSets
Description
This function lists all datasets available in the 'DigestiveDataSets' package. If the 'DigestiveDataSets' package is not loaded, it stops and shows an error message. If no datasets are available, it returns a message and an empty vector.
Usage
view_datasets_DigestiveDataSets()
Value
A character vector with the names of the available datasets. If no datasets are found, it returns an empty character vector.
Examples
if (requireNamespace("DigestiveDataSets", quietly = TRUE)) {
  library(DigestiveDataSets)
  view_datasets_DigestiveDataSets()
}
Obese Patient Weight Loss Data
Description
This dataset, weight_loss_df, is a data frame containing the weight, in kilograms, of an obese patient measured at 52 time points over an 8-month period as part of a weight rehabilitation programme.
Usage
data(weight_loss_df)
Format
A data frame with 52 observations and 2 variables:
- Days
 Integer vector indicating the number of days since the beginning of the programme
- Weight
 Numeric vector indicating the weight (in kilograms) of the patient at each time point
Details
The dataset name has been kept as 'weight_loss_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the DigestiveDataSets package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Source
Data taken from the MASS package version 7.3-65.