Type: Package
Title: Efficiently Access Pro Golf Data
Version: 0.1.5
Description: Fetch Professional Golfers' Association (PGA) Tour tournament data from ESPN https://www.espn.com/golf/ including leaderboards and hole-by-hole scoring. Data is returned in tidy tibble format ready for analysis. Supports local storage via RDS or 'Apache Arrow' Parquet files for fast repeated access. Designed for golf analytics, data journalism, and fantasy sports research.
License: MIT + file LICENSE
Depends: R (≥ 4.0)
Encoding: UTF-8
RoxygenNote: 7.3.3
Imports: httr2, dplyr, tibble
Suggests: testthat (≥ 3.0.0), arrow, ggplot2
Config/testthat/edition: 3
URL: https://github.com/array-carpenter/golfastr
BugReports: https://github.com/array-carpenter/golfastr/issues
NeedsCompilation: no
Packaged: 2026-02-10 01:05:18 UTC; raycarpenter
Author: Ray Carpenter [aut, cre, cph]
Maintainer: Ray Carpenter <raymondcarpenter1@gmail.com>
Repository: CRAN
Date/Publication: 2026-02-12 08:00:08 UTC

golfastr: Efficiently Access Pro Golf Data

Description

logo

Fetch Professional Golfers' Association (PGA) Tour tournament data from ESPN https://www.espn.com/golf/ including leaderboards and hole-by-hole scoring. Data is returned in tidy tibble format ready for analysis. Supports local storage via RDS or 'Apache Arrow' Parquet files for fast repeated access. Designed for golf analytics, data journalism, and fantasy sports research.

Author(s)

Maintainer: Ray Carpenter raymondcarpenter1@gmail.com [copyright holder]

See Also

Useful links:


ESPN API Request Helpers

Description

Internal functions for making requests to ESPN's Golf APIs. Supports multiple tours via the tour parameter.


Build Season Data File

Description

Incrementally load all tournaments for a season into a local file. Skips tournaments already saved. Safe to interrupt and resume.

Usage

build_season(year, file_path, tour = "pga")

Arguments

year

Season year

file_path

Path to data file (.rds or .parquet). Must be specified by user.

tour

Tour name (default: "pga")

Value

Invisibly returns number of tournaments added (integer).

Examples

## Not run: 
# Build 2025 season (run multiple times if needed)
build_season(2025, file_path = tempfile(fileext = ".rds"))

## End(Not run)

Check if Data is Cached

Description

Internal function to check if data exists in cache.

Usage

cache_exists(filename)

Arguments

filename

Name of the cached file.

Value

Logical indicating if file exists in cache.


Display Cache Information

Description

Shows information about the current cache including location, number of files, and total size.

Usage

cache_info()

Value

Invisible NULL. Called for side effects (prints to console).

Examples

cache_info()

Load Data from Cache

Description

Internal function to load data from the local cache.

Usage

cache_load(filename)

Arguments

filename

Name of the cached file.

Value

The cached data, or NULL if not found.


Save Data to Cache

Description

Internal function to save data to the local cache.

Usage

cache_save(data, filename)

Arguments

data

Data to cache.

filename

Name for the cached file.

Value

Invisible path to the cached file.


Check Season Progress

Description

See which tournaments are loaded vs missing for a season.

Usage

check_season(year, file_path, tour = "pga")

Arguments

year

Season year

file_path

Path to data file (.rds or .parquet). Must be specified by user.

tour

Tour name (default: "pga")

Value

A tibble showing status of each tournament with columns: event_id, tournament_name, start_date, end_date, and status (either "loaded" or "missing").

Examples

## Not run: 
progress <- check_season(2025, file_path = "my_golf_data.rds")

## End(Not run)

Clear golfastr Cache

Description

Removes all cached data files from the local cache directory.

Usage

clear_cache(confirm = TRUE)

Arguments

confirm

Logical. If TRUE (default), prompts for confirmation before clearing. Set to FALSE to skip confirmation.

Value

Invisible NULL. Called for side effects.

Examples


clear_cache()
clear_cache(confirm = FALSE)


Compare Players

Description

Side-by-side comparison of multiple players.

Usage

compare_players(players, year = NULL, file_path)

Arguments

players

Vector of player names

year

Optional year filter

file_path

Path to data file (.rds or .parquet)

Value

Tibble comparing player statistics

Examples

## Not run: 
compare_players(c("Scheffler", "McIlroy", "Hovland"), file_path = "golf_data.rds")

## End(Not run)

ESPN Core API Request

Description

Makes requests to ESPN's core sports API.

Usage

espn_core_request(endpoint, tour = "pga", query_params = NULL)

Arguments

endpoint

API endpoint path

tour

Tour code (default "pga")

query_params

Optional query parameters

Value

Parsed JSON response


ESPN Site API Request

Description

Makes requests to ESPN's site API (for scoreboard, schedule, etc.).

Usage

espn_site_request(endpoint, tour = "pga", query_params = NULL)

Arguments

endpoint

API endpoint path

tour

Tour code (default "pga")

query_params

Optional query parameters

Value

Parsed JSON response


Fast leaderboard fetch using site API

Description

Fast leaderboard fetch using site API

Usage

fetch_leaderboard_fast(event_id, year, tour = "pga")

Fetch hole-by-hole for a single player

Description

Fetch hole-by-hole for a single player

Usage

fetch_player_holes(event_id, player_id, tour = "pga")

Field Strength Analysis

Description

Analyze the strength of a tournament field.

Usage

field_strength(tournament, year, file_path)

Arguments

tournament

Tournament name

year

Season year

file_path

Path to data file (.rds or .parquet)

Value

Tibble with field statistics

Examples

## Not run: 
field_strength("Masters", 2025, file_path = "golf_data.rds")

## End(Not run)

Get Cache Directory Path

Description

Returns the path to the golfastr cache directory using the standard R user directory per CRAN policy.

Usage

get_cache_dir()

Value

Character string with the cache directory path.


Get ESPN tour code

Description

Get ESPN tour code

Usage

get_espn_tour_code(tour = "pga")

Arguments

tour

Tour identifier ("pga", "lpga", "euro", etc.)

Value

ESPN tour code string


Get Major Championships

Description

Get results from the four major championships.

Usage

get_majors(year, file_path)

Arguments

year

Season year

file_path

Path to data file (.rds or .parquet)

Value

Tibble with major championship results

Examples

## Not run: 
# Get 2025 majors
get_majors(2025, file_path = "golf_data.rds")

## End(Not run)

Get Player Results

Description

Look up a player's results across all tournaments in a data file.

Usage

get_player(name, file_path)

Arguments

name

Player name (partial match, case-insensitive)

file_path

Path to data file (.rds or .parquet)

Value

Tibble with player's tournament results

Examples

## Not run: 
# Get Rory McIlroy's results
get_player("McIlroy", file_path = "golf_data.rds")

# Get Scottie Scheffler's results
get_player("Scheffler", file_path = "golf_data.rds")

## End(Not run)

Get Tournament Summary (Internal)

Description

Internal function to get leaderboard and scorecards for a tournament.

Usage

get_tournament_summary(event_id, top_n = NULL)

Arguments

event_id

ESPN event ID

top_n

Number of top finishers to get scorecards for

Value

List with leaderboard and scorecards


Get Tournament Winners

Description

Get all tournament winners from a data file.

Usage

get_winners(year = NULL, file_path)

Arguments

year

Optional year filter

file_path

Path to data file (.rds or .parquet)

Value

Tibble with tournament winners

Examples

## Not run: 
# Get all winners
get_winners(file_path = "golf_data.rds")

# Get 2025 winners
get_winners(2025, file_path = "golf_data.rds")

## End(Not run)

Leaderboard Snapshot

Description

Get formatted leaderboard for display.

Usage

leaderboard(tournament, year, top_n = 10, file_path)

Arguments

tournament

Tournament name

year

Season year

top_n

Number of players to show (default: 10)

file_path

Path to data file (.rds or .parquet)

Value

Formatted tibble for display

Examples

## Not run: 
leaderboard("Masters", 2025, file_path = "golf_data.rds")

## End(Not run)

List Available Tournaments

Description

Get a list of all available tournaments for a given year.

Usage

list_tournaments(year, tour = "pga")

Arguments

year

Season year (e.g., 2025)

tour

Tour name: "pga" (default)

Value

Tibble with event_id, tournament_name, start_date, end_date

Examples


# See all 2025 PGA tournaments
list_tournaments(2025)


Load Tournament Data

Description

Load tournament data from a file. Auto-detects format based on extension.

Usage

load_data(file_path, tournament = NULL)

Arguments

file_path

Path to data file (.rds or .parquet)

tournament

Optional tournament name filter (partial match)

Value

A tibble with tournament leaderboard data

Examples

## Not run: 
# Load all data
data <- load_data("golf_data.rds")

# Load and filter to Masters
masters <- load_data("golf_data.rds", tournament = "Masters")

## End(Not run)

Load Tournament Data from Parquet

Description

Load tournament data from a Parquet file. Requires the arrow package.

Usage

load_from_parquet(file_path)

Arguments

file_path

Path to Parquet file. Must be specified by user.

Value

A tibble with tournament leaderboard data containing columns such as position, player name, scores, and tournament metadata.

Examples

## Not run: 
data <- load_from_parquet(file_path = "my_golf_data.parquet")

## End(Not run)

Load Tournament Data from RDS

Description

Load Tournament Data from RDS

Usage

load_from_rds(file_path)

Arguments

file_path

Path to RDS file. Must be specified by user.

Value

A tibble with tournament leaderboard data containing columns such as position, player name, scores, and tournament metadata.

Examples

## Not run: 
data <- load_from_rds(file_path = "my_golf_data.rds")

## End(Not run)

Load Hole-by-Hole Scoring

Description

Retrieves hole-by-hole scoring data for tournaments.

Usage

load_holes(
  year = as.integer(format(Sys.Date(), "%Y")),
  tournament = NULL,
  top_n = NULL,
  tour = "pga"
)

get_player_scorecards(event_id, athlete_id)

Arguments

year

Season year (e.g., 2026). Defaults to current year.

tournament

Tournament identifier - either event_id or partial name match.

top_n

Number of top finishers to include. Default NULL returns all.

tour

Tour identifier. Currently supports "pga" (default).

event_id

ESPN event identifier (for legacy function)

athlete_id

ESPN athlete identifier (for legacy function)

Value

A tibble with hole-by-hole data including:

A tibble with hole-by-hole scoring data

Functions

Examples


# Load hole-by-hole for top 10 at Sony Open
holes <- load_holes(2026, "Sony", top_n = 10)


Load Tournament Leaderboard

Description

Retrieves leaderboard data for tournaments. Can load a single tournament or all tournaments for a year.

Usage

load_leaderboard(
  year = as.integer(format(Sys.Date(), "%Y")),
  tournament = NULL,
  tour = "pga"
)

get_tournament_leaderboard(event_id)

Arguments

year

Season year (e.g., 2026). Defaults to current year.

tournament

Tournament identifier - either event_id or partial name match. If NULL, returns all tournaments for the year.

tour

Tour identifier. Currently supports "pga" (default).

event_id

ESPN event identifier (for legacy function)

Value

A tibble with leaderboard data including:

A tibble with leaderboard data

Functions

Examples


# Load specific tournament by name
sony <- load_leaderboard(2026, "Sony")


Load PGA Hole-by-Hole Data

Description

Loads detailed hole-by-hole scoring data for specified tournaments.

Usage

load_pga_hbh(years, tournaments = NULL, top_n = 10, tour = "pga", dir = NULL)

Arguments

years

Numeric vector. Year(s) to load.

tournaments

Character vector. Optional tournament event IDs or names. If NULL, loads all tournaments.

top_n

Numeric. Number of top finishers to include scorecards for. Default is 10. Set to NULL for all players.

tour

Character. Tour type, default "pga".

dir

Character. Optional directory to save CSV files.

Value

A tibble with hole-by-hole scoring data.

Examples


# Load Masters hole-by-hole for top 10
masters_hbh <- load_pga_hbh(2025, tournaments = "401703504")

# Load with more players
masters_hbh <- load_pga_hbh(2025, tournaments = "401703504", top_n = 50)


Load PGA Leaderboards

Description

Loads tournament leaderboard data for specified year(s) and tournament(s). This is similar to nflfastR's data loading pattern.

Usage

load_pga_leaderboards(years, tournaments = NULL, tour = "pga", dir = NULL)

Arguments

years

Numeric vector. Year(s) to load (e.g., 2025 or 2023:2025).

tournaments

Character vector. Optional tournament event IDs or names to filter. If NULL, loads all tournaments.

tour

Character. Tour type, default "pga".

dir

Character. Optional directory to save CSV files.

Value

A tibble with leaderboard data.

Examples


# Load specific tournament
masters <- load_pga_leaderboards(2025, tournaments = "401703504")

# Load all 2025 tournaments
all_2025 <- load_pga_leaderboards(2025)


Load PGA Tour Schedule

Description

Loads the PGA Tour tournament schedule for a given year. Similar to nflfastR's schedule loading functions.

Usage

load_pga_schedule(year, tour = "pga")

Arguments

year

Numeric. The year to load (e.g., 2025).

tour

Character. Tour type, default "pga".

Value

A tibble with tournament schedule data.

Examples


schedule <- load_pga_schedule(2025)


Load Player Directory

Description

Retrieves a directory of players for a tour. Since ESPN doesn't have a dedicated player directory endpoint, this aggregates players from recent tournament leaderboards.

Usage

load_players(year = as.integer(format(Sys.Date(), "%Y")), tour = "pga")

Arguments

year

Season year to pull players from. Defaults to current year.

tour

Tour identifier. Currently supports "pga" (default).

Value

A tibble with player data including:

Examples


# Get player directory
players <- load_players()


Load Golf Schedule

Description

Retrieves the tournament schedule for a given year and tour.

Usage

load_schedule(year = as.integer(format(Sys.Date(), "%Y")), tour = "pga")

get_pga_schedule(year = format(Sys.Date(), "%Y"))

Arguments

year

Season year (e.g., 2026). Defaults to current year.

tour

Tour identifier. Currently supports "pga" (default).

Value

A tibble with columns:

A tibble with tournament schedule data

Functions

Examples


# Get current year schedule
schedule <- load_schedule()

# Get specific year
schedule_2025 <- load_schedule(2025)


Load Tournament Data

Description

Fetch leaderboard data for a tournament. You can specify either the event_id or search by tournament name.

Usage

load_tournament(year, tournament, tour = "pga")

Arguments

year

Season year (e.g., 2025)

tournament

Tournament name (partial match) or event_id

tour

Tour name: "pga" (default)

Value

Tibble with tournament leaderboard

Examples


# Load by name (partial match works)
masters <- load_tournament(2025, "Masters")
pga_champ <- load_tournament(2025, "PGA Championship")

# Load by event_id
masters <- load_tournament(2025, "401703504")


Load Tournament with Hole-by-Hole Scores

Description

Fetch full tournament data including hole-by-hole scorecards.

Usage

load_tournament_detail(year, tournament, top_n = 10, tour = "pga")

Arguments

year

Season year (e.g., 2025)

tournament

Tournament name (partial match) or event_id

top_n

Only fetch scorecards for top N finishers (default: 10)

tour

Tour name: "pga" (default)

Value

List with 'leaderboard' and 'scorecards' tibbles

Examples


# Get Masters with top 10 scorecards
masters_detail <- load_tournament_detail(2025, "Masters", top_n = 10)


Made Cuts Percentage

Description

Get players by percentage of cuts made.

Usage

made_cuts_leaders(year = NULL, min_events = 5, top_n = 10, file_path)

Arguments

year

Optional year filter

min_events

Minimum events played

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

Tibble with made cuts leaders

Examples

## Not run: 
made_cuts_leaders(year = 2025, file_path = "golf_data.rds")

## End(Not run)

Get Field Descriptions for PGA Data

Description

Returns a data frame with field names and descriptions for leaderboard or hole-by-hole data.

Usage

pga_field_descriptions(data_type = c("leaderboard", "holes"))

Arguments

data_type

Character. Either "leaderboard" or "holes".

Value

A tibble with field and description columns.

Examples

pga_field_descriptions("leaderboard")
pga_field_descriptions("holes")

Get PGA Major Championships

Description

Returns a data frame with the four major championships and their typical schedule.

Usage

pga_majors()

Value

A tibble with tournament, month, and course columns.

Examples

pga_majors()

Get PGA Score Types

Description

Returns a data frame with score type classifications used in hole-by-hole data.

Usage

pga_score_types()

Value

A tibble with score_type, strokes_vs_par, and description columns.

Examples

pga_score_types()

Player Season Summary

Description

Get aggregate statistics for a player's season.

Usage

player_summary(name, year = NULL, file_path)

Arguments

name

Player name (partial match)

year

Optional year filter

file_path

Path to data file (.rds or .parquet)

Value

Tibble with season statistics

Examples

## Not run: 
player_summary("Scheffler", file_path = "golf_data.rds")
player_summary("McIlroy", year = 2025, file_path = "golf_data.rds")

## End(Not run)

Plot Head-to-Head Comparison

Description

Compare two or more players' finishes across tournaments.

Usage

plot_head_to_head(players, year = NULL, file_path)

Arguments

players

Vector of player names

year

Optional year filter

file_path

Path to data file (.rds or .parquet)

Value

ggplot object

Examples

## Not run: 
plot_head_to_head(c("Scheffler", "McIlroy"), file_path = "golf_data.rds")

## End(Not run)

Plot Tournament Leaderboard

Description

Bar chart of tournament leaderboard.

Usage

plot_leaderboard(tournament, year, top_n = 10, file_path)

Arguments

tournament

Tournament name

year

Season year

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

ggplot object

Examples

## Not run: 
plot_leaderboard("Masters", 2025, file_path = "golf_data.rds")

## End(Not run)

Plot Player Results

Description

Visualize a player's finishes across tournaments.

Usage

plot_player(name, year = NULL, file_path)

Arguments

name

Player name

year

Optional year filter

file_path

Path to data file (.rds or .parquet)

Value

ggplot object

Examples

## Not run: 
plot_player("Scheffler", file_path = "golf_data.rds")

## End(Not run)

Plot Scoring Distribution

Description

Histogram of tournament scores.

Usage

plot_scoring(tournament, year, file_path)

Arguments

tournament

Tournament name

year

Season year

file_path

Path to data file (.rds or .parquet)

Value

ggplot object

Examples

## Not run: 
plot_scoring("Masters", 2025, file_path = "golf_data.rds")

## End(Not run)

Plot Win Distribution

Description

Pie/bar chart of wins by player.

Usage

plot_wins(year = NULL, top_n = 10, file_path)

Arguments

year

Optional year filter

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

ggplot object

Examples

## Not run: 
plot_wins(year = 2025, file_path = "golf_data.rds")

## End(Not run)

Save Tournament Data to Parquet

Description

Save tournament data to a Parquet file for cross-language compatibility. Requires the arrow package.

Usage

save_to_parquet(data, file_path, append = TRUE)

Arguments

data

Tibble of tournament data

file_path

Path to Parquet file. Must be specified by user.

append

If TRUE, append to existing data

Value

Invisible NULL. Called for side effects (writes to file).

Examples

## Not run: 
masters <- load_tournament(2025, "Masters")
save_to_parquet(masters, file_path = tempfile(fileext = ".parquet"))

## End(Not run)

Save Tournament Data to RDS

Description

Save tournament data to an RDS file.

Usage

save_to_rds(data, file_path, append = TRUE)

Arguments

data

Tibble of tournament data

file_path

Path to RDS file. Must be specified by user.

append

If TRUE, append to existing data

Value

Invisible NULL. Called for side effects (writes to file).

Examples

## Not run: 
masters <- load_tournament(2025, "Masters")
save_to_rds(masters, file_path = tempfile(fileext = ".rds"))

## End(Not run)

Scoring Average Leaders

Description

Get players with best scoring average.

Usage

scoring_avg_leaders(year = NULL, min_events = 5, top_n = 10, file_path)

Arguments

year

Optional year filter

min_events

Minimum events played

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

Tibble with scoring average leaders

Examples

## Not run: 
scoring_avg_leaders(year = 2025, file_path = "golf_data.rds")

## End(Not run)

Top 10 Leaders

Description

Get players with most top 10 finishes.

Usage

top10_leaders(year = NULL, min_events = 5, top_n = 10, file_path)

Arguments

year

Optional year filter

min_events

Minimum events played

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

Tibble with top 10 leaders

Examples

## Not run: 
top10_leaders(year = 2025, file_path = "golf_data.rds")

## End(Not run)

Tournament History

Description

Get historical results for a specific tournament.

Usage

tournament_history(tournament, file_path)

Arguments

tournament

Tournament name (partial match)

file_path

Path to data file (.rds or .parquet)

Value

Tibble with tournament winners by year

Examples

## Not run: 
tournament_history("Masters", file_path = "golf_data.rds")

## End(Not run)

Most Wins

Description

Get players with most wins.

Usage

win_leaders(year = NULL, top_n = 10, file_path)

Arguments

year

Optional year filter

top_n

Number of players to show

file_path

Path to data file (.rds or .parquet)

Value

Tibble with win leaders

Examples

## Not run: 
win_leaders(year = 2025, file_path = "golf_data.rds")

## End(Not run)

mirror server hosted at Truenetwork, Russian Federation.