Analyzing Hass USA

The {avocado} package provides a weekly summary - starting from January 2017 through November 2020 - of Hass Avocado sales. There are three datasets in this package and let’s start with the dataset hass_usa which focuses on weekly avocado sales in the contiguous US.

Let’s start by loading the package - along with {dplyr} (for data wrangling) and {ggplot} (for data visualization) - and exploring it’s structure

library(avocado)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

data('hass_usa')

dplyr::glimpse(hass_usa)
#> Rows: 810
#> Columns: 11
#> $ week_ending               <date> 2017-01-02, 2017-01-08, 2017-01-15, 2017-01…
#> $ type                      <chr> "Conventional", "Conventional", "Conventiona…
#> $ avg_selling_price         <dbl> 0.89, 0.99, 0.98, 0.94, 0.96, 0.77, 0.87, 0.…
#> $ total_bulk_and_bags_units <dbl> 38879717, 38049803, 38295489, 42140394, 3937…
#> $ plu4046_units             <dbl> 12707895, 11809728, 12936859, 14254151, 1403…
#> $ plu4225_units             <dbl> 14201201, 13856935, 12625666, 14212882, 1168…
#> $ plu4770_units             <dbl> 549845, 539069, 579347, 908617, 818728, 1664…
#> $ total_bagged_units        <dbl> 11420777, 11844072, 12153619, 12764745, 1283…
#> $ sml_bagged_units          <dbl> 8551134, 9332972, 9445623, 9462854, 9918256,…
#> $ lrg_bagged_units          <dbl> 2802710, 2432260, 2638919, 3231020, 2799961,…
#> $ xlrg_bagged_units         <dbl> 66934, 78841, 69078, 70872, 119096, 112870, …

Exploratory Data Analysis

Let’s begin by exploring the following two topics:

Fluctuation of Average Selling Price


hass_usa |> 
  ggplot(aes(x = week_ending)) +
  geom_line(aes(y = avg_selling_price, color = as.factor(type))) +
  scale_color_manual(labels = c('Conventional','Organic'), values = c('steelblue','forestgreen')) +
  scale_x_date(date_breaks = '1 year', date_labels = '%Y') +
  labs(
    x = 'Year',
    y = 'Average Selling Price per Unit (US$)',
    title = 'Fluctuation of Average Selling Price', 
    caption = 'Not adjusted for inflation\nSource: Hass Avocado Board',
    color = ''
  ) +
  ylim(min = 0, max = 3.0) +
  theme(
    plot.background = element_rect(fill = "grey20"),
    plot.title = element_text(color = "#FFFFFF"),
    axis.title = element_text(color = "#FFFFFF"),
    axis.text.x = element_text(color = 'grey50', angle = 45, hjust = 1),
    axis.text.y = element_text(color = 'grey50'),
    plot.caption = element_text(color = 'grey75'),
    panel.background = element_blank(),
    panel.grid.major = element_line(color = "grey50", linewidth = 0.2),
    panel.grid.minor = element_line(color = "grey50", linewidth = 0.2),
    legend.background = element_rect(fill = 'grey20'),
    legend.key = element_rect(fill = 'grey20'),
    legend.title = element_text(color = 'grey75'),
    legend.text = element_text(color = 'grey75'),
    legend.position = 'inside',
    legend.position.inside = c(0.85, 0.85)
  )

Interestingly, we can see that the average selling price for organic avocados tends to be higher than the average selling price for non-organic (Conventional) avocados. Note how there seems to be a fairly large spike in selling price in late 2017. Moreover, it seems as if the peak average selling price of avocados is declining as time goes on.

mirror server hosted at Truenetwork, Russian Federation.