1  Introduction

tidymodels is a framework for creating statistical and machine learning models in R. The framework consists of a set of tightly coupled R packages that are designed in the same way. The project began in late 2016.

The main tidymodels resources are:

We’ll reference these and other resources as needed.

1.1 Installation

tidymodels is built in R so you’ll need to install that. We used R version 4.3.2 (2023-10-31 ucrt) for these notes. To install R, you can go to CRAN1 to download it for your operating system. If you are comfortable at the command line, the rig application is an excellent way to install and manage R versions.

You probably want to use an integrated development environment (IDE); it will make your life much better. We use the RStudio IDE, which can be downloaded here. Other applications are Visual Studio and emacs.

To use tidymodels, you need to install multiple packages. The core packages are bundled into a “verse” package called tidymodels. When you install that, you get the primary packages as well as some tidyverse packages such as dplyr and ggplot2.

To install it, you can use


We suggest using the pak package for installation. To do this, first install that and then use it for further installations:

1.2 Loading tidymodels

Once you do that, load tidymodels:

The default output shows the packages that are automatically attached. There are a lot of functions in tidy models, but by loading this meta-package, you don’t have to remember which functions come from which packages.

Note the lines at the bottom that messages like :

dplyr::filter() masks stats::filter()

This means that two packages, dplyr and stats, have functions with the same name (filter())2. If you were to type filter at an R prompt, the function that you get corresponds to the one in the most recently loaded package. That’s not ideal.

To handle this, we have a function called tidymodels_prefer(). When you use this, it prioritizes functions from the tidy models and tidyverse groups so that you get those
── Conflicts ──────────────────────────────────────────── tidymodels_prefer() ──

If you want to know more about why tidymodels exists, we’ve written a bit about this in the tidymodels book. The second chapter describes how tidyverse principles can be used for modeling.

1.3 Package Versions and Reproducability

We will do our best to use versions of our packages corresponding to the CRAN versions. We can’t always do that, and, for many packages, a version number ending with a value in the 9000 range (e.g., version “”) means that it was a development version of the package and was most likely installed from a GitHub repository.

At the end of each session, we’ll show which packages were loaded and used:

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 11 x64 (build 22621)
 system   x86_64, mingw32
 ui       RTerm
 language (EN)
 collate  English_United States.utf8
 ctype    English_United States.utf8
 tz       America/Chicago
 date     2024-01-05
 pandoc   3.1.1 @ C:\\Users\\bmrei\\AppData\\Local\\Programs\\Quarto\\bin\\tools/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 ! package      * version    date (UTC) lib source
 P backports      1.4.1      2021-12-13 [?] CRAN (R 4.3.1)
 P broom        * 1.0.5      2023-06-09 [?] CRAN (R 4.3.2)
 P cachem         1.0.8      2023-05-01 [?] CRAN (R 4.3.2)
 P class          7.3-22     2023-05-03 [?] CRAN (R 4.3.2)
   cli            3.6.2      2023-12-11 [1] CRAN (R 4.3.2)
 P codetools      0.2-19     2023-02-01 [?] CRAN (R 4.3.2)
 P colorspace     2.1-0      2023-01-23 [?] CRAN (R 4.3.2)
 P conflicted     1.2.0      2023-02-01 [?] CRAN (R 4.3.2)
 P data.table     1.14.10    2023-12-08 [?] CRAN (R 4.3.2)
 P dials        * 1.2.0      2023-04-03 [?] CRAN (R 4.3.2)
   DiceDesign     1.10       2023-12-07 [1] CRAN (R 4.3.2)
 P digest         0.6.33     2023-07-07 [?] CRAN (R 4.3.2)
 P dplyr        * 1.1.4      2023-11-17 [?] CRAN (R 4.3.2)
 P evaluate       0.23       2023-11-01 [?] CRAN (R 4.3.2)
 P fansi          1.0.6      2023-12-08 [?] CRAN (R 4.3.2)
 P fastmap        1.1.1      2023-02-24 [?] CRAN (R 4.3.2)
 P foreach        1.5.2      2022-02-02 [?] CRAN (R 4.3.2)
 P furrr          0.3.1      2022-08-15 [?] CRAN (R 4.3.2)
 P future         1.33.1     2023-12-22 [?] CRAN (R 4.3.2)
 P future.apply   1.11.1     2023-12-21 [?] CRAN (R 4.3.2)
 P generics       0.1.3      2022-07-05 [?] CRAN (R 4.3.2)
 P ggplot2      * 3.4.4      2023-10-12 [?] CRAN (R 4.3.2)
 P globals        0.16.2     2022-11-21 [?] CRAN (R 4.3.1)
   glue           1.6.2      2022-02-24 [1] CRAN (R 4.3.2)
 P gower          1.0.1      2022-12-22 [?] CRAN (R 4.3.1)
 P GPfit          1.0-8      2019-02-08 [?] CRAN (R 4.3.2)
 P gtable         0.3.4      2023-08-21 [?] CRAN (R 4.3.2)
 P hardhat        1.3.0      2023-03-30 [?] CRAN (R 4.3.2)
 P htmltools      0.5.7      2023-11-03 [?] CRAN (R 4.3.2)
 P infer        * 1.0.5      2023-09-06 [?] CRAN (R 4.3.2)
 P ipred          0.9-14     2023-03-09 [?] CRAN (R 4.3.2)
 P iterators      1.0.14     2022-02-05 [?] CRAN (R 4.3.2)
 P jsonlite       1.8.8      2023-12-04 [?] CRAN (R 4.3.2)
 P knitr          1.45       2023-10-30 [?] CRAN (R 4.3.2)
 P lattice        0.21-9     2023-10-01 [?] CRAN (R 4.3.2)
 P lava           1.7.3      2023-11-04 [?] CRAN (R 4.3.2)
 P lhs            1.1.6      2022-12-17 [?] CRAN (R 4.3.2)
   lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.3.2)
 P listenv        0.9.0      2022-12-16 [?] CRAN (R 4.3.2)
 P lubridate      1.9.3      2023-09-27 [?] CRAN (R 4.3.2)
 P magrittr       2.0.3      2022-03-30 [?] CRAN (R 4.3.2)
 P MASS           7.3-60     2023-05-04 [?] CRAN (R 4.3.2)
 P Matrix         1.6-1.1    2023-09-18 [?] CRAN (R 4.3.2)
 P memoise        2.0.1      2021-11-26 [?] CRAN (R 4.3.2)
 P modeldata    * 1.2.0      2023-08-09 [?] CRAN (R 4.3.2)
 P munsell        0.5.0      2018-06-12 [?] CRAN (R 4.3.2)
 P nnet           7.3-19     2023-05-03 [?] CRAN (R 4.3.2)
 P parallelly     1.36.0     2023-05-26 [?] CRAN (R 4.3.1)
 P parsnip      * 1.1.1      2023-08-17 [?] CRAN (R 4.3.2)
 P pillar         1.9.0      2023-03-22 [?] CRAN (R 4.3.2)
 P pkgconfig      2.0.3      2019-09-22 [?] CRAN (R 4.3.2)
 P prodlim        2023.08.28 2023-08-28 [?] CRAN (R 4.3.2)
 P purrr        * 1.0.2      2023-08-10 [?] CRAN (R 4.3.2)
 P R6             2.5.1      2021-08-19 [?] CRAN (R 4.3.2)
 P Rcpp           1.0.11     2023-07-06 [?] CRAN (R 4.3.2)
 P recipes      * 1.0.9      2023-12-13 [?] CRAN (R 4.3.2)
   renv           1.0.3      2023-09-19 [1] CRAN (R 4.3.2)
   rlang          1.1.2      2023-11-04 [1] CRAN (R 4.3.2)
 P rmarkdown      2.25       2023-09-18 [?] CRAN (R 4.3.2)
 P rpart          4.1.21     2023-10-09 [?] CRAN (R 4.3.2)
 P rsample      * 1.2.0      2023-08-23 [?] CRAN (R 4.3.2)
 P rstudioapi     0.15.0     2023-07-07 [?] CRAN (R 4.3.2)
 P scales       * 1.3.0      2023-11-28 [?] CRAN (R 4.3.2)
 P sessioninfo    1.2.2      2021-12-06 [?] CRAN (R 4.3.2)
 P survival       3.5-7      2023-08-14 [?] CRAN (R 4.3.2)
 P tibble       * 3.2.1      2023-03-20 [?] CRAN (R 4.3.2)
 P tidymodels   * 1.1.1      2023-08-24 [?] CRAN (R 4.3.2)
 P tidyr        * 1.3.0      2023-01-24 [?] CRAN (R 4.3.2)
 P tidyselect     1.2.0      2022-10-10 [?] CRAN (R 4.3.2)
 P timechange     0.2.0      2023-01-11 [?] CRAN (R 4.3.2)
 P timeDate       4032.109   2023-12-14 [?] CRAN (R 4.3.2)
 P tune         * 1.1.2      2023-08-23 [?] CRAN (R 4.3.2)
 P utf8           1.2.4      2023-10-22 [?] CRAN (R 4.3.2)
   vctrs          0.6.5      2023-12-01 [1] CRAN (R 4.3.2)
 P withr          2.5.2      2023-10-30 [?] CRAN (R 4.3.2)
 P workflows    * 1.1.3      2023-02-22 [?] CRAN (R 4.3.2)
 P workflowsets * 1.0.1      2023-04-06 [?] CRAN (R 4.3.2)
 P xfun           0.41       2023-11-01 [?] CRAN (R 4.3.2)
 P yaml           2.3.8      2023-12-11 [?] CRAN (R 4.3.2)
 P yardstick    * 1.2.0      2023-04-21 [?] CRAN (R 4.3.2)

 [1] C:/Users/bmrei/aml4td/computing-tidymodels/renv/library/R-4.3/x86_64-w64-mingw32
 [2] C:/Users/bmrei/AppData/Local/R/cache/R/renv/sandbox/R-4.3/x86_64-w64-mingw32/1e360f03

 P ── Loaded and on-disk path mismatch.


  1. The Comprehensive R Archive Network↩︎

  2. The syntax foo::bar() means that the function bar() is inside of the package foo When used together, this is often referred to as “calling the function by its namespace.”. You can do this in your code, and developers often do. However, it’s fairly ugly. ↩︎

  3. Unfortunately, this is not a guarantee but it does work most of the time.↩︎