This module contains the final practical part for the Certificate of Advanced Studies in Advanced Statistical Data Science (CAS ASDS) at the University of Berne for the class of 2024.
Format
The practical module has been packaged as an R package, so everything (data, scripts, reports, …) should be contained in a single bundle, and analyses should be reproducible.
This setup follows suggestions from (Marwick, Boettiger, and Mullen 2018b, 2018a), (Flight 2014), and (Wickham and Bryan 2023) (which provided more or less the instructions and toolchain recommendations based on which this package has been created).
There are other opinions and tools (e.g. (Flight 2021) and (Landau 2024, 2021)) for a lighter-weight reproducible research approach, which I might explore in the future.
For more inspiration and available tools, see (Blischak et al. 2024).
Installation
You can install the development version of asds2024.nils.practical
from GitHub with:
# install.packages("devtools") # <- for `install_github` to be available uncomment this and run it (unless you've already installed it)
# Notes:
# - you probably want to install the suggested dependencies as well, since this package only uses suggested dependencies
# - when `install_github`-ing, you need to explicitly specify that you want the vignettes built as well
devtools::install_github(
"nils-s/cas-asds-practical",
dependencies = c("Depends", "Imports", "LinkingTo", "Suggests"),
build_vignettes = TRUE)
Since not all documents are provided as vignettes, you probably want to clone the package sources into a local directory as well:
From there, you can more directly explore the raw data, and read documents that are not packaged as vignettes.
Installation Troubleshooting
Assuming the devtools
package is installed (and install_github
is available), this package by itself should not cause problems (simply because it contains very little stuff that could cause problems). However, it depends on a bunch of dependencies, which will be installed when installing this package’s suggested dependencies as shown in the code snippet above.
Problems when Installing sf
The sf
package has a few dependencies of its own (not all of which are R packages). The first thing to try (after studying the error messages, of course) is to make sure all prerequisites for sf
are fulfilled (e.g. the GEOS, GDAL, and PROJ libraries).
On a Fedora machine, the following should get you started:
See the sf
documentation for more information.
Compilation Errors when Installing Source Packages
When installing packages from source (as is common on Linux), compilation errors may occur due to aggressive compiler flag settings used in conjunction with C or C++ sources and Rcpp
. In case you see errors like
...
/usr/local/lib/R/site-library/Rcpp/include/Rcpp/iostream/Rstreambuf.h:53:20: warning: field precision specifier ‘.*’ expects argument of type ‘int’, but argument 2 has type ‘std::streamsize’ {aka ‘long int’} [-Wformat=]
53 | Rprintf("%.*s", num, s);
| ~~^~ ~~~
| | |
| int std::streamsize {aka long int}
...
.../include/Rcpp/print.h:30:19: error: format not a string literal and no format arguments [-Werror=format-security]
...
ERROR: compilation failed for package ...
...
you should probably open an issue in the Github/Gitlab/whatever repo of the package that caused the error.
You should absolutely not go into $(R RHOME)/etc/Makeconf
and change the compiler flags, like for example removing -Werror=format-security
from the CXX14FLAGS
or similar ;)
Example
library(asds2024.nils.practical)
vignette("get-started", package = "asds2024.nils.practical")