Chapter 1 Introduction

1.1 Guide to tables

The tables in the following chapters provide detailed information about datasets in R packages. The complete list of output columns is as follows. Columns not initially visible can be viewed by clicking the Column Visibility button.

Output columns
package name of package (optional)
name name of dataset
nr_or_len number of rows or length() (whichever is not NULL)
nc number of columns
add_dim additional dimensions (>= 3, such as for tables)
first_class first class listed
n_cols number of numeric columns
i_cols number of integer columns
f_cols number of factor columns
c_cols number of character columns
d_cols number of date columns
other_cols number of other columns
missing proportion of missing values overall
allclasses full list of classes (optional)

Table columns can be sorted and filtered. Dataset names are linked to documentation if available. Datasets without links are usually included in documentation for a dataset in the same package with a similar name. For example, documentation for alr4::BGSboys and alr4::BGSgirls is included on the help page for alr4::BGSall.

1.2 Exploring data in packages locally

The main function used here is data_xray() from the datacat package. You can explore datasets in R packages locally as follows.

Install datacat:

remotes::install_github("jtr13/datacat")

View information with:

library(datacat)
data_xray("ggplot2") %>% View()

1.3 Contributing

I welcome your suggestions on packages to add to this resource. You can either open an issue or add the package name to this file and create a pull request.

I have intentionally left out packages with nondescriptive dataset names such as ch5ex11: AMCP, [Devore7]