8 Data preprocessing cheat sheet in R

Ying Gao

This cheat sheet would provide the basic steps that students could think of and follow when a new data comes up in order to simply preprocess it, including importing data, checking missing values, reshaping, etc.

Depending on different cases and datasets, however, we need to consider different ways to clean and treat our data. For example, when working with the missing values, we might need to figure out whether the data is missing randomly with no such characteristic or there are specific reasons that could explain it. Therefore, this cheat sheet for now has the limitation for different situations that students could meet in the practical work. By creating this cheat sheet, the process of cleaning and exploring data becomes clearer to me and I also found more useful methods to help understand data. It can be extended and improved to be more comprehensive in the future.

Hope this could save some time and help with getting sense and an easy start for the problem solving or project!

Click the link below for cheat sheet:

https://github.com/yg2804/rep/blob/main/data_preprocessing_cheat_sheet_in_r.pdf