9 Cheatsheet for data science

Liri Chen and Shuo Liu

9.1 What We Learned

For community contribution project, we made a data science cheatsheet, which obtained an overall review of data science knowledge structure, applications, and opportunities. Our motivation for making this cheatsheet is that this cheatsheet will help us prepare for data science job interviews, as well as help us studying for finals of several courses. The cheatsheet includes knowledge in areas of probability and statistics, hypothesis testings and machine learning models. In statistics section, we included probability functions and distributions. In hypothesis testing section, we included knowledge of sample and population distribution, as well as ways of using sample statistics and confidence interval to check null hypothesis rejection. In the machine learning section, we incorporate models of linear regression, logistic regression, decision tree, random forest and KNN. We also covered knowledge of model evaluation, dimension reduction and neural networks.

From this community contribution project, we organized and reviewed data science concepts and knowledge. Our future work includes expanding the existing knowledge points, adding more detailed descriptions, and creating front-end display pages, such as Github pages.

9.3 Reference

This cheatsheet cites the template and some contents by Aaron Wang: https://github.com/aaronwangy/Data-Science-Cheatsheet