50 Cheetsheet of RNA-seq workflow in bioconductor

Yusong Zhao

50.0.1 Motivation

Bioconductor is an R-based project that specializes in processing biological data, and many specially designed packages are included in it. It clearly define multiple workflows aiming to solve specific bioinformatics problems, such as RNA-seq analysis, gene annotation, etc.

However, some of the workflows contain too many packages and cover too many questions, which makes it difficult for users to understand. Therefore, for the rna-seq data analysis problem in bioinformatics, I made a cheetsheet to conduct some core analysis in the whole flow, which may provide some help to beginners who are unfamiliar with bioconductor as well as rna-seq analysis.

50.0.2 What I have learned & Evaluation

I learned how to get biological files from relevant databases, how to perform basic analysis with bioconductor packages and workflows. I also learned how to make simple cheetsheet with powerpoint.

The workload is large, but its application scenario is relatively small (because it is not a common problem). In the future, I want to simplify more workflow in bioconductor. Also, explanations of some biological concepts can be added to the cheetsheet so that people without prior knowledge in the field can use bioconductor too.

File link: https://github.com/Zhao-YS/22fall_EDAV_CC/blob/main/cheetsheet_yz4406.pdf

Citations: 1. https://www.bioconductor.org/ 2. https://bioconductor.org/packages/release/workflows/html/rnaseqGene.html 3. https://bioconductor.org/packages/3.16/bioc/html/DESeq2.html