• CC for EDAV 2019
  • 1 Instructions
    • 1.1 Background
    • 1.2 Preparing your .Rmd file
    • 1.3 Submission steps
    • 1.4 Optional tweaks
    • 1.5 FAQ
      • 1.5.1 What should I expect after creating a pull request?
      • 1.5.2 What if I catch mistakes after my pull request is merged?
      • 1.5.3 Other questions
  • 2 Sample project
  • I Working with data
  • 3 Basic R
    • 3.1 Data types
      • 3.1.1 (1) character
      • 3.1.2 (2)numeric
      • 3.1.3 (3)Logical
    • 3.2 data structure
      • 3.2.1 (1) vector:
      • 3.2.2 (2)list:
      • 3.2.3 (3)factor
      • 3.2.4 (4)matrix
      • 3.2.5 (5) dataframe
  • 4 Data structure and cleaning 101
    • 4.1 Overview
    • 4.2 Data Structure
      • 4.2.1 Basic Data Types
      • 4.2.2 Attributes
      • 4.2.3 Vector
      • 4.2.4 Matrix
      • 4.2.5 Array
      • 4.2.6 List
      • 4.2.7 Data Frame
      • 4.2.8 Data Structure Conversion
      • 4.2.9 Functions to Check Data Structure Attributes
    • 4.3 Data Cleaning
      • 4.3.1 Import Data
      • 4.3.2 Tidy Data
  • 5 All About Dataframes
    • 5.1 Create Data Frames
    • 5.2 Get information on the dataframe
    • 5.3 Concatenate dataframes
    • 5.4 Order dataframes
    • 5.5 Subset of data tables
    • 5.6 Change dataframe shape
    • 5.7 Transforming data
    • 5.8 Dealing with duplicates and missing values
    • 5.9 group_by function
  • 6 Dplyr Relational Databases
    • 6.1 1.Overview
    • 6.2 2.Definition of Relational Databases
    • 6.3 3. R Packages
    • 6.4 4. Data description for example
      • 6.4.1 4.1 BIS Library
      • 6.4.2 4.2 Selected data sets
    • 6.5 5. Types of joins
      • 6.5.1 5.1 Left_join
      • 6.5.2 5.2. Right_join
      • 6.5.3 5.3. Inner_join
      • 6.5.4 5.4. Full_join
  • 7 Web scraping using rvest
    • 7.1 1 Overview
    • 7.2 2 An Easy Example
    • 7.3 3 HTML Basics
      • 7.3.1 3.1 Access the source code
      • 7.3.2 3.2 HTML structures
    • 7.4 4 Rvest
      • 7.4.1 4.1 html_nodes and html_node
      • 7.4.2 4.2 css and xpath
    • 7.5 5 More Examples
      • 7.5.1 5.1 Scrape links using attributes
      • 7.5.2 5.2 Scrape Table
    • 7.6 6 External Resources
  • 8 Working with data links
    • 8.1 Categorical data cheatsheet
    • 8.2 Data wrangling with R cheatsheet:
    • 8.3 Date and Time Cheatsheet in R
    • 8.4 rvest cheatsheet
    • 8.5 tidyverse cheatsheet
    • 8.6 Python vs R (video)
    • 8.7 R package writing (workshop)
    • 8.8 Regex (workshop)
    • 8.9 GitHub help session (workshop)
  • II Static Graphs
  • 9 EDAV Flowchart
    • Distribution
    • Correlation
    • Comparison
    • Patterns
    • Statistical Values (ex. Median, Range)
    • Time Related
    • Survey Data (Likert Scale)
  • 10 Tufte’s Principles of Data-Ink
    • 10.1 Overview
    • 10.2 Minimal Line Plot
    • 10.3 Range-frame (or quartile-frame) scatterplot
    • 10.4 Dot-dash (or rug) scatterplot
    • 10.5 Marginal histogram scatterplot
    • 10.6 Minimal boxplot
    • 10.7 Minimal barchart
    • 10.8 Sparklines
    • 10.9 References and external resources
  • 11 Ridgeline plots
    • 11.1 Overview
    • 11.2 tl;dr
    • 11.3 Simple examples
    • 11.4 Theory
    • 11.5 External resources
  • 12 Gantt charts
    • 12.1 Using geom_line
    • 12.2 Using the package ‘plan’
  • 13 Plotrix for complex visualizations
    • 13.1 Overview
    • 13.2 Plotrix
      • 13.2.1 barNest example
    • 13.3 zoomInPlot example
    • 13.4 fan.plot example
    • 13.5 pie3D example
    • 13.6 pyramid.plot example
    • 13.7 Sources
  • 14 Stacked Bar Charts and Treemaps
    • 14.1 1. Grouped and Stacked Bar Chart
      • 14.1.1 Overview
      • 14.1.2 ggplot2
      • 14.1.3 plotly
      • 14.1.4 Consideration
      • 14.1.5 External resources
    • 14.2 2. Treemap
      • 14.2.1 Overview
      • 14.2.2 Continent level
      • 14.2.3 Region level
      • 14.2.4 Country Level
      • 14.2.5 Consideration
      • 14.2.6 External resources
  • 15 Fluctuation plots
  • 16 Introduction to package ‘ggparty’
    • 16.1 Introdunction of class ‘party’
    • 16.2 Use ‘ggparty’ to visualize the tree
    • 16.3 Customize the tree
    • 16.4 Add plots to the tree
    • 16.5 Application
      • 16.5.1 Categorical vs Numerical
      • 16.5.2 Numerical vs Numerical
  • 17 Multi-class hexbins
  • 18 Visualization in Time Series Analysis
    • 18.1 Initiate a Time series object:
    • 18.2 Plot the data:
    • 18.3 Transformation of nonstationary:
      • 18.3.1 Stationarity:
      • 18.3.2 Operations
    • 18.4 ACF and PACF for time series
    • 18.5 Full model: Yt = T(Trend) + S(Seasonality) +C(Cycle)
      • 18.5.1 Trend(T): Linear, Quadratic, etc. For normal linear model
      • 18.5.2 Seasonality(S):
      • 18.5.3 Cycle(C):
      • 18.5.4 Summary
      • 18.5.5 Reference:
  • 19 How to plot likert data
    • 19.1 Introduction
    • 19.2 Diverging stacked bar chart using function likert()
    • 19.3 Data cleaning and preparation
    • 19.4 Stacked bar chart using ggplot()
    • 19.5 Summary
  • 20 Chart: Stacked Bar Chart (For Likert Data)
    • 20.1 Overview
      • 20.1.1 Stacked Bar Chart
      • 20.1.2 Likert Data
    • 20.2 Examples
      • 20.2.1 Simple Stacked Bar Chart
      • 20.2.2 Likert Data with Stacked Bar Chart
    • 20.3 When to Use
    • 20.4 Considerations
      • 20.4.1 Interpretation of stacked bar charts:
      • 20.4.2 Alignings in Diverging Stacked Bar Charts:
    • 20.5 External Resources & References
  • 21 Likert
    • 21.1 Overview
    • 21.2 tl;dr
    • 21.3 Simple examples
      • 21.3.1 Stacked bar chart
      • 21.3.2 Diverging stacked bar chart
    • 21.4 Stacked bar chart using ggplot
    • 21.5 Theory
    • 21.6 When to use
    • 21.7 External resources
  • 22 Likert vs. Bar Chart
  • 23 Radar plots to show multivariate continuous data
  • 24 R vs tableau plots
    • 24.1 We shall now show our plots using R studio
    • 24.2 We shall now see how to do the same data visualization tasks using Tableau.
  • 25 GeomMLBStadiums
  • 26 ggmosaic
    • 26.1 Overview
    • 26.2 Introduction
    • 26.3 Order of splits
    • 26.4 Splitting on One Variable(binned data)
    • 26.5 Splitting on One Variable(unbinned data)
    • 26.6 Splitting on Two Variables
    • 26.7 Splitting on Three Variables
    • 26.8 Adjusting the Direction of Splits
    • 26.9 Alternative approach: Conditional
    • 26.10 Alternative approach: Facetting
    • 26.11 Comparison with vcd::mosaic
  • 27 Comparative Study of vcd::mosaic and geom_mosaic
    • 27.1 1. vcd::mosaic:
    • 27.2 2. geom_mosaic:
    • 27.3 3. vcd::mosaic vs geom_mosaic – which one is better?
  • 28 Latex Visualization
    • 28.0.1 Summary
  • 29 Cheat sheet of wordcloud2 package
  • 30 Wordcloud
    • 30.1 1. Introduction
    • 30.2 2. Demo of wordcloud2 Package
      • 30.2.1 2.0 Basic Wordcloud Graph
      • 30.2.2 2.1 Font Size
      • 30.2.3 2.2 Color and Background Color
      • 30.2.4 2.3 Shape
      • 30.2.5 2.4 Rotation
      • 30.2.6 2.5 Language
      • 30.2.7 2.6 Customized shape
  • 31 Visualizing Movie Reviews in Word Cloud
    • 31.1 IMDB Reviews
    • 31.2 Cleaning the data!
    • 31.3 Word Cloud
  • 32 Data art (talk)
  • III Interactive Graphs
  • 33 Shiny
    • 33.1 Part 1 How to Build a Shiny App
    • 33.2 1. Install the shiny package
    • 33.3 2. Template for creating a shiny app
    • 33.4 3. Add elements to user interface using fluidPage()
      • 33.4.1 Input functions
      • 33.4.2 Output functions
    • 33.5 4. Build output in server instructions
      • 33.5.1 (1): Save objects you want to display to output$
      • 33.5.2 (2): Build objects with render()
      • 33.5.3 (3): Use input values with input$
    • 33.6 5. Share your app
      • 33.6.1 Save your app
      • 33.6.2 Publish your app on Shinyapps.io
    • 33.7 Part 2 How to Customize Reactions
    • 33.8 1. Reactivity
      • 33.8.1 What is reactivity?
      • 33.8.2 Reactive values
      • 33.8.3 Reactive functions (reactive toolkit)
      • 33.8.4 Modularize code with reactive()
      • 33.8.5 Prevent reactions with isolate()
      • 33.8.6 Trigger code with observeEvent()
      • 33.8.7 Delay reactions with eventReactive()
      • 33.8.8 Manage state with reactiveValues()
    • 33.9 3. Summary
  • 34 HTML, JavaScript, and D3
  • 35 Technical Analysis for Stocks using Plotly
    • 35.1 Import all libraries
    • 35.2 Download data from Alpha Vantage
      • 35.2.1 Usefull links for more information:
    • 35.3 Simple plot: 2 traces in same axis
    • 35.4 Many traces in independent axis but in same plot
    • 35.5 Aesthetics: background and margins
    • 35.6 More aesthetics: hide legends and hide X-axis slider
    • 35.7 Shortcuts to slice data by pre-fixed date ranges
  • 36 GoogleVis
    • 36.1 Overview
    • 36.2 Example: Line chart
    • 36.3 Example: Geo Chart
    • 36.4 Example: Sankey chart
    • 36.5 googleVis in RStudio
    • 36.6 Reference and Resource
  • 37 Interactive graph links
    • 37.1 Bokeh Cheatsheet
    • 37.2 SandDance (video)
    • 37.3 OpenCPU (talk)
      • 37.3.1 What is OpenCPU?
      • 37.3.2 What is this Tutorial?
      • 37.3.3 Distogram: A Working OpenCPU Example
  • IV Spatial Analysis
  • 38 Stamen maps with ggmap
    • 38.1 Mutilayerd plots with ggmaps
    • 38.2 Getting Deeper
  • 39 Mapping in R
    • 39.1 Overview
    • 39.2 What is maps?
    • 39.3 Installing maps
    • 39.4 Simple Demonstration (using maps)
    • 39.5 Simple Demonstration (using ggplot2)
    • 39.6 Mapping with geom_map
    • 39.7 Considerations
    • 39.8 External Resources
  • 40 Plotting Maps with R: An Example-Based Tutorial
    • 40.1 Plotting using base R
    • 40.2 Plotting using ggplot2
    • 40.3 Plotting interactively using leaflet
    • 40.4 Plotting using tmap
  • 41 Different Ways of Plotting U.S. Map in R
    • 41.1 Introduction
    • 41.2 Using usmap package
    • 41.3 Using ggplot2 package
    • 41.4 Using maps package
    • 41.5 Using plotly package
    • 41.6 Using mapview package
    • 41.7 Using leaflet package
    • 41.8 Using tmap package
  • 42 Using Stamen Maps for Plotting Spatial Data
  • 43 World Heatmap in Plotly
    • 43.1 INTRODUCTION
    • 43.2 DEMONSTRATION
    • 43.3 CONCLUSION
    • 43.4 REFERENCES
  • 44 Spatial data links
    • 44.1 CartoDB (video)
    • 44.2 Leaflet
  • V Modeling
  • 45 Time Series Cheatsheet
  • 46 Tutorial for Multivariable Linear Regression
    • 46.1 Motivation
    • 46.2 Connection with Single Variable Regression
    • 46.3 Collinearity and Paradox
    • 46.4 Solution Path
    • 46.5 Stepwise Model Selection
    • 46.6 Model Verification
      • 46.6.1 Outliers and Leverage
  • 47 Keras Package Tutorial
    • 47.1 Installation
    • 47.2 Obtaining a Dataset
    • 47.3 Building a model
  • 48 Time Series Modeling with ARIMA in R
    • 48.1 1. Visualize the time series
    • 48.2 2. Stationarize the Time Series
    • 48.3 3. ACF/PACF
    • 48.4 4. Build the ARIMA Model
    • 48.5 5. Make Predictions
    • 48.6 References/Additional Resources
  • 49 Modeling links
    • 49.1 Exploring Financial Models
    • 49.2 Overview of the t-SNE algorithm
  • VI Communicating Results
  • 50 Rmarkdown tutorial
    • 50.1 1. Overview
      • 50.1.1 1.1 What is R Markdown?
      • 50.1.2 1.2 Workflow
    • 50.2 2. Getting started
      • 50.2.1 2.1. Install the package
      • 50.2.2 2.2. Open file
      • 50.2.3 2.3. output format
    • 50.3 3. Markdown syntax
    • 50.4 4. Embeding code
      • 50.4.1 4.1. Inline code
      • 50.4.2 4.2. Code chunks
      • 50.4.3 4.3. Display options
    • 50.5 5. Rendering
  • 51 Python in Rmarkdown
  • 52 RStudio vs JupyterLab (talk)
  • 53 bookdown (workshop)
  • VII Case studies
  • 54 The first step to analyse a dataset
    • 54.1 Introduction
    • 54.2 A glimpse at the dataset
      • 54.2.1 How does the data look like?
      • 54.2.2 Retrive the metadata
    • 54.3 Dive into one column
      • 54.3.1 Summarise a numerical variable
      • 54.3.2 Understand a categorical variable
    • 54.4 Advanced patterns about a data set
      • 54.4.1 Locate the missing values
      • 54.4.2 Find the outlier for numeric values
      • 54.4.3 Find out the correlations among variables
  • 55 Tinder self-reflection
    • 55.1 Introduction
      • 55.1.1 For The Taken / Non-Millennial Folk
      • 55.1.2 Replicating This Analysis For Yourself
      • 55.1.3 Protecting The Innocent (and Not-So-Innocent)
      • 55.1.4 A Fun Twist
    • 55.2 Analysis
      • 55.2.1 Our Fun New Tinder Statistics: “Amourmetrics”
      • 55.2.2 All-Time Statistics & A Demographical Discovery
      • 55.2.3 “It’s Like Batting Average, But For Tinder”
      • 55.2.4 Where & When Did My Swiping Habits Change?
      • 55.2.5 A Problem With Dates
      • 55.2.6 Overall Trends
      • 55.2.7 Playing Hard To Get
      • 55.2.8 Playing The Game
      • 55.2.9 “Swipe Night, Part 2”
      • 55.2.10 For My Fellow Data Nerds, Or People Who Just Like Graphs
    • 55.3 Conclusion
      • 55.3.1 Dubious Demographics
      • 55.3.2 Love Is Bored
      • 55.3.3 Does Location Matter? Well, Maybe.
      • 55.3.4 The Cinderella Effect
      • 55.3.5 “Playing Hard To Get” May A Be Real Thing
      • 55.3.6 Can We Solve Dating Using Machine Learning?
    • 55.4 Final Thoughts
  • 56 Ice Cream Survey
    • 56.1 Overview
      • 56.1.1 Description
      • 56.1.2 Goals of this community contribution
    • 56.2 Loading packages and reading in data
    • 56.3 Understanding what cleaning is required
    • 56.4 Cleaning and prepping the data
      • 56.4.1 Country
      • 56.4.2 Flavor
      • 56.4.3 Age
    • 56.5 Visualizing the data
      • 56.5.1 Getting an overview
      • 56.5.2 Ice cream preferences by continent and age
    • 56.6 Takeaways
  • 57 “Ask A Manager” salary survey dataset
    • 57.1 Obtaining the dataset
    • 57.2 Description of fields
    • 57.3 Data cleanup process
      • 57.3.1 Industry classification
      • 57.3.2 Job Title classification
      • 57.3.3 Contributing
  • 58 Forecast of the 2020 senate election
  • VIII Chinese translations
  • 59 Intro to stringr 包入门详解
    • 59.1 stringr 包的安装与调用
      • 59.1.1 安装
      • 59.1.2 调用
    • 59.2 字符串匹配函数(Detect Matches)
      • 59.2.1 str_detect(string, pattern)
      • 59.2.2 str_which(string, pattern)
      • 59.2.3 str_count(string, pattern)
      • 59.2.4 str_locate(string, pattern)
      • 59.2.5 str_locate_all(string, pattern)
    • 59.3 字符串的截取函数(Subset Strings)
      • 59.3.1 str_sub(string, start index, end index)
      • 59.3.2 str_subset(string,pattern)
      • 59.3.3 str_extract(string,pattern)
      • 59.3.4 str_match(string, pattern)
    • 59.4 字符串长度编辑函数(Manage Lengths)
      • 59.4.1 str_length(string)
      • 59.4.2 str_pad((string, width, side = c(“left”, “right”,“both”), pad = " ")
      • 59.4.3 str_trunc(string, width, side = c(“right”, “left”,“center”), ellipsis = “…”)
      • 59.4.4 str_trim(string, side = c(“both”, “left”, “right”))
    • 59.5 字符串变换与编辑函数(Mutate Strings)
      • 59.5.1 str_sub(string,start index,end index)
      • 59.5.2 str_replace(string,pattern,replacement)
      • 59.5.3 str_replace_all(string,pattern,replacement)
      • 59.5.4 str_to_lower(string)
      • 59.5.5 str_to_upper(string)
      • 59.5.6 str_to_title(string)
    • 59.6 字符串分割与拼接函数(Join and Split)
      • 59.6.1 str_c(…, sep = "", collapse = NULL)
      • 59.6.2 str_c(…, sep = "“, collapse =”")
      • 59.6.3 str_dup(string, times)
      • 59.6.4 str_split_fixed((string, pattern, n)
      • 59.6.5 str_glue(…, .sep = "", .envir = parent.frame())
      • 59.6.6 str_glue_data(.x, …, .sep = "“, .envir = parent.frame(), .na =”NA")
    • 59.7 字符串排序(Order Strings)
      • 59.7.1 str_sort(string)
      • 59.7.2 str_order(string)
    • 59.8 字符串的编译格式与显示格式修改函数(Encode and Visualize Strings)
      • 59.8.1 str_conv(string, encoding)
      • 59.8.2 str_view(string, pattern)
      • 59.8.3 str_wrap(string,width,indent,exdent)
    • 59.9 正则表达式(Regular Expression)
      • 59.9.1 字符匹配
      • 59.9.2 替换(Alternates)
      • 59.9.3 锚点(Anchors)
      • 59.9.4 查找(Look Arounds)
      • 59.9.5 数量词的使用(Quantifiers)
      • 59.9.6 括号划分表达式并用转义号码替换
    • 59.10 参考文献(Reference)
  • 60 Likert package
  • 61 rvest package 1
  • 62 rvest package 2
    • 62.0.1 Description:
    • 62.0.2 Source
    • 62.0.3 Cheatsheet
    • 62.0.4 Encoding(乱码处理)
    • 62.0.5 google_form
    • 62.0.6 HTML
    • 62.0.7 html_form (提取表单)
    • 62.0.8 html_nodes (提取网页中指定部分)
    • 62.0.9 html_session
    • 62.0.10 html_table (提取网页数据表)
    • 62.0.11 html_text
    • 62.0.12 jump_to (提取相对或绝对链接)
    • 62.0.13 pluck
    • 62.0.14 session_history
    • 62.0.15 set_values (修改表单)
    • 62.0.16 submit_form
  • 63 Translation of ‘parcoords’ Introduction
    • 63.1 1. ‘parcoords’包使用说明 - 中文翻译
      • 63.1.1 parcoords
      • 63.1.2 parcoords-shiny
      • 63.1.3 ParcoordsProxy
      • 63.1.4 pcCenter
      • 63.1.5 pcFilter
      • 63.1.6 pcHide
      • 63.1.7 pcSnapshot
      • 63.1.8 pcUnhide
    • 63.2 2. ‘parcoords’使用教程 - 中文翻译
      • 63.2.1 范例
      • 63.2.2 选项
      • 63.2.3 方法
  • 64 Chinese Translation of R Packages for Interactie Plots 交互式数据可视化包: plotly & parcoords
    • 64.1 R 交互式数据可视化包 ‘plotly’
    • 64.2 R 主题/函数目录:
    • 64.3 add_annotations
    • 64.4 add_data
    • 64.5 add_fun
    • 64.6 add_trace
    • 64.7 animation_opts
    • 64.8 colorbar
    • 64.9 embed_notebook
    • 64.10 ggplotly
    • 64.11 group2NA
    • 64.12 R 交互式数据可视化包 ‘parcoords’
    • 64.13 R 主题/函数目录:
    • 64.14 parcoords
    • 64.15 parcoords-shiny
    • 64.16 parcoordsProxy
    • 64.17 pcCenter
    • 64.18 parcoords_proxy
    • 64.19 pcFilter
    • 64.20 pcHide
    • 64.21 pcSnapshot
    • 64.22 pcUnhide
  • 65 Translation of Lattice Package
    • 65.1 Lattice 画图包的使用介绍
    • 65.2 例子引入
    • 65.3 主要思想
    • 65.4 设计目标
    • 65.5 常见的高级功能
      • 65.5.1 可视化单变量分布
      • 65.5.2 可视化表格
      • 65.5.3 通用功能和方法
      • 65.5.4 散点图和扩展
      • 65.5.5 瓦块数据
      • 65.5.6 三维显示
      • 65.5.7 网格(trellis)对象
    • 65.6 更多资源
      • 65.6.1 版本信息
  • 66 ggmosaic
    • 66.1 Chinese Translation: ‘ggmosaic’(马赛克图)
    • 66.2 引言
    • 66.3 简介
    • 66.4 分割的顺序
    • 66.5 根据一个变量分割(分箱数据):
    • 66.6 根据一个变量分割(非分箱数据):
    • 66.7 根据两个变量分割
    • 66.8 根据三个变量分割
    • 66.9 调整切割的方向
    • 66.10 另外一种方法:条件变量(Conditional)
    • 66.11 另外一种方法:块化(Facet)
    • 66.12 ‘ggmosaic’ vs vcd::‘mosaic’
  • 67 Chinese translation links
    • 67.1 R and ggplot2
    • 67.2 forcats package
      • 67.2.1 示范数据准备
      • 67.2.2 关于缺失数据(NAs)的处理
      • 67.2.3 同义因子水平
      • 67.2.4 混合多个频率低的因子水平成为一个
      • 67.2.5 在ggplot2 条形图中改变条的顺序
    • 67.3 Continuous variables with R (Chinese)
    • 67.4 Visualising Spatial Data
  • IX French translation
  • 68 edav.info
  • X Korean translations
  • 69 Heatmaps
    • 69.0.1 R Markdown
    • 69.0.2 개요
    • 69.0.3 tl;dr
    • 69.0.4 간단한 예제들
    • 69.0.5 2-차원 빈 카운트를 사용한 히트 맵
    • 69.0.6 데이터 프레임의 히트 맵
    • 69.0.7 수정
    • 69.0.8 이론
    • 69.0.9 추가 자료
  • 70 nullabor
    • 70.1 nullaobr 패키지 입문
      • 70.1.1 lineup 방법
      • 70.1.2 Rorschach 방법
      • 70.1.3 특정 분포를 가진 무수의 데이터 생성하기
      • 70.1.4 순열을 통한 무수의 데이터 생성하기
      • 70.1.5 모델에서의 무수 잔차를 이용해 무수의 데이터 생성하기
      • 70.1.6 nullabor 밖의 데이터 생성하기
      • 70.1.7 유의확률 계산하기
      • 70.1.8 검정력 계산하기
    • 70.2 nullbor의 lineup 예시
      • 70.2.1 선거 개찰
    • 70.3 무수(null) 와 데이터 포인츠들간의 거리계산
      • 70.3.1 소개
      • 70.3.2 거리 운율학
      • 70.3.3 단일변수 데이터에서의 거리
      • 70.3.4 회귀 매개변수들의 거리
      • 70.3.5 박스플랏에서의 거리
      • 70.3.6 구분된 상황에서의 거리
      • 70.3.7 구간화 거리
      • 70.3.8 정렬에서의 그래프들간의 평균 거리 계산
      • 70.3.9 여러가지의 정렬들의 차이 측정법
      • 70.3.10 최적의 구간화 수
      • 70.3.11 거리 운율법의 분포도
      • 70.3.12 거리 운율법의 경험적 분포도를 그리기
      • 70.3.13 참조
  • XI EDAV specific
  • 71 Hex Sticker
  • 72 Midsemester Review
    • 72.1 Lecuture 1: Introduction
    • 72.2 Lecture 2: Histograms
    • 72.3 Lecture 3: Grammar of Graphics
    • 72.4 Lecture 4: Common ggplot2 Problems
    • 72.5 Lecture 5: Boxplots & Continuous Variables
    • 72.6 Lecture 6: Rounding Normal (Continuous Variables Wrap-up)
    • 72.7 Lecture 7: Graphical Perception
    • 72.8 Lecture 8: Categorical Variables (Textbook: Chapter 04)
    • 72.9 Lecture 9: Web Scraping & rvest package
    • 72.10 Lecture 10: Scatterplots - 2 Continuous Variables (Textbook: Chapter 05)
    • 72.11 Lecture 11: Parallel Coordinates
    • 72.12 Lecture 12: Interactive Parallel Coordinates (Htmlwidget: parcoords)
    • 72.13 Lecture 13: Git - Workflow
    • 72.14 Lecture 14: Multivariate Categorical Variables (e.g. Mosaic Plots)
    • 72.15 Lecture 15: Transforming Data
    • 72.16 Lecture 16: Likert
    • 72.17 Lecture 17: Git - Branching
    • 72.18 Lecturee 18: Simpson’s Paradox
    • 72.19 Lecture 19: Heatmaps (Textbook: Chapter 8)
    • 72.20 Lecture 20: Time Series (Textbook: Chapter 11)
  • 73 List of Community Contribution
    • 73.0.1 * A lighting talk in class
    • 73.0.2 * A cheatsheet
    • 73.0.3 * A series of tutorials
    • 73.0.4 * A workshop - “ShareYouRWork”
  • Published with bookdown

Community contributions for EDAV Fall 2019

Chapter 37 Interactive graph links

37.1 Bokeh Cheatsheet

Zihe Wang(zw2624) and Yaotian Dai(yd2512)

We created a cheat sheet for Bokeh, a python package great for interactive data visualization.

Please visit it through the link: https://github.com/zw2624/bokeh_cheatsheet

37.2 SandDance (video)

Mughilan Muthupari and Anjani Prasad Atluri

We have made a video tutorial on SandDance (a visualization tool by Microsoft), posted on YouTube here.

Note: The presentation used can be found in the description of the video.

37.3 OpenCPU (talk)

Matthew Mackenzie

37.3.1 What is OpenCPU?

OpenCPU is a “API for Embedded Scientific Computing.” OpenCPU consits of 3 main parts:

  • a server to host OpenCPU apps locally or on the cloud,
  • a HTTP API for data analysis using R, and
  • a JavaScript library to integrate everything together.

OpenCPU works as a platform to create web apps centered around using R for any needed data analysis and visualizations.

37.3.2 What is this Tutorial?

There is not a whole lot of information out there having to do with actually creating an OpenCPU app, so this tutorial will attempt to piece what information is available together by working through an example project. There are 4 major steps involved:

  • Creating a Disfunctional App: creating an R package to acomplish the data processing we need and the HTML for the user to interact with,
  • OpenCPU.js: connecting the HTML to the R package with the OpenCPU JavaScript library,
  • Local Development: testing the app locally, and
  • App Deployment: deploying the app to the OpenCPU Cloud.

As mentioned, this tutorial will be centered around an example… enter Distogram.

37.3.3 Distogram: A Working OpenCPU Example

I wanted to keep things relatively simple, but I think this example gets the point across of the power of using R in the browser. Distogram is an app that prompts users to choose a sample size and probability distribution to sample from, uses R to create a histogram based on these and other parameters, and then presents that plot in the browser. The full tutorial, as well as the running example, can be found at the links below.

  • Full Tutorial: mbmackenzie/distogram
  • Example Application: mbmackenzie.ocpu.io/distogram