Chapter 30 Wordcloud

Chengyou Ju and Yujie Wang

This Rmd file is created by Chengyou Ju (UNI: cj2624) and Yujie Wang (UNI: yw3442) for STAT GR5702 Community Contribution Group 15.

In this file, we will provide a tutorial on how to draw Wordcloud graphs using the wordcloud2 package in R.

The dataset in this project are from the demoFreq package.

30.1 1. Introduction

A Wordcloud is a visual representation of text data. Wordclouds are useful for quickly perceiving the most prominent terms, which makes them widely used in media and well understood by the public.

A Wordcloud is a collection of words depicted in different sizes. The bigger and bolder the word appears, the greater frequency within a given text and the more important it is.

There are two packages in R that can help us draw a wordcloud. wordcloud is the basic package to build the graph, while wordcloud2 package allows more customization. In our demo, we will focus on the wordcloud2 package, which is more widely used.

30.2 2. Demo of wordcloud2 Package

For our demo, we will use a built-in dataset demoFreq, which has 1011 observations of 2 variables, words and frequancy.

## 
##   
   checking for file ‘/tmp/RtmpgWatC5/remotes6fac46b40627/Lchiffon-wordcloud2-8a12a3b/DESCRIPTION’ ...
  
✔  checking for file ‘/tmp/RtmpgWatC5/remotes6fac46b40627/Lchiffon-wordcloud2-8a12a3b/DESCRIPTION’
## 
  
─  preparing ‘wordcloud2’:
## 
  
   checking DESCRIPTION meta-information ...
  
✔  checking DESCRIPTION meta-information
## 
  
─  checking for LF line-endings in source and make files and shell scripts
## 
  
─  checking for empty or unneeded directories
##    Removed empty directory ‘wordcloud2/examples/img’
##    Removed empty directory ‘wordcloud2/examples’
## ─  looking to see if a ‘data/datalist’ file should be added
## 
  
─  building ‘wordcloud2_0.2.2.tar.gz’
## 
  
   
## 
##          word freq
## oil       oil   85
## said     said   73
## prices prices   48
## opec     opec   42
## mln       mln   31
## the       the   26

Parameters for wordcloud2 from Rdocumentation
data - data frame with word and freqency of the word
size - Font size, default is 1. The larger size means the bigger word
fontFamily - font used in the word cloud
fontWeight - Font weight to use, e.g. normal, bold or 600
color - color of the text, keyword ’random-dark’ and ’random-light’ can be used.
backgroundColor - Color of the background
minRotation - If the word should rotate, the minimum rotation (in rad) the text should rotate.
maxRotation - If the word should rotate, the maximum rotation (in rad) the text should rotate.
shuffle - Shuffle the points to draw so the result will be different each time for the same list and settings.
rotateRatio - Probability for the word to rotate. Set the number to 1 to always rotate.
shape - The shape of the “cloud” to draw. Can be a keyword present.
widgetsize - size of the widgets
figPath - The path to a figure used as a mask.
hoverFunction - Callback to call when the cursor enters or leaves a region occupied by a word.

30.2.1 2.0 Basic Wordcloud Graph

Building a wordcloud graph is simple. We can use the wordcloud2 package directly after successfully installing it.

As we can see, the word cloud is easy to build and to read. Words with large frequency like ‘said’ and ‘oil’ are displayed in big font size.
It is actually an interactive plot. If we hover on a certain word, it will pop up the word with its frequency.

30.2.2 2.1 Font Size

We can also modify the font size of the graph.

30.2.3 2.2 Color and Background Color

The word color can be changed using the “color” argument, while the background color can be changed with “backgroundColor”.

30.2.4 2.3 Shape

We can also customize the shape of a wordcloud using the “shape” argument. Here are some examples.

30.2.5 2.4 Rotation

We can also do rotation on the wordcloud graph.

30.2.6 2.5 Language

We can draw a wordcloud graph of words in Chinese.

30.2.7 2.6 Customized shape

We can build wordcloud with the shape of a word using function letterCloud.

R

Also, we can create user-defined shape for the wordcloud by simply adding the image we choose to figPath.

batman

batman