Data Visualization and Analysis, Part 3/3 – Binge Drinking

By | March 28, 2016

Author: Qi Chen

1. Introduction

This is the final part of Data Visualization and Analysis with R and Tableau Series, for the previous parts, please refer to part 1/3 and part 2/3 for details.

There are roughly 2 billion people around the world consuming alcoholic drinks. Among them, a great number of people are considered as “binge drinking”, which means consuming 5 or more drinks per single occasion for male and 4 or more drinks for female. Drinking alcohol has caused many problems and there are about 88,000 people died from alcohol-related causes annually. Moreover, $223.5 billion has been spent for alcohol misuse problems and 75% of the cost are related to binge drinking. Therefore, the study of binge drinking is very important for the well-being of the society. In this article, we analyze the data from BRFSS (Behavior Risk Surveillance System) for disease control, trying to discover the relationship between binge drinking and other factors and assist in decreasing injuries, deaths, and costs related to binge drinking.

2. Analysis

2.1. Software used

In this article, we will demonstrate the use of Tableau and R to run data visualization and analysis.

2.2. Data Visualization with Tableau

Graph1 shows the proportion of people who drink alcohol and who are considered as binge drinking between 2002 and 2012. We can clearly see that the proportion increases a little in the 10 years and roughly 50% of people are alcohol consumers.

Graph 1: Binge vs. Any Drinking
Picture1

We are also interested in the alcohol consumption for different counties in the United States, which is shown in Graph 2. Red means high alcohol consumption and green means vice versa. From this figure, we see that areas near northern Midwest (for example, Wisconsin) have relatively high alcohol consumption.

Graph 2: Alcohol Consumption in Counties in U.S.
Picture2

Similarly, we plot the rate of alcohol related death for each state in Graph 3. Again a color closer to red means a higher rate. We see relatively high percentage of death toll in some states such as Wisconsin, North Dakota and Montana. Thus it might be a good idea to adopt more strict law against drunk driving and other dangerous behaviors related to binge drinking in those areas.

Graph 3: States of Alcohol Impairment
Picture3

2.3. Data Analysis with R

In this part, we want to demonstrate the data analysis techniques using R. At first, we create a facet plot as shown in Graph 4 to show the correlation among different variables, including sleptim1 (sleep time), marital (marital status), avedrnk2 (average drink per day), x.age80 (age), x.rfsmok3 (smoker or not), and x.rfbing5 (binge drinking).

Graph 4: Facet Plot
Picture4

In this graph, red is for male and blue is for female. An interesting phenomenon is that for single people, smokers and those who binge drink consist a large proportion compare to people who are married. We can also see that 50% of smokers binge drink, but less than 30% of nonsmokers binge drink, which indicates a high correlation between smoking and binge drinking.

With R package “party”, we create a conditional interface tree taking sex (gender), x.rfsmok3 (smoke condition), x.rfbmi5 (obesity) and marital (marital status) into consideration and trying to determine the conditional probability of binge drinking under each category. Readers who are interested in the conditional interface tree could refer to this link for details.

Graph 5: Conditional Interface Tree
Picture5

For example, from this figure we could see that the probability for a single male smoker to binge drink is 52.9%.

2.4. Results and Summary

In this article, we demonstrate the use of Tableau and R to run data visualization and analysis regarding to binge drinking.

This article is based on a course project of Industrial Data Analytics offered by Prof. Kaibo Liu in the University of Wisconsin-Madison in Spring 2015. Thank Prof. Liu for his instruction and also thank Criss Ross, Corey Lester and Wyatt Suprise for their initial work.

3. Source

http://www.cdc.gov/brfss/annual_data/annual_2014.html
http://www.cdc.gov/alcohol/data-stats.htm
https://tgmstat.wordpress.com/2013/11/13/plot-matrix-with-the-r-package-ggally/
https://cran.r-project.org/web/packages/ggplot2/ggplot2.pdf
http://ggobi.github.io/ggally/gh-pages/ggpairs.html
http://www.r-bloggers.com/regression-on-categorical-variables/
http://www.r-bloggers.com/classification-tree-models/
http://www.r-bloggers.com/example-9-17-much-better-pairs-plots/

Share this post
  •  
  •  
  •  
  •  
  •  
2+

Users who have LIKED this post:

  • avatar
  • avatar

2 thoughts on “Data Visualization and Analysis, Part 3/3 – Binge Drinking

  1. Profile gravatar of Bridget SchuesslerBridget Schuessler

    The data backing Wisconsin and binge drinking is astounding. The state is known for a social culture of drinking, but these statistically backed visuals really show that something needs to be done to improve the health & safety of state residents.

Leave a Reply