Recreating a graph from The Economist

New year, new me(ans, medians and modes).

Here is one of the visualisations I made while taking a course on R. It’s basic—a recreation of one of the graphs from The Economist, looking at the correlation between corruption perceptions and HDI, using ggplot.

Recreating a graph from The Economist. See here: https://www.economist.com/graphic-detail/2011/12/02/corrosive-corruption

Set working directory

setwd("~/Data Visualization Project")

Load packages

library(ggplot2)

library(dplyr)
library(tidyverse)

library(readr)

Import Economist data

df <- read_csv("Economist_Assignment_Data.csv")

Preview data

head(df)

Create first layer i.e. data source

layer1 <- ggplot(df,aes(x=CPI,y=HDI,color=Region))

Create second layer i.e. visuals i.e. scatterplot, and change the shape and size of the points (check cheat sheet for possible shapes)

pl <- layer1 + geom_point(shape=1,size=4) + scale_shape(solid=FALSE)

pl

Create a trendline

pl.trend <- pl + geom_smooth(aes(group=1),method='lm',formula=y~log(x),se=FALSE,color='red')

pl.trend

Adding text labels to the points

pl.text <- pl.trend + geom_text(aes(label=Country))

pl.text

This shows ALL the country names, and is unreadable. I didn't write this part of the code, but it's to only label a select subset of countries

pointsToLabel <- c("Russia", "Venezuela", "Iraq", "Myanmar", "Sudan", "Afghanistan", "Congo", "Greece", "Argentina", "Brazil", "India", "Italy", "China", "South Africa", "Spain", "Botswana", "Cape Verde", "Bhutan", "Rwanda", "France", "United States", "Germany", "Britain", "Barbados", "Norway", "Japan", "New Zealand", "Singapore")

pl.country <- pl.trend + geom_text(aes(label = Country), color = "gray20", data = subset(df, Country %in% pointsToLabel),check_overlap = TRUE)

The aes label and color are clear. The rest of the arguments are selecting only the subset of countries that were in the list. And to avoid them overlapping with each other.

Changing the y- and x-axis labels and breaks to match The Economist's.

pl.final <- pl.country + scale_x_continuous(name='Corruption Perceptions Index, 2011 (10=least corrupt)',limits=c(1,10),breaks=1:10) + scale_y_continuous(name='Human Development Index,2011 (1=Best)',limits=c(0.2,1),breaks=c(0,0.2,0.4,0.6,0.8,1)) + ggtitle('Corruption and Human Development') + theme_bw()

pl.final

This was fun. Many mistakes were made and the help() command is a lifesaver, but the end result was worth it. Still a few things missing, but nothing a little Photoshop can’t fix. :)

Recreating a graph from The Economist

Infographic on the energy systems of the future