-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwealth-and-environmental-stewardship.Rmd
523 lines (435 loc) · 36.8 KB
/
wealth-and-environmental-stewardship.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
---
title: "Is it true that _as people get richer, they care and do more for the environment?_"
author: "Felipe Valencia"
date: "`r format(Sys.time(), '%Y-%m-%d')`"
output:
html_document:
keep_md: true
toc: true
toc_float: true
code_folding: hide
fig_height: 6
fig_width: 12
fig_align: 'center'
---
```{r, echo=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
```
```{r load_libraries, include=FALSE}
# Load Libraries
library(tidyverse)
library(downloader)
library(readxl)
library(foreign)
library(stringr)
library(vtable)
library(ggrepel)
library(scales)
#Sys.setlocale("LC_TIME", "C")
```
```{r load_data}
# Import data
############# Survey of owners of BEVs in California #############
# Read xlsx
# Temporary file
temp_excel <- tempfile()
# Download using downloader package
download("https://datadryad.org/stash/downloads/file_stream/153034",
dest = temp_excel, # Save temp file
mode = "wb")
# Read xlxs from temp file
BEV_FCV <- read_excel(temp_excel)
############# GDP per capita (current US$) #############
# Download zip using downloader package
# Temporary file
temp_zip <- tempfile()
# Temporary directory
temp_dir <- tempdir()
# Download using downloader library
download("https://api.worldbank.org/v2/en/indicator/NY.GDP.PCAP.CD?downloadformat=csv",
destfile = temp_zip, # Save temp file
mode = "wb")
# Unzip temporary file in temporary directory
unzip(temp_zip, exdir = temp_dir)
# Read csv file using foreign package
GDP <- read_csv(paste(temp_dir, "API_NY.GDP.PCAP.CD_DS2_en_csv_v2_5358417.csv", sep = "/"), skip = 3)
############# CO2 emissions (metric tons per capita) #############
# Download zip using downloader package
# Temporary file
temp_zip <- tempfile()
# Temporary directory
temp_dir <- tempdir()
# Download using downloader library
download("https://api.worldbank.org/v2/en/indicator/EN.ATM.CO2E.PC?downloadformat=csv",
destfile = temp_zip, # Save temp file
mode = "wb")
# Unzip temporary file in temporary directory
unzip(temp_zip, exdir = temp_dir)
# Read csv file using foreign package
CO2_emissions <- read_csv(paste(temp_dir, "API_EN.ATM.CO2E.PC_DS2_en_csv_v2_5358914.csv", sep = "/"), skip = 3)
############# ISO 3166 Countries with Regional Codes #############
# Read csv of the ISO 3166 Countries with Regional Codes (taken from Luke Duncalfe's GitHub reop)
countries_ISO3166 <- read.csv("https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv?raw=true")
```
## Executive Summary
In this project, we tried to discover if people _CARE_ more about the environment and _DO_ something for sustainability as they get richer. To gauge whether they _CARE_, we explored data from a survey of thousands of battery electric vehicle (BEV) owners in the US state of California. They are already helping the environment by owning an electric vehicle but we found that BEV owners with higher incomes _CARE_ about reducing greenhouse gas emissions, but they _CARE_ less than lower-income BEV owners. To assess whether people _DO_ something for environmental sustainability as they got richer, we explored the relationship between _GDP per capita_ and _CO2 emissions (metric tons per capita)_ over the years with data available from every country in the world. We found that when people get richer, their CO2 emissions decrease.
## CARE: Owners of BEVs in California
This data set contains [Sociodemographic data for battery electric vehicle owning households in California (From NCST Project "Understanding the Early Adopters of Fuel Cell Vehicles")](https://doi.org/10.25338/B8P313). We are only going to focus on the: _Information on vehicle owned_, _Household Income_, _Highest Level of Education_, _Age_, _Gender_, _Number of vehicles in the household_, and a scale of the _importance of reducing Greenhouse Gas Emissions (GGE)_. It contains around 906 Fuel-Cell Vehicles (FCV) respondents data, but we ignore them to focus on the results from Battery Electric Vehicle (BEV) respondents. We assumed that those that own a BEV are _"environmentally friendly"_, and we are going to see if their wealth affects their perception of care for the environment.
### Data Wrangling
Here's the data set we have cleaned and prepared for our analysis:
```{r tidy_data}
# Clean & wrangle data
BEV <- BEV_FCV %>%
filter(!is.na(`submitdate. Date submitted`)) %>%
rename(Household_Income = `Household Income`,
ID = `id. Response ID`,
importance_reduce_GGE = `Importance of reducing greenhouse gas emissions (-3 not important, 3 important)`,
edu_level = `Highest Level of Education`,
number_vehicles_H = `Number of vehicles in the household`,
gender = `Gender (Male 1)`) %>%
mutate(Household_Income_Thousands = Household_Income / 1000,
edu_level = str_replace_all(as.character(edu_level), c("1" = "Some High School", "2" = "High School Graduate", "3" = "College Graduate", "4" = "Masters, Doctorate, \nor Professional Degree")),
gender = str_replace_all(as.character(gender), c("1" = "Male", "0" = "Female"))) %>%
select(ID, Carmain, Household_Income_Thousands, importance_reduce_GGE, edu_level, number_vehicles_H, gender, Age) %>%
filter(grepl('PHASE', ID))
# Turn Household income in thousands, gender and edu_level to factors
BEV$Household_Income_Thousands <- as.factor(BEV$Household_Income_Thousands)
BEV$gender <- as.factor(BEV$gender)
BEV$edu_level <- as.factor(BEV$edu_level)
# Custom order for the edu_level factor
BEV$edu_level <- factor(BEV$edu_level, levels = c("Some High School", "High School Graduate", "College Graduate", "Masters, Doctorate, \nor Professional Degree"))
BEV
```
### Summary Statistics
```{r stats1}
# Get Summary Statistics
BEV %>% st(title = "Owners of BEVs - socioeconomic variables and data on attitudes towards sustainability")
```
### Data Visualization and Discussion
```{r plot_data1}
# Plot & visualize data
BEV %>% ggplot(aes(x = importance_reduce_GGE)) +
geom_histogram(binwidth = 0.35, color = "green4", fill = "gray80") +
theme_classic() +
scale_x_continuous(breaks = c(-3, -2, -1, 0, 1, 2, 3),
expand = expansion(mult = c(0, 0))) +
scale_y_continuous(expand = expansion(mult = c(0, 0))) +
theme(panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif"),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif")) +
labs(title = "Distribution of attitudes toward sustainability", subtitle = "Owners of BEVs in California were asked about how important is reducing greenhouse gas emissions. \nMeasured with a continuous scale from -3= “Not important” to 3= “Important”", y = "Count", x = "Not important Important", caption = "Source: National Center for Sustainable Transportation\nhttps://doi.org/10.25338/B8P313")
```
Going back to the summary statistics table, the mean of this scale is around `1.7`, which tells us there's an attitude of care for the sustainability of the environment from owners of BEVs in California. With the histogram, we can see the majority of the respondents consider _"Important"_ to a certain extent reducing greenhouse gas emissions.
According to the summary statistics, in terms of wealth, the majority of owners of BEVs in California have an income between _$75k_ and _$225k_, but we have enough data for incomes greater than _$225k_ to see if on average people with higher income care more about the environment than those with lower incomes among those that we consider _"environmentally friendly"_ due to owning a BEV.
```{r plot_data2}
# Plot & visualize data
# Turn Household income in thousands factor to a numeric data type
BEV$Household_Income_Thousands <- as.numeric(as.character(BEV$Household_Income_Thousands))
BEV %>%
filter(!is.na(Household_Income_Thousands)) %>%
filter(!is.na(edu_level)) %>%
ggplot(aes(y = importance_reduce_GGE, x = Household_Income_Thousands)) +
geom_smooth(color = "green4") +
theme_classic() +
#coord_cartesian(ylim = c(-3, 3), xlim = c(50, 500))
scale_x_continuous(breaks = c(50, 75, 125, 175, 225, 275, 325, 375, 425, 475, 500),
expand = expansion(mult = c(0, 0))) +
theme(panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif", hjust = c(0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1)),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif")) +
labs(title = "Correlation between household income and attitude toward environmental sustainability", subtitle = "Owners of BEVs in California were asked about how important is reducing greenhouse gas emissions. \nMeasured with a continuous scale from -3= “Not important” to 3= “Important”", y = "Mean importance of reducing GGE", x = "Income in thousands of dollars", caption = "Source: National Center for Sustainable Transportation\nhttps://doi.org/10.25338/B8P313")
```
Although the mean is between `1.5` and `1.8`, which leans towards the _“Important”_ attitude, we see that, on average, more affluent owners of BEVs in California consider it less important than those with lower incomes.
```{r plot_data3}
# Plot & visualize data
BEV %>%
filter(!is.na(Household_Income_Thousands)) %>%
filter(!is.na(edu_level)) %>%
# Remove "Some High School" observations because they were too small N=43 to be significant
filter(edu_level != "Some High School") %>%
ggplot(aes(y = importance_reduce_GGE, x = Household_Income_Thousands)) +
geom_smooth(color = "green4") +
facet_wrap(~edu_level, nrow = 1) +
theme_classic() +
scale_x_continuous(breaks = c(50, 75, 125, 175, 225, 275, 325, 375, 425, 475, 500)) +
theme(panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif", angle = 90, hjust = 1, vjust = 0.5),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "Correlation between household income and attitude toward environmental sustainability\nby the highest level of education", subtitle = "Owners of BEVs in California were asked about how important is reducing greenhouse gas emissions. \nMeasured with a continuous scale from -3= “Not important” to 3= “Important”", y = "Mean importance of reducing GGE", x = "Income in thousands of dollars", caption = "Source: National Center for Sustainable Transportation\nhttps://doi.org/10.25338/B8P313")
```
First, it's relevant to say that _"Some High School"_ observations were removed because they were too small `N = 43` to be significant for the analysis.
We recognized several patterns from the correlation between household income and the attitude towards environmental sustainability segmented by the highest educational level for the owners of BEVs in California. We can see that people with a higher level of education have, on average, a higher _"Important"_ score for the perception of reducing greenhouse gas emissions than those with a lower education level. Also, we discovered that more affluent people consider it less important than those with lower incomes when segmented by education level; However, we cannot be sure for the _High School Graduates_ group because the confidence interval is too large at the highest incomes. One of the reasons we might have this wide confidence interval for the perception of reducing GGE of the _High School Graduate_ group with the highest incomes is that we do not have enough observations for this group with these incomes.
An important question to move forward in the analysis is why the wealthier owners of BEVs in California seem to care less about GGE than those at the same educational level but with less affluence, is there something they know that less affluent people in their educational level group don't know about reducing GGE, or is it something that has to do with the psyche of the wealthiest?
Since this wasn't a longitudinal study, we cannot conclude that as people get richer they CARE more or less about the environment, but we do see that owners of BEVs in California with higher incomes, (that are doing something good for the environment by using battery electric vehicles instead of fuel-based ones) on average, CARE about reducing GGE, but they CARE less than less affluent owners.
to further the analysis it will be interesting to find data from owners of fuel-based vehicles, to compare if there are differences with BEV owners in their perspectives and their incomes.
## DO: GDP and CO2 Emissions
The _GDP per capita_ and _CO2 emissions (metric tons per capita)_ data sets allowed us to see the correlation between people's wealth and CO2 emissions. This helped us analyze whether as people get richer they do more for the environment.
### Data Wrangling
Here are the data sets we have cleaned and prepared for our analysis:
```{r tidy_data2}
# Clean & wrangle data
# GDP per Capita tidy
GDP_long <- GDP %>%
select(-`Indicator Name`, -`Indicator Code`) %>%
pivot_longer(!`Country Name`:`Country Code`, names_to = "year", values_to = "GDP_per_cap", values_drop_na = TRUE)
# CO2 Emissions tidy
CO2emissions_long <- CO2_emissions %>%
select(-`Indicator Name`, -`Indicator Code`) %>%
pivot_longer(!`Country Name`:`Country Code`, names_to = "year", values_to = "CO2_m_tons_per_cap", values_drop_na = TRUE)
# Join between GDP per Capita and CO2 Emissions tidy
GDP_CO2Emissions <- inner_join(GDP_long, CO2emissions_long, by = c("Country Name", "Country Code", "year")) %>%
mutate(year = as.double(year))
GDP_CO2Emissions
# Get country regions using ISO3166
code_region <- countries_ISO3166 %>%
select(alpha.3, region)
# Join GDP per Capita and CO2 Emissions with the country regions
GDP_CO2Emissions_Regions <- right_join(code_region, GDP_CO2Emissions, by = c("alpha.3" = "Country Code")) %>%
filter(region != is.na(region)) %>% tibble()
GDP_CO2Emissions_Regions
# Compute means for each region
Totals_GDP_CO2Emissions_Regions <- GDP_CO2Emissions_Regions %>%
group_by(region, year) %>%
summarise(mean_GDP_per_cap = mean(GDP_per_cap),
mean_CO2_m_tons_per_cap = mean(CO2_m_tons_per_cap))
Totals_GDP_CO2Emissions_Regions
```
### Summary Statistics
```{r stats2}
# Get Summary Statistics
GDP_CO2Emissions_Regions %>% st(title = "All years' data for each country with the region, GDP per capita, and CO2 emissions (metric tons per capita)")
```
```{r stats3}
# Get Summary Statistics
Totals_GDP_CO2Emissions_Regions %>% st(title = "All years' data for each region, mean GDP per capita and mean CO2 emissions (metric tons per capita)")
```
### Data Visualization and Discussion
```{r plot_data5.5}
# Plot & visualize data
Totals_GDP_CO2Emissions_Regions %>%
ggplot(aes(x = year)) +
geom_line(aes(y = mean_GDP_per_cap), color = "green4", size = 1) +
geom_line(aes(y = mean_CO2_m_tons_per_cap * 1500), color = "red4", size = 1) +
facet_wrap(~region, nrow = 1) +
scale_y_continuous(name = "Mean GDP per capita (current US$)",
breaks = c(0, 10000, 20000, 30000, 40000, 50000, 60000),
label = comma,
sec.axis = sec_axis(~.x /1500, name = "Mean CO2 emissions (metric tons per capita)")) +
theme_classic() +
scale_x_continuous(breaks = c(1990, 2000, 2010, 2019)) +
theme(axis.title.y = element_text(color = "green4"),
axis.title.y.right = element_text(color = "red4"),
panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif", angle = 90, hjust = 1, vjust = 0.5),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "Evolution of GDP per capita and CO2 emissions in metric tons per capita (5 regions)", subtitle = "Only Europe seems to follow the pattern that as people get richer their CO2 emissions decrease", x = "Year", caption = "Source: World Bank CC BY-4.0")
```
Looking at the evolution of both metrics, we can determine that Europe follows the trend and we could say that __as people in Europe get richer, their CO2 emissions decrease__; however, this does not explain why, but it is a phenomenon that we observe. This pattern in Europe is in line with the initial hypothesis of this project.
On the contrary, other large regions of the world do not follow this trend, but we cannot deny the hypothesis because none of the other regions has reached the GDP per capita growth of Europe, in the future when they reach a similar GDP per capita, we can compare them and determine whether or not this trend occurs when a region reaches a certain level of wealth.
To deepen the analysis and take into consideration the fact that the regions have a wide range of rich and poor countries that will move the averages, we use smaller regions from the World Bank which could helped us see if the phenomenon was also present in smaller groups.
```{r plot_data4}
# Plot & visualize data
GDP_CO2Emissions %>%
filter(`Country Name` == "North America" | `Country Name` == "Latin America & Caribbean" | `Country Name` == "Europe & Central Asia" | `Country Name` == "East Asia & Pacific" | `Country Name` == "South Asia" | `Country Name` == "Middle East & North Africa" | `Country Name` == "Sub-Saharan Africa") %>%
ggplot(aes(x = year)) +
geom_line(aes(y = GDP_per_cap), color = "green4", size = 1) +
geom_line(aes(y = CO2_m_tons_per_cap * 1500), color = "red4", size = 1) +
facet_wrap(~`Country Name`, nrow = 1, labeller = label_wrap_gen(width = 15, multi_line = TRUE)) +
scale_y_continuous(name = "GDP per capita (current US$)",
breaks = c(0, 10000, 20000, 30000, 40000, 50000, 60000),
label = comma,
sec.axis = sec_axis(~.x /1500, name = "CO2 emissions (metric tons per capita)")) +
theme_classic() +
scale_x_continuous(breaks = c(1990, 2000, 2010, 2019)) +
theme(axis.title.y = element_text(color = "green4"),
axis.title.y.right = element_text(color = "red4"),
panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif", angle = 90, hjust = 1, vjust = 0.5),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "Evolution of GDP per capita and CO2 emissions in metric tons per capita (7 regions)", subtitle = "The richest regions in terms of GDP per capita, while having higher CO2 emissions in metric tons per capita, have\nexperienced a decline in CO2 emissions per capita since 1990.", x = "Year", caption = "Source: World Bank CC BY-4.0")
```
As expected, by dividing the large regions into smaller regions, we found that North America was hit hard when grouped with _Latin America & Caribbean_. We saw that it has a higher GDP per capita than Europe and also supports the claim that _when people become richer their CO2 emissions are reduced_, however, the CO2 emissions in metric tons per capita are almost double that of Europe. Additionally, we can see that _Latin America & Caribbean_ have remained constant, but it tends to seem as if it's going down, we'll need to follow up on the evolution of _CO2 emissions in metric tons per capita_ in this region closer, this to be more conclusive with our analysis.
Fortunately, the World Bank classifies countries by income as well, so we explored how _GDP per capita_ and _CO2 emissions in metric tons per capita_ behave under such categories.
```{r plot_data5}
# Plot & visualize data
GDP_CO2Emissions$`Country Name` <- as.factor(GDP_CO2Emissions$`Country Name`)
GDP_CO2Emissions$`Country Name` <- factor(GDP_CO2Emissions$`Country Name`, levels = c("Low income", "Low & middle income", "Lower middle income", "Middle income", "Upper middle income", "High income"))
GDP_CO2Emissions %>%
filter(`Country Name` == "High income" | `Country Name` == "Upper middle income" | `Country Name` == "Middle income" | `Country Name` == "Lower middle income" | `Country Name` == "Low & middle income" | `Country Name` == "Low income") %>%
ggplot(aes(x = year)) +
geom_line(aes(y = GDP_per_cap), color = "green4", size = 1) +
geom_line(aes(y = CO2_m_tons_per_cap * 1500), color = "red4", size = 1) +
facet_wrap(~`Country Name`, nrow = 1, labeller = label_wrap_gen(width = 15, multi_line = TRUE)) +
scale_y_continuous(name = "GDP per capita (current US$)",
breaks = c(0, 10000, 20000, 30000, 40000),
label = comma,
sec.axis = sec_axis(~.x /1500, name = "CO2 emissions (metric tons per capita)")) +
theme_classic() +
scale_x_continuous(breaks = c(1990, 2000, 2010, 2019)) +
theme(axis.title.y = element_text(color = "green4"),
axis.title.y.right = element_text(color = "red4"),
panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif", angle = 90, hjust = 1, vjust = 0.5),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "Evolution of GDP per capita and CO2 emissions in metric tons per capita (Income level)", subtitle = "High-income countries as they get richer in terms of GDP per capita experienced a reduction in CO2 emissions in\nmetric tons per capita.", x = "Year", caption = "Source: World Bank CC BY-4.0")
```
We can see that high-income countries follow the hypothesis, so we will expect that when other countries move to the high-income group, they'll experience a similar reduction in CO2 emissions in metric tons per capita.
```{r plot_data5.2}
# Plot & visualize data
# labels
first_and_last <- GDP_CO2Emissions %>%
filter(`Country Name` == "High income", year == 1990 | year == 2019) %>%
mutate(GDP_per_cap = round(GDP_per_cap, 0),
CO2_m_tons_per_cap = round(CO2_m_tons_per_cap, 1))
GDP_CO2Emissions %>%
filter(`Country Name` == "High income") %>%
ggplot(aes(x = year)) +
geom_line(aes(y = GDP_per_cap), color = "green4", size = 1) +
geom_line(aes(y = CO2_m_tons_per_cap * 1500), color = "red4", size = 1) +
geom_text(data = first_and_last, aes(x = year, y = GDP_per_cap, label = comma(GDP_per_cap)), nudge_y = 2000, color = "green4", size = 5, family = "serif") +
geom_point(data = first_and_last, aes(x = year, y = GDP_per_cap), color = "green4") +
geom_text(data = first_and_last, aes(x = year, y = CO2_m_tons_per_cap * 1500, label = CO2_m_tons_per_cap), nudge_y = -2000, color = "red4", size = 5, family = "serif") +
geom_point(data = first_and_last, aes(x = year, y = CO2_m_tons_per_cap * 1500), color = "red4") +
scale_y_continuous(name = "GDP per capita (current US$)",
breaks = c(0, 10000, 20000, 30000, 40000),
label = comma,
sec.axis = sec_axis(~.x /1500, name = "CO2 emissions (metric tons per capita)")) +
coord_cartesian(ylim = c(0, 46500)) +
theme_classic() +
scale_x_continuous(breaks = c(1990, 2000, 2010, 2019)) +
theme(axis.title.y = element_text(color = "green4"),
axis.title.y.right = element_text(color = "red4"),
panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif"),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif")) +
labs(title = "When high-income countries get richer there is a reduction in CO2 emissions per capita", subtitle = "More than double GDP per capita was required over 30 years to achieve a reduction of just over half a metric ton per\ncapita of CO2", x = "Year", caption = "Source: World Bank CC BY-4.0")
```
By looking at the first and last measurements, we noticed more than double _GDP per capita_ was increased over 30 years while the reduction of just over half a _metric ton per capita of CO2_ took the same amount of time.
```{r plot_data5.9}
# Plot & visualize data
GDP_CO2Emissions_Regions %>%
filter(year == 1990 | year == 2000 | year == 2010 | year == 2019, region != is.na(GDP_CO2Emissions_Regions$region)) %>%
ggplot(aes(x = GDP_per_cap, y = CO2_m_tons_per_cap)) +
geom_smooth(color = "grey30") +
facet_wrap(~year, nrow = 1) +
geom_point(aes(color = region)) +
theme_classic() +
scale_x_continuous(breaks = c(0, 75000, 150000),
label = comma) +
theme(panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif"),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
legend.title = element_text(color = "gray10", size = 14, family = "serif", face = "bold"),
legend.text = element_text(color = "gray30", size = 14, family = "serif"),
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "When people get richer, their CO2 emissions in metric tons decrease", subtitle = "Europeans are leading the race to be rich and reduce CO2 emissions in metric tons per capita", y = "CO2 emissions (metric tons per capita)", x = "GDP per capita (current US$)", caption = "Source: World Bank CC BY-4.0") +
guides(col = guide_legend(title = "Region"))
```
Overall, the trend the world's individuals have had in the past 30 years is a decrease in _CO2 emissions (metric tons per capita)_ as they got richer.
```{r plot_data6.5, fig.height = 8}
# Plot & visualize data
GDP_CO2Emissions_Regions %>%
filter(year == 1990 | year == 2000 | year == 2010 | year == 2019, region != is.na(GDP_CO2Emissions_Regions$region)) %>%
ggplot(aes(x = GDP_per_cap, y = CO2_m_tons_per_cap)) +
geom_smooth(method = "loess", color = "grey30") +
geom_point(aes(color = region)) +
facet_grid(vars(region), vars(year), scales = "free_y") +
theme_classic() +
scale_x_continuous(breaks = c(0, 75000, 150000),
label = comma) +
theme(panel.grid.major.y = element_line(color = "gray", linetype = 1),
plot.title.position = "plot",
plot.title = element_text(size = 25, family = "serif", color = "black"),
plot.subtitle = element_text(size = 18, family = "serif", color = "gray30"),
plot.caption = element_text(hjust = 1, family = "serif", color = "gray30", size = 12),
axis.title = element_text(color = "gray30", size = 16, family = "serif"),
axis.text.x = element_text(color = "gray30", size = 14, family = "serif"),
axis.text.y = element_text(color = "gray30", size = 14, family = "serif"),
legend.title = element_text(color = "gray10", size = 14, family = "serif", face = "bold"),
legend.text = element_text(color = "gray30", size = 14, family = "serif"),
legend.position = "none",
strip.text.x = element_text(color = "gray10", size = 14, family = "serif"),
strip.text.y = element_text(color = "gray10", size = 14, family = "serif"),
strip.background = element_rect(color = "white", fill = "white")) +
labs(title = "When people get richer, their CO2 emissions in metric tons decrease", subtitle = "At their scale, many countries in Europe and Asia followed by the Americas are reducing CO2 emissions (metric tons\nper capita) as their GDP per capita increases", y = "CO2 emissions (metric tons per capita)", x = "GDP per capita (current US$)", caption = "Source: World Bank CC BY-4.0") +
guides(col = guide_legend(title = "Region"))
```
At their scale and on average, individuals in _Europe_ and _Asia_, followed by those from the _Americas_, are reducing _CO2 emissions (metric tons per capita)_ as their _GDP per capita_ increases. To further the analysis, each country could be reviewed over time to see how they are doing and if they are following the trend.
Something important to mention is that the _per capita decrease in CO2 emissions_ is not a representation of how a country is doing in terms of its environmental impact and sustainability. Each country varies in population size, so a country that has relatively small _CO2 emissions (metric tons per capita)_ might impact a lot due to a large population.
The analysis we have done had the purpose of allowing us to see a pattern in the behavior of individuals represented by the _per capita average metrics_ we selected instead of a measurement of the countries' impact and efforts.
## Conclusions
+ Among those in California that are doing something good for the environment by using battery electric vehicles instead of fuel-based ones, __on average CARE about reducing GGE__, but __the more affluent CARE less than less affluent owners__.
+ When distinguished by the highest level of education, we found __the most educated owners of BEVs in California consider more important reducing GGE than less educated owners__.
+ When looking at the correlation between the _household income in thousands_ and the _mean importance of reducing GGE_ of BEV owners in California grouped by educational level, __the wealthiest CARE a little bit less than the less affluent people in their educational level group__.
+ __Europe and North America show that as people get richer, they DO do something for the environment in terms of reducing CO2 emissions per capita__, but the amounts of _CO2 emissions (metric tons per capita)_ are still large compared to other countries. Despite that, __the pattern of the hypothesis is seen__.
+ __Individuals in high-income countries also follow the pattern of a reduction in CO2 emissions as they got richer__; However, the reduction is small and slow.
+ Overall, __the trend the world's individuals have had in the past 30 years is a decrease in CO2 emissions (metric tons per capita) as they got richer__.
+ At their scale and on average, __individuals in Europe and Asia, followed by those from the Americas, are reducing CO2 emissions (metric tons per capita) as their GDP per capita increases__.
## Impact on Actions and Decisions
I think that my visualizations impact actions and decisions. The data visualization on the _importance of reducing GGE_ and _income_ by educational level shows that as people receive higher education they _CARE_ more for the environment, so if we want to foster _CARE_ for the environment we should also support education. The data visualizations about _GDP per capita_ and _CO2 emission (metric tons per capita)_ show us that if we want people to _DO_ more for the environment in terms of reducing CO2 emissions, we might want to take them out of poverty fast so that they can _DO_ their part to reduce their emissions while still prospering.
## Techniques
+ __Load data:__ I used temporary files, `unzip()`, and read from the web using the `raw=true` approach.
+ __Clean & Wrangle:__ I used `filter()` and `select()` to get the data I needed. I used `str_replace_all()` to edit data and `mutate()` to compute data. I used `factor()` to custom order my factor data for better display of data in data visualizations. I used `pivot_longer()` to tidy data into a long form. I used `inner_join()` and `right_join()` to merge three data sets. I used `group_by()` and `summarise()` to aggregate data.
+ __Plot & Visualize:__ I used `ggplot2` to create all my data visualizations for this project. I used `aes()` to define the aesthetics of my data visualizations. I used multiple geometries like `geom_line()`, `geom_smooth()`, `geom_histogram()`, `geom_point()`, and `geom_text()`. I used facets with `facet_wrap()` and `facet_grid()`. I personalized my plots with `theme()` to my liking. I tried for the first time adding a secondary y-axis.
+ __Interpret:__ I used `sumtable()` to get summary statistics that helped me to understand and explain better the data sets and my data visualizations.
## Appendix
### Data Sources
Here are the data sets we used for this project:
+ [GDP per capita - current US dollars](https://data.worldbank.org/indicator/NY.GDP.PCAP.CD) from The World Bank.
+ This data set contains the GDP per capita in current US dollars for almost all countries with data from 1960 to 2021. According to Investopedia GDP per capita _"measures the economic output of a nation per person. It seeks to determine the prosperity of a nation by economic growth per person in that nation. Per capita income measures the amount of money earned per person in a nation."_
+ _GDP per capita_ is our socioeconomic metric.
+ [CO2 emissions (metric tons per capita)](https://data.worldbank.org/indicator/EN.ATM.CO2E.PC) from The World Bank.
+ This data set contains the CO2 emissions (metric tons per capita) for almost all countries with data from 1990 to 2019. The World Bank says: _"Carbon dioxide emissions are those stemming from the burning of fossil fuels and the manufacture of cement. They include carbon dioxide produced during consumption of solid, liquid, and gas fuels and gas flaring."_
+ _CO2 emissions (metric tons per capita)_ is our environmental metric.
+ [Sociodemographic data for battery electric vehicle owning households in California (From NCST Project "Understanding the Early Adopters of Fuel Cell Vehicles")](https://datadryad.org/stash/dataset/doi:10.25338/B8P313) from Scott Hardman of the University of California.
+ This data set contains _"Sociodemographic data for fuel cell and battery electric vehicle owning households in California."_
+ This data set is from owners of BEVs and FCVs in California, it has their estimated income and a score measuring their perception of the importance of reducing greenhouse gas emissions.