generated from emu-hacettepe-analytics/emu430-project-quarto-template
-
Notifications
You must be signed in to change notification settings - Fork 0
/
analysis.qmd
340 lines (282 loc) · 15.4 KB
/
analysis.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
---
styles:
styles.css
format:
html:
toc: true
toc-depth: 6
toc-title: Contents
toc-location: left
toc-expand: 5
smooth-scroll: true
anchor-sections: true
include-after-body: abbrv_toc.html
number-sections: true
---
::: {style="text-align: center"}
::: {style="font-family: fantasy; font-size: 50px; font-style: normal; font-variant: small-caps; font-weight: 700; line-height: 39.6px; color: #333333;"}
**Analysis**
:::
:::
![](giphy2.gif){fig-align="center" width="240"}
::: {style="text-align: center"}
::: {style="font-family: fantasy; font-size: 36px; font-style: normal; font-variant: small-caps; font-weight: 700; line-height: 39.6px; color: #333333;"}
**Key Takeaways**
:::
:::
::: {style="text-align: justify"}
- For our analysis, we have selected the Turkey Health Survey topic from the Turkey Statistical Institute. This survey is a research project with the aim of understanding the general state of health and the collection of data on key health indicators in order to be able to measure the development of the country. The Survey provides international comparisons and helps to identify national health needs.
- Our data set contains a wide range of health-related information in Turkey. This information includes health habits, health care use, prevalent diseases and demographic details. As a team, we analysed health patterns among different age and gender groups, identified variations in health behaviors and disease occurrence, and understood health service utilization. Providing valuable insights for health policies and activities is our ultimate goal.
- The data, especially when it is visualized such as below, provides us with a good understanding of the situation particularly with respect to age and sex. When we look through the various tables and plots we can see that more women seem to suffer from various health issues than men while men reportedly consume more alcohol than women. This kind of distinction not only lead us to think about sociological and cultural factors, and oppression; it also brings light to a much bigger issue in the medical field that not a lot of people consider: Women were excluded from being subjects for medical research until late 1980s. [This source](https://orwh.od.nih.gov/toolkit/recruitment/history#:~:text=In%201986%2C%20NIH%20established%20a,Grants%20and%20Contracts%20in%201987.) says that only in 1986, NIH established a policy that encouraged researchers to include women in their studies. In 1989, this inclusion encouragement was extended to minorities as well.
- Too much alcohol can raise blood pressure and weight, increasing risk of a heart attack, stroke and type 2 diabetes, says [British Heart Foundation](https://www.bhf.org.uk/informationsupport/heart-matters-magazine/medical/effects-of-alcohol-on-your-heart#:~:text=Too%20much%20alcohol%20can%20raise,at%20Royal%20Liverpool%20University%20Hospitals.) We can actually observe this by looking at Data Set 1's Plot 1 and Data Set 2's implications of higher alcohol usage by men. In Data Set 1 - Plot 1 the only categories where numbers for men surpasses those of women are in fact, heart attack and stroke.
:::
## Dataset 1 : The Percentage of Main Diseases/Health Problems Declared by Individuals in the Last 12 Months by Sex, 2016-2022 {toc-text="Dataset 1"}
```{r message=FALSE, warning=FALSE}
#| echo: false
library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
library(plotly)
data_1 <- read_excel("tidydataset1.xls")
colnames(data_1) <- c("Diseases", "men_2016", "women_2016", "men_2019", "women_2019", "men_2022", "women_2022")
data_1_longer <- data_1 %>%
pivot_longer(cols = starts_with(c("men", "women")),
names_to = "Gender",
values_to = "Percentage") %>%
separate(Gender, into = c("Gender", "Year"), sep = "_") %>%
arrange(Diseases) %>%
mutate_at(vars(Year, Percentage), as.numeric) %>%
mutate(Diseases = gsub("^\\n", "", Diseases))
```
### Average Percentage by Disease with Gender Connection {toc-text="Plot 1"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
summary_data <- data_1_longer %>%
group_by(Diseases, Gender) %>%
summarise(mean_percentage = mean(Percentage))
abbreviate_disease_names <- abbreviate(summary_data$Diseases)
plot <- plot_ly(summary_data,
x = ~abbreviate_disease_names,
y = ~mean_percentage,
color = ~Gender,
colors = c("#1E7DFF", "#FC90FF"),
type = "bar") %>%
layout(title = list(text="Average Percentage by Disease with Gender Connection", x = 0, y = 1),
xaxis = list(title = "Diseases",
zeroline = FALSE,
tickangle = 45,
categoryorder = "trace"),
yaxis = list(title = "Percentage",
zeroline = FALSE))
plot
```
::: {style="text-align: justify"}
One look at this plot and we can tell that women are more commonly reported to have health problems than men. Two categories are exceptions: Myocardial infarction, also known as a *heart attack*, and stroke.
:::
### Percentage by Disease with Year Connection {toc-text="Plot 2"}
```{r}
#| code-fold: true
#| code-summary: "Show the code"
abbreviate_disease_names <- abbreviate(data_1_longer$Diseases)
chart_12 <- plot_ly(data_1_longer,
x = ~abbreviate_disease_names,
y = ~Percentage,
color = ~as.factor(Year),
colors = c("#49CE53", "#2B59C3","#FA7E61"),
type = "bar") %>%
layout(title = list(text="Percentage by Disease with Year Connection", x = 0, y = 1),
xaxis = list(title = "Diseases",
tickangle = 45),
yaxis = list(title = "Percentage"))
chart_12
```
![](Ekran görüntüsü 2024-01-04 094715.png){fig-align="center" width="350"}
::: {style="text-align: justify"}
This graph tells us a few key things:
- First is the fact that 2019 was apparently a terrible year.
- The second is that the most reported cases of health problems seem to be Low Back Disorders and Neck Problems- Which is not all that surprising when you consider the current state of world affairs, is it? People are working 9-5, hunched over desks and keyboards all day, and the situation is not very different for students, which make up about 15% of Turkey’s population according to a [recent research](https://data.tuik.gov.tr/Bulten/Index?p=Istatistiklerle-Genclik-2022-49670).
- The third one is that, interestingly enough, diabetes has seen a new high in 2022 and not 2019. There might be several reasons for that, and to make educated guesses we can put forward the lack of movement in our daily lives that got only worse after the Covid-19 pandemic. Or the fact that no one has time to consume ‘good’ food anymore. We don’t cook our own meals as often, instead opting for takeout after an exhausting shift that sucked the life out of us. Even if we do cook, we no longer can guarantee that our ingredients don’t contain a plethora of chemicals that we can’t even pronounce the names of.
:::
### Distribution of Percentage by Gender for Each Disease {toc-text="Plot 3"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
abbreviate_disease_names <- abbreviate(data_1_longer$Diseases)
box_13 <- plot_ly(data_1_longer,
x = ~reorder(abbreviate_disease_names, -Percentage),
y = ~Percentage,
color = ~Gender,
colors = c("#1E7DFF", "#FC90FF"),
type = "box") %>%
layout(boxmode = "group",
title = "Distribution of Percentage by Gender for Each Disease",
xaxis = list(title = "Diseases",
zeroline = FALSE),
yaxis = list(title = "Percentage",
zeroline = FALSE))
box_13
```
## Data Set 2 : The Percentage of Individuals' Status of Alcohol Use by Sex and Age Group, 2016-2022 {toc-text="Dataset 2"}
```{r message=FALSE, warning=FALSE}
#| echo: false
library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
data_2 <- read_excel("tidydataset2.xls")
colnames(data_2) <- c("age", "men_2016", "women_2016", "men_2019", "women_2019", "men_2022", "women_2022", "usage")
data_2_longer <- data_2 %>%
pivot_longer(cols = starts_with(c("men", "women")),
names_to = "gender",
values_to = "rate") %>%
separate(gender, into = c("gender", "year"), sep = "_") %>%
arrange(age) %>%
mutate_at(vars(year, rate), as.numeric) %>%
na.omit()
```
### Distribution of Alcohol Usage Percentage by Consume Status and Gender {toc-text="Plot 1"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
chart_21 <- plot_ly(data_2_longer,
x = ~usage,
y = ~rate,
color = ~gender,
colors = c("#1E7DFF", "#FC90FF"),
type = "box") %>%
layout(boxmode = "group",
title = list(text="Distribution of Alcohol Usage Percentage by Consume Status and Gender", x = 0, y = 1),
xaxis = list(title = "Alcohol Usage",
zeroline = FALSE),
yaxis = list(title = "Percentage",
zeroline = FALSE))
chart_21
```
### Distribution of Alcohol Usage Percentage by Year {toc-text="Plot 2"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
chart_22 <- ggplot(data_2_longer, aes(x = year, y = rate, color = gender)) +
geom_point(alpha = 0.7, size = 1) +
stat_summary(fun = "mean", geom = "line", linewidth = 0.3, color = "black",
aes(group = interaction(usage, gender)), show.legend = TRUE) +
labs(title = "Distribution of Alcohol Usage Percentage by Year",
x = "Year",
y = "Percentage") +
scale_color_manual(values = c("#1E7DFF", "#FC90FF")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
theme_minimal() +
facet_grid(. ~ usage, scales = "free_x", space = "free_x", switch = "both") +
theme(panel.spacing = unit(2, "lines"))+
scale_x_continuous(breaks = c(2016, 2019, 2022))
ggplotly(chart_22)
```
::: {style="text-align: justify"}
We can see that the most recent year has seen a dramatic increase in men who have never consumed alcohol, which is supported by a decrease in regular consumers and people who have consumed alcohol in the past but don't consume anymore. The decrease in the “doesn’t consume” category seems to be more steep- Can it be the decline in the purchasing power with most recent economic disasters? Or is it the simple fact that people are choosing healthier lifestyles?
:::
### Percentage of Consumers of Alcohol Use by Sex and Age Group {toc-text="Plot 3"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
consumers_data <- data_2_longer %>%
filter(usage == "Consumers")
consumers_data$year <- factor(consumers_data$year, levels = c(2016, 2019, 2020, 2022))
chart_23 <- ggplotly(ggplot(consumers_data, aes(x = year, y = age, fill = rate)) +
geom_tile() +
facet_grid(cols = vars(gender), scales = "free") +
labs(title = "Percentage of Consumers of Alcohol Use by Sex and Age Group",
x = "Year",
y = "Age Group",
fill = "Percentage") +
scale_fill_gradient(low = "white", high = "#fc8d62") +
theme_minimal())
chart_23
```
::: {style="text-align: justify"}
The age range for alcohol consumers seem to overlap for both men and women- We can see from this visual representation that both men and women around the ages 25-34 consume more alcohol than any other age range. Although for men, we can see the intensity of consumption seems to carry on over to the 35-44 age range as well.
:::
## Data Set 3 : Body Mass Index Distribution of Individuals by Sex, 2008-2022 {toc-text="Dataset 3"}
```{r message=FALSE, warning=FALSE}
#| echo: false
library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
library(streamgraph)
data_3 <- read_excel("tidydataset3.xls")
colnames(data_3) <- c("Year", "Sex", "Underweight", "Normal Weight", "Pre Obese", "Obese")
data_3_long <- data_3 %>%
pivot_longer(cols = c(Underweight, 'Normal Weight', 'Pre Obese', Obese),
names_to = "Category",
values_to = "Percentage")
years <- c(2008, 2010, 2012, 2014, 2016, 2019, 2022)
```
```{r eval=FALSE, message=FALSE, warning=FALSE, include=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
chart_31 <- plot_ly(data_3_long,
x = ~Year,
y = ~Percentage,
color = ~Category,
colors = c("PuBu"),
type = "bar") %>%
layout(title = list(text="Percentage of Categories (2008-2022)", x = 0, y = 1),
xaxis = list(title = "Year",
tickvals = c(2008, 2010, 2012, 2014, 2016, 2019, 2022)),
yaxis = list(title = "Percentage")
)
chart_31
```
### Trends Over Years {toc-text="Plot 1"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
filtered_data <- data_3_long[data_3_long$Sex != "Total", ]
chart_321 <- ggplotly(ggplot(filtered_data, aes(x = Year, y = Percentage, color = Category)) +
geom_line() +
facet_wrap(~Sex) +
labs(title = "Trends Over Years", x = "Year", y = "Percentage") +
theme_minimal() +
scale_x_continuous(breaks = years, labels = years) +
theme(
axis.text.x = element_text(size = 8, hjust = 0.5),
plot.title = element_text(hjust = 0.5)
))
chart_321
summarized_3_long <- data_3_long %>%
filter(Sex == "Total")
chart_322 <- streamgraph(summarized_3_long,
key = "Category",
value = "Percentage",
date = "Year") %>%
sg_fill_manual(c("#3A20B8", "#7D78EC", "#AE89ED", "#FDBDFF")) %>%
sg_legend(show = TRUE, label = "Category:") %>%
sg_axis_x(tick_interval = "year")
chart_322
```
::: {style="text-align: justify"}
Over the years, we have gotten unhealthier and unhealthier. The percentage of people who are a healthy weight are decreasing while people who are pre-obese or obese are on the rise.
:::
### Boxplot of Percentage by Category and Sex {toc-text="Plot 2"}
```{r message=FALSE, warning=FALSE}
#| code-fold: true
#| code-summary: "Show the code"
chart_33 <- plot_ly(data_3_long,
x = ~reorder(Category, -Percentage),
y = ~Percentage,
color = ~Sex,
colors = c("#1E7DFF", "#FC90FF","#8D87FF"),
type = "box") %>%
layout(boxmode = "group",
title = list(text="Boxplot of Percentage by Category and Sex", x = 0, y = 1),
xaxis = list(title = "Category",
zeroline = FALSE),
yaxis = list(title = "Percentage",
zeroline = FALSE))
chart_33
```
::: {style="text-align: justify"}
While the “normal weight” category dominates the others throughout the study period, the latter half of the graph shows the concerning approach of the pre-obese category, almost catching up in recent years. The lack of movement and healthy food intake that we talked about earlier are definitely big contributors. But let’s approach this one from a different perspective: The tabooing of fatness and eating disorders in our society. Beauty standards. Our inclination to not believe people even when they voice their struggles. One thing that’s very common in especially younger people is using food as a form of escapism. What we need is to restore our relationship with our food consumption, and to start treating our bodies with respect and love. Much of physical health is closely connected with that of the mind.
:::