Weekly Assignment #3
1: Go to Dryad to choose a published paper and data set and reconstruct your own figure. Code a ggplot graph that looks as close to the published figure as you can.
See Figure 1 from Race, AI, De Jesus, M, Beltran, RS, Zavaleta, ES. A comparative study between outcomes of an in-person versus online introductory field course. Ecol Evol. 2021; 11: 3625– 3635. https://doi.org/10.1002/ece3.7209.
library(tidyverse)## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(ggthemes)
library(ggplot2) # load these libraries so I can access tools
classoutcomesData<-read.table(file="data/classoutcomes.csv", header=TRUE, sep = ",") # upload Race et. al data and assign it to this variable
glimpse(classoutcomesData)## Rows: 19
## Columns: 18
## $ PRE <chr> "Question", "Flora/Fauna", "Experimental Design", "Oral Pre…
## $ X <chr> "In-Person", "3.301886792", "3.117424242", "3.464150943", "…
## $ X.1 <chr> "Online", "3.15625", "3.25", "3.46875", "3.0625", "4.59375"…
## $ X.2 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.3 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.4 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.5 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.6 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.7 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ POST <chr> "Question", "Flora/Fauna", "Experimental Design", "Oral Pre…
## $ X.8 <chr> "In-Person", "4.025", "3.8875", "3.9875", "4.0375", "4.55",…
## $ X.9 <chr> "Online", "4.142857143", "4.428571429", "3.714285714", "4.4…
## $ X.10 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.11 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ X.12 <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ Difference <chr> "Question", "Flora/Fauna", "Experimental Design", "Oral Pre…
## $ X.13 <chr> "In-Person", "0.723113208", "0.770075758", "0.523349057", "…
## $ X.14 <chr> "Online", "0.986607143", "1.178571429", "0.245535714", "1.3…
head(classoutcomesData)## PRE X X.1 X.2 X.3 X.4 X.5 X.6 X.7
## 1 Question In-Person Online NA NA NA NA NA NA
## 2 Flora/Fauna 3.301886792 3.15625 NA NA NA NA NA NA
## 3 Experimental Design 3.117424242 3.25 NA NA NA NA NA NA
## 4 Oral Presentation 3.464150943 3.46875 NA NA NA NA NA NA
## 5 Field Research 2.490566038 3.0625 NA NA NA NA NA NA
## 6 Science Career 4.505660377 4.59375 NA NA NA NA NA NA
## POST X.8 X.9 X.10 X.11 X.12 Difference
## 1 Question In-Person Online NA NA NA Question
## 2 Flora/Fauna 4.025 4.142857143 NA NA NA Flora/Fauna
## 3 Experimental Design 3.8875 4.428571429 NA NA NA Experimental Design
## 4 Oral Presentation 3.9875 3.714285714 NA NA NA Oral Presentation
## 5 Field Research 4.0375 4.428571429 NA NA NA Field Research
## 6 Science Career 4.55 4.285714286 NA NA NA Science Career
## X.13 X.14
## 1 In-Person Online
## 2 0.723113208 0.986607143
## 3 0.770075758 1.178571429
## 4 0.523349057 0.245535714
## 5 1.546933962 1.366071429
## 6 0.044339623 -0.308035714
colnames(classoutcomesData)[16] ="Metric" # rename these columns to make them a bit easier to call
colnames(classoutcomesData)[17] ="Class In-Person"
colnames(classoutcomesData)[18] ="Class Online"
classoutcomesDF<-data.frame(classoutcomesData[c(2:9),c("Metric", "Class In-Person", "Class Online")]) # create a new df using these parameters
data_mod <- cbind(classoutcomesDF[1], stack(classoutcomesDF[2:3])) # stack in person and online so data_mod can be placed into the ggplot code## Warning in data.frame(..., check.names = FALSE): row names were found from a
## short variable and have been discarded
data_mod$values <- as.numeric(data_mod$values) # coerce the values column to be numeric because ggplot cannot plot y if it is in character format
classoutcomes_plot <- ggplot(data_mod, aes(fill=ind, y=values, x=Metric)) +
geom_bar(position="dodge", stat="identity", color="black") +
scale_x_discrete(limit=c("Flora/Fauna", "Experimental Design", "Oral Presentation", "Field Research", "Graduate Degree", "Science Career", "Research Opportunities", "UCSC Community"),
labels=c("Species Identification", "Experimental Design", "Oral Presentations", "Research Methods", "Grad School Interest", "Science Career Interest", "Research Opportunities", "Sense of Community")) +
ylim(-1,2)+
theme_bw() + # remove background
theme(panel.border = element_blank(), panel.grid.major = element_blank(), # remove panel border and grid
panel.grid.minor = element_blank(), axis.line.x = element_blank(), axis.line.y=element_line(colour="black"), axis.ticks.x = element_blank()) + # remove background elements, remove x axis line, add black y axis line, remove x axis ticks
theme(axis.text.x = element_text(angle = 45, hjust=1)) +
xlab(NULL) +
ylab("Improvement") +
scale_fill_manual(labels=c("BI0E82 In-Person", "BIOE82 Online"), values=c("darkcyan", "gray60")) +
theme(legend.title=element_blank(),
legend.position='top',
legend.justification='left',
legend.direction='vertical')
print(classoutcomes_plot)# I noticed that my bar graph values look a little different than those in Fig 1... some don't line up with all of the data in the Dryad set (see Grad School Interest, Science Career Interest, Research Opportunities). I'm very curious as to what is going on here!