- Simulate a brand new data frame (the type of data is up to you) and
conduct a statistical analysis in R using 2 variables. Create a basic
ggplot that goes with the analysis.
# glimpse(iris)
# we are making a box plot that will compared species to sepal length
flwr<-data.frame(iris$Species, iris$Sepal.Length) # create a df with these variables from the iris data set
- I’m going to do an ANOVA Test to test if there is
significance between these two variables:
- ANOVA and t-tests are used for statistical analysis when x is a
categorical variable and y is a continuous variable. Box and bar plots
are the preferred plots for these analyses.
ANOV<-aov(iris.Sepal.Length ~ iris.Species, data=flwr) # use iris data, aov requires data to be formatted y~x
summary(ANOV) # summary table of aov
## Df Sum Sq Mean Sq F value Pr(>F)
## iris.Species 2 63.21 31.606 119.3 <2e-16 ***
## Residuals 147 38.96 0.265
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pvalue_func<-function(data=ANOV){ # used this function I created in Weekly Assignment 2
p<-summary(ANOV)[[1]][["Pr(>F)"]][1] # extract the first element in the first compartment named "Pr(>F)" (the p value)
return(p)
}
pvalue_func(data=ANOV)
## [1] 1.669669e-31
- Now I’m going to create a box plot:
## Loading required package: viridisLite
library(ggplot2)
cols<-viridis(3, option = "plasma") # takes 3 hex color codes from the viridis color packages. # also plasma, turbo, viridis. # have to give it the same number of colors as you have groups
flwrbox<-ggplot(data=flwr, aes(x=iris.Species,y=iris.Sepal.Length, fill=iris.Species)) + # create a ggplot using flwr data where x=species, y=sepal length, and the boxes are filled based on species
geom_boxplot() + # create a box plot
scale_fill_manual(values=cols, name = "Species") + # name the legend Species
labs(title="Difference in Sepal Length Among Species",
subtitle="Abby Griffin BIOL 1007A",
x="Species",
y="Sepal Length") + # naming title/subtitle/x/y axes
theme_classic() + # remove background and gridlines
theme(plot.title = element_text(color = "black"),
plot.subtitle = element_text (color = "grey60")
) + # change color of title/subtitle
annotate("text", x = 2, y = 7.5, label = "p=1.669669e-31")
print(flwrbox)
