Step1: Open this .RMD file in R Studio and click the “Preview” button above. A formatted version should pop up in a browser window. IF not, troubleshoot that!
Everything up to the intermission(?) is primarily an exercise in data wrangling (tidy+transform) and markdown formatting using R Studio. The second portion contains an overview of data modeling and a preliminary attempt to model Hop Freshening Power (synonyms: ‘dryhop creep’, ‘ABV creep’, ‘dry-hop creep’) as a component of the warm-dryhopping process.
# __ .__ .___
# _/ |_|__| __| _/__.__.
# \ __\ |/ __ < | |
# | | | / /_/ |\___ |
# |__| |__\____ |/ ____|
# \/\/
# \/
# ||
# ||
# ||
# _ /\
# | |_ _ _ __ _ _ _ ___/ _|___ _ _ _ __
# | _| '_/ _` | ' \(_-< _/ _ \ '_| ' \
# \__|_| \__,_|_||_/__/_| \___/_| |_|_|_|
# // \\
# // \\
# // \\
# // \\
# _ // _ _ \\ _ _
# __ _(_)____ _ __ _| (_)______ _ __ ___ __| |___| |
# \ V / (_-< || / _` | | |_ / -_)------| ' \/ _ \/ _` / -_) |
# \_/|_/__/\_,_\__,_|_|_/__\___|------|_|_|_\___/\__,_\___|_|
#
#ascii font/art from <http://www.network-science.de/ascii/>
The warm-dryhopping process can profoundly impact the body, alcohol and vicinal diketone content of the beer. Common metrics and models for this phenomenon have not been established.
Through experience in brewing we know that endpoints of dryhopping include:
Experiments were carried out where
Focusing on ethanol increase as the endpoint, we know these to be relevant factors:
For more details on this dataset see the manuscript.
please send comments, complaints and corrections to luke@bellsbeer.com! * Data are from the file “ujbc_a_1469081_sm5496.txt” (supplementary data from Jacob A. Kirkendall, Carter A. Mitchell & Lucas R. Chadwick (2018): The Freshening Power of Centennial Hops, Journal of the American Society of Brewing Chemists Volume 76, Issue 3, Pages 178-184 (2018) DOI: 10.1080/03610470.2018.1469081* available from https://www.tandfonline.com/doi/full/10.1080/03610470.2018.1469081?scroll=top&needAccess=true
#.rs.restartR() # restart R
sessionInfo()
#data wrangling
#install.packages("dplyr") # dplyr "data plyer"
library(dplyr)
library(ggplot2)
library(tidyverse)
library(caret) # for dummyVars function
library(psych) # for pairs.panels function
library(memisc) # compare models
library(drc) # "drc" = dose response curves (contains a suite of flexible and versatile model fitting and after-fitting functions)
#library(plotly)
library(gridExtra) # for grid.arrange in R markdown
flash forward! scroll to the bottom and run bigchunk_tidytransform
rm(list = ls()) # clear workspace
mydata <-read.csv("ujbc_a_1469081_sm5496.txt",stringsAsFactors = FALSE) ## SPECIFY filename
dput(names(mydata))
c("sample_id", "hop", "BINhop", "expt", "sample_group", "special_group",
"brew_date", "sample_collection_date", "volume_mL", "dryhop_date",
"dhop_day", "Hop_type", "mg_hops", "contact_days", "special_conditions",
"temp.C", "Test_Date", "Test.time", "ABV", "ABW", "OE", "Er",
"Ea", "SG", "RDF", "ADF", "Calories", "REF_NH", "ABV_increase",
"vol_norm_ABV_increase", "ABV_increase_prototabs")
Anton Paar data were collected at time zero for kinetics experiment (triplicate samples run immediately after sample collection, while the remainder were being dry-hopped), but were inadvertently excluded from the original dataset. Here they are: Date time sample no. ABV ABW OE (P) Er Ea SG 20/20 RDF ADF sample_id 07/27/17 3:26 PM 03_ 6.68 5.2 15.95 6.08 3.74 1.01462 63.88 76.58 JK1-20-41 07/27/17 3:30 PM 04_ 6.68 5.2 15.95 6.08 3.73 1.01461 63.89 76.6 JK1-20-50 07/27/17 3:34 PM 05_ 6.68 5.2 15.95 6.08 3.74 1.01462 63.88 76.58 JK1-20-87
## fill in MISSING DATA :( add time zero data for time series analysis)
## nested ifelse to add time zero plato values for each expt:
mydata<- mydata %>% mutate(., initial_plato = ifelse(expt==" 2A", 3.74, ifelse(expt==" 1A", 3.69, ifelse(expt==" 1B", 3.53, ifelse(expt==" 3B", 3.54, ifelse(expt==" 4B", 4.66, NA))))))
## add complete time-zero anton paar data for time series (expt2A)
## quadruple-up each triplicate time-zero measurement (one copy for each NH/DH & rouse/still combo)
## note: this practice is of questionable statistical rigor. In retrospect it would have been better (and a lot easier) to just run 12 samples through Anton on day0!
timezero <- mydata %>% filter(expt==" 2A") %>% arrange(special_group) %>% slice(1:12)
#data.entry(timezero)
timezero$Test_Date <- rep(c("7/27/2017","7/27/2017","7/27/2017"),4)
timezero$Test.time <- rep(c("3:26 PM", "3:30 PM", "3:34 PM"),4)
timezero$ABV <- 6.68
timezero$ABW <- 5.2
timezero$OE <- 15.95
timezero$Er <- 6.08
timezero$Ea <- rep(c(3.74, 3.73, 3.74),4)
timezero$SG <- rep(c(1.01462, 1.01461, 1.01462),4)
timezero$RDF <- rep(c(63.88, 63.89, 63.88),4)
timezero$ADF <- rep(c(76.58, 76.60, 76.58),4)
timezero$Calories <- rep(c(215.37,215.34,215.35),4)
timezero[timezero$hop=="DH",]$contact_days <-0
timezero[,c(28:31)] <- 0
mydata <- rbind(timezero,mydata)
write.csv(mydata, "quickie.csv", row.names = FALSE)
mydata<-read.csv("quickie.csv", stringsAsFactors = FALSE)
### this is a routine for inspecting data ###
mydatatypes<- as.data.frame(cbind(sapply(mydata, class))) ## table of datatypes
mydatatypes$headers <- rownames(mydatatypes) ## convert rownames to values
na_count <-as.data.frame(sapply(mydata, function(y) sum(length(which(is.na(y)))))) ## count NAs by column
mydataOverview<-as.data.frame(cbind(na_count,mydatatypes)) ## table of NA counts and datatypes
names(mydataOverview) <- c("NA_count","Data_Class", "Header")
#sum(is.na(mydata)) ## total count of NA values in entire sheet (231 in this case)
#mydataOverview %>% filter(NA_count>0) ## breakdown of NA values
note: this is a base R +dplyr data wrangling exercise! in general it’s best to use lubridate package for date/timestamps!
test_df<- mydata # (original dataframe to be used for testing/illustration)
x_brew_date.POSIXct<- as.POSIXct(mydata$brew_date, format = "%m/%d/%Y") # as a standalone vector (USELESS here), or:
test_df$brew_date.POSIXct <-as.POSIXct(mydata$brew_date, format = "%m/%d/%Y") # as a "new column" in our dataframe
datecols<- dput(names(dplyr::select(mydata, matches("date")))) ## headers containing string "date" => c("brew_date", "sample_collection_date", "dryhop_date", "Test_Date")
c("brew_date", "sample_collection_date", "dryhop_date", "Test_Date"
)
## FORLOOP
for (icol in datecols) {
newcol = paste0(icol,".POSIXct")
# print(newcol)
mydata[, newcol] = as.POSIXct(mydata[, icol],format = "%m/%d/%Y") ### CREATE NEW columns with POSIXct
# mydata[, newcol] = as.POSIXct(as.numeric(mydata[, icol]) * (60*60*24), origin="1899-12-30") ### microsoft times
}
mydata<- mydata %>%
mutate(dayofaddition = as.integer(as.numeric(difftime(dryhop_date.POSIXct,brew_date.POSIXct))),
daysonhops = as.integer(as.numeric(difftime(Test_Date.POSIXct,dryhop_date.POSIXct))/(60*60*24)),
hops_g_100mL = (mg_hops/volume_mL)/10,
pounds_bbl = (mg_hops/volume_mL)*117/454
)
mydata$OX<-grepl("OX", mydata$Hop_type) ## create logical "OX" column
mydata$rouse<-grepl("rouse", mydata$special_group)## create logical column
mydata$Grind<-grepl("Grind", mydata$Hop_type) ## create logical column
mydata$Cone<-grepl("Cone", mydata$Hop_type) ## create logical column
mydata$harvest2014<-grepl("14", mydata$Hop_type) ## create logical column
mydata$harvest2015<-grepl("15", mydata$Hop_type) ## create logical column
mydata$harvest2017<-grepl("17", mydata$Hop_type) ## create logical column
## ifelse statement for harvestyear (if neither 2014 nor 2015 nor 2017, then 2016)
mydata$harvestYear <- ifelse(
mydata$harvest2014==TRUE, 2014,
ifelse(mydata$harvest2015==TRUE, 2015,
ifelse(mydata$harvest2017==TRUE, 2017, 2016)))
dput(levels(as.factor(mydata$harvestYear)))
c("2014", "2015", "2016", "2017")
## ifelse statement for form of hops (if neither cone nor ground nor NH, then pellet)
mydata$form_of_hops <- ifelse(
mydata$Cone==TRUE, "cone",
ifelse(mydata$Grind==TRUE, "ground",
ifelse(mydata$Hop_type=="NH", "NH", "pellet")))
dput(levels(as.factor(mydata$form_of_hops)))
c("cone", "ground", "NH", "pellet")
## ifelse statement for temperature greater or less than 10
mydata$DH_temp <- ifelse(
mydata$temp.C<10, "cold",
ifelse(mydata$temp.C>10, "warm", "something else"))
dput(levels(as.factor(mydata$DH_temp)))
c("cold", "warm")
mydata$variety<-mydata$Hop_type
mydata$variety<- gsub("[0-9]+","", mydata$variety) ## remove all numbers
mydata$variety<- gsub("OX","", mydata$variety) ## remove specific text
mydata$variety<- gsub("Grind","", mydata$variety)
mydata$variety<- gsub("Cone","", mydata$variety)
mydata$variety<- gsub(" ","", mydata$variety) ## remove spaces
dput(levels(as.factor(mydata$variety)))
c("AMAR", "CASC", "CENT", "CIT", "NH", "SIM")
mydata<- mydata %>% mutate(EXPTnew=paste0("group",expt, substr(special_group, 1,2),as.character(rouse)))
# clean it up by removing "NA" and any spaces due to canarycode bug
mydata$EXPTnew <- gsub(" ","", mydata$EXPTnew) ## remove any spaces
mydata$EXPTnew <- gsub("NA","", mydata$EXPTnew) ## remove "NA"
dput(levels(as.factor(mydata$expt)))
c(" 1A", " 1B", " 2A", " 3A", " 3B", " 4B")
dput(levels(as.factor(mydata$EXPTnew)))
c("group1AFALSE", "group1BFALSE", "group2A01FALSE", "group2A01TRUE",
"group2A05FALSE", "group2A05TRUE", "group2A12FALSE", "group2A12TRUE",
"group2A19FALSE", "group2A19TRUE", "group2A25FALSE", "group2A25TRUE",
"group2A33FALSE", "group2A33TRUE", "group2A42FALSE", "group2A42TRUE",
"group3AFALSE", "group3BFALSE", "group4BFALSE")
(experimental factors on the left, then measurements, followed by calculations and then all date columns on the right):
mydata <- mydata %>%
dplyr::select(sample_id,expt, EXPTnew, hop, BINhop, variety, OX, harvestYear, form_of_hops,special_conditions, rouse, daysonhops, dayofaddition, DH_temp, temp.C, hops_g_100mL, pounds_bbl,
ABV, ABW, OE, Er, Ea, SG, RDF, ADF, Calories,
dhop_day, contact_days, REF_NH, ABV_increase,
brew_date.POSIXct, sample_collection_date.POSIXct, dryhop_date.POSIXct,Test_Date.POSIXct, initial_plato)
## remove ".POSIXct" suffix. Leaving it as-is will only add to confusion if/when these data are saved and re-imported (and become 'character' format again!)
colnames(mydata) = gsub(".POSIXct", "", colnames(mydata))
(NH = not dry-hopped)
## mean NH (control) values for each EXPTnew group
mean.control_NH<- mydata %>%
group_by(EXPTnew) %>%
filter(hop=="NH") %>%
summarise_at(vars(ABV, ABW, Ea, SG),funs(mean, n()))
## replace "_mean" with ".control_NH" in column names
colnames(mean.control_NH) = gsub("_mean", ".control_NH", colnames(mean.control_NH))
## first join our data with mean ABV for unhopped samples in given experiment (mean.control_NH; calculated above)
FPHcalc<- left_join(mydata, mean.control_NH, by="EXPTnew")
## calculate ABW_increase by subtracting each individual ABW measurement from respective mean.control_NH:
FPHcalc$delta.ABV <- FPHcalc$ABV - FPHcalc$ABV.control_NH
FPHcalc$delta.ABW <- FPHcalc$ABW - FPHcalc$ABW.control_NH
FPHcalc$delta.plato <- FPHcalc$Ea - FPHcalc$Ea.control_NH
FPHcalc$delta.SG <- FPHcalc$SG - FPHcalc$SG.control_NH
## the control samples have served their purpose, now remove them from dataset. The following calculations are only meaningful for dry-hopped samples.
FPHcalc<- FPHcalc %>% filter(hop=="DH")
FPHcalc$calcCO2_increase <- FPHcalc$delta.ABW*(0.44/0.46)
##convert calcCO2_increase (in g/100mL) to calculated CO2 volumes added
## g/L = 10* g/100mL
## The conversion factor from volumes of CO2 to CO2 by weight (g/L) is 1.96. For example: 2.5 volumes x 1.96 = 4.9 g/l.
FPHcalc$calcCO2vols_increase <- FPHcalc$calcCO2_increase*10/1.96
## FPH = Fold Production due to Hops (fold-increase by mass: amount of given endpoint relative to amount of hops added)
##
FPHcalc$FPH_EtOH = FPHcalc$delta.ABW/FPHcalc$hops_g_100mL
FPHcalc$FPH_CO2 = FPHcalc$calcCO2_increase/FPHcalc$hops_g_100mL
FPHcalc$FPH_plato = FPHcalc$delta.plato/FPHcalc$hops_g_100mL
## replacing time-zero time values with a very small number (rather than exactly zero) will prevent issues with analysis of nonlinear models
FPHcalc[FPHcalc$daysonhops==0,]$daysonhops <- 0.01
## and save the transformed data to csv:
write.csv(FPHcalc,"FPHcalc.csv", row.names = FALSE)
FPHcalc <- read.csv("FPHcalc.csv", stringsAsFactors = TRUE)
df<- FPHcalc %>% group_by(EXPTnew) %>%
summarise_at(vars(hops_g_100mL,pounds_bbl,delta.ABV, delta.ABW, delta.plato, FPH_plato, FPH_EtOH, FPH_CO2, calcCO2vols_increase),funs(round(mean(.), 2))) %>%
arrange(desc(FPH_EtOH))
df
df <-FPHcalc
p1<- ggplot(df, aes(y=FPH_EtOH,x=OE, color=contact_days)) + geom_point(size=2)
p2<- ggplot(df, aes(y=FPH_EtOH,x=ADF, color=contact_days)) + geom_point(size=2)
p3<- ggplot(df, aes(y=FPH_EtOH,x=ABW, color=contact_days)) + geom_point(size=2)
p4<- ggplot(df, aes(y=FPH_EtOH,x=Ea, color=contact_days)) + geom_point(size=2)
grid.arrange(p1, p2, p3, p4, ncol = 2)
df <- FPHcalc
x<- df$Ea
y<- df$ABW #FPH_EtOH
p1<- ggplot(df, aes(x,y, color=form_of_hops)) + geom_point(size=2)
p2<- ggplot(df, aes(x,y, color=brew_date)) + geom_point(size=2)
p3<- ggplot(df, aes(x,y, color=rouse)) + geom_point(size=2)
p4<- ggplot(df, aes(x,y, color=pounds_bbl)) + geom_point(size=2)
grid.arrange(p1, p2, p3, p4, ncol = 2)
see https://plotly-r.com/the-plotly-cookbook.html for overview and some visualization inspiration.
# interactive 3D plots with library(plotly)
df <- FPHcalc # %>% filter(expt==" 2A")
# makes html very slow to render...comment out
#plot_ly(df, x = ~daysonhops, y = ~ABW, z = ~Ea) %>% add_markers(color = ~expt)
mydata<-FPHcalc %>% dplyr::select(expt,variety,form_of_hops,pounds_bbl,special_conditions,rouse,daysonhops,DH_temp,brew_date,ABW,SG,OE,ADF,RDF,calcCO2vols_increase,FPH_CO2, FPH_EtOH) ## last one is 100% in j
## note use of "dplyr::select" because one of these packages is conflicting with dplyr commands :()
## following Manuel Amunategui https://www.youtube.com/watch?v=igPQ-pI8Bjo
## Using Correlations To Understand Your Data: Machine Learning With R
##functions for flattenSquareMatrix
cor.prob <- function (X, dfr=nrow(X) -2) {
R<- cor(X, use="pairwise.complete.obs")
above<- row(R) < col(R)
r2 <- R[above]^2
Fstat<- r2 * dfr/(1-r2)
R[above] <- 1- pf(Fstat, 1, dfr)
R[row(R) == col(R)] <- NA
R
}
flattenSquareMatrix <- function(m) {
if( (class(m) != "matrix") | (nrow(m)!=ncol(m))) stop("Must be a square matrix.")
if(!identical(rownames(m), colnames(m))) stop("Row and column names must be equal.")
ut <- upper.tri(m)
data.frame(i = rownames(m)[row(m)[ut]],
j = rownames(m)[col(m)[ut]],
cor=t(m)[ut],
p=m[ut])
}
## library(caret) to dummify everything (turn all characters&factors into columns; ignores numbers and integers)
dmy<- dummyVars(" ~ .",data = mydata)
mydummifieddata<- data.frame(predict(dmy, newdata = mydata))
corMat = cor(mydummifieddata)
corMasterList<- flattenSquareMatrix(cor.prob(mydummifieddata)) ## list of all correlations
## order by strength of correlation
corlist<- corMasterList %>% arrange(-abs(corMasterList$cor))
write.csv(corlist,paste0("FLAT correlation matrix_.csv"))
corlist <- corlist %>% dplyr::filter(j=="FPH_EtOH") ## filter specific endpoint
head(corlist,50)
## specify interesting variables:
interestingvariables<-c("ABW", "pounds_bbl", "daysonhops", "calcCO2vols_increase")
pairs.panels(mydummifieddata[c(interestingvariables, "FPH_EtOH")])
Observing the impacts of dryhopping in the presence of live yeast has led many brewing professionals to understand that FPH is a function of many of the variables above including hop variety,form_of_hops,harvestYear,OX,DH_temp,daysonhops,rouse,pounds_bbl…. Many have intuitively created a model in their heads (without necessarily thinking of it as such) and skillfully adjust process when necessary to account for this phenomenon. In linear modeling, our function will take on the form: \(FPH = intercept + \beta_{1}X_{1} + \beta_{2}X_{2} + ... + \beta_{n}X_{n}\) where \(\beta\) values are what we’re attempting to derive in this modeling exercise, and X values are (collectively) a particular set of conditions.
df<-FPHcalc
lm1<-lm(df$SG~df$OE)
lm2<-lm(df$SG~df$ADF)
lm3<-lm(df$SG~df$ABW)
lm4<-lm(df$SG~df$Ea)
par(mfrow = c(2, 2), oma = c(0, 0, 0, 0))
plot(df$SG~df$OE)
abline(lm1)
plot(df$SG~df$ADF)
abline(lm2)
plot(df$SG~df$ABW)
abline(lm3)
plot(df$SG~df$Ea)
abline(lm4)
For each of the four cases above, we can easily see with our eyeballs the degree to which a linear model fits the data. Residual plots and R-squared values are commonly used to express this goodness-of-fit (that we can easily see with our eyeballs!).
df<-FPHcalc
lm1<-lm(df$SG~df$OE)
lm2<-lm(df$SG~df$ADF)
lm3<-lm(df$SG~df$ABW)
lm4<-lm(df$SG~df$Ea)
par(mfrow = c(2, 2), oma = c(0, 0, 0, 0))
plot(lm1$residuals)
plot(lm2$residuals)
plot(lm3$residuals)
plot(lm4$residuals)
#summary(lm4)
df<-FPHcalc
lm1<-lm(df$SG~df$OE)
lm2<-lm(df$SG~df$ADF)
lm3<-lm(df$SG~df$ABW)
lm4<-lm(df$SG~df$Ea)
mtable1234 <- mtable("Model 1"=lm1,"Model 2"=lm2,"Model 3"=lm3, "Model 4"=lm4,
summary.stats=c("sigma","R-squared","F","p","N"),show.eqnames=T)
mtable1234b <- relabel(mtable1234,
"(Intercept)" = "Constant",
x1 = "OE = Original Extract (g/100mL)",
x2 = "ADF = Apparent Degree of Fermentation (%)",
x3 = "ABW = Ethanol (w/w)",
x4 = "Er = Residual Extract (g/100mL)"
)
mtable1234
Calls:
Model 1: lm(formula = df$SG ~ df$OE)
Model 2: lm(formula = df$SG ~ df$ADF)
Model 3: lm(formula = df$SG ~ df$ABW)
Model 4: lm(formula = df$SG ~ df$Ea)
======================================================================
Model 1 Model 2 Model 3 Model 4
----------- ------------- ------------ ---------------
df$SG df$SG df$SG df$SG
----------------------------------------------------------------------
(Intercept) 1.071*** 1.063*** 1.056*** 1.000***
(0.020) (0.000) (0.001) (0.000)
df$OE -0.004**
(0.001)
df$ADF -0.001***
(0.000)
df$ABW -0.008***
(0.000)
df$Ea 0.004***
(0.000)
----------------------------------------------------------------------
sigma 0.002 0.000 0.001 0.000
R-squared 0.066 0.997 0.934 1.000
F 8.553 48038.147 1700.301 3532408.297
p 0.004 0.000 0.000 0.000
N 123 123 123 123
======================================================================
#show_html(mtable1234b)
df <- FPHcalc
lm1<-lm(FPH_EtOH~daysonhops, data=df)
lm2<-lm(FPH_EtOH~daysonhops*form_of_hops, data=df)
lm3<-lm(FPH_EtOH~daysonhops*rouse, data=df)
lm4<-lm(FPH_EtOH~daysonhops*form_of_hops*rouse, data=df)
mtable1234 <- mtable("Model 1"=lm1,"Model 2"=lm2,"Model 3"=lm3, "Model 4"=lm4,
summary.stats=c("sigma","R-squared","F","p","N"),show.eqnames=T)
mtable1234b <- relabel(mtable1234,
"(Intercept)" = "Constant",
SG = "Specific Gravity",
ABW = "ABW = Ethanol (w/w)",
Er = "Er = Residual Extract (g/100mL)"
)
mtable1234
Calls:
Model 1: lm(formula = FPH_EtOH ~ daysonhops, data = df)
Model 2: lm(formula = FPH_EtOH ~ daysonhops * form_of_hops, data = df)
Model 3: lm(formula = FPH_EtOH ~ daysonhops * rouse, data = df)
Model 4: lm(formula = FPH_EtOH ~ daysonhops * form_of_hops * rouse, data = df)
=============================================================================
Model 1 Model 2 Model 3 Model 4
----------- ----------- ----------- -----------
FPH_EtOH FPH_EtOH FPH_EtOH FPH_EtOH
-----------------------------------------------------------------------------
(Intercept) 0.498*** 0.306 0.537*** 0.319
(0.044) (0.197) (0.049) (0.197)
daysonhops 0.044*** 0.044*** 0.041*** 0.041***
(0.003) (0.003) (0.004) (0.004)
form_of_hops: ground/cone 0.849** 0.849**
(0.278) (0.277)
form_of_hops: pellet/cone 0.174 0.199
(0.200) (0.199)
rouse -0.227 -0.208
(0.122) (0.118)
daysonhops x rouseTRUE 0.010 0.009
(0.006) (0.006)
-----------------------------------------------------------------------------
sigma 0.355 0.340 0.352 0.339
R-squared 0.634 0.669 0.645 0.678
F 210.023 80.145 72.038 49.193
p 0.000 0.000 0.000 0.000
N 123 123 123 123
=============================================================================
#show_html(mtable1234b)
Table 20.1. Useful non-linear functions EXPANDED:
Function Class | name | equation | example code | example applications |
---|---|---|---|---|
Asymptotic functions | Michaelisâ“Menten | \(y =\frac{ax}{1+bx}\) | nls(bone~aage/(1+bage),start=list(a=8,b=0.08))) | enzyme reactions |
nls(rate~SSmicmen(conc,a,b)) | tbd | |||
2-parameter asymptotic exponential | \(y = a(1 â e^{âbx} )\) | nls(bone~a(1-exp(-cage)),start=list(a=120,c=0.064)) | tbd | |
3-parameter asymptotic exponential | \(y = a â be^{âcx}\) | nls(bone~a-bexp(-cage),start=list(a=120,b=110,c=0.064)) | tbd | |
nls(bone~SSasymp(age,a,b,c)) | tbd | |||
nls(density ~ SSlogis(log(concentration), a, b, c)) | tbd | |||
S-shaped functions | 2-parameter logistic | \(y = \frac{e^{a+bx}}{1 + e^{a+bx}}\) | tbd | |
3-parameter logistic | \(y = \frac{a}{1 + be^{âcx}}\) | tbd | ||
4-parameter logistic | \(y = a + \frac{b-a}{1 + e^{(câx)/d}}\) | nls(weight~SSfpl(Time, a, b, c, d)) | tbd | |
Weibull | \(y = a â be^{â(cx^d)}\) | nls(weight ~ SSweibull(time, Asym, Drop, lrc, pwr)) | tbd | |
Gompertz | \(y = ae^{âbe^{âcx}}\) | tbd | ||
Humped curves | Ricker curve | \(y = axe^{âbx}\) | tbd | |
First-order compartment | \(y = k exp(âexp(a)x) â exp(âexp(b)x)\) | nls(conc~SSfol(Dose, Time, a, b, c)) | tbd | |
Bell-shaped | \(y = a exp(âABS(bx)^2)\) | tbd | ||
Biexponential | \(y = ae^{bx} â ce^{âdx}\) | tbd |
\(y =\frac{ax}{1+bx}\)
expt2 <- FPHcalc %>% dplyr::filter(expt==" 2A") %>% dplyr::select(FPH_EtOH, rouse,special_conditions, daysonhops,brew_date, pounds_bbl,form_of_hops,ABW.control_NH,dayofaddition)
myfactor<- as.factor(expt2$special_conditions) #rouse
cols <- as.numeric(myfactor)
legend.cols <- as.numeric(as.factor(levels(myfactor)))
my.MM.model <- nls(FPH_EtOH~a*daysonhops/(1+b*daysonhops),myfactor, data=expt2,start=list(a=8, b=0.08))
summary(my.MM.model)
Formula: FPH_EtOH ~ a * daysonhops/(1 + b * daysonhops)
Parameters:
Estimate Std. Error t value Pr(>|t|)
a 0.186159 0.010109 18.41 < 2e-16 ***
b 0.067820 0.005728 11.84 1.45e-15 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.09307 on 46 degrees of freedom
Number of iterations to convergence: 6
Achieved convergence tolerance: 1.686e-06
So using the Michaelis-Menten equation as our model, the relationship between hop-induced ethanol production (FPH_EtOH) and hop contact time (daysonhops) would be expressed as: \(FPH_EtOH = \frac{0.183572*daysonhops}{1+0.066388*daysonhops}\) with a Residual standard error of 0.09307. As detailed below, logistic models do a better job than this, and using library(drc) makes it easier to introduce a multilevel factor.
x <- seq(0,50,0.1)
yv <- predict(my.MM.model, list(daysonhops=x))
{plot(expt2$daysonhops, expt2$FPH_EtOH, pch=21, col="purple", bg="green", main = "Michaelis-Menten model of ethanol production induced by dry-hopping")
lines(x,yv,col="blue")}
R. Alex Speers, Peter Rogers, Bruce Smith J. Inst. Brew. 109(3), 229â“235, 2003 https://doi.org/10.1002/j.2050-0416.2003.tb00163.x Free Access from the Institute of Brewing! https://onlinelibrary.wiley.com/doi/10.1002/j.2050-0416.2003.tb00163.x
Following Speers2003, the relationship between plato and time in primary fermentations is well modeled by a four parameter logistic model:
PLATO = PINF + (P_D / (1+ (EXP(-B*(TIME-M)))))
or
\(PLATO = P_{inf}+ \frac{P_D}{1+e^{-B(t-M)}}\)
where
so then for our case
\(PLATO = P_{inf}+ \frac{OE-P_{inf}}{1+e^{-F_{max}(daysonhops-M)}}\)
PLATO = PINF + (P0 / (1+ (EXP(B*(HOURS-M)))))
Andrew J. MacIntosh, Maria Josey, R. Alex Speers J. Am. Soc. Brew. Chem. 74(4), 250-257, 2016
“is only statistically advantageous to apply the five-parameter model when many data points are available… (72 from this experiment as opposed to 10 when using Yeast-14).”
MacIntosh five-parameter logistic: \(P_{(t)} =\frac{P_i - P_e}{(1+s * e^{-B(t-M)})^{1/s}}\)
library(drc) getMeanFunctions()
drc LL.4: \(f(x) = c + \frac{d-c}{1+\exp(b(\log(x)-\tilde{e}))}\) #4param logistic drc LL.5: \(f(x) = c + \frac{d-c}{(1+\exp(b(\log(x)-e)))^f}\) #5param logistic drc G.4 \(f(x) = c + (d-c)(\exp(-\exp(b(x-e))))\) #4param Gompertz
drc W1.4 \(f(x) = c + (d-c) \exp(-\exp(b(\log(x)-\log(e))))\) #4param Weibull1 drc W2.4 \(f(x) = c + (d-c) (1 - \exp(-\exp(b(\log(x)-\log(e)))))\) #4param Weibull2 drc BC.5: \(f(x, b,c,d,e,f) = c + \frac{d-c+fx}{1+\exp(b(\log(x)-\log(e)))}\) # Brain-Cousens
df <- FPHcalc %>% filter(expt==" 2A")
df$PLATO.hopdrop <- df$initial_plato + df$delta.plato
x <- df$daysonhops
y <- df$PLATO.hopdrop
group<- as.factor(df$special_conditions) #rouse
cols <- as.numeric(group)
legend.cols <- as.numeric(as.factor(levels(group)))
## Fitting models using function drm from library(drc). see <http://rstats4ag.org/dose-response-curves.html> for overview and <https://www.rdocumentation.org/packages/drc/> for list of models (fct values) and other functions available in drc
## create models
#m.LL.3<-drm(y ~x,group, fct = LL.3()) #3-parameter logistic (lower limit at 0)
#m.LL.3u<-drm(y ~x, fct = LL.3u()) #3-parameter logistic (upper limit at 1)
m.LL.4<-drm(y ~x,group, fct = LL.4()) #4-parameter log-logistic
# from ??LL.4: f(x) = c + \frac{d-c}{1+\exp(b(\log(x)-\log(e)))} or in another parameterisation (converting the term \log(e) into a parameter) f(x) = c + \frac{d-c}{1+\exp(b(\log(x)-\tilde{e}))}
m.L.4 <-drm(y ~x,group, fct = L.4()) #changing the fct = LL.4() to fct = L.4() allows plotting on a log10 scale
m.LL2.4<-drm(y ~x,group, fct = LL2.4()) #4-parameter log-logistic with log(e) rather than e as a parameter
m.LL2.5<-drm(y ~x,group, fct = LL2.5()) #Generalised log-logistic
# from ??LL.2.5: f(x) = c + \frac{d-c}{(1+\exp(b(\log(x)-e)))^f}
m.LL.5<-drm(y ~x,group, fct = LL.5()) #5-parameter logistic
m.W1.4<-drm(y ~x,group, fct = W1.4()) #4-parameter Weibull1
NaNs produced
m.W2.4<-drm(y ~x,group, fct = W2.4()) #4-parameter Weibull2
m.BC.5<-drm(y ~x,group, fct = BC.5()) #5-parameter Brain-Cousens (hormesis)
m.AR.3<-drm(y ~x,group, fct = AR.3()) #3-parameter Shifted asymptotic regression
NaNs producedNaNs produced
#m.MM.2<-drm(y ~x,group, fct = MM.2()) #2-parameter Michaelis-Menten
m.MM.3<-drm(y ~x,group, fct = MM.3()) #3-parameter Michaelis-Menten
NaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs producedNaNs produced
##plot
#par(mfrow = c(1,2), oma = c(0, 0, 0, 0))
{plot(x , y, col=cols, main="raw data")
legend("topright", legend=levels(group), pch=16, col=legend.cols)}
#plot(m.LL.3, type='all',col=cols, main="LL.3 (lower limit at 0)")
plot(m.LL.4, type='all',col=cols, main="LL.4 four-parameter log-logistic")
plot(m.L.4, type='all',col=cols, main="L.4")
plot(m.LL.5, type='all',col=cols, main="LL.5")
plot(m.LL2.4, type='all',col=cols, main="LL2.4 four-parameter log-logistic")
plot(m.LL2.5, type='all',col=cols, main="Generalised log-logistic")
plot(m.W1.4, type='all',col=cols, main="Weibull1")
plot(m.W2.4, type='all',col=cols, main="Weibull2")
plot(m.BC.5, type='all',col=cols, main="Brain-Cousens (hormesis)")
plot(m.AR.3, type='all',col=cols, main="Shifted asymptotic regression")
#plot(m.MM.2, type='all',col=cols, main="2-parameter Michaelis-Menten")
plot(m.MM.3, type='all',col=cols, main="3-parameter Michaelis-Menten")
df <- FPHcalc %>% filter(expt==" 2A")
df$PLATO.hopdrop <- df$initial_plato + df$delta.plato
x <- df$daysonhops
y <- df$PLATO.hopdrop
myfactor<- as.factor(df$special_conditions) #rouse
cols <- as.numeric(myfactor)
legend.cols <- as.numeric(as.factor(levels(myfactor)))
## create models
#m.LL.4<-drm(y ~x,myfactor, fct = LL.4(names = c("Slope", "Lower", "Upper", "Midpoint or ED50"))) #4-parameter log-logistic (general parameters)
#m.LL.4<-drm(y ~x,myfactor, fct = LL.4(names = c("Slope", "Lower", "Upper", "ED50"))) #4-parameter log-logistic (Dose-Response parameters)
m.LL.4<-drm(y ~x,myfactor, fct = LL.4(names = c("F_max", "P_inf", "OE", "M"))) #4-parameter log-logistic
m.L.4 <-drm(y ~x,myfactor, fct = L.4()) #changing the fct = LL.4() to fct = L.4() allows plotting on a log10 scale
m.LL2.4<-drm(y ~x,myfactor, fct = LL2.4()) #4-parameter log-logistic with log(e) rather than e as a parameter
m.LL2.5<-drm(y ~x,myfactor, fct = LL2.5()) #Generalised log-logistic
m.LL.5<-drm(y ~x,myfactor, fct = LL.5()) #5-parameter logistic
#plot a few side by side
par(mfrow = c(1, 2), oma = c(0, 0, 0, 0))
plot(m.LL.4,log="", broken = TRUE, bcontrol = list(style = "slash"), col = c(2,6,3,23,56), main = "4-parameter logistic")
plot(m.LL2.5,log="", broken = TRUE, bcontrol = list(style = "slash"), col = c(2,6,3,23,56), main = "5-parameter logistic")
par(mfrow = c(3, 2), oma = c(0, 0, 0, 0))
{plot(m.MM.3$predres, main = "Michaelis-Menten")
abline(0,0)}
{plot(m.W2.4$predres, main = "Weibull2")
abline(0,0)}
{plot(m.LL.4$predres, main = "4-parameter logistic")
abline(0,0)}
{plot(m.LL.5$predres, main = "5-parameter logistic")
abline(0,0)}
{plot(m.BC.5$predres, main = "Brain-Cousens (hormesis)")
abline(0,0)}
{plot(m.LL2.5$predres, main = "Generalised log-logistic")
abline(0,0)}
#pick one model to be mymodel
mymodel <- m.LL.4
mymodel
A 'drc' model.
Call:
drm(formula = y ~ x, curveid = myfactor, fct = LL.4(names = c("F_max", "P_inf", "OE", "M")))
Coefficients:
F_max:rouse F_max:still P_inf:rouse P_inf:still OE:rouse
1.392 1.800 1.809 2.149 3.749
OE:still M:rouse M:still
3.751 12.274 7.958
confint(mymodel)
2.5 % 97.5 %
F_max:rouse 1.186734 1.597836
F_max:still 1.537194 2.063586
P_inf:rouse 1.653949 1.963275
P_inf:still 2.079132 2.219179
OE:rouse 3.710521 3.787753
OE:still 3.714969 3.786819
M:rouse 10.616221 13.931883
M:still 7.316189 8.600129
summary(mymodel)
Model fitted: Log-logistic (ED50 as parameter) (4 parms)
Parameter estimates:
Estimate Std. Error t-value p-value
F_max:rouse 1.392285 0.101704 13.690 < 2.2e-16 ***
F_max:still 1.800390 0.130226 13.825 < 2.2e-16 ***
P_inf:rouse 1.808612 0.076525 23.634 < 2.2e-16 ***
P_inf:still 2.149156 0.034647 62.030 < 2.2e-16 ***
OE:rouse 3.749137 0.019107 196.219 < 2.2e-16 ***
OE:still 3.750894 0.017775 211.018 < 2.2e-16 ***
M:rouse 12.274052 0.820272 14.963 < 2.2e-16 ***
M:still 7.958159 0.317638 25.054 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error:
0.03875737 (40 degrees of freedom)
Each level (rouse and still) has a unique curve shape thereby resulting in different parameters for a best-fit log-logistic model. Now that we know this, it is practical to address each set of conditions separately. Based on the 4-parameter logistic model LL.4 we have for roused samples:
\(PLATO_{roused} = P_{inf}+ \frac{OE-P_{inf}}{1+e^{-F_{max}(daysonhops-M)}}\)
or
\(PLATO_{roused} = 1.809 + \frac{1.94}{1+e^{-1.3923(daysonhops-12.274)}}\)
and for untouched (still) samples:
\(PLATO_{still} = 2.149 + \frac{1.60}{1+e^{-1.8004(daysonhops-7.958)}}\)
Plotting and analysis options for multilevel models are limited. By filtering the data to focus on one set of conditions at a time we create ‘simple’ y~x logistic models. Many wonderful visualization tools are available for such x~y models. Here’s just a few:
(specify eval=false in R notebook codechunk header {r eval=FALSE} to prevent multiple messages regarding “Recycling array of length 1 in array-vector arithmetic is deprecated” from appearing in html)
df <- FPHcalc %>% filter(expt==" 2A" & rouse==TRUE)
df$PLATO.hopdrop <- df$initial_plato + df$delta.plato
x <- as.numeric(df$daysonhops)
y <- df$PLATO.hopdrop
## create model
mymodel<-drm(y ~x, fct = LL.4()) #4-parameter logistic with library(drc)
plot(mymodel, main = "default (logarithmic) x-axis")
plot(mymodel, log="", main = "non-logarithmic x-axis")
# create prediction intervals
newdata = data.frame(y = y, x = x)
conf95 <- as.data.frame(predict(mymodel, newdata,interval = "confidence", level = 0.95))
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
conf95$x <- x
pred95 <- as.data.frame(predict(mymodel, newdata, interval = "prediction", level = 0.95))
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
pred95$x <- x
pred999 <- as.data.frame(predict(mymodel, newdata, interval = "prediction", level = 0.999))
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
Recycling array of length 1 in array-vector arithmetic is deprecated.
Use c() or as.vector() instead.
pred999$x <- x
mymodel
A 'drc' model.
Call:
drm(formula = y ~ x, fct = LL.4())
Coefficients:
b:(Intercept) c:(Intercept) d:(Intercept) e:(Intercept)
1.392 1.809 3.749 12.272
Note: If you compare the multi-level model parameter estimates for rouse=TRUE with those for unilevel model based on rouse=TRUE subset, the results are similar, but not identical!
# Asymmetric ribbons based on prediction intervals
#newdata <- data.frame(y = c(3.5, 3, 2.9, 2.5,1.8,1.7), x = c(1,5,10,20,30,40))
#newdata <- tidyfermi %>% filter(batch = SELECTBATCH) %>% select(days,plato)
qplot(data=pred95, x=x, y=Prediction, ymin=Lower, ymax=Prediction, geom="ribbon", fill=I("red"), alpha=I(0.2)) +
geom_ribbon(data=pred95, aes(x=x, ymin=Prediction, ymax=Upper), fill=I("blue"), alpha=I(0.2)) +
geom_line(data=pred95, aes(x=x, y=Prediction), color=I("green"), lwd=1)+
geom_point(data=newdata, aes(x=x, y=y, ymin=NULL, ymax=NULL), size=1, col="blue")+
ylab("y")
# Visualise intervals
# based on Maurits Evers (https://stackoverflow.com/questions/49444489/nonlinear-regression-prediction-in-r?rq=1)
data.frame(x=x, y=y) %>% ggplot(aes(x, y)) +
geom_point() +
geom_line(data = pred95, aes(x = x, y = Prediction)) +
geom_ribbon(data = pred999, aes(x = x, ymin = Lower, ymax = Upper),fill=I("pink"),alpha = 0.4)+
geom_ribbon(data = pred95, aes(x = x, ymin = Lower, ymax = Upper),fill=I("green"),alpha = 0.4)+
geom_ribbon(data = conf95, aes(x = x, ymin = Lower, ymax = Upper),fill=I("purple"),alpha = 0.4);
For cases where 2016 Centennial hops are added to this particular base beer (sampled into 12oz bottles) near the end of primary fermentation at room temperature and bottles remain untouched (still), the 4-parameter logistic equation with the following sets of parameters is a reasonable model of the subsequent hop-induced plato drop: \(PLATO_{still} = f(daysonhops) = 2.149 + \frac{1.60}{1+e^{-1.8004(daysonhops-7.958)}}\)
The following 4-parameter logistic equation is a reasonable model for cases identical to the above, but where bottles are roused once daily: \(PLATO_{roused} = f(daysonhops) = 1.809 + \frac{1.94}{1+e^{-1.3925(daysonhops-12.272)}}\)
The residuals for the 5-parameter logistic models are similar to those obtained for the 4-parameter equation, but in the case of unroused samples a visual inspection reveals very different curves. The 4-parameter logistic predicts a much shorter “lag time” than does the 5-parameter logistic. More data points in the days after dryhopping would shed light on the question as to which model is superior.
Only a single variable (rousing vs. still) among several hundred variables encompassing raw materials, machines, people, process, and gages (measurements) resulted in quite different estimated parameters for a 4-parameter logistic model. These results highlight the fact that all process variables known to significantly impact the endpoints we wish to control must themselves be defined, controlled and/or documented if models such as these are to be useful to characterize the process in question.
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.2.0 data.table_1.11.8 chron_2.3-53
[4] lubridate_1.7.4 bindrcpp_0.2.2 gridExtra_2.3
[7] drc_3.0-1 memisc_0.99.14.12 MASS_7.3-50
[10] psych_1.8.10 caret_6.0-81 lattice_0.20-35
[13] forcats_0.3.0 stringr_1.3.1 purrr_0.2.5
[16] readr_1.2.1 tidyr_0.8.2 tibble_1.4.2
[19] tidyverse_1.2.1 ggplot2_3.1.0 dplyr_0.7.8
loaded via a namespace (and not attached):
[1] nlme_3.1-137 httr_1.3.1 rprojroot_1.3-2
[4] repr_0.19.1 tools_3.5.1 backports_1.1.2
[7] R6_2.3.0 rpart_4.1-13 lazyeval_0.2.1
[10] colorspace_1.3-2 nnet_7.3-12 withr_2.1.2
[13] tidyselect_0.2.5 mnormt_1.5-5 curl_3.2
[16] compiler_3.5.1 cli_1.0.1 rvest_0.3.2
[19] xml2_1.2.0 sandwich_2.5-0 labeling_0.3
[22] scales_1.0.0 mvtnorm_1.0-8 digest_0.6.18
[25] foreign_0.8-70 rmarkdown_1.10 rio_0.5.16
[28] base64enc_0.1-3 pkgconfig_2.0.2 htmltools_0.3.6
[31] plotrix_3.7-4 rlang_0.3.0.1 rstudioapi_0.8
[34] bindr_0.1.1 generics_0.0.1 zoo_1.8-4
[37] jsonlite_1.5 gtools_3.8.1 ModelMetrics_1.2.2
[40] zip_1.0.0 car_3.0-2 magrittr_1.5
[43] Matrix_1.2-14 Rcpp_1.0.0 munsell_0.5.0
[46] abind_1.4-5 stringi_1.2.4 multcomp_1.4-8
[49] yaml_2.2.0 carData_3.0-2 plyr_1.8.4
[52] recipes_0.1.4 grid_3.5.1 parallel_3.5.1
[55] crayon_1.3.4 haven_2.0.0 splines_3.5.1
[58] hms_0.4.2 knitr_1.20 pillar_1.3.0
[61] reshape2_1.4.3 codetools_0.2-15 stats4_3.5.1
[64] glue_1.3.0 evaluate_0.12 modelr_0.1.2
[67] foreach_1.4.4 cellranger_1.1.0 gtable_0.2.0
[70] assertthat_0.2.0 gower_0.1.2 openxlsx_4.1.0
[73] prodlim_2018.04.18 broom_0.5.0 class_7.3-14
[76] survival_2.42-3 timeDate_3043.102 iterators_1.0.10
[79] lava_1.6.4 TH.data_1.0-9 ipred_0.9-8