Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe V
Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
The cheddar dataset has 30 observations on the following four variables:
taste: a subjective taste score
Acetic: concentration of acetic acid (log scale)
H2S: concentration of hydrogen sulfide (log scale)
Lactic: concentration of lactic acid
Use the following statement to access the data: data(“cheddar”, package = “faraway”)
Question 1.1:
Show the first six observations of the dataset.
Hint: Use the head() function. For example: head(cars)
Question 1.2:
Show descriptive statistics for each of the variables.
What is the mean subjective taste score?
Hint: Use the summary function. For example: summary(cars)
Question 1.3:
Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.
Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)
Question 1.4:
Fit a regression model with taste as the response and no predictors.
What is the value of the intercept? What does it represent?
For Example:
lmodNoX <- lm (speed ~ 1, data = cars)
summary(lmodNoX)
Question 1.5:
Fit a regression model with taste as the response and Acetic as the only predictor.
Is the model statistically significant at the 5% level?
Hint: See Lesson 2, Slide 51.
Question 1.6:
Calculate the p-value of the entire model you created in Question 1.5 using the anova() function.
Hint: See Lesson 3, Slide 17.
Question 1.7:
Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic).
Which predictors are statistically significant at the 5% level?
Hint: See Lesson 3, Slide 16.
Question 1.8:
Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7.
Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19.
Question 1.9:
Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model.
Should we reject this hypothesis?
Hint: Lesson 3, Slide 22.
Part 2: Study of Teenage Gambling in Britain
The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables:
sex: 0 = male, 1 = female
status: Socioeconomic status score based on parents’ occupation
income: income in pounds per week
verbal: verbal score in words out of 12 correctly defined
gamble: expenditure on gambling in pounds per year
Use the following statement to access the data: data("teengamb", package = “faraway”)
Question 2.1:Convert the sex variable into a factor and label the levels (male and female).
Hint: See Lesson 3, Slide 48.
Question 2.2:
Show the number of males and the number of females.
Hint: Use the summary function. See Lesson 1, Slide 52.
Question 2.3:
Fit a model with gamble as the response and income, verbal and sex as predictors.
Which variables are statistically significant at the 5% level?
Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58.
Question 2.4:
Use the confint function to produce 95% confidence intervals for the coefficients based on the same model.
Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.