The Fat data

The Fat data contains the age, weight, height, and ten body circumference measurements for 252 men. Each man’s percentage of body fat was accurately estimated by an underwater weighing technique.

The data frame contains the following variables:

brozek: Percent of body fat using Brozek’s equation, 457/Density – 414.2

siri: Percent body fat using Siri’s equation, 495/Density – 450

density: Density (gm/cm3)

age: Age (yrs)

weight: Weight (lbs)

height: Height (inches)

adipos: Adiposity index = Weight/Height2 (kg/m2)

free: Fat Free Weight = (1 – fraction of body fat) * Weight, using Brozek’s formula (lbs)

neck: Neck circumference (cm)

chest: Chest circumference (cm)

abdom: Abdomen circumference (cm) at the umbilicus and level with the iliac crest

hip: Hip circumference (cm)

thigh: Thigh circumference (cm)

knee: Knee circumference (cm)

ankle: Ankle circumference (cm)

biceps: Extended biceps circumference (cm)

forearm: Forearm circumference (cm)

wrist: Wrist circumference (cm) distal to the styloid processes

You can access the data using the following statement: data(fat, package = “faraway”)

Question 1

Fit a regression model with the brozek variable (percent of body fat) as a response and the following six predictors: age, neck, abdom, thigh, forearm and wrist.

Show the summary. Which predictors are significant at the 0.05 level?

Question 2

Provide interpretation to the coefficient of each significant predictor

Hints:

Hints: See Lesson 3, Slide 49 and Slide 58.

Question 3

Compute the median value of the six predictors. Store the medians in a variable named x0 and show the values .

Hint: See Lesson 4, Slide 18.

Question 4

Construct a confidence interval of the mean response based on the median values that you stored in x0.

Hint: See Lesson 4, Slide 20.

Question 5

Construct a prediction interval of the next response value based on the median values that you stored in x0.

Hint: See Lesson 4, Slide 20.

Question 6

Which of the two intervals is wider?

Question 7

Construct a confidence interval of the outcome variable for a person with the following characteristics:

Age: 49 years

Neck: circumference: 40 cm

Abdomen: circumference: 95 cm

thigh: circumference: 60 cm

forearm: circumference: 31 cm

wrist circumference: 19.5 cm

Hints:

You can store the predictor values in a new variable named x1. Here is an example of such a variable:

x1 <- c("(Intercept)" = 1, age = 25, neck =34, abdom = 84, forearm = 25, wrist = 25)
Note that the intercept should be 1, but you will need to update the values of the predictors.

# Category: R

## Please help me solve the problems in the following document, and provide the ans

Please help me solve the problems in the following document, and provide the answer with pdf form. Thank you!

## The data set is below this link, and the description is below, and you need to u

The data set is below this link, and the description is below, and you need to use kNN (regression or classification) on the data. You need to know how to use R-markdown and R to work on this report.

https://drive.google.com/drive/folders/1e17NNZPVSc…

## The data set is below this link, and the description is below, and you need to u

The data set is below this link, and the description is below, and you need to use kNN (regression or classification) on the data. You need to know how to use R-markdown and R to work on this report.

https://drive.google.com/drive/folders/1e17NNZPVSc…

## R questions will provide more data once accepted.R workshop – Data Visualization

R questions will provide more data once accepted.R workshop – Data Visualization Exercise #1-#8 (2 points each)

Workshop #4 Exercise- #1

Exercise- #2.1

## Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study

In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.

The cheddar dataset has 30 observations on the following four variables:

taste: a subjective taste score

Acetic: concentration of acetic acid (log scale)

H2S: concentration of hydrogen sulfide (log scale)

Lactic: concentration of lactic acid

Use the following statement to access the data: data(“cheddar”, package = “faraway”)

Question 1.1:

Show the first six observations of the dataset.

Hint: Use the head() function. For example: head(cars)

Question 1.2:

Show descriptive statistics for each of the variables.

What is the mean subjective taste score?

Hint: Use the summary function. For example: summary(cars)

Question 1.3:

Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.

Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)

Question 1.4:

Fit a regression model with taste as the response and no predictors.

What is the value of the intercept? What does it represent?

For Example:

lmodNoX <- lm (speed ~ 1, data = cars)
summary(lmodNoX)
Question 1.5:
Fit a regression model with taste as the response and Acetic as the only predictor.
Is the model statistically significant at the 5% level?
Hint: See Lesson 2, Slide 51.
Question 1.6:
Calculate the p-value of the entire model you created in Question 1.5 using the anova() function.
Hint: See Lesson 3, Slide 17.
Question 1.7:
Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic).
Which predictors are statistically significant at the 5% level?
Hint: See Lesson 3, Slide 16.
Question 1.8:
Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7.
Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19.
Question 1.9:
Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model.
Should we reject this hypothesis?
Hint: Lesson 3, Slide 22.
Part 2: Study of Teenage Gambling in Britain
The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables:
sex: 0 = male, 1 = female
status: Socioeconomic status score based on parents’ occupation
income: income in pounds per week
verbal: verbal score in words out of 12 correctly defined
gamble: expenditure on gambling in pounds per year
Use the following statement to access the data: data("teengamb", package = “faraway”)
Question 2.1:Convert the sex variable into a factor and label the levels (male and female).
Hint: See Lesson 3, Slide 48.
Question 2.2:
Show the number of males and the number of females.
Hint: Use the summary function. See Lesson 1, Slide 52.
Question 2.3:
Fit a model with gamble as the response and income, verbal and sex as predictors.
Which variables are statistically significant at the 5% level?
Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58.
Question 2.4:
Use the confint function to produce 95% confidence intervals for the coefficients based on the same model.
Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

## Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study

In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.

The cheddar dataset has 30 observations on the following four variables:

taste: a subjective taste score

Acetic: concentration of acetic acid (log scale)

H2S: concentration of hydrogen sulfide (log scale)

Lactic: concentration of lactic acid

Use the following statement to access the data: data(“cheddar”, package = “faraway”)

Question 1.1:

Show the first six observations of the dataset.

Hint: Use the head() function. For example: head(cars)

Question 1.2:

Show descriptive statistics for each of the variables.

What is the mean subjective taste score?

Hint: Use the summary function. For example: summary(cars)

Question 1.3:

Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.

Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)

Question 1.4:

Fit a regression model with taste as the response and no predictors.

What is the value of the intercept? What does it represent?

For Example:

lmodNoX <- lm (speed ~ 1, data = cars)
summary(lmodNoX)
Question 1.5:
Fit a regression model with taste as the response and Acetic as the only predictor.
Is the model statistically significant at the 5% level?
Hint: See Lesson 2, Slide 51.
Question 1.6:
Calculate the p-value of the entire model you created in Question 1.5 using the anova() function.
Hint: See Lesson 3, Slide 17.
Question 1.7:
Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic).
Which predictors are statistically significant at the 5% level?
Hint: See Lesson 3, Slide 16.
Question 1.8:
Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7.
Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19.
Question 1.9:
Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model.
Should we reject this hypothesis?
Hint: Lesson 3, Slide 22.
Part 2: Study of Teenage Gambling in Britain
The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables:
sex: 0 = male, 1 = female
status: Socioeconomic status score based on parents’ occupation
income: income in pounds per week
verbal: verbal score in words out of 12 correctly defined
gamble: expenditure on gambling in pounds per year
Use the following statement to access the data: data("teengamb", package = “faraway”)
Question 2.1:Convert the sex variable into a factor and label the levels (male and female).
Hint: See Lesson 3, Slide 48.
Question 2.2:
Show the number of males and the number of females.
Hint: Use the summary function. See Lesson 1, Slide 52.
Question 2.3:
Fit a model with gamble as the response and income, verbal and sex as predictors.
Which variables are statistically significant at the 5% level?
Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58.
Question 2.4:
Use the confint function to produce 95% confidence intervals for the coefficients based on the same model.
Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

## Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study

In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.

The cheddar dataset has 30 observations on the following four variables:

taste: a subjective taste score

Acetic: concentration of acetic acid (log scale)

H2S: concentration of hydrogen sulfide (log scale)

Lactic: concentration of lactic acid

Use the following statement to access the data: data(“cheddar”, package = “faraway”)

Question 1.1:

Show the first six observations of the dataset.

Hint: Use the head() function. For example: head(cars)

Question 1.2:

Show descriptive statistics for each of the variables.

What is the mean subjective taste score?

Hint: Use the summary function. For example: summary(cars)

Question 1.3:

Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.

Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)

Question 1.4:

Fit a regression model with taste as the response and no predictors.

What is the value of the intercept? What does it represent?

For Example:

lmodNoX <- lm (speed ~ 1, data = cars)
summary(lmodNoX)
Question 1.5:
Fit a regression model with taste as the response and Acetic as the only predictor.
Is the model statistically significant at the 5% level?
Hint: See Lesson 2, Slide 51.
Question 1.6:
Calculate the p-value of the entire model you created in Question 1.5 using the anova() function.
Hint: See Lesson 3, Slide 17.
Question 1.7:
Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic).
Which predictors are statistically significant at the 5% level?
Hint: See Lesson 3, Slide 16.
Question 1.8:
Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7.
Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19.
Question 1.9:
Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model.
Should we reject this hypothesis?
Hint: Lesson 3, Slide 22.
Part 2: Study of Teenage Gambling in Britain
The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables:
sex: 0 = male, 1 = female
status: Socioeconomic status score based on parents’ occupation
income: income in pounds per week
verbal: verbal score in words out of 12 correctly defined
gamble: expenditure on gambling in pounds per year
Use the following statement to access the data: data("teengamb", package = “faraway”)
Question 2.1:Convert the sex variable into a factor and label the levels (male and female).
Hint: See Lesson 3, Slide 48.
Question 2.2:
Show the number of males and the number of females.
Hint: Use the summary function. See Lesson 1, Slide 52.
Question 2.3:
Fit a model with gamble as the response and income, verbal and sex as predictors.
Which variables are statistically significant at the 5% level?
Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58.
Question 2.4:
Use the confint function to produce 95% confidence intervals for the coefficients based on the same model.
Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

## a. How many people in this sample have breast cancer? What proportion? b. What p

a. How many people in this sample have breast cancer? What proportion?

b. What proportion of respondents are missing information for household income?

c. Why do the proportions change when we use the option “, missing” in the tab command?

d. What is the mean BMI in the sample? What is the standard deviation? What are the minimum and maximum values?

e. Are there any participants missing information for BMI? How can you tell?

please answer the questions according to the document provided.

## The magrittr package adds a set of tools called pipes to R. We need it for this

The magrittr package adds a set of tools called pipes to R. We need it for this exam: install.packages(“magrittr”).

Rewrite the following function calls using pipes, with x <- 1:8 (and submit in the plain text the solutions):
1. add a new variable rev_per_minute using a pipe (below is how it is done without pipe so it has to be replaced)
age <- c(28, 48, 47, 71, 22, 80, 48, 30, 31)
purchase <- c(20, 59, 2, 12, 22, 160, 34, 34, 29)
visit_length <- c(5, 2, 20, 22, 12, 31, 9, 10, 11)
bookstore <- data.frame(age, purchase, visit_length)
2. sqrt(mean(x))
3. assign("x", 25)
4. sort(x^2+5)[1:2]
Attached grading rubric