1. Provide the code that parallelizes the following: library(MKinfer) # Load pa

1. Provide the code that parallelizes the following: library(MKinfer) # Load pa

1. Provide the code that parallelizes the following: library(MKinfer) # Load package used for permutation t-test # Create a function for running the simulation: simulate_type_I <- function(n1, n2, distr, level = 0.05, B = 999,alternative = "two.sided", ...) { # Create a data frame to store the results in: p_values <- data.frame(p_t_test = rep(NA, B),p_perm_t_test = rep(NA, B),p_wilcoxon = rep(NA, B)) for(i in 1:B) { # Generate data: x <- distr(n1, ...) y <- distr(n2, ...) # Compute p-values: p_values[i, 1] <- t.test(x, y, alternative = alternative)$p.value p_values[i, 2] <- perm.t.test(x, y,alternative = alternative,R = 999)$perm.p.value p_values[i, 3] <- wilcox.test(x, y,alternative = alternative)$p.value } # Return the type I error rates: return(colMeans(p_values < level)) } 2. Provide the code that runs the following code in parallel with 4 workers (with mclapply): lapply(airquality, function(x) { (x-mean(x))/sd(x) })

Posted in R

Part 1: Understanding Data and Measurement (15 points) Data and Information Hier

Part 1: Understanding Data and Measurement (15 points)
Data and Information Hier

Part 1: Understanding Data and Measurement (15 points)
Data and Information Hierarchy (5 points): Describe the difference between data, information, knowledge, and wisdom, explaining the hierarchical relationship among them. Provide a specific, real-world example to illustrate each level of the hierarchy.
Variables and Measurement Scales (10 points): Explain the different types of variables (nominal, ordinal, interval, and ratio), and describe the associated scales of measurement. Provide a specific example of each type of variable and explain why it is classified as such. Part 2: Descriptive Statistics and Bivariate Analysis (30 points) Frequency Distribution and Summary Measures (15 points): Select a dataset (this could be publicly available data or a dataset from your workplace). Create a frequency distribution for a chosen variable, calculate common summary measures (mean, median, mode, range, variance, and standard deviation), and provide a short interpretation of these measures.
Bivariate Analysis (15 points): With the same dataset or a different one, conduct a bivariate analysis that includes both an association between two qualitative variables and a correlation between two quantitative variables. Interpret your findings. Part 3: Probability and Distributions (30 points) Probability (10 points): Discuss the basic rules of probability, conditional probability, and Bayes’ theorem. Illustrate your discussion with unique examples.
Random Variables and Probability Distributions (20 points): Define discrete and continuous random variables. Give a real-world example of each and describe the associated probability distribution for each variable.
Part 4: Sampling Techniques (20 points)
Sampling (20 points): Define and differentiate random and non-random sampling. Discuss how to determine an appropriate sample size for a given study. Include an illustrative example from a real or hypothetical research study.
Submission Format: The assignment should be approximately 2,000 words. The focus should be on the quality and depth of your responses, rather than meeting a strict word count. Cite your sources in-text and on the reference page in APA format. Write in a clear, concise, and organized manner; demonstrate ethical scholarship in accurate representation and attribution of sources.

Posted in R

The Fat data The Fat data contains the age, weight, height, and ten body circumf

The Fat data
The Fat data contains the age, weight, height, and ten body circumf

The Fat data
The Fat data contains the age, weight, height, and ten body circumference measurements for 252 men. Each man’s percentage of body fat was accurately estimated by an underwater weighing technique.
The data frame contains the following variables:
brozek: Percent of body fat using Brozek’s equation, 457/Density – 414.2
siri: Percent body fat using Siri’s equation, 495/Density – 450
density: Density (gm/cm3)
age: Age (yrs)
weight: Weight (lbs)
height: Height (inches)
adipos: Adiposity index = Weight/Height2 (kg/m2)
free: Fat Free Weight = (1 – fraction of body fat) * Weight, using Brozek’s formula (lbs)
neck: Neck circumference (cm)
chest: Chest circumference (cm)
abdom: Abdomen circumference (cm) at the umbilicus and level with the iliac crest
hip: Hip circumference (cm)
thigh: Thigh circumference (cm)
knee: Knee circumference (cm)
ankle: Ankle circumference (cm)
biceps: Extended biceps circumference (cm)
forearm: Forearm circumference (cm)
wrist: Wrist circumference (cm) distal to the styloid processes
You can access the data using the following statement: data(fat, package = “faraway”)
Question 1
Fit a regression model with the brozek variable (percent of body fat) as a response and the following six predictors: age, neck, abdom, thigh, forearm and wrist.
Show the summary. Which predictors are significant at the 0.05 level?
Question 2
Provide interpretation to the coefficient of each significant predictor
Hints:
Hints: See Lesson 3, Slide 49 and Slide 58.
Question 3
Compute the median value of the six predictors. Store the medians in a variable named x0 and show the values .
Hint: See Lesson 4, Slide 18.
Question 4
Construct a confidence interval of the mean response based on the median values that you stored in x0.
Hint: See Lesson 4, Slide 20.
Question 5
Construct a prediction interval of the next response value based on the median values that you stored in x0.
Hint: See Lesson 4, Slide 20.
Question 6
Which of the two intervals is wider?
Question 7
Construct a confidence interval of the outcome variable for a person with the following characteristics:
Age: 49 years
Neck: circumference: 40 cm
Abdomen: circumference: 95 cm
thigh: circumference: 60 cm
forearm: circumference: 31 cm
wrist circumference: 19.5 cm
Hints:
You can store the predictor values in a new variable named x1. Here is an example of such a variable:
x1 <- c("(Intercept)" = 1, age = 25, neck =34, abdom = 84, forearm = 25, wrist = 25) Note that the intercept should be 1, but you will need to update the values of the predictors.

Posted in R

Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
The cheddar dataset has 30 observations on the following four variables:
taste: a subjective taste score
Acetic: concentration of acetic acid (log scale)
H2S: concentration of hydrogen sulfide (log scale)
Lactic: concentration of lactic acid
Use the following statement to access the data: data(“cheddar”, package = “faraway”)
Question 1.1:
Show the first six observations of the dataset.
Hint: Use the head() function. For example: head(cars)
Question 1.2:
Show descriptive statistics for each of the variables.
What is the mean subjective taste score?
Hint: Use the summary function. For example: summary(cars)
Question 1.3:
Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.
Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)
Question 1.4:
Fit a regression model with taste as the response and no predictors.
What is the value of the intercept? What does it represent?
For Example:
lmodNoX <- lm (speed ~ 1, data = cars) summary(lmodNoX) Question 1.5: Fit a regression model with taste as the response and Acetic as the only predictor. Is the model statistically significant at the 5% level? Hint: See Lesson 2, Slide 51. Question 1.6: Calculate the p-value of the entire model you created in Question 1.5 using the anova() function. Hint: See Lesson 3, Slide 17. Question 1.7: Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic). Which predictors are statistically significant at the 5% level? Hint: See Lesson 3, Slide 16. Question 1.8: Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7. Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19. Question 1.9: Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model. Should we reject this hypothesis? Hint: Lesson 3, Slide 22. Part 2: Study of Teenage Gambling in Britain The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables: sex: 0 = male, 1 = female status: Socioeconomic status score based on parents’ occupation income: income in pounds per week verbal: verbal score in words out of 12 correctly defined gamble: expenditure on gambling in pounds per year Use the following statement to access the data: data("teengamb", package = “faraway”) Question 2.1:Convert the sex variable into a factor and label the levels (male and female). Hint: See Lesson 3, Slide 48. Question 2.2: Show the number of males and the number of females. Hint: Use the summary function. See Lesson 1, Slide 52. Question 2.3: Fit a model with gamble as the response and income, verbal and sex as predictors. Which variables are statistically significant at the 5% level? Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58. Question 2.4: Use the confint function to produce 95% confidence intervals for the coefficients based on the same model. Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

Posted in R

Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
The cheddar dataset has 30 observations on the following four variables:
taste: a subjective taste score
Acetic: concentration of acetic acid (log scale)
H2S: concentration of hydrogen sulfide (log scale)
Lactic: concentration of lactic acid
Use the following statement to access the data: data(“cheddar”, package = “faraway”)
Question 1.1:
Show the first six observations of the dataset.
Hint: Use the head() function. For example: head(cars)
Question 1.2:
Show descriptive statistics for each of the variables.
What is the mean subjective taste score?
Hint: Use the summary function. For example: summary(cars)
Question 1.3:
Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.
Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)
Question 1.4:
Fit a regression model with taste as the response and no predictors.
What is the value of the intercept? What does it represent?
For Example:
lmodNoX <- lm (speed ~ 1, data = cars) summary(lmodNoX) Question 1.5: Fit a regression model with taste as the response and Acetic as the only predictor. Is the model statistically significant at the 5% level? Hint: See Lesson 2, Slide 51. Question 1.6: Calculate the p-value of the entire model you created in Question 1.5 using the anova() function. Hint: See Lesson 3, Slide 17. Question 1.7: Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic). Which predictors are statistically significant at the 5% level? Hint: See Lesson 3, Slide 16. Question 1.8: Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7. Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19. Question 1.9: Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model. Should we reject this hypothesis? Hint: Lesson 3, Slide 22. Part 2: Study of Teenage Gambling in Britain The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables: sex: 0 = male, 1 = female status: Socioeconomic status score based on parents’ occupation income: income in pounds per week verbal: verbal score in words out of 12 correctly defined gamble: expenditure on gambling in pounds per year Use the following statement to access the data: data("teengamb", package = “faraway”) Question 2.1:Convert the sex variable into a factor and label the levels (male and female). Hint: See Lesson 3, Slide 48. Question 2.2: Show the number of males and the number of females. Hint: Use the summary function. See Lesson 1, Slide 52. Question 2.3: Fit a model with gamble as the response and income, verbal and sex as predictors. Which variables are statistically significant at the 5% level? Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58. Question 2.4: Use the confint function to produce 95% confidence intervals for the coefficients based on the same model. Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

Posted in R

Part 1: The Cheddar Cheese Study In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe V

Part 1: The Cheddar Cheese Study
In a study of cheddar cheese from the LaTrobe Valley of Victoria, Australia, samples of cheese were analyzed for their chemical composition and were subjected to taste tests. Overall taste scores were obtained by combining the scores from several tasters.
The cheddar dataset has 30 observations on the following four variables:
taste: a subjective taste score
Acetic: concentration of acetic acid (log scale)
H2S: concentration of hydrogen sulfide (log scale)
Lactic: concentration of lactic acid
Use the following statement to access the data: data(“cheddar”, package = “faraway”)
Question 1.1:
Show the first six observations of the dataset.
Hint: Use the head() function. For example: head(cars)
Question 1.2:
Show descriptive statistics for each of the variables.
What is the mean subjective taste score?
Hint: Use the summary function. For example: summary(cars)
Question 1.3:
Show a histogram for each of the variables. Make sure to label the x-axis of each histogram.
Hint: use the hist function. For example: hist(cars$speed, xlab = “Speed (mph)”, main = “”)
Question 1.4:
Fit a regression model with taste as the response and no predictors.
What is the value of the intercept? What does it represent?
For Example:
lmodNoX <- lm (speed ~ 1, data = cars) summary(lmodNoX) Question 1.5: Fit a regression model with taste as the response and Acetic as the only predictor. Is the model statistically significant at the 5% level? Hint: See Lesson 2, Slide 51. Question 1.6: Calculate the p-value of the entire model you created in Question 1.5 using the anova() function. Hint: See Lesson 3, Slide 17. Question 1.7: Fit a regression model with taste as the response and the three chemical contents as predictors (Acetic, H2S, and Lactic). Which predictors are statistically significant at the 5% level? Hint: See Lesson 3, Slide 16. Question 1.8: Use the anova() function to recalculate the significance of the Acetic variable as shown in the output of Question 1.7. Hints: Use the anova function to compare the three predictor model with a model that does not include Acetic. See Lesson 3, Slide 19. Question 1.9: Test the hypothesis that the coefficients of Acetic and H2S both equal 0 when Lactic is included in the model. Should we reject this hypothesis? Hint: Lesson 3, Slide 22. Part 2: Study of Teenage Gambling in Britain The teengamb dataset contains a survey conducted to study teenage gambling in Britain. The dataset has 47 observations and five variables: sex: 0 = male, 1 = female status: Socioeconomic status score based on parents’ occupation income: income in pounds per week verbal: verbal score in words out of 12 correctly defined gamble: expenditure on gambling in pounds per year Use the following statement to access the data: data("teengamb", package = “faraway”) Question 2.1:Convert the sex variable into a factor and label the levels (male and female). Hint: See Lesson 3, Slide 48. Question 2.2: Show the number of males and the number of females. Hint: Use the summary function. See Lesson 1, Slide 52. Question 2.3: Fit a model with gamble as the response and income, verbal and sex as predictors. Which variables are statistically significant at the 5% level? Provide interpenetration to the coefficients of the significant variables.Hints: See Lesson 3, Slide 49 and Slide 58. Question 2.4: Use the confint function to produce 95% confidence intervals for the coefficients based on the same model. Can you deduce which coefficients are significant at the level of 5% based on the intervals?Hint: See Lesson 3, Slide 26, the last code for the whole model.

Posted in R