Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypothesis. An experiment is designed to test the hypothesis. It is designed in such a way that it hopefully will deliver conclusive results. The data from a population is collected and analyzed, and then a conclusion is drawn. From your own experiences and reading:
Explain what are the 2 major problems with collecting the samples? Is it possible to fix the problems you mentioned? If not, explain why is that so. If it is, explain how you would do it. To participate in the discussion, respond to the discussion promptly by Thursday at 11:59PM EST. Then, read a selection of your colleagues’ postings. Finally, respond to at least two classmates by Sunday at 11:59PM EST in one or more of the following ways: I will post two classmates’s work later and you will respond to both of them

Posted in R

Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypothesis. An experiment is designed to test the hypothesis. It is designed in such a way that it hopefully will deliver conclusive results. The data from a population is collected and analyzed, and then a conclusion is drawn. From your own experiences and reading:
Explain what are the 2 major problems with collecting the samples? Is it possible to fix the problems you mentioned? If not, explain why is that so. If it is, explain how you would do it. To participate in the discussion, respond to the discussion promptly by Thursday at 11:59PM EST. Then, read a selection of your colleagues’ postings. Finally, respond to at least two classmates by Sunday at 11:59PM EST in one or more of the following ways: I will post two classmates’s work later and you will respond to both of them

Posted in R

Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypoth

Data scientists conduct continual experiments. This process starts with a hypothesis. An experiment is designed to test the hypothesis. It is designed in such a way that it hopefully will deliver conclusive results. The data from a population is collected and analyzed, and then a conclusion is drawn. From your own experiences and reading:
Explain what are the 2 major problems with collecting the samples? Is it possible to fix the problems you mentioned? If not, explain why is that so. If it is, explain how you would do it. To participate in the discussion, respond to the discussion promptly by Thursday at 11:59PM EST. Then, read a selection of your colleagues’ postings. Finally, respond to at least two classmates by Sunday at 11:59PM EST in one or more of the following ways: I will post two classmates’s work later and you will respond to both of them

Posted in R

Continuing with the theme of hypothesis testing, this week, we turn our attentio

Continuing with the theme of hypothesis testing, this week, we turn our attentio

Continuing with the theme of hypothesis testing, this week, we turn our attention to conducting tests for one sample, two paired samples, and two independent samples. To further develop our understanding of these tests, this assignment will focus on the application of these statistical techniques. You will select a dataset, conduct the appropriate tests, and share your findings.Assignment Requirements: Dataset Selection: Choose a dataset that allows for one sample, two paired samples, and two independent sample tests. Briefly explain why you have chosen this dataset.
Hypothesis Formulation: Formulate hypotheses appropriate for one sample, two paired samples, and two independent sample tests. Describe the hypotheses for each test clearly.
Execution of Tests: Perform the tests using Python or R, and document the steps you have taken. Be sure to include your code in your initial post.
Results Interpretation: Interpret the results of your tests. What do the results tell you about your dataset and the hypotheses you formulated?
Conclusions and Applications: Summarize your findings and discuss potential real-world applications of your conclusions. Submission Format: Your submission should be a maximum of 500-600 words (excluding Python/R code). Submit your assignment in APA format as a Word document or a PDF file. Include your written analysis and any tables or visualizations that support your findings. If you used any software for your calculations (like R, Python, Excel), please include your code or formulas as well. Include an APA-formatted reference list for any external resources used.

Posted in R

Introduction: Provide a concise overview of the concepts of parametric tests, un

Introduction: Provide a concise overview of the concepts of parametric tests, un

Introduction: Provide a concise overview of the concepts of parametric tests, univariate tests for normality, and hypothesis testing. Dataset Selection: Identify and describe a dataset suitable for applying these tests. Explain your reasons for choosing it. Parametric Test Application: Conduct a parametric test on your selected dataset. Include all steps and any Python or R code you used. Univariate Test for Normality Application:Perform a univariate test for normality on your dataset. Again, include all steps and any Python or R code used. Results and Conclusion: Summarize your test results. Were your hypotheses confirmed or rejected? What conclusions can you draw about the population from your sample? Submission Format: Your submission should be a maximum of 500-600 words (excluding Python/R code). Submit your assignment in APA format as a Word document or a PDF file. Include your written analysis and any tables or visualizations that support your findings. If you used any software for your calculations (like R, Python, Excel), please include your code or formulas as well. Include an APA-formatted reference list for any external resources used.

Posted in R

Submission: executive report (2 pages) with appendix(no page limitation), slides

Submission: executive report (2 pages) with appendix(no page limitation), slides

Submission: executive report (2 pages) with appendix(no page limitation), slides (10 mins)Tentative Grading Rules:
+Nice coding
+Nice EDA analysis
+Well written executive report.
+Nice presentation
+tried Arima, regression, and smoothed methods
+tried advanced models (for example combined methods)
+ Performed model selections
+ good prediction
+good recommendation
+class discussion
– not sufficient EDA
– The prediction part is not consistent with your conclusion
– Report writing can be improved.
– Presentation can be improved.
– R coding can be improved
– Need to try advanced models
-not consider multi-seasonality.
A public transportation company is expecting increasing demand for…
A public transportation company is expecting increasing demand for its services and is planning to acquire new buses and to extend its terminals.These investments require a reliable forecast of future demand which should be based on historic demand stored in the companyâs data warehouse. For each 15-minute interval between 6:30 hours and 22 hours the number of passengers arriving at the terminal has been recorded and stored. As a forecasting consultant you have been asked to forecast the number of passengers arriving at the terminal. Available Data Part of the historic information is available in the file bicup2006.xls. The file contains the worksheet “Historic Information” with known demand for a 3-week period, separated into 15-minute intervals. The second worksheet (“Future”) contains dates and times for a future 3-day period, for which forecasts should be generated (as part of the 2006 competition) Assignment Goal Your goal is to create a model/method that produces accurate forecasts. To evaluate your accuracy, partition the given historic data into two periods: a training period (the first two weeks) and a validation period (the last week). Models should be fitted only to the training data and evaluated on the validation data. Although the competition winning criterion was the lowest Mean Absolute Error (MAE) on the future 3-day data, this is not the goal for this assignment. Instead, if we consider a more realistic business context, our goal is to create a model that generates reasonably good forecasts on any time/day of the week. Consider not only predictive metrics such as MAE, MAPE, and RMSE, but also look at actual and forecasted values, overlaid on a time plot. Assignment For your final model, present the following summary: 1. Name of the method/combination of methods 2. A brief description of the method/combination 3. All estimated equations associated with constructing forecasts from this method 4. The MAPE and MAE for the training period and the validation period 5. Forecasts for the future period (March 22-24), in 15-min intervals 6. A single chart showing the fit of the final version of the model to the entire period (including training, validation, and future). Note that this model should be fitted using the combined training + validation data Tips and Suggested Steps 1. Use exploratory analysis to identify the components of this time series. Is there a trend? Is there seasonality? If so, how many “seasons” are there? Are there any other visible patterns? Are the patterns global (the same throughout the series) or local? 2. Consider the frequency of the data from a practical and technical point of view. What are some options? 3. Compare the weekdays and weekends. How do they differ? Consider how these differences can be captured by different methods. 4. Examine the series for missing values or unusual values. Suggest solutions. 5. Based on the patterns that you found in the data, which models or methods should be considered? 6. Consider how to handle actual counts of zero within the computation of MAPE.

Posted in R

A store sells two types of toys, A and B. The store owner pays $8 and $14 for e

A store sells two types of toys, A and B. The store owner pays $8 and $14 for e

A store sells two types of toys, A and B. The store owner pays $8 and $14 for each one unit of toy A and B respectively. One unit of toys A yields a profit of $2 while a unit of toys B yields a profit of $3. The store owner estimates that no more than 2000 toys will be sold every month and he does not plan to invest more than $20,000 in inventory of these toys. How many units of each type of toys should be stocked in order to maximize his profit?
ex2 transportation problem

Posted in R

1. Provide the code that parallelizes the following: library(MKinfer) # Load pa

1. Provide the code that parallelizes the following: library(MKinfer) # Load pa

1. Provide the code that parallelizes the following: library(MKinfer) # Load package used for permutation t-test # Create a function for running the simulation: simulate_type_I <- function(n1, n2, distr, level = 0.05, B = 999,alternative = "two.sided", ...) { # Create a data frame to store the results in: p_values <- data.frame(p_t_test = rep(NA, B),p_perm_t_test = rep(NA, B),p_wilcoxon = rep(NA, B)) for(i in 1:B) { # Generate data: x <- distr(n1, ...) y <- distr(n2, ...) # Compute p-values: p_values[i, 1] <- t.test(x, y, alternative = alternative)$p.value p_values[i, 2] <- perm.t.test(x, y,alternative = alternative,R = 999)$perm.p.value p_values[i, 3] <- wilcox.test(x, y,alternative = alternative)$p.value } # Return the type I error rates: return(colMeans(p_values < level)) } 2. Provide the code that runs the following code in parallel with 4 workers (with mclapply): lapply(airquality, function(x) { (x-mean(x))/sd(x) })

Posted in R

Data Obtain time series data on the price of a commodity. ** I will send the da

Data
Obtain time series data on the price of a commodity. ** I will send the da

Data
Obtain time series data on the price of a commodity. ** I will send the data
Analysis
(1) Transform the data if necessary. (2) Plot the data and tabulate the descriptive statistics. Comment on your observations. (3) Conduct ARIMA modelling. (4) Appraise the adequacy of the results and apply appropriate diagnostic procedures. Which model gives the best ‘fit’? (5) Forecast the commodity price using appropriate forecasting techniques and evaluate your forecast. Select the best model for forecasting. (6) Compare your methods and findings with those of other empirical studies on this topic.

Posted in R

Data Obtain time series data on the price of a commodity. ** I will send the da

Data
Obtain time series data on the price of a commodity. ** I will send the da

Data
Obtain time series data on the price of a commodity. ** I will send the data
Analysis
(1) Transform the data if necessary. (2) Plot the data and tabulate the descriptive statistics. Comment on your observations. (3) Conduct ARIMA modelling. (4) Appraise the adequacy of the results and apply appropriate diagnostic procedures. Which model gives the best ‘fit’? (5) Forecast the commodity price using appropriate forecasting techniques and evaluate your forecast. Select the best model for forecasting. (6) Compare your methods and findings with those of other empirical studies on this topic.

Posted in R