STAT 1301/2300: Statistical Packages
Problem 1. Probability distributions and graphical displays (10 points)
Part 1.a. Generate 1000 observations from a normal distribution with mean 100 and standard deviation 15. Call the random number vector norm_vec. (2 points)
Part 1.b. Create a histogram to show the sample data. Overlay a red theoretical normal density curve on the histogram. (4 points)
Make sure the normal curve can be completely Adjust ylim if necessary.
Make sure the normal curve is a bell curve. If you see a straight line. That’s wrong. Read the slides carefully about how to cope with
Part 1.c. Calculate the following (4 points)
The 90th percentile of an F distribution with 5 and 10 degrees of
Calculate the probability P (t19 > 3).
Calculate the probability P (−1.98 < Z < 2.98) where Z ∼ N (0, 1).
Find the 99th percentile of χ2
Problem 2. Sampling distributions (15 points)
Part 2.a. Generate 10000 observations from χ2 distribution. Save the random numbers in a vector called chisq_vec. Create a histogram for it. Observe the shape, especially the skewness of it. (3 points)
Part 2.b. Sample 50 observations from chisq_vec without replacement. Call it chisq_samp. Calculate the mean of the sample data. Compare it with the mean of chisq_vec. (3 points)
Part 2.c. Put chisq_vec in a matrix with 50 rows and 200 columns. Call it chisq_mat. (1 point)
Part 2.d. Consider each column of chisq_mat as a random sample of size 50 from χ2 distribution. Now we have 200 samples (columns)! Calculate the mean of each column and save the means in a vector called mean_vec. You may use the apply() function. Type ?apply in R console for details about apply. (2 points)
Part 2.e. mean_vec actually contains 100 sample means! Now we are able to verify the properties of the sampling distribution of sample mean. (6 points)
Calculate the mean and standard devation of mean_vec. (2 points)
Compare them with their theoretical values. Note that for an χ2 distribution, the mean is k and the variance is 2k. And, based on the property of sampling distribution of sample mean, we have the following:
where n is the sample size.
Create a histogram for mean_vec. What shape do you observe? Is it roughly symmetric? Compare it with the one of chisq_vec. (1 points)
Overlay an empirical normal curve on the histogram of mean_vec. In order to do so, you need to create a sequence of consecutive numbers using seq. Check the range of mean_vec for the from and to values of seq. Use dnorm() function to ftnd the density values. The mean and std of dnorm should be the same with the mean and std of mean_vec. (3 points)
Problem 3. Plot grouped data (5 points)
The mtcars data set is a built-in data set in base R. It comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models). Type ?mtcars in R console for more details about it.
We want to examine the relationship between type of transmission (am) and fuel efficiency (mpg). Use a side-by-side boxplot the compare mpg between the two groups. (5 points)
It should be
The title should be “Fuel Efficiency vs. Transmission”
The x label should be “miles per gallon”
Give meaningful names for the transmission levels and show them on the Instead of changing the labels directly, you may consider transforming am to a factor vector and giving character labels to the levels.
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of