(5/5)

1. (Chapter 3, pp. 65-67) Consider the multiple linear regression model

y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + ε.

For the general linear hypothesis approach, find the appropriate D and d. (15 points) (a) H0 : β1 + 2β2 = 3

(b) H0

: β1 + β2 = β

2 3

(c) H0 : β1 = β2, β3 = β4

(d) H0 : β1 − 2β2 = 4β3, β1 + 2β2 = 0 (e) H0 : β1 = β2 = β3 = β4

2. (R programming) The National Football League data are in the attached CSV file. (65 points)

y Games won (per 14-game season)

x1 Rushing yards (season)

x2 Passing yards (season)

x3 Punting average (yards/punt)

x4 Field goal percentage (FGs made/FGs attempted 2season)

x5 Turnover differential (turnovers acquired–turnovers lost)

x6 Penalty yards (season)

x7 Percent rushing (rushing plays/total plays)

x8 Opponents’ rushing yards (season)

x9 Opponents’ passing yards (season)

Table 1: Variable description

(a) (Lecture Notes Ch 3 pp. 16-18) Draw nine scatterplots for the number of games won (y) against the nine variables: x1, . . . , x9. (5 points)

From the plots, the number of games won (y) seem to be associated with the team’s rushing yardage (x1), passing yardage (x2), turnover differential (x5), percentage of rushing plays (x7), and opponents’ yards rushing (x8). We fit a multiple linear regression model relating y to x1, x2, x5, x7, and x8. The corresponding parameters are denoted by β1, β2, β5, β7, β8.

(b) Write down the fitted linear model, and report (i) the variance estimate (σˆ2) (ii) R2 and (iii) adjusted R2. (5 points)

(c) Interpret the six parameter estimates: βˆ1, βˆ2, βˆ5, βˆ7, βˆ8, and σˆ2. (5 points)

(d) (Lecture Notes Chapter 3, pp.45-46) Obtain the anova output using the code on p.46, and construct ANOVA table as in p.45 based on the output. (10 points)

(e) (Lecture Notes Chapter 3, pp.45-47) Obtain the anova output using the code on p.47 and construct ANOVA table as in p.45 based on the output. (10 points)

(f) Confirm that the two ANOVA tables from (d)-(e) are the same and perform test for significance of regression at significance level of 0.05. (5 points)

(g) Test the value of passing yardage given all the other 4 predictors at significance level of 0.05. (5 points)

i. Write down the null and alternative hypotheses.

ii. Specify the distribution of the test statistics under the null hypothesis.

iii. Find the observed test statistic and p-value, and draw a conclusion.

In the model with the 5 predictors, it is not significant that the number of games won (y) is associated with the team’s rushing yardage (x1), turnover differential (x5), and percentage of rushing plays (x7). Therefore, we investigate the contribution of the 3 predictors (x1, x5, x7) to the model.

(h) Test an appropriate subset of coefficients at significance level of 0.05. (10 points)

i. Write down the null and alternative hypotheses.

ii. Specify the distribution of the test statistics under the null hypothesis.

iii. Find the observed test statistic and p-value, and draw a conclusion.

(i) Which model do you prefer to consider based on the test performed in (h)? (5 points)

(j) Attach the R code and console output you used for (a)-(h). (5 points)

(5/5)

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of