(5/5)

**Covariance, Correlation and Linear Regression**

1, For the below datasets, perform the following steps. Use the function provided by the instructor for COV, COR, OLS. Draw a scatter plot for this data

- Calculate the covariance and correlation
- Compute Estimate the regression line by calculating the slope and intercept using the OLS function.
- Draw a scatter plot and the best-fitted line in one plot.
- Calculate the 𝑅2
- Interpret the fitted slope, is the intercept meaningful? Explain.

Dataset 1: Study Time and Exam Scores (n = 10 students) [ScoreData.txt]

Student |
Study Hours |
Exam Score |

Tom |
1 |
53 |

Mary |
5 |
74 |

Sarah |
7 |
59 |

Oscar |
8 |
43 |

Cullyn |
10 |
56 |

Jaime |
11 |
84 |

Theresa |
14 |
96 |

Knut |
15 |
69 |

Jin-Mae |
15 |
84 |

Courtney |
19 |
83 |

Dataset 2: Portfolio Returns on Selected Mutual Funds (n = 17 funds) [LastThisYear.txt]

Last Year (X) |
This Year (Y) |

11.9 |
15.4 |

19.5 |
26.7 |

11.2 |
18.2 |

14.1 |
16.7 |

14.2 |
13.2 |

5.2 |
16.4 |

20.7 |
21.1 |

11.3 |
12.0 |

-1.1 |
12.1 |

3.9 |
7.4 |

12.9 |
11.5 |

12.4 |
23.0 |

12.5 |
12.7 |

2.7 |
15.1 |

8.8 |
18.7 |

7.2 |
9.9 |

5.9 |
18.9 |

Dataset 3: Number of Orders and Shipping Cost (n = 12 months) [OrderShipCost.txt]

Orders (X) |
Ship Cost (Y) |

1,068 |
4,489 |

1,026 |
5,611 |

767 |
3,290 |

885 |
4,113 |

1,156 |
4,883 |

1,146 |
5,425 |

892 |
4,414 |

938 |
5,506 |

769 |
3,346 |

677 |
3,673 |

1,174 |
6,542 |

1,009 |
5,088 |

- Perform multiple regression analysis on the SalesAdvertising data set.

The SalesAdvertising dataset contains statistics about the sales of a product in 200 different markets, together with advertising budgets in each of these markets for different media channels: TV, radio and newspaper. The sales are in thousands of units and the budget is in thousands of dollars.

Your task is to explore the dataset and build a multi-regression model that would take TV, radio and newspaper as predictors and Sales as the target variable.

- Load the data into NumPy
- Summarize Statistics of the dataset (mean, stdev, max,min for each column).
- Visualize the data using correlation matrix & scatter plots
- Split the data into Training and Test dataset.
- Fit the multiple linear regression to the training dataset
- Estimate the regression model and explain the regression equation.
- Predict the Test dataset
- Visualize the residuals
- Calculate Regression Error Metrics (Accuracy, MSE and RMSE).

(5/5)

DescriptionIn this final assignment, the students will demonstrate their ability to apply two majorconstructs of the C programming language – Fu

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of