(5/5)

You work for an auction house that specializes in buying and selling classic cars. Few cars are as iconic as the Corvette, one of the most celebrated American sports cars and with one of the longest production runs (almost 70 years!). The attached dataset includes information about 100 Corvettes sold at four different auctions in 2019 (Dallas, Louisville, Kansas City, and Monterey). The first sheet contains the data and the second sheet contains the data dictionary (description of each variable). Your task: prepare the Corvette dataset for data analysis and visualization. Prepare the Corvette dataset using Excel or R (both work equally well for this assignment) using the 10 questions/prompts below and address them in a brief memo (2.5 pts per item): Hint: It may be helpful to address these items in the order listed. Which variables have one or more missing values? For each variable with missing data, explain whether you would apply simple mean imputation or omit the variable from the dataset. Simplify the color categories for both the Color and Interior variables individually so there are no more than seven categories in each of these two variables.

The seven categories can be different for each of the variables. The Engine variable has a lot going on! It may refer to engine size (in cubic inches or liters), horsepower, or engine size and horsepower. Should the variable be divided into two variables, one for engine size and one for horsepower? This variable should be interpreted categorically (each model year has a set number of engine/horsepower configurations), but are there too many categories to be meaningful? Should this variable remain in the dataset? If the Transmission variable is missing data and imputation is used, what is the imputed value replacing the missing values? If the Convertible variable is missing data and imputation is used, what is the imputed value replacing the missing values? If the Miles variable is missing data and imputation is used, what is the imputed value replacing the missing values? Create a set of k-1 dummy variables based on the categorical Auction variable. Which is the reference category and why? For each row that still has missing data, explain whether you would prefer to omit the observation or impute the value for the missing data, and then take this action. If there is no remaining missing information, note that in the assignment. Convert the Transmission and Convertible variables to numerical dummy variables (expressed as 0 and 1 instead of A/S or Y/N). How many transmissions are automatic? Each student will bring a different perspective and approach to working with this dataset. I expect to see lots of differently configured datasets with different numbers of observations. I am looking for a careful, reasonable, and methodical thought process that produces a dataset that can be used to build visualizations later on in this course. Deliverables Excel spreadsheet of updated dataset Written memo addressing the 10 items. Word or PDF format

(5/5)

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

Get Free Quote!

367 Experts Online