(5/5)

Homework 4

The SAS dataset HeinzHunts has data on grocery store purchases of Hunts and Heinz ketchup. Each observation corresponds to one purchase occasion (of one of these brands) and consists of the following variables:

- Heinz : =1 if Heinz was purchased, =0 if Hunts was purchased
- PriceHeinz : Price of Heinz
- PriceHunts : Price of Hunts
- DisplHeinz : = 1 if Heinz had a store display, =0 if Heinz did not have a store display
- DisplHunts : = 1 if Hunts had a store display, =0 if Hunts did not have a store display
- FeatureHeinz : = 1 if Heinz had a store feature, =0 if Heinz did not have a store feature
- FeatureHunts : = 1 if Hunts had a store feature, =0 if Hunts did not have a store feature

- Create a variable LogPriceRatio = log (PriceHeinz/PriceHunts).

- Randomly select 80% of the data set as the training sample, remaining 20% as test sample

- Estimate a logit probability model for the probability that Heinz is purchased – using LogPriceRatio, DisplHeinz, FeatureHeinz, DisplHunts, FeatureHunts as the explanatory variables. Include interaction terms between display and feature for a particular brand (e.g., DisplHeinz * FeatureHeinz).

- Interpret the results. What promotional methods (feature / display) are effective for Hunts? For Heinz? How would you interpret the results for the interaction effects?

- Based on the estimated model, and using the logit probability formula, calculate the change in predicted probability that Heinz is purchased if LogPriceRatio changes from 0.5 to 0.6 and Heinz does not use a feature or display, while Hunts uses a feature and a display.

Recall that in the logit model: , where Y is the outcome variable, X are the predictor variables, and are the estimated model coefficients.

- The estimated model is to be used for targeting customers for Hunts coupons to build loyalty for the brand. Coupons are to be sent to customers who are likely to buy Hunts, and not to customers who are likely to buy Heinz. Therefore, the coupons should be sent to customers whose predicted probability of buying Heinz is below a certain threshold level that needs to be determined based on the costs of misclassifications (incorrectly sending / not sending a coupon)

The following information about the costs of incorrect classification is available: The cost of incorrectly sending a coupon to a customer who would have bought Heinz is $1 per customer, and the cost of incorrectly failing to send a coupon to a customer who would have bought Hunts is $0.25 per customer.

Based on these costs, what is the optimal threshold probability level that should be used with the estimated model to decide which consumers should receive coupons.

(HINT: Step 1: Using the appropriate SAS command, create an ROC table for the test data from the estimated model. The ROC table provides the number of false positive and false negative classifications for each possible probability threshold.

Step 2: Using the cost information, calculate the total cost of misclassification for each probability threshold.

Total Cost = # of False Positives * False Positive Cost + # of False Negatives * False Negative Cost

Think carefully as to what is false positive and negative in this context.

Step 3: Choose the probability threshold that leads to the lowest total cost.)

(5/5)

DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma

Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t

Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th

1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of

1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of