(5/5)

You have just got a loaded 6-sided dice from your statistician friend. Unfortunately,

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

MLE, MAP, Concentration (Pengtao)

1. MLE of Uniform Distributions [5 pts]

Given a set of i.i.d samples X1, ..., Xn Uniform(0, θ), find the maximum likelihood estimator of θ.

(a) Write down the likelihood function (3 pts)

(b) Find the maximum likelihood estimator (2 pts)

2. Concentration [5 pts]

The instructors would like to know what percentage of the students like the Introduction to Machine Learn- ing course. Let this unknown—but hopefully very close to 1—quantity be denoted by µ. To estimate µ, the instructors created an anonymous survey which contains this question:

”Do you like the Intro to ML course? Yes or No”

Each student can only answer this question once, and we assume that the distribution of the answers is i.i.d.

(a) What is the MLE estimation of µ? (1 pts)

(b) Let the above estimator be denoted by µˆ. How many students should the instructors ask if they want the estimated value µˆ to be so close to the unknown µ such that

P(|µˆ − µ| > 0.1) < 0.05, (4pts)

3. MAP of Multinational Distribution [10 pts]

You have just got a loaded 6-sided dice from your statistician friend. Unfortunately, he does not remem- ber its exact probability distribution p1, p2, ..., p6. He remembers, however, that he generated the vector (p1, p2, . . . , p6) from the following Dirichlet distribution.

Γ(Σ6

u ) Y Σ

P(p , p , . . . , p ) =

i=1 i

pui−1δ(

pi − 1),

where he chose ui = i for all i = 1, . . . , 6. Here Γ denotes the gamma function, and δ is the Dirac delta. To

estimate the probabilities p1, p2, . . . , p6, you roll the dice 1000 times and then observe that side i occurred

ni times (Σ6 ni = 1000).

(a) Prove that the Dirichlet distribution is conjugate prior for the multinomial distribution.

(b) What is the posterior distribution of the side probabilities, P(p1, p2, . . . , p6|n1, n2, . . . , n6)?

Linear Regression (Dani)

1. Optimal MSE rule [10 pts]

Suppose we knew the joint distribution PXY . The optimal rule f ∗ : X → Y which minimizes the MSE (Mean Square Error) is given as:

f ∗ = arg min E[(f (X) Y )2]

Show that f ∗(X) = E[Y |X].

2. Ridge Regression [10 pts]

In class, we discussed l2 penalized linear regression:

where Xi = [X(1) . . . X(p)].

β 2

i=1

i i

a) Show that a closed form expression for the ridge estimator is β = (ATA + λI)−1ATY where

A = [X1; . . . ; Xn] and Y = [Y1; ...; Yn].

b) An advantage of ridge regression is that a unique solution always exists since (ATA+λI) is invertible. To be invertible, a matrix needs to be full rank. Argue that (ATA + λI) is full rank by characterizing its p eigenvalues in terms of the singular values of A and λ.

Logistic Regression (Prashant)

1. Overfitting and Regularized Logistic Regression [10 pts]

a) Plot the sigmoid function 1/(1 + e−wX ) vs. X ∈ R for increasing weight w ∈ {1, 5, 100}. A qualitative sketch is enough. Use these plots to argue why a solution with large weights can cause logistic regression to overfit.

b) To prevent overfitting, we want the weights to be small. To achieve this, instead of maximum conditional likelihood estimation M(C)LE for logistic regression:

max

w0,...,wd

P (Yi|Xi, w0, . . . , wd),

i=1

we can consider maximum conditional a posterior M(C)AP estimation:

max

w0,...,wd

P (Yi|Xi, w0, . . . , wd)P (w0, . . . , wd)

i=1

where P (w0, . . . , wd) is a prior on the weights.

Assuming a standard Gaussian prior N (0, I) for the weight vector, derive the gradient ascent update rules for the weights.

(5/5)

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

Use CA10RAM to get 10%* Discount.

You have just got a loaded 6-sided dice from your statistician friend. Unfortunately,

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

Other Services

You have just got a loaded 6-sided dice from your statistician friend. Unfortunately,

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science

Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6