Visualizing the Multivariate Normal
INSTRUCTIONS TO CANDIDATES
ANSWER ALL QUESTIONS
Visualizing the Multivariate Normal
Spectral Decomposition
P is orthogonal if PT P = 1 and PPT = 1.
Theorem: Let A be symmetric n × n. Then we can write
A = PDPT ,
where D = diag (λ_{1}, . . . , λ_{n}) and P is orthogonal. The λs are the eigenvalues of A and ith column of P is an eigenvector corresponding to λ_{i} .
Orthogonal matrices represent rotations of the coordinates. Diagonal matrices represent stretchings/shrinkings of coordinates.
Properties
- The covariance matrix Σ is symmetric and positive definite, so we know from the spectral decomposition theorem that it can be written as
Σ = PΛPT .
- Λ is the diagonal matrix of the eigenvalues of Σ.
- P is the matrix whose columns are the orthonormal eigenvectors of Σ (hence V is an orthogonal matrix).
) Geometrically, orthogonal matrices represent rotations.
) Multiplying by P rotates the coordinate axes so that they are parallel to the eigenvectors of Σ.
) Probabilistically, this tells us that the axes of the probability-contour ellipse are parallel to those eigenvectors.
) The radii of those axes are proportional to the square roots of the eigenvalues.
Can we view the det(Σ) as a “variance“?
- Variance of one-dimensional
- From the SDT: det(Σ) = _{i} λ_{i} .
- Eigenvalues (λ_{i} ) tell us how stretched or compressed the distribution
- View det(Σ) as stretching/compressing factor for the MVN
- We will see this from the contour plots
Our focus is visualizing MVN distributions in R.
What is a Contour Plot?
- Contour plot is a graphical technique for representing a 3-dimensional
- We plot constant z slices (contours) on a 2-D
- The contour plot is an alternative to a 3-D surface The contour plot is formed by:
- Vertical axis: Independent variable
- Horizontal axis: Independent variable
- Lines: iso-response
Contour Plot
The lines of the contour plots denote places of equal probability mass for the MVN distribution
- The lines represent points of both variables that lead to the same height on the z-axis (the height of the surface)
- These contours can be constructed from the eigenvalues and eigenvectors of the covariance matrix
- The direction of the ellipse axes are in the direction of the eigenvalues
- The length of the ellipse axes are proportional to the constant times the eigenvector
- More specifically
||Σ−1/2(X − µ)|| = c2
has ellipsoids centered at µ and axes at √(λ_{i} v_{i} )
Visualizing the MVN Distribution Using Contour Plots
The next figure below shows a contour plot of the joint pdf of a bivariate normal distribution. Note: we are plotting the theoretical contour plot. This particular distribution has mean
µ = . 1 Σ
(Solid dot), and variance matrix
Σ =. 2 1 Σ
Code to construct plot
library(mvtnorm)
x.points <- seq(-3,3,length.out=100) y.points <-x.points
z <- matrix(0,nrow=100,ncol=100) mu <- c(1,1)
sigma <- matrix(c(2,1,1,1),nrow=2) for (i in1:100) {
for (j in1:100) {
z[i,j] <- dmvnorm(c(x.points[i],y.points[j]),
mean=mu,sigma=sigma)
}
}
contour(x.points,y.points,z)
Our findings
- Probability contours are
- Density changes comparatively slowly along the major axis, and quickly along the minor
- The two points marked + in the figure have equal geometric distance from µ.
- But the one to its right lies on a higher probability contour than the one above it, because of the directions of their displacements from the means
Kernel density estimation (KDE)
- KDE allows us to estimate the density from which each sample was
- This method (which you will learn about in other classes) allows us to approximate the density using a
- There are R packages that use kde’s such as density().
What did we learn?
- The contour plot of X (bivariate density): Color is the probability density at each point (red is low density and white is high density).
- Contour lines define regions of probability density (from high to low).
- Single point where the density is highest (in the white region) and the contours are approximately ellipses (which is what you expect from a Gaussian).
What can we say in general about the MVN density?
- The spectral decomposition theorem tells us that the contours of the multivariate normal distribution are
- The axes of the ellipsoids correspond to eigenvectors of the covariance
- The radii of the ellipsoids are proportional to square roots of the eigenvalues of the covariance
Related Questions
CSI 1420 Introduction to C Programming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C
DescriptionIn this final assignment, the students will demonstrate their ability to apply two majorconstructs of the C programming language – Fu
The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 1
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of
Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:
1
Project 2
Introduction - the SeaPort Project series
For this set of projects for the course, we wish to simulate some of the aspects of a number of