Distribuzioni di probabilità e simulazione stocastica

Documenti analoghi
Probability Distributions T O P I C # 1

Università di Pavia Econometria. Richiami di teoria delle distribuzioni statistiche. Eduardo Rossi

ESERCIZI EPOS. { C x 3 (1 x) 0 x 1 0 altrove

SCHEDA INSEGNAMENTO A.A. 2017/2018

Constant Propagation. A More Complex Semilattice A Nondistributive Framework

6.6 ARIMA Model. Autoregressive Integrated Moving Average (ARIMA ) Model. ARIMA(p, d, q) Process

Neyman Construction. θ s true. pdf f (x θ) is known for each prospectiveθ generate x construct an int erval in DATA phase space.

Probabilità e Statistica per l analisi di dati sperimentali. Modelli lineari I

Esclusione dei valori meno probabili Il criterio di Chauvenet

Testi del Syllabus. Docente UMETON CESARE PAOLO Matricola:

COIN TOSSINGS. Contents. 1 Sistemi dinamici a tempo discreto 2. 2 Bernoulli shifts 4. 3 Equivalenza tra sistemi deterministici e stocastici 8

Estratto dal Cap. 8 di: Statistics for Marketing and Consumer Research, M. Mazzocchi, ed. SAGE, 2008.

Relative error analysis of matrix exponential approximations for numerical integration

Optmization Methods for Machine Learning. Gradient method for multilayer perceptron

Finite Model Theory / Descriptive Complexity: bin

Testi del Syllabus. Resp. Did. STRACCI FABRIZIO Matricola:

Probabilità e Statistica per l analisi di dati sperimentali. Modelli lineari II

UNIVERSITÀ DEGLI STUDI DI FOGGIA DIPARTIMENTO DI ECONOMIA

Vettore (o matrice) casuale (o aleatorio): vettore (o matrice) i cui elementi sono variabili aleatorie

Testi del Syllabus. Docente BARTUCCI ROSA Matricola: Insegnamento: INTRODUZIONE AL METODO SPERIMENTALE. Anno regolamento: 2012 CFU:

Calcolo delle Probabilità e Statistica Matematica: definizioni prima parte. Cap.1: Probabilità

Nonlinear Control Lecture # 12 Passivity. Nonlinear Control

Gaussian processes (GP)

INTRODUZIONE AL METODO SPERIMENTALE

number of successes, or a vector of length 2 giving the numbers of successes and failures, respectively.

Discrete-Continuous models using Genz s algorithm

Richiami di Probabilità e Statistica

TEST D IPOTESI. Statistica

Distribuzioni teoriche di probabilità: distribuzione binomiale e distribuzione normale

College Algebra. Logarithms: Denitions and Domains. Dr. Nguyen November 9, Department of Mathematics UK

26/01/2012 ANOVA ANALYSIS OF VARIANCE

Variabili aleatorie. continue. Discreto continuo

MODULO: Medie. Francesco Bologna Enrico Rogora. CASIO Università di Roma Luglio Avellino

Coppie di variabili aleatorie

Statistica descrittiva

Statistica Applicata all edilizia Lezione: approccio stocastico all analisi delle serie storiche

STATISTICA A K (60 ore)

Mattia Zanella. Ferrara, April

Distribuzioni teoriche di probabilità: distribuzione binomiale e distribuzione normale

Statistica. Capitolo 6. Variabili Aleatorie Continue e Distribuzioni di Probabilità. Cap. 6-1

Tecniche di sondaggio

Matematica e Statistica per Scienze Ambientali

Dependable Systems. Theoretical background

Z-test, T-test, χ 2 -test

Richiami di probabilità e statistica

Single-rate three-color marker (srtcm)

Syllabus Course description

Il Teorema del limite Centrale (TLC)

Università di Siena. Corso di STATISTICA. Parte prima: Probabilità e variabili aleatorie. Andrea Garulli, Antonello Giannitrapani, Simone Paoletti

LA DISTRIBUZIONE NORMALE o DI GAUSS

Discrete Parabolic Anderson Model with Heavy Tailed Potential

Distribuzione normale multidimensionale

Accesso Mul*plo - modelli

Significanza di un test / goodness-of-fit

Statistica Di Base Con Aggiornamento Online

Stima della qualità dei classificatori per l analisi dei dati biomolecolari

Syllabus Course description

Numerical Heat and Mass Transfer

II Esonero - Testo B

Markov Chain Monte Carlo Estimation and Diagnostics. Summary of Session Gibbs Sampling. Metropolis Hastings Sampling. Adaptive method for MH Sampling.

Misure riassuntive di effetto per varie tipologie di variabili statistiche

Variabili aleatorie. Variabili aleatorie

0.1 Percorrenza e Cilindrata

UNIVERSITÀ DEGLI STUDI DI TORINO

Serie di numeri di diverse origini. Marcata asimmetria in favore dei digits bassi nella prima cifra. Legge di Benford. Processi moltiplicativi

Variabili aleatorie continue: la normale. Giovanni M. Marchetti Statistica Capitolo 6 Corso di Laurea in Economia

Cenni di Statistica Inferenziale

Misure riassuntive di effetto per varie tipologie di variabili statistiche

PROVA SCRITTA DI STATISTICA (COD COD ) 7 luglio 2005 APPROSSIMARE TUTTI I CALCOLI ALLA QUARTA CIFRA DECIMALE SOLUZIONI MODALITÀ A

Distribuzione esponenziale. f(x) = 0 x < 0

Metodi Quantitativi per Economia, Finanza e Management. Lezione n 5 Test d Ipotesi

Modelli a effetti misti

Fourier transform of images

Singular integrals on Riemannian manifolds

1.1 Obiettivi della statistica Struttura del testo 2

Esercizi su leggi Gaussiane

Fisher Linear discriminant analysis (pag 228)

Statistica per principianti

Variabili aleatorie continue

Calcolo delle Probabilità e Statistica Matematica previsioni 2003/04

Variabili casuali multidimensionali

Metodologie statistiche per l analisi del rischio ELEMENTI DI PROBABILITÀ PER L ANALISI DEL RISCHIO

Syllabus Course description

Metodologie statistiche per l analisi del rischio ELEMENTI DI PROBABILITÀ PER L ANALISI DEL RISCHIO

Testi del Syllabus. Testi in italiano. Resp. Did. TORELLI LUCIO Matricola: BORELLI MASSIMO, 3 CFU TORELLI LUCIO, 3 CFU.

I VETTORI GAUSSIANI E. DI NARDO

Exam of ELECTRONIC SYSTEMS June 15 th, 2012 Prof. Marco Sampietro

distribuzione normale

Crescita e dimensione delle aziende: un approccio statistico Fabrizio Lillo

Docente Prof. Paola Perchinunno - Tel

Modelli lineari generalizzati

Indice. centrale, dispersione e forma Introduzione alla Statistica Statistica descrittiva per variabili quantitative: tendenza

A.S. 2011/2012. Circuito semaforico da incrocio. Corso di Elettronica. Dipartimento di Elettrotecnica

Esercitazione 4 Distribuzioni campionarie e introduzione ai metodi Monte Carlo

A.A. 2006/2007 Laurea di Ingegneria Informatica. Fondamenti di C++ Horstmann Capitolo 3: Oggetti Revisione Prof. M. Angelaccio

Transcript:

Probabilità e Statistica per l analisi di dati sperimentali Distribuzioni di probabilità e simulazione stocastica Sviluppo e gestione di Data Center per il calcolo scientifico ad alte prestazioni Master Progetto PRISMA, UniBA/INFN Alessio Pollice Dipartimento di Scienze Economiche e Metodi Matematici Università degli Studi di Bari Aldo Moro [credits: G. Jona Lasinio, S. Arima @ Sapienza Università di Roma] (Master PRISMA) 12/02/14 1 / 45

Statistical models Introduction Statistics concerns what can be learned from data using to study the variability of the data statistical models (Master PRISMA) 12/02/14 2 / 45

Statistical models Introduction Statistics concerns what can be learned from data using to study the variability of the data statistical models The key feature of a statistical model is that variability is represented using probability distributions, which form the building-blocks from which the model is constructed (Master PRISMA) 12/02/14 2 / 45

Introduction Statistical models Statistics concerns what can be learned from data using to study the variability of the data statistical models The key feature of a statistical model is that variability is represented using probability distributions, which form the building-blocks from which the model is constructed Statistical models must accommodate: systematic variation random variation (Master PRISMA) 12/02/14 2 / 45

Introduction Statistical models Statistics concerns what can be learned from data using to study the variability of the data statistical models The key feature of a statistical model is that variability is represented using probability distributions, which form the building-blocks from which the model is constructed Statistical models must accommodate: systematic variation random variation The key idea in statistical modelling is to treat the data as the outcome of a random experiment (Master PRISMA) 12/02/14 2 / 45

Random sample Random sample The fundamental idea of statistical modelling is to treat data as observed values of random variables (Master PRISMA) 12/02/14 3 / 45

Random sample Random sample The fundamental idea of statistical modelling is to treat data as observed values of random variables The data available y 1, y 2,..., y n are the observed values of a random sample of size n, defined to be a collection of n independent and identically distributed random variables Y 1, Y 2,..., Y n (Master PRISMA) 12/02/14 3 / 45

Random sample Random sample The fundamental idea of statistical modelling is to treat data as observed values of random variables The data available y 1, y 2,..., y n are the observed values of a random sample of size n, defined to be a collection of n independent and identically distributed random variables Y 1, Y 2,..., Y n We suppose that each of the Y j has the same cumulative distribution function F, which represents the population from which the sample has been taken (Master PRISMA) 12/02/14 3 / 45

Random sample Random sample Statistical models Random variables (Master PRISMA) 12/02/14 4 / 45

Random sample Random sample Statistical models Random variables mean of a random variable variance of a random variable moments of a random variable (Master PRISMA) 12/02/14 4 / 45

Random sample Mean and variance of a random variable Let Y be a random variable with cumulative distribution function F and density function f Expected value of Y E[Y ] = µ = ydf (y) = yf (y)dy Variance of Y V [Y ] = µ 2 µ 2 = ( y 2 df (y) ) 2 ydf (y) (Master PRISMA) 12/02/14 5 / 45

Random sample Moment generating function Moments of a random variable Y can be obtained by the use of the moment generating function (Laplace transform) provided that M(t) < M(t) = E(e ty ) (Master PRISMA) 12/02/14 6 / 45

Random sample Moment generating function Moments of a random variable Y can be obtained by the use of the moment generating function (Laplace transform) provided that M(t) < Let the derivatives of M be M(t) = E(e ty ) M (t) = dm(t) dt, M (t) = d2 M(t), M (r) (t) = dr M(t) dt 2 dt r (Master PRISMA) 12/02/14 6 / 45

Random sample Moment generating function Moments of a random variable Y can be obtained by the use of the moment generating function (Laplace transform) provided that M(t) < Let the derivatives of M be M(t) = E(e ty ) M (t) = dm(t) dt, M (t) = d2 M(t), M (r) (t) = dr M(t) dt 2 dt r If finite, the r th moment of Y is µ r µ r = M (r) (0) = E[Y r ] (Master PRISMA) 12/02/14 6 / 45

Random sample Moment generating function Some properties: Y 1,..., Y n are independent if and only if their joint moment generating function factorizes as E[exp(Y 1 t 1 +... + Y n t n )] = E[exp(Y 1 t 1 )] E[exp(Y n t n )] (Master PRISMA) 12/02/14 7 / 45

Random sample Moment generating function Some properties: Y 1,..., Y n are independent if and only if their joint moment generating function factorizes as E[exp(Y 1 t 1 +... + Y n t n )] = E[exp(Y 1 t 1 )] E[exp(Y n t n )] Let Y = a + bx. The moment generating function of Y is M Y (t) = e at M X (bt) (Master PRISMA) 12/02/14 7 / 45

Random sample Moment generating function Some properties: Y 1,..., Y n are independent if and only if their joint moment generating function factorizes as E[exp(Y 1 t 1 +... + Y n t n )] = E[exp(Y 1 t 1 )] E[exp(Y n t n )] Let Y = a + bx. The moment generating function of Y is M Y (t) = e at M X (bt) Any moment generating function corresponds to a unique probability distribution (Master PRISMA) 12/02/14 7 / 45

Random sample Probability distributions Important random variables include the following that will only be quickly reviewed to concentrate on the Normal distribution model Poisson distribution Binomial distribution Uniform distribution Exponential distribution (Master PRISMA) 12/02/14 8 / 45

Probability distributions Random sample Binomial Poisson probabilità 0.00 0.15 0.30 p=0.5, n=50 p=0.5, n=20 p=0.5, n=5 probabilità 0.0 0.1 0.2 0.3 lambda=1 lambda=4 lambda=10 0 10 20 30 40 50 0 5 10 15 20 valori valori Uniform Exponential f(x) 0.0 0.2 0.4 U(2,5) U(0.5,6) U(1,9) f(x) 0.0 0.5 1.0 1.5 lambda=0.5 lambda=1 lambda=1.5 0 2 4 6 8 10 0 1 2 3 4 5 x (Master PRISMA) 12/02/14 9 / 45 x

Normal distribution Normal distribution We say that X N(µ, σ 2 ) when f X (x; µ, σ) = 1 σ 2π exp ( 1 ) (x µ)2 2σ2 µ mean, median and mode (location parameter) σ 2 variance (scale parameter) (Master PRISMA) 12/02/14 10 / 45

Normal distribution Normal distribution Varying location parameter density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 µ = 0 µ = 1 µ = 3 µ = 1 µ = 3 6 4 2 0 2 4 6 (Master PRISMA) 12/02/14 11 / 45

Normal distribution Normal distribution Varying scale parameter density 0.0 0.5 1.0 1.5 σ 2 = 1 σ 2 = 4 σ 2 = 9 σ 2 = 0.1 σ 2 = 0.25 3 2 1 0 1 2 3 (Master PRISMA) 12/02/14 12 / 45

Normal distribution Normal distribution Varying scale and location parameters density 0.0 0.5 1.0 1.5 σ 2 = 1 σ 2 = 0.1 σ 2 = 3 µ = 0 µ = 2 µ = 2 4 2 0 2 4 (Master PRISMA) 12/02/14 13 / 45

Normal distribution Normale con media 0 e varianza 1 density 0.0 0.1 0.2 0.3 0.4 4 2 0 2 4 x (Master PRISMA) 12/02/14 14 / 45

Normal distribution Perché è tanto importante questa distribuzione? Molti fenomeni naturali hanno questo comportamento simmetrico rispetto ad un valore e campanulare nella distribuzione di frequenza La si può spesso usare anche quando i dati non mostrano un andamento normale, ma ricadono sotto le assunzioni del teorema del limite centrale (quando le osservazioni sono indipendenti ed hanno varianza finita la media campionaria tende a distribuirsi normalmente all aumentare della dimensione del campione) (Master PRISMA) 12/02/14 15 / 45

Normal distribution Perché è tanto importante questa distribuzione? Regola empirica Prendiamo un insieme di osservazioni che segue la distribuzione normale e calcoliamo media e varianza 1 circa il 68% dei dati cade in un intervallo di ampiezza 1 s.d. dalla media 2 circa il 95% dei dati cade in un intervallo di ampiezza 2 s.d. dalla media 3 quasi tutti i dati (99.7%) cadono in un intervallo di ampiezza 3 s.d. dalla media (Master PRISMA) 12/02/14 16 / 45

Normal distribution Esempio: pesi di 100 donne media=60kg, sd=2.5 Histogram of weights Density 0.00 0.05 0.10 0.15 1 sd 54 56 58 60 62 64 66 weights (Master PRISMA) 12/02/14 17 / 45

Normal distribution Esempio: pesi di 100 donne media=60kg, sd=2.5 Histogram of weights Density 0.00 0.05 0.10 0.15 2 sd 54 56 58 60 62 64 66 weights (Master PRISMA) 12/02/14 18 / 45

Normal distribution Come capire se i dati seguono una distribuzione normale? 1 QQ-plot: grafico a dispersione dei quantili empirici vs quelli teorici. Se i punti si dispongono lungo la bisettrice i dati seguono la distribuzione teorica 2 istogramma 3 test di adattamento (Master PRISMA) 12/02/14 19 / 45

Normal distribution E se i dati non sono normali? se i dati mostrano un comportamento molto lontano dalla normale possiamo sempre trasformarli le trasformazioni più usate sono la radice quadrata e il logaritmo naturale (Master PRISMA) 12/02/14 20 / 45

Normal distribution E se i dati non sono normali? se i dati mostrano un comportamento molto lontano dalla normale possiamo sempre trasformarli le trasformazioni più usate sono la radice quadrata e il logaritmo naturale 2 1 0 1 2 0 200 400 600 800 Normal Q Q Plot Theoretical Quantiles Sample Quantiles 2 1 0 1 2 0 5 10 15 20 25 30 Normal Q Q Plot Theoretical Quantiles Sample Quantiles radice quadrata (Master PRISMA) 12/02/14 20 / 45

Normal distribution E se i dati non sono normali? 2 1 0 1 2 0 200 400 600 800 Normal Q Q Plot Theoretical Quantiles Sample Quantiles 2 1 0 1 2 3 4 5 6 7 Normal Q Q Plot Theoretical Quantiles Sample Quantiles logaritmo naturale (Master PRISMA) 12/02/14 21 / 45

Standardization Normal distribution The random variable Z = X µ N(0, 1) σ is a standardized Normal distribution. f Z (z) = 1 exp ( 12 ) z2 2π Clearly, X = σz + µ (Master PRISMA) 12/02/14 22 / 45

Standardization Normal distribution The cumulative density function is obtained by integrating the density as follows F Z (z) = 1 u exp ( 12 ) z2 dz = Φ(u) 2π For X N(µ = 3, σ 2 = 4) P(X 2) = F X (x = 2) = 1 2 2π 4 that cannot be computed analytically. Use R! ( exp 1 ) (z 3)2 dz 2 4 (Master PRISMA) 12/02/14 23 / 45

Normal distribution M.G.F. of Normal distribution The moment generating function of X N(0, 1) is M X (t) = E[e tx ] = e 1 2 t2 Let Y = µ + σx, then Y N(µ, σ 2 ) The m.g.f. of the linear transformation Y of Z is M Y (t) = E[e ty ] = exp(µt)exp( 1 2 σ2 t 2 ) (Master PRISMA) 12/02/14 24 / 45

Chi-squared distribution Chi-squared Let X N(0, 1), then the random variable Y = X 2 χ 2 (1) F Y (u) = 2Φ( u) f Y (u) = 1 u2π e 1 2 u Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 ν = 1 ν = 3 ν = 5 ν = 10 ν = 20 0 5 10 15 20 25 30 (Master PRISMA) 12/02/14 25 / 45

Chi-squared distribution Chi-squared More generally, for X χ 2 (n) f X (x, n) = 1 2 n/2 Γ(n/2) e x/2 x n 2 1 (Master PRISMA) 12/02/14 26 / 45

Chi-squared distribution Chi-squared More generally, for X χ 2 (n) f X (x, n) = 1 2 n/2 Γ(n/2) e x/2 x n 2 1 Note that X χ 2 (n) X Gamma(θ = 1/2; ν = n/2) (Master PRISMA) 12/02/14 26 / 45

Chi-squared Chi-squared distribution The m.g.f. of a Gamma distribution X Gamma(θ, ν) is ( ) θ ν M X (t) = E[e tx ] = θ t and hence the m.g.f. of a X χ 2 (1) (θ = 1 2, ν = 1 2 ) M X (t) = E[e tx ] = 1 1 2t (Master PRISMA) 12/02/14 27 / 45

Chi-squared Chi-squared distribution The m.g.f. of a Gamma distribution X Gamma(θ, ν) is ( ) θ ν M X (t) = E[e tx ] = θ t and hence the m.g.f. of a X χ 2 (1) (θ = 1 2, ν = 1 2 ) M X (t) = E[e tx ] = 1 1 2t Theorem Let (X 1, X 2,..., X 2 ) iid N(0, 1), then Z = X 2 1 + X 2 2 +... + X 2 n χ 2 (n) (Master PRISMA) 12/02/14 27 / 45

Student T distribution Other distributions Let X N(0, 1) and W χ 2 (ν) (independent), then the random variable Y = X W /ν T ν Density 0.0 0.1 0.2 0.3 0.4 ν = 1 ν = 3 ν = 5 ν = 10 ν = 20 N(0,1) 10 5 0 5 10 (Master PRISMA) 12/02/14 28 / 45

Fisher F distribution Other distributions Let X 1 χ 2 (ν 1 ) and X 2 χ 2 (ν 2 ) (independent), then the random variable Z = X 1/ν 1 X 2 /ν 2 F (ν 1, ν 2 ) F distribution (nu1=5) Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 nu2 = 1 nu2 = 5 nu2 = 10 nu2 = 20 10 5 0 5 10 (Master PRISMA) 12/02/14 29 / 45

Association measures Other distributions Let Y 1 and Y 2 be any two random variables The covariance between Y 1 and Y 2 is Cov(Y 1, Y 2 ) = E[(Y 1 E[Y 1 ])(Y 2 E[Y 2 ])] = E[(Y 1 Y 2 )] E[Y 1 ]E[Y 2 ] (Master PRISMA) 12/02/14 30 / 45

Association measures Other distributions Let Y be a p-dimensional random vector The variance-covariance matrix is a p p matrix defined as Σ = Cov(Y, Y) = E[(Y E[Y])(Y E[Y]) ] The generic element σ rs is σ rs = Cov(Y r, Y s ) = E[(Y r E[Y r ])(Y s E[Y s ])] (Master PRISMA) 12/02/14 31 / 45

Association measures Other distributions Let Y be a p-dimensional random vector The variance-covariance matrix is a p p matrix defined as Σ = Cov(Y, Y) = E[(Y E[Y])(Y E[Y]) ] The generic element σ rs is σ rs = Cov(Y r, Y s ) = E[(Y r E[Y r ])(Y s E[Y s ])] Theorem: Σ is a positive semidefinite matrix (Master PRISMA) 12/02/14 31 / 45

Other distributions Association measures Let Y be a p-dimensional random vector a be a q-dimensional vector B be a p q matrix For the vector a + B Y the q q variance-covariance matrix is given by Var(a + B Y) = B ΣB (Master PRISMA) 12/02/14 32 / 45

Association measures Other distributions The correlation between two variables Y 1 and Y 2 is defined as ρ(y 1, Y 2 ) = Cov(Y 1, Y 2 ) Var(Y1 )Var(Y 2 ) (Master PRISMA) 12/02/14 33 / 45

Other distributions Association measures The correlation between two variables Y 1 and Y 2 is defined as ρ(y 1, Y 2 ) = Cov(Y 1, Y 2 ) Var(Y1 )Var(Y 2 ) Cov(Y 1, Y 2 ) Var(Y 1 )Var(Y 2 ) 1 ρ(y 1, Y 2 ) 1 (Master PRISMA) 12/02/14 33 / 45

Association measures Other distributions The correlation between two variables Y 1 and Y 2 is defined as ρ(y 1, Y 2 ) = Cov(Y 1, Y 2 ) Var(Y1 )Var(Y 2 ) Cov(Y 1, Y 2 ) Var(Y 1 )Var(Y 2 ) 1 ρ(y 1, Y 2 ) 1 The correlation matrix is given by Ω = Σ 1 1 2 Σ Σ 2 where Σ 1 2 elements is a diagonal matrix with the variances of Y as non-zero (Master PRISMA) 12/02/14 33 / 45

Other distributions The multivariate normal distribution Y = (Y 1,..., Y p ) N p (µ, Σ) has density function { 1 f (y; µ, Σ) = (2π) p/2 exp 1 } Σ 1/2 2 (y µ) Σ 1 (y µ) where Σ = det(σ) (Master PRISMA) 12/02/14 34 / 45

Other distributions The multivariate normal distribution Example: p = 2 where f (y 1, y 2 ; µ 1, µ 2, Σ) = Q(y 1, y 2 ) = 1 1 ρ 2 { 1 2πσ 1 σ exp 1 } 2 1 ρ 2 2 Q(y 1, y 2 ) [ (y1 ) 2 ( ) ( µ 1 y1 µ 1 y2 µ 2 2ρ σ 1 σ 1 σ 2 ) + ( ) ] 2 y2 µ 2 σ 2 (Master PRISMA) 12/02/14 35 / 45

Other distributions The multivariate normal distribution rho= 0.8 rho= 0 0.25 0.20 0.15 0.10 0.05 0.00 4 2 0 f(x,y) x 2 4 4 2 0 y 2 4 0.15 0.10 f(x,y) 0.05 0.00 4 2 0 x 2 4 4 2 0 y 2 4 rho= 0.4 rho= 0.7 0.15 0.10 0.05 0.00 4 2 0 f(x,y) x 2 4 4 2 0 y 2 4 0.20 0.15 0.10 0.05 0.00 4 2 0 f(x,y) x 2 4 4 2 0 y 2 4 (Master PRISMA) 12/02/14 36 / 45

Other distributions The multivariate normal distribution rho= 0.8 rho= 0 2 1 0 1 2 3 4 0.04 0.1 0.08 0.18 0.14 0.16 0.12 0.06 0.02 2 1 0 1 2 3 4 0.08 0.06 0.04 0.12 0.1 0.02 2 1 0 1 2 3 4 2 1 0 1 2 3 4 rho= 0.4 rho= 0.7 2 1 0 1 2 3 4 0.04 0.1 0.12 0.14 0.06 0.02 0.08 2 1 0 1 2 3 4 0.04 0.1 0.18 0.14 0.06 0.02 0.08 0.12 2 1 0 1 2 3 4 2 1 0 1 2 3 4 (Master PRISMA) 12/02/14 37 / 45

Other distributions The multivariate normal distribution The moment generating function of Y N p (µ, Σ) is M Y (t) = E[e t Y ] = exp(t µ + 1 2 t Σt) Theorem: Let Y N p (µ, Σ) and B a k p matrix, then the variable W = BY is W N k (Bµ, BΣB ) (Master PRISMA) 12/02/14 38 / 45

Other distributions Marginal and conditional distribution Let Y N p (µ, Σ) and Y = (Y 1, Y 2 ) where Y 1 is q 1 and Y 2 is (p q) 1 Let µ = (µ 1, µ 2 ) Let ( ) Σ11 Σ Σ = 12 Σ 12 Σ 22 where Σ 11 is q q, Σ 12 = Σ 21 is q (p q) and Σ 22 is (p q) (p q) (Master PRISMA) 12/02/14 39 / 45

Other distributions Marginal and conditional distribution Let Y N p (µ, Σ) and Y = (Y 1, Y 2 ) where Y 1 is q 1 and Y 2 is (p q) 1 Let µ = (µ 1, µ 2 ) Let ( ) Σ11 Σ Σ = 12 Σ 12 Σ 22 where Σ 11 is q q, Σ 12 = Σ 21 is q (p q) and Σ 22 is (p q) (p q) It can be shown that 1 Y 1 N q (µ 1, Σ 11 ) 2 Y 2 N p q (µ 2, Σ 22 ) 3 Y 1 Y 2 = y N q (µ 1 + Σ 12 Σ 1 22 (y µ 2)Σ 11 Σ 12 Σ 1 22 Σ 21) (Master PRISMA) 12/02/14 39 / 45

Convergence Modes of convergence IDEA: The bigger our sample, the more faith we can have in our inferences, because the sample is more representative of the distribution F from which it comes We will mention two modes of convergence: convergence in probability convergence in distribution (Master PRISMA) 12/02/14 40 / 45

Convergence Convergence in probability The the sequence of random variables X 1, X 2,... converges in probability to X, X n p X if for any ɛ > 0 Pr( X n X > ɛ) 0 as n A special case of this is the weak law of large numbers: let Y 1, Y 2,... be a sequence of iid random variables, each with finite mean µ, then Ȳ n p µ (Master PRISMA) 12/02/14 41 / 45

Convergence Weak law of large numbers n = 1 n = 5 Density 0.0 0.2 0.4 0.6 Density 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 y 0 1 2 3 4 y n = 10 n = 50 Density 0.0 0.4 0.8 1.2 Density 0.0 0.5 1.0 1.5 2.0 2.5 0.5 1.0 1.5 2.0 2.5 y 0.6 0.8 1.0 1.2 1.4 1.6 y (Master PRISMA) 12/02/14 42 / 45

Convergence Convergence in distribution The sequence of random variables X 1, X 2,... converges in distribution to X, X n D X if Pr(X n < x) Pr(X < x) as n An special case of this is the Central Limit Theorem: let Y 1, Y 2,... be a sequence of iid random variables with finite mean µ and finite variance σ 2 > 0, then Z n = (Ȳ n µ) σ/ n D Z N(0, 1) (Master PRISMA) 12/02/14 43 / 45

Central Limit Theorem Convergence n = 1 n = 5 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 6 4 2 0 2 4 6 x 6 4 2 0 2 4 6 x n = 10 n = 50 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 6 4 2 0 2 4 6 x 6 4 2 0 2 4 6 x (Master PRISMA) 12/02/14 44 / 45

Some consequences Convergence Consider the average Ȳ of a random sample of iid random variables with mean µ and variance σ 2 > 0 The weak law of large numbers implies that Ȳ is a consistent estimator of µ It is also an unbiased estimator of µ: E[Ȳ ] = µ The central limit theorem implies that Ȳ = µ + n 1/2 σz n where Z n D N(0, 1). Hence in large samples Ȳ is essentially a normal variable with mean µ and variance σ 2 /n (Master PRISMA) 12/02/14 45 / 45