Covariance
In probability and statistics, the covariance is a value that indicates the degree of joint variation of two random variables with respect to their means. It is the basic data to determine if there is a dependency between both variables and it is also the necessary data to estimate other basic parameters, such as the linear correlation coefficient or the regression line.
Interpretation
When the high values of one of the variables usually correspond to the high values of the other, and the same is verified for the small values of one with those of the other, it is corroborated that they tend to show similar behavior, which is reflected in a positive value of the covariance
On the contrary, when the high values of one variable usually correspond mainly to the lower values of the other, expressing an opposite behavior, the covariance is negative.
The sign of the covariance, therefore, expresses the trend in the linear relationship between the variables.
The magnitude requires additional effort of interpretation:
The normalized version of the covariance, the correlation coefficient indicates the magnitude of the specificity of the linear relationship.
A distinction must be made between:
(1) the covariance of two random variables, a statistical parameter of a population considered a property of the joint distribution, and
(2) the Sample covariance that is used as a statistically estimated value is one of the main causes or reasons for covariance.
Definition
Covariance between two random variables X{displaystyle X} and And{displaystyle Y} defined as
- Cov (X,And)=E [chuckles](X− − E [chuckles]X])(And− − E [chuckles]And])],{displaystyle operatorname {Cov} (X,Y)=operatorname {E} {{{big [}(X-operatorname {E} [X])(Y-operatorname {E} {E}{Y]){big}}},}
always E [chuckles]X]{displaystyle operatorname {E} [X]}}, E [chuckles]And]{displaystyle operatorname {E} [Y]} and <math alttext="{displaystyle operatorname {E} [XY]E [chuckles]XAnd].∞ ∞ {displaystyle operatorname {E} [XY] taxinfty }<img alt="{displaystyle operatorname {E} [XY]Where E [chuckles]X]{displaystyle operatorname {E} [X]}}, E [chuckles]And]{displaystyle operatorname {E} [Y]} and E [chuckles]XAnd]{displaystyle operatorname {E} [XY]} denotes the expected values of random variables X{displaystyle X}, And{displaystyle Y} and XAnd{displaystyle XY} respectively. As hope is a linear operator then the previous expression can be written otherwise
- Cov (X,And)=E [chuckles](X− − E [chuckles]X])(And− − E [chuckles]And])]=E [chuckles]XAnd− − XE [chuckles]And]− − E [chuckles]X]And+E [chuckles]X]E [chuckles]And]]=E [chuckles]XAnd]− − E [chuckles]X]E [chuckles]And]− − E [chuckles]X]E [chuckles]And]+E [chuckles]X]E [chuckles]And]=E [chuckles]XAnd]− − E [chuckles]X]E [chuckles]And].{EFFFF}{EFFFF}{EFFFFFF} {EFFFFFF}
Discrete Random Variables
If random variables X{displaystyle X} and And{displaystyle Y} can take the values xi{displaystyle x_{i}} and andi{displaystyle and} for i=1,2,...... ,n{displaystyle i=1,2,dotsn} likely P [chuckles]X=xi]=1/n{displaystyle operatorname {P} [X=x_{i}]=1/n} and P [chuckles]And=andi]=1/n{displaystyle operatorname {P} [Y=y_{i}]=1/n} respectively then covariance can be expressed in terms of E [chuckles]X]{displaystyle operatorname {E} [X]}} and E [chuckles]And]{displaystyle operatorname {E} [Y]} Like
- Cov (X,And)=1n␡ ␡ i=1n(xi− − E [chuckles]X])(andi− − E [chuckles]And]){displaystyle operatorname {Cov} (X,Y)={frac {1}{n}{i=1}^{n}left(x_{i}-operatorname {E} [X]right)left(y_{i}-operatorname {E} [Y]right}
or expressed as
- Cov (X,And)=12n2␡ ␡ i=2n␡ ␡ j=1n(xi− − xj)(andi− − andj){displaystyle operatorname {Cov} (X,Y)={frac {1}{2n^{2}}}}}}{i=2}^{n}sum _{j=1}{n}{n}left(x_{i}-x_{j}right)left(y_{i}-y_{j}right)}}
Multivariate Case
Yeah. X{displaystyle {mathbf {X}}}} is a random dimension vector n{displaystyle n}I mean, X=(X1,...... ,Xn)t{displaystyle {mathbf {X}}=(X_{1},ldotsX_{n})^{t}}} where Xi{displaystyle X_{i}} for i=1,2,...... ,n{displaystyle i=1,2,dotsn} are random variables, the covariance matrix, denoted by ・ ・ {displaystyle sigma }It is given by
- ・ ・ =(Cov (X1,X1)Cov (X1,X2) Cov (X1,Xn)Cov (X2,X1)Cov (X2,X2) Cov (X2,Xn) Cov (Xn,X1)Cov (Xn,X2) Cov (Xn,Xn))##### #####################################################################
I mean, the (i,j){displaystyle (i,j)}- 30th entrance ・ ・ {displaystyle sigma } corresponds to the covariance between Xi{displaystyle X_{i}} and Xj{displaystyle X_{j}} that can be represented as
- ・ ・ ij=Cov (Xi,Xj){displaystyle Sigma _{ij}=operatorname {Cov} (X_{i},X_{j}}}}}
in particular, when i=j{displaystyle i=j}, then
- ・ ・ ii=Cov (Xi,Xi)=Var (Xi){displaystyle Sigma _{ii}=operatorname {Cov} (X_{i},X_{i})=operatortorname {Var} (X_{i})}}}
so the matrix ・ ・ {displaystyle sigma } can be written as
- ・ ・ =(Var (X1)Cov (X1,X2) Cov (X1,Xn)Cov (X2,X1)Var (X2) Cov (X2,Xn) Cov (Xn,X1)Cov (Xn,X2) Var (Xn))#### ###################################################################
Properties
Covariance with itself
Variance is a particular case of covariance when two random variables are identical
- Cov (X,X)=Var (X)≡ ≡ σ σ X2{displaystyle operatorname {Cov} (X,X)=operatorname {Var} (X)equiv sigma _{X}^{2}}}
Covariance of linear combinations
Sean. X{displaystyle X}, And{displaystyle Y}, W{displaystyle W} and V{displaystyle V} random variables and a,b,c,d한 한 R{displaystyle a,b,c,din mathbb {R} } then.
- Cov (X,a)=0{displaystyle operatorname {Cov} (X,a)=0,}
- Cov (X,X)=Var (X){displaystyle operatorname {Cov} (X,X)=operatorname {Var} (X)}Where Var (X){displaystyle operatorname {Var} (X)} denotes the variance X{displaystyle X}.
- Cov (X,And)=Cov (And,X){displaystyle operatorname {Cov} (X,Y)=operatorname {Cov} (Y,X)} called symmetry property.
- Cov (aX,bAnd)=abCov (X,And){displaystyle operatorname {Cov} (aX,bY)=aboperatorname {Cov} (X,Y)}
- Cov (X+a,And+b)=Cov (X,And){displaystyle operatorname {Cov} (X+a,Y+b)=operatorname {Cov} (X,Y)}
- Cov (aX+bAnd,cW+dV)=acCov (X,W)+adCov (X,V)+bcCov (And,W)+bdCov (And,V){displaystyle operatorname {Cov} (aX+bY,cW+dV)=acoperatorname {Cov} (X,W)+adoperatorname {Cov} (X,V)+bcoperatorname {Cov} (Y,W)+bdoperatorname {Cov} (Y,V)}}
- Cov (X,And)=E [chuckles]XAnd]− − E [chuckles]X]E [chuckles]And]{displaystyle operatorname {Cov} (X,Y)=operatorname {E} [XY]-operatorname {E} [X]operatorname {E} [Y]}}, formula used in practice to calculate covariance.
These properties follow almost directly from the definition of covariance.
For a sequence X1,X2,...... ,Xn{displaystyle X_{1},X_{2},dotsX_{n}} of random variables and for values a1,a2,...... ,an한 한 R{displaystyle a_{1},a_{2},dotsa_{n}in mathbb {R} } Got it.
- <math alttext="{displaystyle {begin{aligned}operatorname {Var} left(sum _{i=1}^{n}a_{i}X_{i}right)&=sum _{i=1}^{n}a_{i}^{2}sigma _{X_{i}}^{2}+2sum _{i,j:i<j}a_{i}a_{j}operatorname {Cov} (X_{i},X_{j})\&=sum _{iVar (␡ ␡ i=1naiXi)=␡ ␡ i=1nai2σ σ Xi2+2␡ ␡ i,j:i.jaiajCov (Xi,Xj)=␡ ␡ i.jaiajCov (Xi,Xj){displaystyle {begin{aligned}operatorname {Var} left(sum _{i=1}{n}a_{i}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i<img alt="{displaystyle {begin{aligned}operatorname {Var} left(sum _{i=1}^{n}a_{i}X_{i}right)&=sum _{i=1}^{n}a_{i}^{2}sigma _{X_{i}}^{2}+2sum _{i,j:i<j}a_{i}a_{j}operatorname {Cov} (X_{i},X_{j})\&=sum _{i
Non-correlation and Independence
Random variables whose covariance is zero are said to be uncorrelated.
Yeah. X{displaystyle X} e And{displaystyle Y} are independent random variables then their covariance is zero, this is
- Cov(X,And)=0{displaystyle {text{Cov}}(X,Y)=0}
this happens because of the property of independence
- E [chuckles]XAnd]=E [chuckles]X]E [chuckles]And]{displaystyle operatorname {E} [XY]=operatorname {E} [X]operatorname {E} [Y]}
then substituting in the covariance formula we get
- Cov (X,And)=E [chuckles]XAnd]− − E [chuckles]X]E [chuckles]And]=E [chuckles]X]E [chuckles]And]− − E [chuckles]X]E [chuckles]And]=0{displaystyle {begin{aligned}operatorname {Cov} (X,Y) fake=operatortorname {E} [XY]-operatorname {E} [X]operatorname {E}{E}{E}{Y]operatorname {E} {E}{E}{E}{operatorname {E}{
The opposite, however, is generally not true: some pairs of random variables have zero covariance even though they are not independent. Under some additional hypotheses, zero-value covariance implies independence, such as in the case of the multivariate normal distribution.
Relation to the dot product
Most of the properties of covariance follow from those of the dot product:
- Bilineality: for a,b한 한 R{displaystyle a,bin mathbb {R} } and random variables X{displaystyle X}, And{displaystyle Y} and U{displaystyle U} is fulfilledCov (aX+bAnd,U)=aCov (X,U)+bCov (And,U){displaystyle operatorname {Cov} (aX+bY,U)=aoperatorname {Cov} (X,U)+boperatorname {Cov} (Y,U)}}
- Symmetry: Cov(X,And)=Cov(And,X){displaystyle {text{Cov}}(X,Y)={text{Cov}}(Y,X)}
- It's an operator. positive: Var(X)=Cov(X,X)≥ ≥ 0{displaystyle {text{Var}}(X)={text{Cov}}(X,X)geq}also, if Cov(X,X)=0{displaystyle {text{Cov}}(X,X)=0} then. X{displaystyle X} is a constant random variable.
In fact, the covariance is an inner product over the quotient space of the random variables of equal finite moments except constant.
Sample covariance
Yeah. X{displaystyle X} and And{displaystyle Y} are random variables that take the values xi{displaystyle x_{i}} and andi{displaystyle and} for i=1,2,...... ,n{displaystyle i=1,2,dotsn} then you can estimate the covariance between X{displaystyle X} and And{displaystyle Y}This esteemed denoted by Sxand{displaystyle S_{xy}} defined as
- Sxand=1n− − 1␡ ␡ i=1n(xi− − x! ! )(andi− − and! ! )=1n− − 1[chuckles]␡ ␡ i=1nxiandi− − nx! ! and! ! ]{displaystyle {begin{aligned}S_{xy}{frac} {1{n-1}}}{i=1}{n}{(x_{i}{overline}{x}}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i}{i
where
- x! ! =␡ ␡ i=1nxinandand! ! =␡ ␡ i=1nandin{displaystyle {overline {x}}=sum _{i=1}{n}{frac {x}{n}{n}}}{qquad {text{y}}}qquad {overline {y}}}}=sum _{i=1}{n}{frac {y_{i}}{n}}}}}}
denote the sample mean.
The estimate Sxand{displaystyle S_{xy}} He has the property that he's an insected estimator.
Covariance Interpretation
- Yeah. {0}}" xmlns="http://www.w3.org/1998/Math/MathML">Sxand▪0{displaystyle S_{xy}{0}{0}}{0}}" aria-hidden="true" class="mwe-math-fallback-image-inline" src="https://wikimedia.org/api/rest_v1/media/math/render/svg/312d1ed9a7cc4fa03bb7b9a5a4a81d00f165c0a0" style="vertical-align: -1.005ex; width:7.675ex; height:2.843ex;"/> there is direct dependence (positive), i.e., on large values X{displaystyle X} corresponds great values of And{displaystyle Y}.
- Yeah. Sxand=0{displaystyle S_{xy}={0} is interpreted as the non-existent linear relationship between the two variables.
- Yeah. <math alttext="{displaystyle S_{xy}Sxand.0{displaystyle S_{xy}{0}<img alt="{displaystyle S_{xy} there is inverse or negative dependence, that is to say, on great values X{displaystyle X} corresponds small values of And{displaystyle Y}.
Contenido relacionado
Wilhelm ackerman
Cohort study
Harmonic mean