Back to Statistics & AB Testing
Statistics & AB Testing

Statistics & AB Testing

38 of 77 Completed

Analysis of Variance (ANOVA)

Note on Notation

Once again, we will use different notations for the various things FF can denote:

  • The test and distribution’s name: FF

  • The distribution: F\mathcal{F}

  • The test statistic: ϕ\phi

  • The pdf: fFf_F

  • The cdf: FFF_F

    Cheat sheet

  • Description: Tests if the variance of two normally-distributed samples, x1\vec{x}_1 and x2\vec{x}_2, are equal,

  • Statistic: σ12/σ22\sigma_1^2/\sigma_2^2 (ratio of variances)

  • Distribution: F(n11,n21)\mathcal{F}(n_1-1,n_2-1) (FF)

  • Sidedness: Two-sided

  • Null Hypothesis: H0:σ12=σ22H_0: \sigma_1^2 = \sigma_2^2

  • Alternative Hypothesis: Ha:σ12σ22H_a:\sigma_1^2 \neq \sigma_2^2

  • Test Statistic: ϕ=s12s22 \phi=\frac{s_1^2}{s_2^2}

    Description

    In the last section, we went over how the χ2\chi^2 distribution is derived from summing standard normal distribution. Well, the FF distribution is derived from dividing two χ2\chi^2 distributions. Specifically if S1K(k1)S_1\sim\mathcal{K}(k_1) and S2K(k2)S_2\sim\mathcal{K}(k_2) then S1/k1S2/k2F(k1,k2) \frac{S_1/k_1}{S_2/k_2}\sim\mathcal{F}(k_1,k_2) This χ2\chi^2 distribution might seem like they come from nowhere in this test, but recall the definition of s2s^2, the sample variance.

s2=1n1i=1n(xiμ^)2 s^2 =\frac{1}{n-1}\sum_{i=1}^n(x_i - \hat{\mu})^2 Since xiN(μ,σ2)x_i\sim\mathcal{N}(\mu,\sigma^2) under the assumptions of the FF test, μ^N(μ,σ2/n)(xiμ^)N(0,1)(xiμ^)2K(1) \hat{\mu}\sim\mathcal{N}(\mu,\sigma^2/n)\Rightarrow (x_i-\hat{\mu})\sim\mathcal{N}(0,1)\Rightarrow (x_i-\hat{\mu})^2\sim \mathcal{K}(1) A final thing to note is that, unlike other tests that assume that samples are normally distributed, the FF test is extremely sensitive to violations of non-normality. Thus, it would take an ever larger sample size than for other tests for a FF test to stay valid for non-normal samples.

Another thing to note is since FF tests are only two-sided, we can’t determine the “direction” of the test. In this test, for example, the result doesn’t tell you if σ12\sigma_1^2 or σ22\sigma_2^2 is larger than the other.

FF Test for Comparison of Multiple Means (Omnibus Test of Means)

Cheat Sheet

  • Description: Tests if the means of kk normally-distributed samples, x1,x2,,xk\vec{x}_1,\vec{x}_2,\dots,\vec{x}_k, with n1,n2,,nkn_1,n_2,\dots,n_k observations each (NN total)differ in at least one pair-wise comparison

  • Statistic: μiμj\mu_i-\mu_j (difference of means between any two groups)

  • Distribution: F(k1,Nk)\mathcal{F}(k-1,N-k) (FF)

  • Sidedness: Two-sided

  • Null Hypothesis: H0:μ1=μ2==μkH_0: \mu_1=\mu_2=\cdots=\mu_k

  • Alternative Hypothesis: Ha:μiμjH_a:\mu_i\neq\mu_j for at least one pair i,jki,j\leq k where iji\neq j

  • Test Statistic: ϕ=VeVeˉ \phi=\frac{V_e}{V_{\bar{e}}}

    Description

    The VeV_e and VeˉV_{\bar{e}} in ϕ\phi denote the “explained variance” and “unexplained variance” respectively. They are also called be the “between-group variability” and “within-group variability”. The idea is that we can take the ratio of the sum of the variances of each group and the variance of the samples taken as a whole as a proxy to determine if there is a difference of means between groups. This is because we would expect these two “variances” to not differ if all groups share the same mean.

As for definitions, VeV_e is defined as:

Ve=i=1kni(μ^iμ^A)k1 V_e=\sum_{i=1}^k \frac{n_i(\hat{\mu}_i-\hat{\mu}_A)}{k-1} where μ^A\hat{\mu}_A is the mean of all samples when combined.VeˉV_{\bar{e}} is defined as:

Veˉ=i=1kj=1ni(xi,jμ^i)Nk V_{\bar{e}}=\sum_{i=1}^k\sum_{j=1}^{n_i}\frac{(x_{i,j}-\hat{\mu}_i)}{N-k} Where xi,jx_{i,j} is the jjth observation in the iith sample.

This test is useful because using multiple tt tests can greatly increase the chance of making a type I error. This is because running multiple tt-tests exponentially increases the probability of getting false positives (also called “type I errors”). “Exponentially” here is not a placeholder for “a lot.” If each test has false-positive probability ψ\psi, the probability of never getting a false positive in nn many tests is (1ψ)n\left(1-\psi\right)^n, which clearly tends to zero as nn\rightarrow\infty. The FF-test doesn’t have this issue since it’s an “omnibus test”, meaning it tests all of these hypotheses “all at once.”

Good job, keep it up!

49%

Completed

You have 39 sections remaining on this learning path.

Advance your learning journey! Go Premium and unlock 40+ hours of specialized content.