Statistics & AB Testing

38 of 77 Completed

The Z and t test

The $Z$ and $t$ tests are very basic tests that are most often used to test if means are equal/greater than/less than some hypothesized value. Generally, $Z$ tests are used when the sample size is greater than 30, whereas $t$ -tests are used with sample sizes less than 30.

$Z$ tests

Cheat Sheet

Description: Tests if the mean $\mu$ is equal/less than/ greater than $\mu_0$
Statistic: $\mu$ (mean)
Distribution: $\mathcal{N}(\mu,\sigma^2)$ (normal)
Sidedness: Either
Null Hypothesis: $H_0: \mu = \mu_0$ (two-sided), $\mu \geq\mu_0,\mu\leq\mu_0$ (one-sided)
Alternative Hypothesis: $H_a: \mu \neq \mu_0$ (two-sided), $\mu\lt\mu_0,\mu\gt\mu_0$ (one-sided)
Test Statistic: $Z=\frac{\hat{\mu}-\mu_0}{s/\sqrt{n}}\quad (s=\sigma \ \text{if known})$

Description

The $Z$ -test compares the mean of a sample ( $\hat{\mu}$ , also denoted as $\bar{x}$ ) to some value, $\mu_0$ . It’s the go-to test for questions about the mean of some population. For example, if you had a sample of the GPA of students, you could run a $Z$ -test to see if the students are doing well on average.

It assumes that the sample follows a normal distribution with some mean $\mu$ and variance $\sigma^2$ .

In a two-sided $Z$ -test, we want to check if $\hat{\mu}\neq\mu_0$ . In a one-sided $Z$ -test, we want to either test that $\hat{\mu}\lt\mu_0$ or $\hat{\mu}\gt\mu_0$ .

The mean of the sample $\vec{x}=\{x_1,x_2,\dots,x_n\}$ is defined by the familiar notion of the arithmetic mean of a set of numbers: $\hat{\mu}_{\vec{x}}=\frac{1}{n}\sum_{i=1}^nx_i$ The test-statistic of the $Z$ -test, called the $Z$ -values or $Z$ -score, is defined as

$Z=\frac{\hat{\mu}-\mu_0}{\sigma/\sqrt{n}}$ Where $\sigma$ is the standard deviation of the assumed sample distribution $\mathcal{N}(\mu,\sigma)$ . Most of the time, $\sigma$ is not known in advance, so it is estimated through the sample standard deviation, $s$ .

$s=\sqrt{\frac{1}{n-1}\sum_{i=1}^n(x_i-\hat{\mu})^2}$ This is because $s$ is an unbias estimator for $\sigma$ , meaning that $\mathbb{E}[s]=\mathbb{E}[\sigma]$ . By the way, this is why the $1/(n-1)$ term is included, because using the more intuitive constant of $1/n$ results in the bias estimator of $\sigma$ , where $\mathbb{E}[s]\not=\mathbb{E}[\sigma]$ .

Note how $Z$ fits the general form of a test statistic we talked about in the previous section. $T=\frac{\hat{\theta}-\theta_0}{\sqrt{\frac{\mathsf{Var}\left[\hat{\theta}|\theta_0\right]}{n}}},\quad \begin{matrix} \hat{\theta}:=\hat{\mu}\\\\ \theta_0:=\mu_0\\\\ \mathsf{Var}\left[\hat{\theta}|\theta_0\right]:=\sigma \end{matrix}$

What is the decision function for the $Z$ -test? Well, recall the general form of a decision function.

$\mathcal{D}(T,F,\alpha,s)=\begin{cases} \text{reject} \ H_0\\text{if} \ 1-F(T)\lt\alpha/s\\\\ \text{accept} \ H_0\\text{if} \ 1-F(T)\geq\alpha/s \end{cases}$

Remember, $s$ here refers to the sidedness of the test, which is 1 or 2, not the sample standard deviation.

Our test statistic is $Z$ , and our assumed distribution is normal, so our cdf $F=\Phi$ . So our decision function is: $\mathcal{D}(Z,\Phi,\alpha,s)=\begin{cases} \text{reject} \ H_0 \ \text{if} \ 1-\Phi(Z)\lt\alpha/s\\\\ \text{accept} \ H_0\\text{if} \ 1-\Phi(Z)\geq\alpha/s \end{cases}$

Note: Throughout this course and in (almost all) real interviews, you will not be expected to calculate cdfs by hand. You can just describe what conclusions you would draw if you were given the value of $F(\theta)$

Example

Let’s go back to the question of student GPAs. Let’s say our sample is

$3.2, 2.9, 3.7, 2.5, 3.1, 3.8, 2.7, 3.0, 3.3, 2.8, 3.6, 2.6, 3.5, 2.4, 3.4, 2.3\\\\ 2.2, 4.0, 2.1, 3.8, 2.9, 3.7, 2.5, 3.1, 3.6, 2.7, 3.0, 3.3, 2.8, 3.5, 2.6, 3.4\\\\ 3.2, 2.3, 3.9, 2.2, 4.0, 2.1, 3.7, 2.9, 3.6, 2.5, 3.1, 3.8, 2.7, 3.0, 3.3, 2.8$ and we want to know if students have at least a 3.0 on average. Our hypotheses would be:

$H_0: \mu \leq 3.0$ $H_a: \mu\gt3.0$ Since there are $n=48$ samples here, we can use a one-sided $Z$ -test to test these hypotheses. Here $\hat{\mu}=3.064583$ and $s=0.5459676$ so our $Z$ -statistic is

$Z=\frac{\hat{\mu}-\mu_0}{s/\sqrt{n}}=\frac{3.064583-3}{0.5459676/\sqrt{48}}=0.819543$ If we set $\alpha=0.05$ , since $1-\Phi(0.819543)\approx 0.2$ , we fail to reject $H_0$ . Thus, we do not have statistically significant evidence to say that the average student has a GPA higher than 3.0.

$t$ Tests

Note on notation

The $t$ -test is a bit of a notational nightmare to describe because $t$ is used to denote - The name of the test and distribution (we will use $t$ ) - The test statistic of the test (we will use $\tau$ ) - The notation of the $t$ -distribution (we will use $\mathcal{T}$ ) - the pdf of the $t$ -distribution (we will use $\varphi_t$ ) - the cdf of the $t$ -distribution (we will use $\Phi_t$ ) To avoid confusion, we’ll use the notation denoted in parentheses above to refer to these elements in an unambiguous way.

Cheat sheet

Description: Tests if the mean $\mu$ is equal/less than/ greater than $\mu_0$ for small samples ( $n\lesssim30$ )
Statistic: $\mu$ (mean)
Distribution: $\mathcal{T}(n-1)$ ( $t$ )
Sidedness: Either
Null Hypothesis: $H_0: \mu = \mu_0$ (two-sided), $\mu \geq\mu_0,\mu\leq\mu_0$ (one-sided)
Alternative Hypothesis: $H_a: \mu \neq \mu_0$ (two-sided), $\mu lt\mu_0,\mu\gt\mu_0$ (one-sided)
Test Statistic: $\tau=\frac{\hat{\mu}-\mu_0}{s/\sqrt{n}}\quad (s=\sigma \ \text{if known})$

Description

The $t$ -test is a modified version of the $Z$ -test that allows for more conservative inferences when the sample size is small. Generally, a sample same size of $n=30$ is considered the cut-off between when to use a $Z$ -test and a $t$ -test, samples with a size lower than 30. Thus in a situation where data collection is costly, like if we are collecting data via manual observation in a scientific experiment.

The only change to the decision function of the $t$ test compared to the $Z$ is the assumed distribution of the sample. Instead of a normal distribution, it is assumed to follow a $t$ -distribution, which itself is a modification of the normal distribution.

The $t$ distribution does have a closed-form formula, but it is so complicated that it’s not very useful. It’s more important to focus on its shape and single parameter, $\nu$ , called the degrees of freedom. Because of this, the distribution $\mathcal{T}(\nu)$ can be called a $t$ -distribution with $\nu$ degrees of freedom. Importantly as $\nu$ gets large, the $t$ -distribution resembles a standard normal distribution more and more. That is

$\lim_{\nu\rightarrow\infty}\varphi_t(x|\nu)=\varphi(x)\quad\text{and}\quad \lim_{\nu\rightarrow\infty}\Phi_t(x|\nu)=\Phi(x)$ Animation showing the t distribution converging to the standard normal distribution

Image Credit to T.J. Kyner

As you can see in the above animation, when $\nu \lt 30$ the $t$ -distribution has very “long tails”. This represents the fact that in small sample sizes, we have a much higher probability of getting “extreme results” that deviate greatly from the mean. These long tails are how the $t$ -test controls for large sample sizes; they make it harder to reject $H_0$ compared to the $Z$ -test. This fact can also be seen in the fact that the variance of the $t$ -distribution is $\nu/(\nu-2)$ , which is greater than 1 (the variance of the standard normal) but also converges to 1 as $\nu$ gets larger.

Because of this convergence, some professionals think it’s better to just always use a $t$ -test, since it will be equivalent to the $Z$ -test for large sample sizes.

In the $t$ -test, $\nu$ is set to $n-1$ .

Example

Let’s only consider the GPAs of the first row of students in the $Z$ -test

$3.2, 2.9, 3.7, 2.5, 3.1, 3.8, 2.7, 3.0, 3.3, 2.8, 3.6, 2.6, 3.5, 2.4, 3.4, 2.3$ Here $n = 16,\nu=15,\hat{\mu}=3.05,s=0.4760952$ So $\tau=\frac{3.05-3}{0.4760952/\sqrt{16}}=0.42008\rightarrow 1-\Phi_t(\tau)\approx0.34$ Since $0.34\not\lt0.05$ , we still fail to reject $H_0$ and make the same conclusion as before.

In case you’re curious, if we used the entire sample, our $p$ -value from this $t$ -test would be about 0.2, just like the $Z$ -test!

Fundamentals of Hypothesis Testing

Z and t Tests

Good job, keep it up!

49%

Completed

You have 39 sections remaining on this learning path.

Advance your learning journey! Go Premium and unlock 40+ hours of specialized content.

Statistics & AB Testing

The Z and t test

ZZZ tests

Cheat Sheet

Description

Example

ttt Tests

Note on notation

Cheat sheet

Description

Example

49%

$Z$ tests

$t$ Tests