Back to Data Science
Data Science

Data Science

91 of 257 Completed

Probability Axioms

What is Probability?

You likely encounter probability in your everyday life, several times a day. Perhaps it is a weatherman telling you there is a 20%20\% chance it will rain today, or a boss telling you that profits are likely to increase by 5%5\% this year. Indeed, humans have a basic understanding that all things have a chance to happen (but that chance may be zero).

However, numerous psychological studies have shown human beings are terrible at probability. For example, when asked to pick a random number between 11 and 1010, people are statistically more likely to pick seven than any other number. For this reason, we need a precise way to talk about probability in a way that is independent of human psychology. This language is called probability theory.

Probability Axioms

An axiom is, informally, any statement that is taken as truth without any justification. Axioms help give us a baseline of things we can all agree on so we can make further mathematical arguments. For example, we can’t start talking about addition if we don’t define certain properties of addition, like that 1+2=2+11+2=2+1 (commutativity) or that the sum of two numbers is also a number (closure).

The probability axioms are a list of “laws” that all mathematicians, statisticians, data scientists, physicists, economists, etc. agree are true about probability and take for granted. We will describe these axioms both formally and informally. To be able to talk about the axioms formally, we must first define some notation.

Notation

A sample space, denoted by the capital Greek letter omega (Ω)(\Omega), is a set of events that describe the outcome of some experiment, as described in the introduction to this section. Events are usually denoted by uppercase Latin letters, like AA and BB. For example, when we flip a coin, we can either get heads or tails, which we will call the events HH and TT respectively. So our sample space for our flipping a coin would be Ω={H,T}\Omega=\{H,T\} A probability function, denoted as P\mathbb{P}, is a function that maps each event in the sample space Ω\Omega to a (real) number between zero and one. Intuitively, P\mathbb{P} gives us a way to describe the chance that an event happens. For example, if we assume our coin is fair, P(H)=0.5\mathbb{P}(H)=0.5.

With this notation, we will define what it means to be a “proper” probability.

Axiom 1

Informally, the first probability axiom says that the probability of anything happening is non-negative. Note non-negative does not mean non-zero, an event can have zero probability of happening. Formally: P(E)0, EΩ\mathbb{P}(E)\geq0,\quad\forall \ E\in\Omega The \forall symbol means “for all”, as in, “for all events in the sample space.”

Axiom 2

Informally, the second probability axiom says that, in any experiment, something must happen. That is to say, the probability that something happens is 11. Formally: P(Ω)=EΩP(E)=1\mathbb{P}(\Omega)=\sum_{E\in\Omega}\mathbb{P}(E)=1

Axiom 3

Informally, the third probability axiom says that if two events *cannot both occur at the same time*, then the probability that either event happens is the sum of their probabilities. This one is a bit more technical. One way to think of this is that a coin can not show heads and tails at the same time, so: P(HT)=P(H)+P(T)=0.5+0.5=1\mathbb{P}(H\cup T)=\mathbb{P}(H)+\mathbb{P}(T)=0.5+0.5=1 This matches our intuition that a coin will certainly either show heads or tails. The technical term is to say HH and TT are mutually exclusive events, as in, they can not both occur at the same time.

Formally, if {E1,E2,}\{E_1,E_2,\dots\} are mutually exclusive events in the sample space Ω\Omega then: P(i=1Ei)=i=1P(Ei)\mathbb{P}\left(\bigcup_{i=1}^{\infty}E_i\right)=\sum_{i=1}^{\infty}\mathbb{P}(E_i)

Good job, keep it up!

35%

Completed

You have 166 sections remaining on this learning path.