Subscription Retention
Start Timer
0:00:00
Given a table of subscriptions, write a query to get the retention rate of each monthly cohort for each plan_id for the three months after sign-up.
Order your output by start_month, plan_id, then num_month.
If an end_date is in the same month as start_date we say the subscription was not retained in the first month.
If the end_date occurs in the month after the month of start_date, the subscription was not retained in the second month. And so on for the third.
The end_date field is NULL if the user has not canceled.
Schema:
Input:
subscriptions table
| Column | Type |
|---|---|
user_id |
INTEGER |
start_date |
DATETIME |
end_date |
DATETIME |
plan_id |
VARCHAR |
Output:
| Column | Type |
|---|---|
start_month |
DATETIME |
num_month |
INTEGER |
plan_id |
VARCHAR |
retained |
FLOAT |
Example
Input:
subscriptions table
| user_id | start_date | end_date | plan_id |
|---|---|---|---|
| 1 | 2020-01-10 00:00:00 | 2020-03-31 23:59:59 | plan1 |
| 2 | 2020-01-10 00:00:00 | 2020-02-01 23:59:59 | plan1 |
| 3 | 2020-01-10 00:00:00 | 2020-07-31 23:59:59 | plan1 |
Output:
| start_month | num_month | plan_id | retained |
|---|---|---|---|
| 2020-01-01 | 1 | plan1 | 1.0 |
| 2020-01-01 | 2 | plan1 | 0.67 |
| 2020-01-01 | 3 | plan1 | 0.33 |
Explanation:
- Cohort: All subscriptions starting in January 2020.
- Month 1 retention: All 3 users were active in the first month → 3⁄3 = 1.0
- Month 2 retention: Only users 1 and 3 were active in Feb → 2⁄3 ≈ 0.67
- Month 3 retention: Only user 3 was active in Mar → 1⁄3 ≈ 0.33
.
.
.
.
.
.
.
.
.
Comments