当前位置:网站首页>Hypothesis testing
Hypothesis testing
2022-07-19 07:23:00 【Alex Tech Bolg】
Table of contents

1 Proportion
Experiment: test color color of a button
- Click through probability: N(users who clicked) / N(total users)
- 1000 users in both control and treatment groups
Results:
- Control group: 1.1% CTP
- Treatment group: 2.3% CTP
Significance:
- Practical significant boundary: 0.01
- Significance level α \alpha α : 0.05
Make a decision:
- Significant difference? Launch the “feature”?
Questions
1. Which hypothesis test to use?
2. What is the null hypothesis?
3. Is the result statistically significant?
4. Is the result practically significant?

- Bernoulli population: either clicks or doesn’t click
- Control group: n*p = 1000 * 1.1% = 11
- Treatment group: n * p = 1000 * 2.3% = 23
- Both np and n(1-p) are larger than 10, so we can consider it as large samples. Test statistic follows Z-distribution.
T-Test Z-Test The difference between ?
Measurements
- Users clicked X c t X_{ct} Xct, X t r X_{tr} Xtr
- Total number of users n c t n_{ct} nct, n t r n_{tr} ntr
P c t P_{ct} Pct = X c t X_{ct} Xct / n c t n_{ct} nct = 11 / 1000
P t r P_{tr} Ptr = X t r X_{tr} Xtr / n t r n_{tr} ntr = 23 / 1000
What is the null hypothesis?
We want to measure the difference of P t r P_{tr} Ptr and P c t P_{ct} Pct .
d = P t r P_{tr} Ptr - P c t P_{ct} Pct
Null hypothesis:
H 0 H_{0} H0: P t r P_{tr} Ptr = P c t P_{ct} Pct , d = 0
d ~ N(0, S E 2 SE^{2} SE2)
We don’t know the standard deviation of d, so we need to estimate it.
Test statistic:
TS = ( P t r (P_{tr} (Ptr - P c t ) / S E P_{ct}) / SE Pct)/SE
Estimate a standard error:
- Choose a SE can represent both groups
- “Pooled” standard error
Compute “pooled” SE
- “Pooled” probability of a click, p’
- Total probability across 2 groups:
P ′ P' P′ = ( X c t + X t r ) / ( n c t + n t r ) (X_{ct} + X_{tr}) / (n_{ct} + n_{tr}) (Xct+Xtr)/(nct+ntr) = (11+23) / (1000+1000) = 0.017
Test statistics
TS = ( P t r (P_{tr} (Ptr - P c t ) / S E P_{ct}) / SE Pct)/SE = 0.012 / 0.00578 = 2.076
Is result statistically significant?
- critical z-score ( α \alpha α: 0.05) = 1.96
- TS > 1.96 or TS < -1.96, reject null hypothesis
- In this example, Test is statistically significant.
Is result practically significant?
Confidence interval of d
Center of C.I. = 0.012 (This is P t r P_{tr} Ptr - P c t P_{ct} Pct )
Width of C.I. (margin of error)
m = Z * S p o o l S_{pool} Spool = 1.96 * 0.00578 = 0.0113
CI of d: 0.012 ± 0.0113 = 0.0007 ~ 0.0233

Best guess: There is a practical significant change.
It’s possible the change is not practical significant.
Make launch decision:
- Not confident the change is practically significant.
- Not recommend launch the feature.
Checking statistical significance:
- Check if CI overlaps with 0: If it does, result is not statistically significant.
- Equivalent to comparing TS with critical value.
2 Mean
Experiment: if a new feature changes avg. number of posts


Correction: Mean of treatment is 1.7
What conclusion can you draw?
- Assume variances are similar.
Significance:
- Practical significant boundary: 0.05
- Significance level α \alpha α : 0.05





Correction: Spool = 1.06
SS: Sum of square

Margin of error would be t-score*Spool (1/(1/nc +1/nt)^1/2), which would come to be ~0.51 which +/- from d-hat (0.6) would be above the significance level of 0.05
3 Welch’s T-test



Reference: https://www.youtube.com/watch?v=6uw0A3aKwMc
边栏推荐
- web安全(xss及csrf)
- M design of GPS data longitude and latitude height analysis and Kalman analysis software based on matlab-GUI
- JS不使用async/await解决数据异步/同步问题
- 字典、元组和列表的使用及区别,
- What does IP fragment mean? How to defend against IP fragment attacks?
- MySQL正则表达式^和$用法
- Pytorch learning diary (4)
- 类与super、继承
- 2021-10-25 browser compatibility problems
- Network knowledge-03 data link layer PPPoE
猜你喜欢

Data protection / disk array raid protection IP segment 103.103.188 xxx

M BTS antenna design based on MATLAB, with GUI interface

Network knowledge-03 data link layer PPP

M matlab simulation of bit error rate using LDPC, turbo and convolutional channel coding and decoding in VBLAST cooperative MIMO system segment

IP103.53.125. XXX IP address segment details

网络知识-04 网络层-ICMP协议

m基于simulink的16QAM和2DPSK通信链路仿真,并通过matlab调用simulink模型得到误码率曲线

Pytorch learning diary (4)

Pytorch learning diary (II)

Edit close automatically generate configuration file when saving
随机推荐
Solution to the conflict between security automatic login and anti CSRF attack
Dictionary, use of sets, conversion of data types
Minecraft paper version 1.18.1 open service tutorial, my world open service tutorial, mcsmanager 9 panel use tutorial
wcdma软切换性能matlab仿真m,对比平均激活集数(MASN)、激活集更新率(ASUR)及呼叫中断概率(OP)三个性能指标
2021-10-25 browser compatibility problems
My world 1.18.1 forge version open service tutorial, can install mod, with panel
Ivew shuttle box transfer component highlights the operation value
Summary of Statistics for Interview
Quickly master the sort command and tr command
PyTorch学习日记(四)
爬虫基础—代理的基本原理
How to set primary key self growth in PostgreSQL database
ArraysList方法
Network knowledge-03 data link layer Ethernet
M design of GPS data longitude and latitude height analysis and Kalman analysis software based on matlab-GUI
M FPGA implementation of chaotic digital secure communication system based on Lorenz chaotic self synchronization, Verilog programming implementation, with MATLAB chaotic program
The principle of SYN Flood attack and the solution of SYN Flood Attack
剑指Offer刷题记录——Offer 05. 替换空格
4. Installation and use of idea
TypeScript(一)