当前位置:网站首页>Summary of Statistics for Interview
Summary of Statistics for Interview
2022-07-19 07:23:00 【Alex Tech Bolg】
Table of Contents
P_value
- P value means the probability of obtaining the observed results of a test.
- The smaller the P value, the less likely we can get the observed results in a test based on what our current hypothesis.
- And in statistics, our current hypothesis is null hypothesis, and a p-value smaller than 0.05 means that we should reject this null hypothesis and accept the alternative hypothesis.
explain p-value to non-tech people
Let’s say your p-value < 0.05, how would you explain p-value to someone who doesn’t understand statistics?
https://quantifyinghealth.com/p-value-explanation/
- P value means how likely the results were so unusual that they appeared just by chance.
- The smaller the P value, the more likely that the results were so extreme that they can just appeared by chance.
(P value means the probability of obtaining the observed results of a test.
The smaller the P value, the less likely we can get the observed results in a test based on what our current hypothesis.)
- We typically set 0.05 as a threshold to determine if the results are unusual or not. If p-value is smaller than 0.05, then we consider it is very likely that the results appeared by chance.
(And in statistics, our current hypothesis is null hypothesis, and a p-value smaller than 0.05 means that we should reject this null hypothesis and accept the alternative hypothesis.)
Power of a test / statistical power
The statistical power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis when a specific alternative hypothesis is true. It is commonly denoted by 1 − β 1-\beta 1−β ,
Statistical power ranges from 0 to 1, and as the power of a test increases, the probability β \beta β of making a type II error by wrongly failing to reject the null hypothesis decreases.
‘Statistical power’ refers to the power of a binary hypothesis, which is the probability that the test rejects the null hypothesis given that the alternative hypothesis is true.
Standard Error
- https://en.wikipedia.org/wiki/Standard_error
- https://stats.stackexchange.com/questions/29641/standard-error-for-the-mean-of-a-sample-of-binomial-random-variables
The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error of the mean (SEM).

What are covariance and correlation? How are they related?
- Covariance is a quantitative measure of the extent to which the deviation of one variable from its mean matches the deviation of the other from its mean.
- Correlation is a measurement of the relationship between two variables. It is the covariance of the two variables, normalized by the variance of each variable.
What is the law of large numbers?
- The Law of Large Numbers is a theory that states that as the number of trials increases, the average of the result will become closer to the expected value.
- Eg. flipping heads from fair coin 100,000 times should be closer to 0.5 than 100 times.
Q: What is the Central Limit Theorem? Explain it. Why is it important?
- The central limit theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size gets larger no matter what the shape of the population distribution.
- The central limit theorem is important because it is used in hypothesis testing and also to calculate confidence intervals.
CTR / CTP

- https://regularization.medium.com/udacity-a-b-testing-notes-lession-1-1e8ca8f8a704
- The difference is CTR cares about clicks and CTP cares about visitors. A visitor may click and view page multiple times. In general, a rate is used to measure the usability and a probability is used to measure the impact. For example, use rate to answer how often a user finds a specific button on a web page with many buttons; use probability to answer how many users progress to the next page.
- For CTR, engineers modify the website to capture a page view event and a click event.
- For CTP, need to further match each page view with all of the child clicks, so that you count, at most, one child click per page view
What’s the major drawback of A/B testing?
The fact that A/B test results are not telling you in absolute terms which version is better. They are telling you which version is better given your current user base, which is the data you use to test.
Can take lots of time and resources
A/B testing can take a lot longer to set up than other forms of testing. Setting up the A/B system can be a resource and time hog, although third-party services can help. Depending on the company size, there may be endless meetings about which variables to include in the tests. Once a set of variables have been agreed, designers and coders will need to effectively work on double the amount of information. In addition, in order to get conclusive results, tests can take weeks and months for low-traffic sites.
https://www.experienceux.co.uk/ux-blog/the-pros-and-cons-of-ab-testing/
- A/B testing can make you forget about the big picture
https://medium.com/@madsbuchstage/the-limits-of-a-b-testing-9f96691c9a0c
边栏推荐
猜你喜欢

M simulation of cooperative MIMO distributed space-time coding technology based on MATLAB

TypeScript(ts-loader,tsconfig.json及lodash)

Steam game server configuration selection IP
![Minecraft integration package [gtnh] gray Technology: new vision server building tutorial](/img/59/d5f226f57cfd7d28d5a76ff38fae16.png)
Minecraft integration package [gtnh] gray Technology: new vision server building tutorial

Pytorch learning diary (4)
How does the advanced anti DDoS server confirm which are malicious ip/ traffic? ip:103.88.32. XXX

Quickly learn to use cut command and uniq command

How to open the service of legendary mobile games? How much investment is needed? What do you need?

Servlet 笔记

Pychart installation tutorial
随机推荐
Pytorch learning diary (III)
Sword finger offer question brushing record - offer 04 Search in two-dimensional array
【量化笔记】波动volatility相关技术指标以其含义
How do you know whether the network needs to use advanced anti DDoS server? How to choose the computer room is also very important, as well as the stability of the later business
1. What is a server?
Recursive access to directories, print Fibonacci sequences, high-order functions
PyTorch学习日记(四)
What if the website is hijacked?
剑指Offer刷题记录——Offer 07.重建二叉树
C语言编译器的下载、配置和基本使用
Paper reading: deep residual shrink networks for fault diagnosis
网络知识-04 网络层-IPv6
pytorch张量
M FPGA implementation of chaotic digital secure communication system based on Lorenz chaotic self synchronization, Verilog programming implementation, with MATLAB chaotic program
M analysis of anti-interference performance of high-speed frequency hopping communication system based on Simulink
m3GPP-LTE通信网络中认知家庭网络Cognitive-femtocell性能matlab仿真
Functions and random numbers
M simulation of DQPSK modulation and demodulation technology based on MATLAB
m基于simulink的16QAM和2DPSK通信链路仿真,并通过matlab调用simulink模型得到误码率曲线
What role does 5g era server play in this?