博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
关于χ²分布和统计
阅读量:5116 次
发布时间:2019-06-13

本文共 2861 字,大约阅读时间需要 9 分钟。

Recently I was helping my wife review some research papers in her physiotherapy area, some of which involve certain amount of statistical analysis, particularly χ² statistic which I later found is not quite a trivial statistical topic. I realized I have forgotten the majority of what I learned about probability and statistics in the university. Now I have to pick some of them up again. Fortunately, Wikipedia is always very handy for such needs.

First of all, what is ? (As it is what χ² is based on)

In short, k-order χ² distribution or χ² distribution with k (k is a positive integer) degrees of freedom is the sum of squares of k independent standard normal random variables (random variables with the). It degenerates to a standard normal random variable when k is 1.

Its probability density function and cumulative density function are both given in the .

However what is interesting is its utilization as a mathematical tool in statistic tests.

Think about the following scenario.

From hypothesis, in a certain area, the ratio of the number of men to that of women is 1.1:1, and we can use the tool developed using χ² statistic to test how likely this 'theory' or statement is NOT true.

To answer this question, the key is to create a formula in a similar form to χ² where the errors or differences are corresponding to the individual random variables in χ².

And at the same time we can draw a sample of people from that area in the number that can easily test the frequency, like 105, as the ideal match of the theoretical frequency would be 55 men and 50 women.

And the formula mentioned above is defined as (note this testing variable is also called χ², as it's a χ² test)

χ² = (Number of Men from the Sample - 55)^2 / 55 + (Number of Women from the Sample - 50)^2 / 50, provided the size of the sample is 105.

We can see either of the two components of the sum above should act like the square of a standard normal random variable if the statement is true, however they are completely correlated instead of independent as if one of them is known the other is determined.

So if we end up having 59 men and 46 women in the sample, we will have χ² = 0.61. Look up in the cdf of χ² for degrees of freedom being 1, we find the possibility of χ² over 0.61 is around 0.4 which is way above the conventional criteria for statistical significance 0.001. This possibility might be denoted by p in some literature. So normally we would not reject the null hypothesis.

Hmm, the above interpretation sounds not making much sense (esp. the fact that we treat the case with degrees of freedom being 1 whereas there are actually two terms involved), however that's what I understand from the Wikipedia articles. Will review and correct that after a further study on the subject.

References:

1.

2.

转载于:https://www.cnblogs.com/quanben/archive/2013/02/08/3128843.html

你可能感兴趣的文章
待整理
查看>>
一次动态sql查询订单数据的设计
查看>>
C# 类(10) 抽象类.
查看>>
Vue_(组件通讯)子组件向父组件传值
查看>>
jvm参数
查看>>
我对前端MVC的理解
查看>>
Silverlight实用窍门系列:19.Silverlight调用webservice上传多个文件【附带源码实例】...
查看>>
2016.3.31考试心得
查看>>
mmap和MappedByteBuffer
查看>>
Linux的基本操作
查看>>
转-求解最大连续子数组的算法
查看>>
对数器的使用
查看>>
【ASP.NET】演绎GridView基本操作事件
查看>>
ubuntu无法解析主机错误与解决的方法
查看>>
尚学堂Java面试题整理
查看>>
MySQL表的四种分区类型
查看>>
[BZOJ 3489] A simple rmq problem 【可持久化树套树】
查看>>
STM32单片机使用注意事项
查看>>
swing入门教程
查看>>
好莱坞十大导演排名及其代表作,你看过多少?
查看>>