目录

  • 1 概率论的基本概念
    • 1.1 随机试验
    • 1.2 样本空间、随机事件
    • 1.3 频率与概率
    • 1.4 等可能概型(古典概型)
    • 1.5 条件概率
    • 1.6 独立性
  • 2 随机变量及其分布
    • 2.1 随机变量
    • 2.2 离散型随机变量及其分布
    • 2.3 随机变量的分布函数
    • 2.4 连续型随机变量及其概率密度
    • 2.5 随机变量的函数的分布
  • 3 多维随机变量及其分布
    • 3.1 二维随机变量
    • 3.2 边缘分布
    • 3.3 新建目录
    • 3.4 新建目录
    • 3.5 二维随机变量的特征数
  • 4 随机变量的数字特征
    • 4.1 数学期望
    • 4.2 随机变量的数字特征
    • 4.3 协方差及相关系数
    • 4.4 矩、协方差矩阵
  • 5 大数定律与中心极限定理
    • 5.1 大数定律
    • 5.2 中心极限定理
  • 6 统计量及其分布
    • 6.1 样本数据的整理与显示
    • 6.2 统计量及其分布
  • 7 参数估计
    • 7.1 点估计得几种方法
    • 7.2 点估计的评价标准
    • 7.3 区间估计
  • 8 假设检验
    • 8.1 假设检验的基本思想与概念
    • 8.2 正态总体参数假设检验
  • 9 基于R语言的实验
    • 9.1 R语言介绍
    • 9.2 R软件下载与安装
    • 9.3 初识R软件
    • 9.4 蒲丰投针的计算
    • 9.5 同一天生日的计算
    • 9.6 抛硬币和骰子
    • 9.7 两点分布
    • 9.8 二项分布
    • 9.9 泊松分布
    • 9.10 正态分布
    • 9.11 指数分布
二项分布

9.8 The Binomial Distribution 二项分布

The binomial random variable is defined as the sum of repeated Bernoulli trials, so it represents the count of the number of successes (outcome=1) in a sample of these trials. The argument size in the binom functions tells R the number of Bernoulli trials we want in the sample.

Random Samples: rbinom

Notice we used the binom functions with size = 1 to explore the Bernoulli distribution above, so we just need to change this argument to sample from a binomial distribution:

rbinom(n = 15, size = 20, p = 0.7)
##  [1] 14 13 17 12 16 14 18 11 11 14 13 14 17 14 15

This represents the process of taking 15 samples, each with 20 trials, where the probability of success in each trial is 0.7, and the outcomes are the number of successes in each sample.


Note: In the traditional notation for the binomial PDF B(n,p), we write:P(X=k;n,p)=(nk)(p)x(1p)nkIn this context n refers to the number of (bernoulli) trials in the sample.
But in Rsize is used to refer to the number of trials in the sample, and n is is instead used to refer to the number of outcomes you want to randomly draw from the binomial distribution.


We can plot these outcomes if we simulate too many to examine directly,

dat <- rbinom(n = 1000, size = 20, p = 0.7)
barplot(table(dat), ylab = "counts")

Density Functions: dbinom

The probability density function (PDF) of the binomial distribution is given by:f(x|n,p)=Pr(X=x)=(nx)px(1p)nx

The function that computes this automatically is dbinom(). The d stands for “density” and the binom stands for “binomial”. Suppose we want to know the probability of getting 12 successes in 20 trials, we can calculate this easily with,

dbinom(x = 12, size = 20, p = 0.7)
## [1] 0.1143967

In fact, we can easily obtain and graph the probability of every possible outcome in this binomial distribution,

barplot(height = dbinom(0:20, size = 20, p = 0.7), 
        names.arg = 0:20,
        main = "Binomial PDF", xlab = 'X', ylab = 'Probability')

Cumulative Distribution Functions: pbinom

The cumulative distribution function (CDF) of the binomial distribution is given by:F(q|n,p)=Pr(Xq)=k=0qf(k|n,p)

Suppose we want to know the probability of getting at most 12 successes in 20 trials, we can obtain this easily with,

pbinom(q = 12, size = 20, p = 0.7)
## [1] 0.2277282

In fact, we can easily obtain and graph the entire CDF,

barplot(height = pbinom(0:20, size = 20, p = 0.7), 
        names.arg = 0:20,
        main = "Binomial CDF", xlab = 'X', ylab = 'Probability')

We can illustrate the relationship between the PDF and the CDF in the following plot,

par(mfrow = c(1,2))
barplot(height = dbinom(0:20, size = 20, p = 0.7), 
        names.arg = 0:20, 
        ylim = c(0,1),
        main = "Binomial PDF", xlab = 'X', ylab = 'Probability',
        col = c(rep("blue", 15), rep("gray", 8)))
barplot(height = pbinom(0:20, size = 20, p = 0.7), 
        names.arg = 0:20, 
        ylim = c(0,1),
        main = "Binomial CDF", xlab = 'X', ylab = 'Probability',
        col = c(rep("gray", 14), "blue", rep("gray", 6)))

 Notice that the value of the CDF at X=14 corresponds to the sum of the PDF from X=0 to X=14.

Properties of Distributions

Note that the sum of the densities is,

sum(dbinom(0:20, size = 20, p = 0.7))
## [1] 1

And we can obtain the expectation of this binomial distribution, using the general definition: E(X)=xiP(X=xi)

sum(0:20 * dbinom(0:20, size = 20, p = 0.7))
## [1] 14

Note that this is also what we get using the specific formula for the expectation of a binomial: E(X)=np=200.7.

The variance can be calculated using the general form: Var(X)=(xiE(X))2P(X=xi)

sum((0:20 - 20 * 0.7)^2 * dbinom(0:20, size = 20, p = 0.7))
## [1] 4.2

Which is equal to specific formula for the variance of a binomial: Var(X)=np(1p)=20×0.7×0.3.