Mean is probably the most widely used point estimate, but it alone only conveys parts of the story. Another thing we would like to know is how good/accurate this estimate is. Standard error and confidence interval (CI) could be used to answer this question.

Multiple sampling

Given a population, one would create M samples of size N, and calculate the mean for each sample (over N sample points). With this M means, one can calculate its mean, and standard deviation; this mean is the mean estimate of the population, and the standard deviation is the standard error. (Yes, standard error is a form of standard deviation.) With this M means, one can calculate the confidence interval using percentile method, identifying the interval containing 95% data points in the middle basically.

Bootstrapping single sample

Creating M samples could be too expensive to be practical, so one alternative is to use bootstrapping to simulate sampling M times from a single sample. Once we have M samples, we could follow the same procedure above.

Matlab/Octave Code

Code for two approaches is shown below. The population follows normal distribution (0, 1), so std_1, the standard deviation of the first sample is ~1. For samples of size 100, standard error is ~0.1; for samples of size 400, standard error is ~0.05. The relation is not accidental; standard error decreases by 1/sqrt(N). Since both standard error and confidence interval reflects the accuracy, they share the same relation; CI length decreases by 1/sqrt(N).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
% just to convince octave that this is a local function
1;

function ret = ci(x)
bootstrap_x = mean(empirical_rnd(x, max(size(x)), 10000))';
ret = prctile(bootstrap_x, [2.5, 97.5]);
end

disp("multiple sampling")
for sample_size = [100, 400]
sample_means = [];
for i = 1:10000
x = randn(sample_size, 1);
if i == 1
std_1 = std(x)
mean_1 = mean(x)
end
sample_means(i) = mean(x);
end
mean_estimate = mean(sample_means)
standard_error = std(sample_means)
my_ci = ci(sample_means)
ci_len = my_ci(2) - my_ci(1)
end

disp("bootstrapping from a single sample")
for sample_size = [100, 400]
x = randn(sample_size, 1);
std_1 = std(x)
mean_1 = mean(x)
bootstrap_x = mean(empirical_rnd(x, sample_size, 10000))';
mean_estimate = mean(bootstrap_x)
standard_error = std(bootstrap_x)
my_ci = ci(bootstrap_x)
ci_len = my_ci(2) - my_ci(1)
end

Output is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
multiple sampling
std_1 = 0.94194
mean_1 = -0.026550
mean_estimate = 0.0019642
standard_error = 0.10011
my_ci =

-0.000054469 0.003909210

ci_len = 0.0039637
std_1 = 0.96719
mean_1 = 0.070110
mean_estimate = -0.00021029
standard_error = 0.049619
my_ci =

-0.00118529 0.00073309

ci_len = 0.0019184
bootstrapping from a single sample
std_1 = 0.98712
mean_1 = -0.16471
mean_estimate = -0.16467
standard_error = 0.097553
my_ci =

-0.16661 -0.16278

ci_len = 0.0038315
std_1 = 0.93220
mean_1 = 0.050283
mean_estimate = 0.050874
standard_error = 0.046385
my_ci =

0.049971 0.051774

ci_len = 0.0018032

Reference