In a local teaching district,

a technology grant is available to teachers in order

to install a cluster of four computers in their classroom. From the 6,250 teachers in the

district, 250 were randomly selected and asked if they felt

that computers were an essential teaching tool

for their classroom. Of those selected, 142 teachers

felt that the computers were an essential

teaching tool. And then they ask us, calculate

a 99% confidence interval for the proportion of

teachers who felt that the computers are an essential

teaching tool. So let’s just think about

the entire population. We weren’t able to survey all

of them, but the entire population, some of them fall

in the bucket, and we’ll define that as 1, they thought

it was a good tool. They thought that the computers

were a good tool. And we’ll just define a

0 value as a teacher that says not good. And some proportion of the total

teachers think that it is a good learning tool. So that proportion is p. And then the rest of them think

it’s a bad learning tool, 1 minus p. We have a Bernoulli Distribution

right over here, and we know that the mean of

this distribution or the expected value of this

distribution is actually going to be p. So it’s actually going to be a

value, it’s neither 0 or 1, so not an actual value that you

could actually get out of a teacher if you were

to ask them. They cannot say something in

between good and not good. The actual expected value

is something in between. It is p. Now what we do is we’re taking

a sample of those 250 teachers, and we got that 142

felt that the computers were an essential teaching tool. So in our survey, so we had 250

sampled, and we got 142 said that it is good, and we’ll

say that this is a 1. So we got 142 1’s, or we sampled

1, 142 times from this distribution. And then the rest of the time,

so what’s left over? There’s another 108 who said

that it’s not good. So 108 said not good, or you

could view them as you were sampling a 0, right? 108 plus 142 is 250. So what is our sample

mean here? We have 1 times 142, plus 0

times 108 divided by our total number of samples,

divided by 250. It is equal to 142 over 250. You could even view this as

the sample proportion of teachers who thought that

the computers were a good teaching tool. Now let me get a calculator

out to calculate this. So we have 142 divided by

250 is equal to 0.568. So our sample proportion

is 0.568. or 56.8%, either one. So 0.568. Now let’s also figure out our

sample variance because we can use it later for building

our confidence interval. Our sample variance here–

so let me draw a sample variance– we’re going to take

the weighted sum of the square differences from the mean and

divide by this minus 1. So we can get the best estimator of the true variance. So it’s 1 times– no, it’s the

other way actually around– we have 142 samples that were 1

minus 0.568 away from our sample mean, or we’re this far

from the sample mean 142 times, and we’re going to

square those distances. Plus the other 108 times we got

a 0, so we were 0 minus 0.568 away from the

sample mean. And then we are going to divide

that by the total number of samples minus 1. That minus 1 is our adjuster

so that we don’t underestimate. So 250 minus 1. Let’s get our calculator

out again. And so we have 100– we put

a parentheses around everything– I have 142 times

1 minus 0.568 squared, plus 108 times 0 minus– and you

could obviously do parts of this in your head, but I’m just

going to write the whole thing out– minus 0.568 squared,

and then all of that divided by 250 minus 1 is 249. So our sample variance is–

well, I’ll just say 0.246. It is equal to– it is our

sample variance– I’ll write it over here– our sample

variance is equal to 0.246. If you were to take the square

root of that our actual sample standard deviation is going to

be, let’s take the square root of that answer right over there,

and we get 0.496 is equal to 0. I’ll just round that

up to 0.50. So that is our sample

standard deviation. Now this interval, let’s think

of it this way, we are sampling from some sampling

distribution of the sample mean. So it looks like this

over here, it looks that over there. And it has some mean, and so

the mean of the sampling distribution of the sample mean

is actually going to be the same thing as this mean over

here– it’s going to be the same mean value– which

is the same thing as our population proportion. We’ve seen this multiple

times. And the sampling distribution’s

standard deviation, so the standard

deviation of the sampling distribution, so we could view

that as one standard deviation right over there. So the standard deviation of

the sampling distribution, we’ve seen multiple times,

is equal to the standard deviation– let me do this in

a different color– is equal to the standard deviation of

our original population divided by the square root

of the number of samples. So it’s divided by 250. Now we do not know this

right over here. We do not know the actual

standard deviation in our population. But our best estimate of that,

and that’s why we call it confident, we’re confident that

the real mean or the real population proportion, is going

to be in this interval. We’re confident, but we’re not

100% sure because we’re going to estimate this over here, and

if we’re estimating this we’re really estimating

that over there. So if this can be estimated it’s

going to be estimated by the sample standard deviation. So then we can say this is going

to be approximately, or if we didn’t get a weird,

completely skewed sample, it actually might not even be

approximately if we just had a really strange sample. But maybe we should write

confident that– we are confident that the standard

deviation of our sampling distribution is going to be

around, instead of using this we can use our standard

deviation of our sample, our sample standard deviation. So 0.50 divided by the square

root of 250, and what’s that going to be? That is going to be– so we

have this value right over here, and actually I don’t have

to round it, divided by the square root of 250. We get 0.031. So this is equal to

0.031 over here. So that’s one standard

deviation. Now they want a 99% confidence

interval. So the way I think about it is

if I randomly pick a sample from the sampling distribution,

what’s the 99% chance, or how many– let

me think of it this way. How many standard deviations

away from the mean do we have to be that we can be 99%

confident that any sample from the sampling distribution will

be in that interval? So another way to think about

think it, think about how many standard deviations we need to

be away from the mean, so we’re going to be a certain

number of standard deviations away from the mean such that any

sample, any mean that we sample from here, any sample

from this distribution has a 99% chance of being plus or

minus that many standard deviations. So it might be from

there to there. So that’s what we want. We want a 99% chance that if

we pick a sample from the sampling distribution of the

sample mean, it will be within this many standard deviations

of the actual mean. And to figure that out let’s

look at an actual Z-table. So we want 99% confidence. So another way to think about it

if we want 99% confidence, if we just look at the upper

half right over here, that orange area should be 0.475,

because if this is 0.475 then this other part’s going to be

0.475, and we will get to our– oh sorry, we want

to get to 99%, so it’s not going to be 0.475. We’re going to have to go

to 0.495 if we want 99% confidence. So this area has to be 0.495

over here, because if that is, that over here will also be. So that their sum will

be 99% of the area. Now if this is 0.495, this value

on the z table right here will have to be 0.5,

because all of this area, if you include all of this

is going to be 0.5. So it’s going to be

0.5 plus 0.495. It’s going to be 0.995. Let me make sure I

got that right. 0.995. So let’s look at our Z-table. So where do we get 0.995. on our z table? 0.995. is pretty close, just to have

a little error, it will be right over here–

this is 0.9951. So another way to think about it

is 99– so this value right here gives us the whole

cumulative area up to that, up to our mean. So if you look at the entire

distribution like this, this is the mean right over here. This tells us that at 2.5

standard deviations above the mean, so this is 2.5 standard

deviations above the mean. So this is 2.5 times the

standard deviation of the sampling distribution. If you look at this whole area,

this whole area over here, if you look at the

Z-table, is going to be 0.9951, which tells us that just

this area right over here is going to be 0.4951, which

tells us that this area plus the symmetric area of that many

standard deviations below the mean, if you combine

them, 0.4951 times 2 gets us to 99.2. So this whole area right

here is 99.992. So if we look at the area 2.5

standard deviations above and below the mean– oh,

let me be careful. This isn’t just 2.5,

we have to add another digit of precision. This is 2.5, and the next digit

of precision is given by this column over here. So we have to look all the way

up into the second to the last column, and we have to add

a digit of 8 here. So this is 2.58 standard

deviations. We have 2.5 over here, and then

we get the next digit 8 from the column. 2.58 standard deviations above

and below the standard deviation encompasses a little

over 99% of the total probability. So there’s a little over a 99%

chance that any sample mean that I select from the sampling

distribution of the sample mean will fall

within this much of the standard deviation. So let me put it this way. There is a 99– it’s actually,

what, a 99.2% chance, right? If you multiply this times 2

you get 0.99– actually you get 0.9902. So we’ll say roughly 99% chance

that any sample that a random sample mean is within

2.58 standard deviations of the sampling mean of the mean

of the sampling distribution of the sampling mean, which is

the same thing as our actual population mean, which is the

same thing as our population proportion. So of p. And we know what this

value is right here. At least we have a decent

estimate for this value. We don’t know exactly what this

is, but our best estimate for this value is

this over here. So we could re-write this, so

we could say that we are confident because we are really

using an estimator to get this value here. We are confident that there is

a 99% chance that a random x, a random sample mean, is

within– and let’s figure out this value right here

using a calculator. So it is 2.58 times our best

estimate of the standard deviation of the sampling

distribution, so times 0.031 is equal to 0.0– well let’s

just round this up because it’s so close to 0.08– is

within 0.08 of the population proportion. Or you could say that you’re

confident that the population proportion is within 0.08

of your sample mean. That’s the exact

same statement. So if we want our confidence

interval, our actual number that we got for there,

our actual sample mean we got was 0.568. So we could replace this, and

actually let me do it. I can delete this right here. Let me clear it. I can replace this, because we

actually did take a sample. So I can replace this

with 0.568. So we could be confident that

there’s a 99% chance that 0.568 is within 0.08 of the

population proportion, which is the same thing as the

population mean, which is the same thing as the mean of the

sampling distribution of the sample mean, so forth

and so on. And just to make it clear we can

actually swap these two. It wouldn’t change

the meaning. If this is within 0.08

of that, then that is within 0.08 of this. So let me switch this

up a little bit. So we could put a p is within

of– let me switch this up– of 0.568. And now linguistically it sounds

a little bit more like a confidence interval. We are confident that there’s a

99% chance that p is within 0.08 of the sample

mean of 0.568. So what would be our confidence

interval? It will be 0.568 plus

or minus 0.08. And what would that be? If you add 0.08 to this right

over here, at the upper end you’re going to have 0.648. And at the lower end of our

range, so this is the upper end, the lower end. If we subtract 8 from

this we get 0.488. So we are 99% confident that the

true population proportion is between these two numbers. Or another way, that the true

percentage of teachers who think those computers are good

ideas is between– we’re 99% confident– we’re confident that

there’s a 99% chance that the true percentage of teachers

that like the computers is between

48.8% and 64.8%. Now we answered the first

part of the question. The second part, how could the

survey be changed to narrow the confidence interval,

but to maintain the 99% confidence interval? Well, you could just

take more samples. If you take more samples than

our estimate of the standard deviation of this distribution

will go down because this denominator will be higher. If the denominator is higher

then this whole thing will go down. So if the standard deviations

go down here, then when we count the standard deviations,

when we do the plus or minus on the range, this value

will go down and will narrow our range. So you just take more samples.