A sample survey of patient records is being planned in a city that has 25 local mental health centers. The objective of the survey is to estimate the total number of patients who were given diazepam as part of their therapeutic regimen. The number of patients treated in each of the mental health centers is listed in the accompanying table. A sample of patients is to be selected by choosing a simple random sample of mental health centers and, within each sample mental health center, by choosing a subsample of patients. (Sampling of Populations p. 327)

The data for the 25 local mental health centers is below.

Throughout the following we’re going to provide estimates for the total number of patients in the 25 hospitals who were given diazepam, average number of patients per hospital who were given diazepam and population proportion of patients who were given diazepam.

```
number_patients <- c(491, 866, 188, 994, 209, 961, 834, 9820, 348,
246, 399, 175, 166, 672, 475, 439, 392, 584,
882, 424, 775, 262, 968, 586, 809)
health_center <- c(1:25)
cbind(health_center, number_patients)
```

A survey of 10% of all patients conducted the previous year in health centers 1, 3, 8, 12, and 15 yielded the following data:

```
# Survey data
health_center_surv <- c(1, 3, 8, 12, 15)
number_patients_surv <- c(46, 13, 942, 15, 42)
number_given_diazepan <- c(14, 5, 340, 1, 20)
cbind(health_center_surv, number_patients_surv, number_given_diazepan)
```

Here, we define the given variables.

```
N <- sum(number_patients) # population size
n <- sum(number_patients_surv) # sample population size
M <- length(number_patients) # number of hospitals
m <- length(number_patients_surv) # number of hospitals in survey
f <- m/M
# create quick function for zscore for future use
zscore <- function(x=.95){
if ( x==.99) {
return(2.576)
} else if ( x==.95) {
return(1.96)
} else if ( x==.9) {
return(1.645)
} else
return(1.96)
}
z = zscore(.99)
```

## a. Compute 99% CI for the population total

i.e., the total number of patients in the 25 hospitals who were given diazepam.

```
x_prime <- (M/m) * sum((number_patients[health_center_surv]/number_patients_surv)*number_given_diazepan)
margin_of_error <- z * (M/(sqrt(m)*f)) * sqrt((sum((number_given_diazepan - mean(number_given_diazepan))^2))/(m-1)) * sqrt((N-n)/N)
print(paste("The 99% CI for the population total is [",
x_prime - margin_of_error,",",
x_prime + margin_of_error,"]" ))
# "The 99% CI for the population total is [ -21546.3662822233 , 61586.0991903195 ]"
```

## b. Compute a 99% CI for X_bar.

i.e., average number of patients per hospital who were given diazepam

```
X_bar <- x_prime / M
margin_of_error_for_x_bar <- z * (1/(sqrt(m)*f)) * sqrt((sum((number_given_diazepan - mean(number_given_diazepan))^2))/(m-1)) * sqrt((N-n)/N)
print(paste("The 99% CI for the X_bar is [",
X_bar - margin_of_error_for_x_bar,",",
X_bar + margin_of_error_for_x_bar,"]" ))
```

## c. compute a 99% CI for X_2bar.

i.e., population proportion of patients who were given diazepam

```
X_2bar <- x_prime / N
margin_of_error_for_x_2bar <- z * (M/(N*sqrt(m)*f)) * sqrt((sum((number_given_diazepan - mean(number_given_diazepan))^2))/(m-1)) * sqrt((N-n)/N)
print(paste("The 99% CI for the X_bar is [",
X_2bar - margin_of_error_for_x_2bar,",",
X_2bar + margin_of_error_for_x_2bar,"]" ))
```

Overall, the interval estimates are quite large, and include 0. More data would be necessary to derive accurate estimates.