In Global Context: Mental Health Study
Indiana University . Bloomington . Indiana
Bulgaria Sample Design
SIZE AND SAMPLING PROCEDURE
sample we propose is a two-stage cluster sample. The first stage includes
random selection of about 150 clusters. This sample is drawn from the list
of electoral sections (the primary units in elections of the October 2003
local elections). Electoral sections of different size (depending on the
number of people in each section) will be proportionally represented in
the sample. The second stage includes random selection of eight (8)
respondents. The planned sample size is thus 1200. The expected
non-response is 15-20%, and that makes the expected size of the real
sample about 1000. (For more details, please refer to the enclosed
planned sample size is N=1200. The sampling model is two-stage random
cluster sample. The sampling universe is the population of
1: Population of
first stage of the sample is based on the list of the electoral sections
as of the last local elections (October, 2003). The total number of the
electoral sections is 12313.
The average size of one electoral section is about 500 people. Electoral
sections cover the whole territory of the country and respectively they
provide access to the whole population. We dispose of the complete list of
electoral sections, which includes: number, territorial location, and
number of registered voters. Selection of electoral sections to be
included in the sample is made employing the following procedure
(systematic random selection):
list of electoral section a cumulative measure of size column based on the
number of people in each electoral section is computed.
number of people in all sections is divided by the number of sections to
be included in the sample (the proposed number for the present survey is
150). The product of the division is the so-called “selection
start (RS) within the range between 1 and SI is chosen.
the cumulative column in the table, the first electoral section to be
included is the one which contains the RS. The second section is the one,
which contains RS+SI, the third - RS+2SI, etc.
proposed number of clusters (electoral sections) for the sample of the
present survey is N = 150. Following the above procedure ensures:
clusters are chosen with probability proportional to the size of sections.
sample is proportionally distributed over the territory of the country and
includes all types of locations (cities, towns, and villages).
That it is
representative of the whole population of the country.
the second stage of the sampling procedure a fixed number of persons in
each cluster (electoral section) are selected at random. This number is
obtained by dividing the size of the planned sample by the number of
clusters (1200/150). Thus each cluster is to include 8 persons.
respondents within the clusters are chosen at random from the Central
register (the computer center of the ESGRAON system). The ESGRAON system
covers the whole
result of the selection at the second stage is a list of respondents
including personal ID, name, community, and address. Thus each interviewer
will be supplied with the names and the addresses of the respondents to be
interviewed. The interviewers will record the information for all
sample composed in the above-described way will have the following
is representative of the Bulgarian population aged 18 and over and will
cover the whole territory of the country.
is designed to reproduce the basic socio-demographic parameters of the
population aged 18+ as of the data from the last Census.
available information from the latest Census of the population (March
2001) will be used.
parameter estimates (distributions for each variable in the survey) will
depend on the size of the sample and the level of intra-class correlation
(the level of similarity of respondent answers to the different questions
within a given cluster).
the planned sample size (N = 1200) and the average estimate for the
intra-class correlation of B=0.05 the expected maximal stochastic errors
for the different estimates of variable distributions are as follows:
The weighting of cases goes
through several steps which could be summarized as follows:
After dividing column “A” by
column “B” we arrive at the required weighting coefficients in column
“C”. Thus, after applying the weighting variable in the data set, one
male will no longer be a single unit, but 1.116279 units and one female
will be 0.912281 of a unit.
The stated herewith example is
applied only when one demographic variable from the sample is biased. In
cases when several variables are “off the limits” a combination of
each category of one variable and each category of the other variables is
computed. The percentile distribution is than compared with the one driven
from National Statistics. The eventual procedure of coefficient
calculation follows the same logic as the one described above. Only in
this case we have as much coefficient as the number of combinations
between categories of variables. If two variables sex (2 categories) and
marital status (5 categories) with have distribution different from the
General in total 32 possible coefficients are possible.
1022 E. Third Street, Bloomington, IN 47405 (812) 855-3841