Download pdf report

Author

Others(Video) Research Week 2021: Mendeley for reference management-
vista

0(Video) How to make a trifold brochure in Microsoft Publisher -
Descargar

0(Video) Cara Mencari Jurnal SCOPUS sesuai Bidang Penelitian Kita

Embed Size (px)

### Sample Text - SAGE - the natural home for authors, publishers, etc.

untitledPlease rate the generalization. Assess the diversity of the population. Consider a census

sampling procedure

Compared

Lessons on the Generalizability of the Quality of the Sample in Qualitative

research

conclusions

A common technique in journalism is to put a "human face" on a story. For example, a Boston Globe reporter (Abel 2008) interviewed a participant for a story about a residential program for the chronically homeless. "Burt" had worked as a welder, alcoholism and physical and mental problems put a monkey wrench on the job. By the time he turned 60, Burt had spent many years on the streets. Fortunately, a new Massachusetts program provided her with independent housing, but even then “the lure of alcohol and street friends was strong” (Abel 2008: A14). It is a sad story with an unusually happy, if uncertain, ending. Together

Along with another story and comments from various service workers, the article provides a compelling justification for the new housing program. However, we don't know if the two participants interviewed for the story are like most of the show's participants, most of the homeless in Boston, or most of the homeless in the United States, or if they are

sampling

CHAPTER 5

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 9/29/200811:23 PM Page 148

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

are just two people who caught the attention of this reporter. In other words, we don't know how generalizable his stories are, and if we don't have confidence in the generalization, then the validity of this account of how the program participants became homeless is questionable. Since we don't know if your situation is widespread or unique, we can't really judge what the report tells us about the social world. In this chapter you will learn about sampling methods, especially sampling procedures.

determine the generalizability of the research results. I first review the justification for the use of sampling in social research and consider two circumstances in which sampling is not necessary. The chapter then turns to specific sampling techniques and when they are most appropriate, using examples from research on homelessness. This section is followed by a section on sampling distributions, which introduces you to the logic of statistical inference, that is, how to determine the probability that our sample statistics represent the population from which the sample was drawn. By the end of the chapter, you should understand the questions to ask to assess the generalizability of a study and the decisions to make when designing a sampling strategy. You should also keep in mind that it is just as important to choose the "right" people or objects to study as it is to ask the right questions of the participants.

2 PLANNING EXAMPLE

You have encountered the problem of generalization in every study you have read about in this book. For example, Keith Hampton and Barry Wellman (1999) discussed their findings in Netville as if they could be generalized to residents of other communities; Norman Nie and Lutz Erbring (2000) generalized the results of their Internet survey to the entire US adult population, and the results of the National Geographic Society (2000) Web survey were generalized to the entire world. Whether we are designing a sampling strategy or evaluating someone else's results, we must understand how and why researchers choose sampling and the implications of those choices for the generalizability of study results.

Define the components of the sample and the population.

Let's say we are designing a survey of homeless adults in a city. We do not have the time or resources to study the entire adult population of the city, although this is the group of people or other entities to which we wish to generalize our results. Even the city of Boston, which conducts an annual homeless census, does not have the resources to interview the homeless it enumerates. Instead, we decided to study a sample, a subset of that population. The individual members of this sample are called elements or elementary units. In many studies we take samples directly from the elements

in the population of interest. We can examine a sample of the

Chapter 5 Sampling—149

population The totality of people or other entities to which the results of the study are to be generalized.

sample A subset of a population that is used to study the population.

Elements The individual members of the population whose properties are to be measured.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:25 page 149

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

150— I N V E S T I G AT I N G DIE S O Z I A L W E L D

All students of a school, based on a list available at the Registrar's Office. This list from which members of the population are selected is called the sampling frame. The students selected and interviewed from this list are the elements. In some studies, the entities that can be easily reached are

is not the same as the items we want information about, but does include those items. For example, we may have a list of households but not a list of the entire population of a city, even though adults are the items we really want to sample. In this situation, we might sample households so that we can identify adult individuals in those households. The households are called census units and the adults in the households are the elements (Levy & Lemeshow 1999:13–14).

The information we collect is not actually the elements of our study. For example, a researcher might use schools to sample

conduct a survey of educational practices and then interview a sample of teachers in each selected school to obtain information on educational practices. Both schools and teachers are called sampling units because we sample from both (Levy and Lemeshow 1999: 22). The schools are selected in the first phase of the sample, so they are the primary sampling units (in this case they are also the study elements). Teachers are secondary sampling units (but are not elements as they are used to provide information about the whole school) (see Figure 5.1). Knowing exactly what population a sample can represent is important when selecting or

Evaluate the components of the pattern. In a survey of "adult Americans," the general population can reasonably be considered as all residents of the United States who are at least 21 years of age. But always keep in mind how the population might have been reduced by the sampling procedures. For example, only English-speaking residents of the United States may have been interviewed. A study population is the collection of items that we actually focus on and sample from, not some larger collection that we actually wish we had studied. Some population groups, such as B. the homeless, are not identified by simple criteria as a

geographic boundary or organizational membership. A clear definition of such a population is difficult, but necessary. Anyone should be able to determine which population was actually studied. Studies of homelessness in the early 1980s, however, "did not propose definitions, did not use screening questions to be sure that the people they interviewed were truly homeless, and made little effort to cover the universe." homeless" (Burt 1996: 15). . (Perhaps only homeless people in an emergency shelter were studied.) The result was a "collection of studies that could not be compared" (Burt 1996:15). Several studies of homelessness in urban areas addressed the problem by using a more explicit definition of population: “persons who did not have their own home or a permanent home (i.e., they rented or owned it themselves) and did not have a regular arrangement with someone else to dwell in.” (Burt 1996:18). Even this more explicit definition still leaves some questions unanswered: What is a

"regular deal"? How permanent does a “fixed place” have to be? In a study of

sampling frame A list of all items or other units containing items in a population.

Enumerated Units Units that contain one or more enumerated items in a sampling frame.

Sampling Units Units listed in each phase of a multilevel sampling design.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:25 page 150

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—151

Chicago homeless Michael Sosin, Paul Colson, and Susan Grossman (1988) answered these questions in their definition of the population of interest:

We define homeless as: those who are currently staying with a friend or family member for at least one day but less than fifteen days, are paying no rent, and are unsure if the length of stay will exceed fourteen days; those currently residing in a shelter, either overnight or temporarily; those who currently do not have normal and acceptable accommodation and therefore sleep on the street, indoors, in abandoned buildings, in cars, in metro or bus stations, in alleys, etc.; those who live in a needy treatment center who have lived in the facility less than 90 days and say they have nowhere to go after they are discharged. (p. 22)

Example components in a two-stage study

sample of schools

Schools are the elements and the primary sampling unit.

Teachers are the secondary sampling units; provide information

about schools.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 151

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

152— I N V E S T I G AT I N G DIE S O Z I A L W E L D

This definition gives the concept of homelessness of Sosin et al. accurate again and allows researchers in other places or at other times to devise methods for studying a comparable population. Our generalizations can be more accurate.

Assess generalization

Once we have clearly defined the population from which we are sampling, we need to determine the extent of the generalizations we will make from our sample. Remember from Chapter 2 the two different meanings of generalizability?

Can the results of a population sample be generalized to the population from which the sample was drawn? Do the results of Nie and Erbring (2000) apply to the United States, those of NationalGeographic (2000) to the world, or the study by Wechsler et al. (2000) on binge drinking for all US college students? This type of generalizability was defined in Chapter 2 as sample generalizability.

Can the results of a study on one population be generalized to another, slightly different population? Are email users in Netville similar to other Ontario suburbs? In other provinces? In the U.S? Are college students similar in their drinking habits to full-time workers, stay-at-home moms, or other groups? Are the results of a laboratory study on the effects of alcohol at a small Northeastern college any different from what would be obtained at a Midwestern college? How generalizable are the results of a survey of homeless people in a city? This type of generalizability question was defined in Chapter 2 as generalizability across populations. This chapter focuses primarily on the sample generalization problem: Can

Can the findings from a sample be generalized to the population from which the sample was drawn? This is truly the most fundamental question one can ask about a sample, and social research methods provide many tools to address it. The generalizability of samples depends on the quality of the sample, which is determined by the quantity

Sampling error: The difference between the characteristics of a sample and the characteristics of the population from which it was selected. The greater the sampling error, the less representative the sample will be and the less generalizable the results will be. To assess sample quality when designing or evaluating a study, ask yourself the following questions:

• From what population were the cases selected? • What method was used to select cases from this population? • In general, do the cases studied represent the population from which they were selected?

But researchers often project their theories onto groups or populations that are much larger or simply different from the ones they actually studied. The population to which generalizations are made in this way may be called the target population: a set of items larger or different than the sample population to which the researcher wishes to generalize the results of any study. In generalizing the results to the target populations, we have to be somewhat speculative. We must carefully consider the validity of claims that the results can be applied to other groups, geographic areas, cultures, or times.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 152

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Since the validity of generalizations across populations cannot be empirically tested except by further investigation in other settings, I will not devote much attention to this issue here. But I will return to the problem of generalizability across populations in Chapter 7, which deals with experimental research, and in Chapter 12, which discusses methods for studying different societies.

Assess the diversity of the population.

Sampling is unnecessary if all units in the population are identical. Physicists do not need to select a representative sample of atomic particles to learn about fundamental physical processes. You can study a single atomic particle because it is identical to every other particle of its kind. Similarly, biologists do not need to sample a particular plant species to determine whether a particular chemical has toxic effects on that particular species. The idea is, "if you've seen one, you've seen them all." What about the people? Certainly not all humans are the same (nor are other animals, in many ways).

Respect). However, when we study physical or psychological processes that are common to all people, sampling is not required to obtain generalizable results. Psychologists and social psychologists often conduct experiments with college students to learn what processes they believe to be identical in individuals. They believe that most people would have the same reactions as college students if they experienced the same experimental conditions. Field researchers who have observed group processes in a small community sometimes make the same assumption. However, there is a potential problem with this assumption: there is no way of knowing

sure whether the processes examined are identical in all people. In fact, experiments can yield different results depending on the type of people being studied or the conditions of the experiment. Stanley Milgram's (1965) classic experiments on obedience to authority, which you studied in Chapter 3, illustrate this point very well. You recall that Milgram's original experiments tested the willingness of male volunteers in New Haven, Connecticut, to obey instructions from an authority figure to administer "electric shocks" to another person, even if the shocks harm the person giving them. receives. In most cases, the volunteers complied with this request. Milgram concluded that people are very obedient to authority. Were these results generalizable to all men, to men in the United States, or to men in New

Oasis? The initial experiment was repeated many times to test the generalizability of the results. Similar results were obtained in many iterations of Milgram's experiments, that is, when the experimental conditions and subjects were similar to those studied by Milgram. Other studies showed that some groups were less likely to be as compliant in their response. Under certain conditions, such as B. another "subject" in the room refusing to deliver shocks, subjects were likely to challenge authority. So what do the early experimental results tell us about how people will react?

authoritarian movement in the real world when conditions are not so carefully controlled? In the real social world, people may also respond in less compliant ways. Other people may argue against obeying the orders of a particular leader, or people may see the consequences of their actions on television. But alternatively, people in the real world can be even more

Chapter 5 Sampling—153

Sampling error Any difference between the characteristics of a sample and the characteristics of a population. The greater the sampling error, the less representative the sample will be.

target population A set of items, larger than or different from the sample population, to which the researcher wishes to generalize the results of the study.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 153

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

154— I N V E S T I G AT I N G DIE S O Z I A L W E L D

More obedient to authority than subjects when, for example, they are dragged by a mob or shackled by ideological zeal. Milgram's initial research and the many iterations of it give us a great deal of insight into human behavior, in part because it helps identify the kinds of people and conditions to which the initial findings can be generalized (lack of resistance to authority). . But generalizing the results of individual experiments is always risky, because such research often examines a small number of people who were not chosen to represent a specific population group. The main point is that social scientists rarely break down.

the problem of proving the generalizability of their results. If a small sample has been studied in an experiment or field research project, the study should be repeated in different settings or, preferably, with a representative sample of the population.

generalizations are sought (see Appendix 5.2). The social world and the people in it are too different to be considered "identical entities." Social psychology experiments and small field studies have produced good social science, but they need to be replicated in other settings, with other topics, to claim generalizability. Even if we think we have discovered basic social processes in a laboratory experiment or field observation, we must strive to seek confirmation from other samples and research.

Consider a census

In some circumstances, it may be possible to circumvent the generalization problem by conducting a census, examining the entire population of interest, rather than drawing a sample. The federal government tries to do this every 10 years with the US Census. censuses

This includes studies of all employees (or students) in small organizations, studies comparing all 50 states, and studies of the entire population of a specific type of organization in a specific area. However, compared to the US Census and similar efforts in other countries, states, and cities, the population studied in these other censuses is relatively small. This is because social scientists do not usually attempt to collect data from all members.

with a large population, it's just that this would be too expensive and time consuming, and they can do almost as well with a sample. Some social scientists are doing research with data from the United States. census, but it is the government that collects the data and it is your tax dollars that pay for the effort. Congress and the President provided nearly $4.5 billion to conduct the 2000 Census (Prewitt 2000), and the US Census Bureau spent 12 years planning (US Census Bureau 2000a). The Census Bureau is already testing new approaches for the 2010 census, including an Internet-based response option (U.S. Bureau of the Census 2003).

representative sample A sample that “resembles” the population from which it was selected in all respects potentially relevant to the study. The distribution of the characteristics on the elements of a representative sample corresponds to the distribution of these characteristics on the whole of the population. In a non-representative sample, some characteristics are overrepresented or underrepresented.

Census Research in which information is obtained from the responses of all available members of a complete population to questions.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 154

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—155

Even if the population of interest for a survey is a small town of 20,000 or students at a university of 10,000, researchers need to draw a sample. The costs of interviewing "just" thousands of people far exceed the budgets of most research projects. In fact, not even the US Census Bureau makes sure that everyone answers every question that needs to be covered in the census. Soit takes a sample. Each household must complete a census summary (it had seven basic questions in 2000) and a sample of one in six households must complete a long version (with an additional 53 questions) (Rosenbaum 2000). The fact is, getting people to take a survey is hard. This is another reason for a survey.

Research can be expensive. Even the United States Bureau of the Census (1999) has several efforts to make

Representative and unrepresentative samples

Population: 33% (5 of 15)

satisfied

EXPOSITION 5.2

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 155

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

156— I N V E S T I G AT I N G DIE S O Z I A L W E L D

increase the response rate despite the fact that federal law requires all citizens to complete their census questionnaire. Decadal decline (US Census Bureau 2000e). However, half a million temporary workers and up to six follow-ups were required to contact the remaining households that did not respond by mail (U.S. Bureau of the Census 2000b, 2000c). As the 2000 US Census progressed, the Bureau became concerned about the underrepresentation of minority groups (Kershaw 2000), impoverished cities (Zielbauer 2000), wealthy individuals in communities, and luxury buildings ( Langford 2000) and even college students (Abel 2000) conducted an even more intensive sample survey to learn more about the characteristics of those who had not yet responded (Anderson & Fienberg 1999; U.S. Bureau of the Census, 2000d). The number of people missing from the census was still estimated to be between 3.2 and 6.4 million (U.S. Bureau of the Census 2001), and controversy over the underrepresentation of some groups continued (Armas 2002; Holmes 2001a). The average survey project has much less legal and financial backing and is therefore suitable

A census is probably not possible. Even in Russia, which spent almost $200 million to census its approximately 145 million people, the scarcity of resources after the collapse of the Soviet Union prevented an adequate census (Myers 2002). The census had to be postponed from 1999 to 2002 due to lack of funds and was dependent on voluntary participation. Despite an $8 million publicity campaign, many residents of impoverished areas refused to participate (Tavernise 2002). In Vladivostok, “anger over the recent increase in electricity prices, many residents refused to participate. . . boycotted in protest against the ruined streets” (Tavernise 2002: A13). In most survey situations, it is much better to only interview a limited number of people.

population so that more resources are available for follow-up actions that can overcome reluctance or indifference to participate. (I will pay more attention to the non-response problem in Chapter 8.)

2 SAMPLING METHODS

We can now more systematically examine the characteristics of samples that make them more or less likely to represent the population from which they were selected. The most important

With samples, it must be distinguished whether they are based on a probability or non-probability sampling method. Sampling methods that do not tell us in advance the probability of selecting each item are called non-probability sampling methods. Probability sampling methods are based on random sampling or

Random selection process that is basically equivalent to flipping a coin to decide who of two people "wins" and who "loses". heads and tails are the same

Probability Sampling A sampling technique that is based on a random or random selection method so that the probability of selecting members of the population is known.

No Probability Sampling Sampling in which the probability of selecting members of the population is unknown.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 156

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

they are likely to appear in a coin toss, giving both people an equal chance to win. That chance, your chance of selection, is 1 in 2, or 0.5. Tossing a coin is a fair way to choose one of two people

because there is no systematic bias in the selection process. You may win or lose the coin toss, but you know that the outcome was simply due to chance, not bias. For the same reason, rolling a six-sided die is a fair way to choose one of six possible outcomes (choice probability is 1 in 6, or 0.17). Dealing a hand after shuffling a deck is a fair way to deal decks in a game of poker (the odds of each person getting a particular result, such as a full house or flush, are equal). Similarly, state lotteries use a random process to select winning numbers. Therefore, the odds of winning a lottery, the probability of choosing, are known, although they are much smaller (perhaps 1 in a million) than the odds of winning a lottery coin. There is a natural tendency to confuse the concept of

Random sampling, in which cases are selected based solely on chance, using a random sampling method. At first glance, "leave things to chance" seems to imply having no control over the sampling method. In order for nothing but chance to influence the selection of cases, the researcher must proceed very methodically and leave nothing to chance except the selection of the cases themselves. The investigator must follow carefully controlled procedures if a purely random process is to be carried out. When reading about sampling, do not assume that a random sample was obtained just because the researcher used a random selection method at some point in the sampling process. Beware of these two problems in particular; Selecting items from an incomplete list of the total population and not having an adequate response rate. If the sampling frame is incomplete, a random sample chosen from this list will not be

really be a random sample from the population. You should always consider the adequacy of the sampling frame. Even for a humble demographic like a college student body, the student roster is likely to be at least somewhat out of date at some point. For example, some students drop out but their status is not officially registered yet. Although you may consider the amount of error introduced in this particular situation to be negligible, for a larger population, the problems become very complicated. The sampling frame for any city, state, or nation is likely to be incomplete at all times due to constant migration in and out of the area. Even unavoidable omissions in the sampling frame can bias a sample against certain groups within the population. A very large sample frame can still introduce bias among many sample members.

cannot be contacted or refuse to participate. Nonresponse is a major risk in survey research because nonresponders are likely to be systematically different from those who take the time to participate. You should not assume that the results come from a random sample.

Chapter 5 Sampling—157

Selection probability The probability that a member of the population will be selected for inclusion in the sample. If you count all the items in a population, the probability that a particular item will be selected is 1.0. If half the items in the population are sampled at random (for example, by tossing a coin), the probability of selecting each item is half, or 0.5. As the sample size decreases relative to the population, the probability of selection also decreases.

Random sampling A sampling method that is based on a random or random selection method such that each member of the sampling frame has a known probability of being selected.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 157

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

158— I N V E S T I G AT I N G DIE S O Z I A L W E L D

generalizable to the population from which the sample was drawn when the non-response rate is significant (certainly not when it is well above 30%).

probability sampling

Probability sampling is one in which the probability of selection is known and not zero (there is some possibility of selecting each item). These methods select items at random and therefore do not have a systematic bias; only chance determines which elements are contained in the sample. This characteristic of probability sampling makes them much more desirable than non-probability sampling when the goal is to generalize to a larger population.

Although a random sample does not have a systematic bias, it will certainly have some sampling error due to chance. The probability of getting heads is 0.5 in a single coin toss and in 20, 30 or however many coin toss you want. But it is entirely possible to flip a coin twice and get heads both times. The random "sample" of the two sides of the coin is unbiased but not representative. Imagine that you randomly select a sample of 10 people from a population of 50 men and 50 women. Can't you imagine that among these 10 people, 7 are women and only 3 are men? Fortunately, we can mathematically determine the likely degree of sampling error in an estimate based on a random sample (as we will see later in this chapter), provided that the randomness of the sample has not been destroyed by a high rate of non-response or a bad answer. Control over the selection process. In general, both the sample size and the homogeneity (equality) of the population

affect the degree of error due to chance; the proportion of the population represented by the sample does not. Detailed,

• The larger the sample, the more confidence we can have in the representativeness of the sample. If we randomly select 5 people to represent the entire population of our city, our sample is unlikely to be highly representative of the entire population in terms of age, gender, race, attitudes, etc. But if we randomly select 100 people, the chances of a representative sample are much better; with a sample of 1,000, the chances are even very good.

• The more homogeneous the population, the more confidence we can have in the representativeness of a sample of any size. Suppose we plan to take 50 samples from each of the two communities to estimate the median household income. A community is very diverse, with family incomes ranging from $12,000 to $85,000. In the other, more homogeneous community, family income is concentrated in a narrow range of $41,000 to $64,000. The estimated median household income based on the sample from the homogeneous community is likely to be more representative than the estimate based on the sample from the more heterogeneous community. Because there are fewer variations to represent, fewer cases are needed to represent the homogeneous community.

• The proportion of the total population that a sample contains does not affect the representativeness of the sample unless that proportion is large. We can consider any choice set

Non-responders or other entities not participating in a study, although selected for the sample.

Systematic bias Over or under representation of some population characteristics in a sample due to the method used to select the sample. A sample that is formed by a sampling bias is a biased sample.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:26 page 158

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—159

less than 2% with about the same level of confidence (Sudman 1976:184). In fact, the representativeness of the sample will probably not increase much until the sample proportion is slightly larger. Other things being equal, a sample of 1,000 from a population of 1 million (with a sampling fraction of 0.001, or 0.1%) is much better than a sample of 100 from a population of 10,000 (although the sampling fraction for this smaller sample is 0, 01, or 1%, which is ten times as large.) The sample size makes representativeness more likely, not the proportion of the total that the sample represents.

The polls used to predict presidential election results illustrate both the value of random sampling and the problems it cannot overcome. In most presidential elections, pollsters have accurately predicted the results of the actual vote, using random samples and, now, telephone interviews to find out which candidate likely voters intend to vote for. Figure 5.3 shows how close these sample-based predictions have been over the past 13 contests. The exceptions were the 1980 and 1992 elections, when third party candidates had an unforeseen impact. Otherwise, small discrepancies between the randomly sampled predicted votes and the actual votes can be attributed to random errors. The Gallup poll predicted the outcome of the disputed 2000 quite well.

presidential election. Gallup's latest prediction was that GeorgeW. Bush would win by 48% (Al Gore would only receive 46%, while Green Party candidate Ralph Nader would receive 4%). Although the race was much closer and Gore won the popular vote (before losing in the electoral college), Gallup correctly points out that there seemed to have been a seminal bias in favor of Gore (Newport 2000). In 2004, Gallup's final 49% prediction for Bush came within 2 percentage points of his overall 51% victory (actually 50.77%); the "mistake" is due in part to the 1% of the votes cast by third-party candidate Ralph Nader.

Presidential Election Results: Expected and Actual

0

20

40

60

80

Year

EXPOSITION 5.3

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 159

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Still, election polls have produced some major forecast errors. The reasons for these errors illustrate some ways that unwanted biases can affect sample results. In 1936, a Literary Digest poll predicted that Alfred M. Landon would defeat President Franklin Delano Roosevelt in a landslide, but Roosevelt received 63% of the vote. The problem? The Digest mailed 10 million fake ballots to people in phone books, vehicle registration records, voter rolls, and more. But in 1936, during the Great Depression, only relatively wealthy people owned telephones and cars, and they were more likely to be Republicans. Additionally, only 2,376,523 complete ballots were returned, and a response rate of just 24% leaves a lot of room for error. Of course, this survey was not designed as a random sample, so it is not surprising that systematic bias occurs. Gallup was able to accurately predict the results of the 1936 election using a random sample of only 3,000 (Bainbridge 1989: 43–44). In 1948, pollsters incorrectly predicted that Thomas E. Dewey would marry Harry S.

Truman, based on the sampling method used successfully by George Gallup since 1934. The problem? Pollsters stopped collecting data a few weeks before the election, and many people changed their minds during those weeks (Kenney, 1987). The sample was systematically biased by underrepresenting changes in voter sentiment just before the election. The fast-paced 2008 presidential election was also a challenge for pollsters,

especially among Democratic Party voters. In the first New Hampshire primary, polls successfully predicted Republican John McCain's margin of victory of 5.5% (polls had an average discount of just 0.2%). However, all the polls predicted that Barack Obama would win the Democratic primary by about 8 percentage points, but he lost to Hillary Clinton by 12 points (47% to 35%). The president of the Pew Research Center, Andrew Kohut (2008:A27), concluded that the problem was that voters who were poorer, less educated, white, and who tended to refuse to participate in polls tended to be less favorable to blacks than to blacks. other voters. These voters, who were not represented in the polls, preferred Clinton over Obama. Because they do not disproportionately exclude or include certain groups

In general, successful random samples avoid systematic distortions in the selection process. However, when some types of people are more likely to be reluctant to participate in surveys or less available for interviews, systematic bias can still creep into the sampling process. Also, random error will still affect the specific results obtained from any random sample. The different types of random sampling differ in their ability to minimize random errors. The four most common methods for drawing random samples are simple

random sampling, systematic random sampling, stratified random sampling, and cluster sampling.

simple random sampling

Simple random sampling requires a procedure that generates numbers or identifies cases based solely on chance. As you know, tossing a coin or rolling a die can be used to identify cases by sheer chance, but these techniques are not very efficient tools for drawing a sample. A table of random numbers like the one in Appendix E greatly simplifies the process. the

160— I N V E S T I G AT I N G DIE S O Z I A L W E L D

simple random sampling A sampling method in which each item in the sample is selected purely at random through a random process.

table of random numbers A table containing lists of numbers ordered purely at random; It is used to draw a random sample.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 160

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—161

The researcher numbers each item in the sample frame and then uses a systematic process to select the corresponding numbers from the random number table. (Exercise 1 at the end of this chapter explains the process step by step.) Alternatively, a researcher can use a lottery method. Each case number is written on a small card, then the cards are shuffled and the sample of cards is selected. When a large sample needs to be generated, these methods are very cumbersome. Happily,

a computer program can easily generate a random sample of any size. The researcher must first number all the items to be examined (the sampling frame) and then run the computer program to generate a random selection of numbers within the desired range. The items represented by these numbers are the sample. Organizations that conduct telephone surveys often extract

Random sampling using another automated process called random digit dialing. A machine chooses random numbers within the telephone prefixes that correspond to the area in which the survey is to be carried out. Random digit dialing is especially useful when a sample frame is not available. The researcher simply replaces any inappropriate numbers (for example, those no longer in service or those intended for businesses) with the closest randomly generated phone number. The probability of selection in a simple real chance

The sample is the same for each item. If a sample of 500 is selected from a population of 17,000 (ie, a sampling frame of 17,000), then the probability of selection for each item is 500/17,000, or 0.03. Each item has an equal chance of being selected, just like the chances of a coin toss (1/2) or dice roll (1/6). or EPSEM. Simple random sampling can be done with either or

no surrogate sampling. Subsampling returns each item, after it has been selected, to the sampling frame so that it can be resampled. Sampling without replacement then excludes each item selected for the sample from the sampling frame. In practice, it doesn't matter if the sampled items are replaced after selection, as long as the population is large and the sample is intended to contain only a small fraction of the population. In fact, replacement samples are rarely used. In a simple random sample study, Bruce Link et al. (1996) used

Random digit dialing to contact adult household members in the continental United States for a study of public attitudes and beliefs about homelessness. 63 percent of potential interview partners responded. The actual sample obtained was not exactly comparable to the sampled population: compared to the US based on census figures, the sample was overrepresented by women, ages 25-54, married, and with more than one college degree; underrepresented Latinos. How does this example affect you? Let's evaluate the quality of the sample based on the questions asked.

earlier in the chapter:

• From what population were the cases selected? There is a clearly defined population: adult residents of the continental United States (living in households with telephones).

Random Digit Dialing The process of randomly dialing numbers by a machine within specified telephone area codes, creating a random sample for telephone surveys.

Surrogate sampling A sampling method in which sample items are returned to the sampling frame after selection so that they can be resampled. Samples can be drawn without replacement or without replacement.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 161

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

• What method was used to select cases from this population? The case selection method is random and there is no systematic bias in the sample.

• In general, do the cases studied represent the population from which they were selected? The results most likely represent the sample population, since there was no sampling bias and a large number of cases were selected. However, 37% of those selected for interviews could not be contacted or chose not to respond. This non-response rate seems to introduce a small bias into the sample for several characteristics.

We must also consider the question of generalizability across populations: do the findings from this sample have implications for a larger group beyond the population from which the sample was drawn? Because a representative sample of the entire adult US population was drawn, this question involves generalizations across countries. Link and his colleagues do not make such generalizations. There is no telling what might happen in other countries with different histories of homelessness and social policies.

Systemic random sampling

Systematic random selection is a variant of simple random selection. The first element is randomly selected from a list or from sequential files, and then one of every nth element is selected. This is a convenient method of drawing a random sample when the elements of the population

they are arranged in order. It is particularly efficient when the articles are not printed (ie, there is no sampling frame) but are represented by folders in filing cabinets. A systematic sampling requires the following three steps:

1. The total number of cases in the population is divided by the number of cases needed for the sample. This division gives the sampling interval, the number of cases from one sample case to another. If 50 cases out of 1,000 are to be selected, the sampling interval is 20; it is selected every 20 cases.

2. A number from 1 to 20 (or whatever the sampling interval is) is chosen at random. This number identifies the first case to review, counting from the first case on the list or files.

3. After selecting the first case, each n case is selected for sampling, where n is the sampling interval. If the sampling interval is not an integer, the sample size

The interval is systematically varied to obtain the correct sample size. For example, if the sampling interval is 30.5, the sampling interval alternates between 30 and 31. In almost all sampling situations, a systematic random sample produces an essentially simple random sample. The exception is a situation where the sequence of elements is affected by periodicity, that is, the sequence varies in a regular periodic pattern. For example, the houses of a new development with the same number

162— I N V E S T I G AT I N G DIE S O Z I A L W E L D

systematic random sampling A sampling method that selects sample items from a list or from sequential files.

Sampling interval The number of cases from one sample case to another in a systematic random sample.

periodicity A sequence of items (in a list to be scanned) that varies in a regular, periodic pattern.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 162

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—163

The houses on each block (for example, eight) can be numbered by block, starting with the house in the northwest corner of each block and working clockwise. If the sampling interval is 8, the same as the periodic pattern, all the selected cases are in the same position (see Figure 5.4). But in reality, the periodicity and sampling interval are rarely the same.

Stratified random sampling

Although all probability sampling methods use random sampling, some add steps to the sampling method to make sampling more efficient or easier. Stratified random sampling uses information known about the total population before sampling to make the sampling process more efficient. First, all members of the population (ie, in the sampling frame) are distinguished according to their value for a characteristic of interest. This characteristic forms the strata of the sample. The elements within these layers are then randomly sampled. For example, race may be the basis for distinguishing individuals in a population of interest.

The influence of periodicity in systematic random sampling

1 2 3

28

293031

32

If the sampling interval for a study in this neighborhood is 8, then each item in the sample is a house in the northwest corner, and therefore the sample is skewed.

EXPOSITION 5.4

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 163

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

164— I N V E S T I G AT I N G DIE S O Z I A L W E L D

Individuals within each racial category are then randomly selected. Of course, the use of this method requires more information before sampling than a simple random sample. It must be possible to locate each element in one and only one stratum, and the size of each stratum in the population must be known. This method is more efficient than drawing a simple one.

Random sampling because it ensures proper representation of the elements in the layers. Imagine that you plan to select a sample of 500 people from an ethnically diverse neighborhood. The neighborhood's population is 15% Black, 10% Hispanic, 5% Asian, and 70% White. If you draw a simple random sample, you may end up with a slightly disproportionate number from each group. But if you created sample strata based on race and ethnicity, you could randomly select cases from them.

Stratification: 75 Black (15% of the sample), 50 Hispanic (10%), 25 Asian (5%), and 350 White (70%). By using a proportionally stratified sample, you would eliminate any possibility of sampling error in the ethnic distribution of the sample. Each stratum would be represented in the population from which the sample was drawn, exactly proportional to its size (see Figure 5.5). This is the strategy of Brenda Booth et al. (2002) in a study of homeless adults in two

Los Angeles County locations with large numbers of homeless people. Booth et al. (2002: 432) randomly selected subjects from homeless shelters, feeding facilities, and literally homeless populations on the streets. The sample of respondents was selected in proportion to their numbers in the Midtown and West Side areas, as determined by an overnight census. They were also drawn proportionally to their distribution among three nested sample strata: the population using emergency beds, the population using snacks, and the vulnerable population not using any. For disproportionately stratified samples, the proportion of each stratum included

the sample deliberately deviates from the population. For the ethnically stratified sample, you could select an equal number of cases from each racial or ethnic group: 125 Black (25% of the sample), 125 Hispanic (25%), 125 Asian (25%), and 125 White ( 25%). With this type of sample, the probability of selection is known for each case, but unequal between strata. You know the population proportions, so you can easily adjust your pooled sample statistics to reflect these actual proportions. For example, if you want to combine

To estimate the ethnic groups and average income of the entire population, you would have to "weight" each case in the sample. The weight is a number that is multiplied by the value of each case, depending on the stratum in which it is found. For example, you would multiply the income of all blacks in the sample by 0.6 (75/125) and the income of all Hispanics by 0.4 (50/125). ), etc. Weighting in this way reduces the influence of oversampled layers and increases the influence of undersampled layers to what they would have been if pure probability sampling had been used.

stratified random sampling A sampling method in which sample items are selected separately from strata of the population determined in advance by the researcher.

Proportionally Stratified Sampling A sampling method in which members of strata are selected in exact proportion to their representation in the population.

Disproportionately Stratified Sample A sample in which members are selected from strata in proportions different from those present in the population.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:27 page 164

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—165

Booth et al. (2002:432) incorporated a disproportionate random sampling element into their proportional random sampling strategy for the Minneapolis homeless: they oversampled women so that they made up 26% of the sample, compared to their actual percentage of 16 % in the sample. homeless population. Why would someone choose a sample that is inherently unrepresentative? The most common reason is to ensure that cases from smaller strata are sampled in sufficient numbers to allow for separate statistical estimates and to facilitate comparisons between strata. Remember that one of the determinants

Stratified random sample

Population: All the inhabitants of the community X n = 10,000

Random selection: 1 of 56 of the white layer;

1 in 8 Hispanic class; 1 of 12 black cape; 1 of 4 Asian cape

Random selection: 1 out of 20 each turn

White n=7,000

White n=125

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 165

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

166— I N V E S T I G A T I N G DIE S O Z I A L W E L D

the sample quality is the sample size. The same applies to subgroups within samples. If a central concern in a research project is describing and comparing the incomes of people of different racial and ethnic groups, then it is important that researchers base each group's median income on enough cases to produce a valid account. If there are few members of a particular minority group in the population, it is necessary to oversample them. Such disproportionate sampling can also lead to a more efficient sample design when the cost of data collection differs significantly between strata or when the variability (heterogeneity) of the strata is different. Weighting is also sometimes used to reduce the unrepresentativeness of a sample.

It occurs due to non-response. If the investigator determines that the sample obtained does not represent the population with respect to some known characteristics, such as gender or education, the investigator weights the cases in the sample so that the sample has equal proportions of men and women or high school and college. graduates, like the entire population (see Figure 5.6). Note, however, that this procedure does not solve the problems caused by a non-representative sample, since you do not yet know what the composition of the sample should have been relative to the other variables in your study; They simply reduced the representativeness of the sample in terms of the variables used in the weighting. This, in turn, may make it more likely that the sample is representative of the population in terms of other characteristics, but you are not sure.

Weighting a sample obtained to match a proportion of the population

21%

17%

62%

21%

17%

62%

The population

EXPOSITION 5.6

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 166

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

cluster sampling

Cluster sampling is useful when a sampling frame of items is not available, as is often the case with large populations spread over a wide geographic area or among many different organizations. A cluster is a naturally occurring mixed collection of members of the population, with each member appearing in one and only one cluster. Schools could be clustered to sample students, blocks could be clustered to sample city residents, counties could be clustered to sample the general population, and corporations could be clustered to sample employees. Taking a global sample is at least a two-step process.

First, the researcher draws a random sample of clusters. A list of clusters should be much easier to obtain than a list of all the individuals in each cluster in the population. Next, the researcher draws a random sample of items within each selected group. Since only a fraction of the total clusters are involved, it should be much easier to derive the sampling frame at this stage. For example, a sample of city dweller clusters might use blocks before the first level cluster.

A research assistant could walk around each selected block and record the addresses of all occupied dwellings. Or, given a sample of student clusters, a researcher might contact the schools selected in the first stage and arrange with the registrar to obtain lists of students at each school. Cluster samples often involve multiple phases (see Figure 5.7), with clusters within clusters, as when a national sample of people might first include states, then geographic units within those states, then residences within those units, and finally people inside residences. In multilevel cluster sampling, the clusters at the first level of the sample are called primary sampling units (Levy & Lemeshow 1999:228). How many clusters should be selected and how many individuals in each cluster

should be selected? In general, the sample will be more similar to the general population.

Chapter 5 Sampling—167

Cluster sampling Sampling in which items are selected in two or more stages, where the first stage is the random selection of naturally occurring clusters and the last stage is the random selection of items within the clusters.

Group A naturally occurring mixed collection of members of the population.

Multilevel cluster sampling

Level 1: Random

Cities and counties within those states

Step 3: choose randomly

every school

EXPOSITION 5.7

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 167

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

when the researcher selects as many clusters as possible, although this means selecting fewer individuals within each cluster. Unfortunately, this strategy also maximizes sample costs for in-person interview studies. The more clusters a researcher selects, the more time and money they will need to spend traveling to the different clusters to reach people for interviews. The calculation of how many clusters to sample and how many individuals are in them

Clustering is also affected by the degree of similarity of the individuals within the clusters: the more similar the individuals are within the clusters, the fewer individuals are needed to represent each cluster. So when you set out to sample a cluster, consider how similar the individuals within the clusters are and how many clusters you can afford to include in your sample. Cluster sampling is a very popular technique among survey researchers, but there is a

General Disadvantage: Sampling error is greater with a cluster sample than with a simple random sample, since not one but two random selection steps are performed. This sampling error increases as the number of clusters decreases and decreases as the homogeneity of cases per cluster increases. In summary, it is best to sample as many clusters as possible, and a sample of clusters is more likely to be representative of the population when the cases within the clusters are relatively similar.

P r o b a b i l i d a d M e t o d o s C o m p a r d a

Can you now understand why researchers often prefer to take a stratified random sample or a cluster sample rather than a simple random sample? Figure 5.8 is designed to help you remember the key features of these different sample types and determine when each is most appropriate. Many professionally designed surveys use combinations of clustering and stratified probability.

sampling procedure. For example, Peter Rossi (1989) drew a disproportionately stratified cluster sample of shelter users for a study of homelessness in Chicago (see Figure 5.9). The sample of dwellings was stratified by size, with smaller dwellings having a lower probability of selection than larger dwellings. In fact, all the largest accommodations were chosen; they had a selection probability of 1.0. Within the selected emergency shelters, the users of the emergency shelters were then interviewed using a systematic random selection process (except in small shelters, where everyone was interviewed). Homeless people living on the streets were also randomly selected. In the first phase, city blocks were stratified based on the likely concentration of homelessness (estimated by various informed groups). Within these blocks were randomly selected

168— I N V E S T I G AT I N G DIE S O Z I A L W E L D

Characteristics of Probabilistic Sampling

Simple systematic layered cluster feature

Unbiased selection of cases Yes Yes Yes Yes Requires a sampling frame Yes No Yes No Ensures representation of key layers No No Yes No Uses a natural clustering of cases No No No Yes Reduces sampling costs No No No Yes

EXPOSITION 5.8

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 168

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—169

Shifts and on the night of the investigation, between 1 a.m. and 5 p.m. m. and 6 a.m. m., interview teams tested everyone outside on that block for homelessness. Individuals identified as homeless were then interviewed (and received $5 for their time). The response rate for two different samples (fall and winter) in shelters and on the street was between 73% and 83%. How would we rate the Chicago homeless sample using the sample report questions?

• From what population were the cases selected? The population was clearly defined for each cluster.

• What method was used to select cases from this population? The random selection process has been carefully described.

• In general, do the cases studied represent the population from which they were selected? Unbiased sampling procedures give us reasonable confidence in the representativeness of the sample, although we know little about nonrespondents and therefore have reasonable concern that some types of homelessness may have been missed.

A generalization across populations seems reasonable for this sample, since the results seem likely to reflect general processes involving homelessness. Rossi (1989) clearly has this in mind, since the title of his book refers to the homeless in the United States, not just Chicago.

sampling without probability

Non-probability sampling methods are commonly used in qualitative research; They are also used in quantitative studies when investigators cannot use probability selection methods. Non-qualitative research, which focuses on a very small setting or sample, allows for a more intensive description of activities and actors, but also limits the generalization ability of field researchers.

Note: Shelters were drawn with probabilities proportional to size, and residents within shelters were disproportionately sampled to form a self-weighted sample. The sample proportions for the Phase 2 sample are given in Table B.

Chicago Shelter Universe and Shelter Samples, Fall and Winter Surveys

A. Refuge Universe and Samples

Autumn Winter

Eligible properties in the universe 28 45 Bed capacities in the universe 1,5732,001 Sampled properties 22 27

B. Details of the winter protection sample

Accommodation Size Count in Count in Resident Classification Universe Sample Ratio of sample

Large (37 beds or more) 17 17 0.25 Medium (18–33 beds) 12 6 0.50 Small (less than 18 beds) 16 4 1.00

EXPOSITION 5.9

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 169

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

170— I N V E S T I G AT I N G DIE S O Z I A L W E L D

it reduces the confidence that others can place in these generalizations. The use of non-probability sampling techniques in quantitative research all too often reflects a lack of interest in generalizing or a poor understanding of the importance of probability-based sampling. There are four common non-probability sampling techniques: availability sampling, quotas

Sampling, directed sampling and snowball sampling. Because these methods do not use a random selection process, we cannot expect that a sample selected using any of these methods will result in a representative sample. They should not be used in quantitative studies where a probability-based method is possible. However, these methods are useful when random sampling is not possible, when a research question requires intensive study of a small population, or when a researcher is conducting a preliminary exploratory study.

V a i l a b i l i d a M uestro

Items are selected for availability sampling because they are available again or easy to find. Therefore, this sampling method is also known as arbitrary, random, or convenience sampling.

There are many ways to select items for an availability display: standing on street corners and talking to passersby, co-workers having time to talk, asking questions when picking up your paycheck at an HR office, or talking with specific people. at convenient times during activities observed in an antisocial environment. You

You may find yourself interviewing students available on CampusHangouts as part of a course assignment. To study sexual risk-taking among homeless youth in Minneapolis, Linda Halcon and Alan Lifson (2004:73) hired highly experienced street workers who approached and asked youth who were known or suspected to be homeless. household if they would be willing to participate in a 20 - to 30 minute interview. The interviewers then conducted the 44-question interview, after which they gave

Respondents received risk reduction information and recommendations and a $20 coupon. A group participatory observational study may not require a more sophisticated approach.

When Philippe Bourgois, Mark Lettiere, and James Quesada (1997) studied homeless heroin addicts in San Francisco, they were immersed in a community of addicts living in a public park. These addicts became the test of availability. In social research, an availability sample is often adequate, for example abroad.

a researcher is exploring a new environment and trying to get a sense of prevailing attitudes, or a survey researcher is pre-testing a new set of questions. Now I would like to ask you to use the sample qualifying questions to qualify the person on the street

Interviews with homeless people. If your answers are something like "The population was unknown," "The method used to select the cases was arbitrary," and "The cases examined do not represent the population," you're right! There is no clearly definable population from which respondents were drawn and no systematic technique was used to select respondents. Certainly, it's not very likely that respondents understood the distribution of sentiment among the homeless in the Boston area, or among welfare mothers, or among impoverished rural immigrants, or whatever we imagine to be the relevant population. Similarly, comments made by people on the street to reporters may suggest

something about which the bums think, or perhaps not; we really can't be sure. But let's give the reporters credit: if they just want a few quotes to bolster their story

Availability pattern A pattern in which items are selected at will.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 170

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

attractive, there is nothing wrong with your sampling method. However, his approach gives us no basis to believe that we understand the sentiment of the community. The people who are available in any situation are probably not the same as those who are not. We cannot be entirely sure that what we are learning can be generalized to a larger group of patients with confidence. Availability sampling is often disguised as a more rigorous form of research. Popular

Magazines regularly survey their readers by printing a questionnaire for readers to fill out and return. Then a follow-up article appears in the magazine with a headline like "What you think about intimacy in marriage." With a large circulation of the magazine, a large sample can be obtained this way. The problem is that typically only a small fraction of readers return the quiz, and these respondents are likely to be different from other readers who didn't have the interest or time to participate. Sothe's survey is based on a sample of availability. While the follow-up article may be interesting, we have no reason to believe that the results describe readers as a whole, let alone the population as a whole. Now do you see why the availability sample is so different from the random sample?

Methods that assume that "nothing but chance" is involved in the selection of the real case? Precisely what makes availability sampling "random" is that many other things besides chance can influence case selection, from research staff biases to the work schedules of potential respondents. To really leave the selection of cases to chance, we have to be very careful in the selection process so that other factors do not play a role. There is nothing "random" about choosing cases at random.

Quota Sampling

Quota sampling is designed to overcome the most obvious shortcoming of availability sampling: that the sample consists only of what is available, without regard to similarity to the population of interest. The distinctive feature of a quota sample is that the quotas are set to ensure that the sample represents certain characteristics relative to their prevalence in the population. Suppose you want to draw a sample of adult residents of a

City on a study to support a tax increase to improve city schools. From the city's annual report, learn the percentage of city residents by gender, race, age, and number of children. You think any of these characteristics could affect support for the new school taxes, so you want to make sure your sample includes men, women, whites, blacks, Hispanics, Asians, older, younger, large families, small families, and families without children. relative to their number in the city's population. This is where the odds come into play. Suppose that 48% of the city's adult residents are male.

52% are women, 60% employed, 5% unemployed and 35% unemployed. These percentages and those corresponding to the other characteristics become the sample quotas. If you want to include a total of 500 residents in your sample, 240 must be men (48% of 500), 260 women, 300 employees, etc. You can even set more specific quotas, such as B. a certain number of employed women, employed men, unemployed men, etc. With the fee list in hand, you can (your research

Chapter 5 Sampling—171

Quota sampling A non-probability sampling method in which items are selected to ensure that the sample represents certain characteristics relative to their prevalence in the population.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:28 page 171

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

172— I N V E S T I G AT I N G DIE S O Z I A L W E L D

staff) can now go out into the community and search for the correct number of people in each quota category. You can go door to door, bar to bar, or just stand on a street corner until you have interviewed 240 men, 260 women, etc. The problem is that even if we know that a quota sample is representative of the

For certain characteristics for which quotas have been established, we have no way of determining whether the sample is representative of other characteristics. For example, in Figure 5.10, quotas have been set by gender only. Under these circumstances, it is not surprising that the sample is representative of the population only in terms of sex, not race. Interviewers are only human; They may avoid potential respondents with menacing dogs in the front yard, or they may seek respondents who are physically attractive or seem easy to interview. Realistically, researchers can only set quotas for a small fraction of the characteristics relevant to a study, so a quota sample really isn't much better than an availability sample (although it always helps to follow careful and consistent procedures for selecting cases). within quota limits). This last point brings me to another limitation of sampling by quotas: it is necessary to know them

Characteristics of the entire population to establish the appropriate quotas. In most cases, researchers only know what the population looks like from a few characteristics relevant to their concern, and in some cases, they do not have that information about the entire population. If you are now skeptical of quota sampling, you have understood my comments.

However, in some situations, the setting of quotas may lead to more stringent sampling procedures. It is almost always best to maximize comparability in research, and quota sampling methods can help qualitative researchers do this. For example, Doug Timmer, Stanley Eitzen, and Kathryn Talley (1993: 7) have interviewed homeless people in various cities and elsewhere.

Quota Sampling

Quota sample 50% male, 50% female, 50% white, 50% black

Representative of the gender distribution in the population, not representative of

racial distribution.

EXPOSITION 5.10

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:29 page 172

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—173

for his book on the root causes of homelessness. People who were available were interviewed, but the researchers were careful to create a diverse sample. They interviewed 20 homeless men living on the streets without a roof and 20 mothers living in family shelters. About half of the people selected by the researchers for the street sample were black and about half were white. Although the researchers did not use quotas to try to even out the distribution of traits across the entire homeless population, their informal quotas helped ensure some diversity of key traits. Does quota sampling remind you of stratified sampling? Is easy to see why

both select sample members based in part on one or more key characteristics. Figure 5.11 summarizes the differences between the quota sample and the stratified random sample. The main difference, of course, is the lack of random sampling in quota sampling.

directed selection

In directed sampling, each sample item is selected for a specific purpose, usually based on the unique location of the sample items. Targeted sampling may involve examining the entire population of a limited group (adult homeless shelter managers) or a subset of a population (mid-level managers known for their efficiency). Or a specific sample might be a "key informant survey" targeted at individuals who have particular knowledge about the issues being studied. Herbert Rubin and Irene Rubin (1995) propose three guidelines for the selection of informants

when designing a targeted sampling strategy. The informants must be

• 'knowledgeable about the cultural field, situation or experience under study', • 'open to discussion', and • 'representing a variety of points of view'. (p. 66)

Additionally, Rubin and Rubin (1995) suggest selecting interviewees until they can pass two tests:

• Completeness: "What you hear gives a general idea of the importance of a concept, theme or process" (p. 72).

• Saturation: "Confidence is gained that you will learn little new from subsequent interviews" (p. 73).

Comparison of stratified and quota sampling methods

Layered Quota Function

Impartial (random) selection of cases Yes No Sampling required Yes No Ensures representation of important strata Yes Yes

EXPOSITION 5.11

Targeted Sampling A non-probability sampling method in which items are selected for a specific purpose, usually based on their unique location.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:29 page 173

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Adhering to these guidelines helps ensure that a specific sample is adequately representative of the setting or issues being studied. Of course, directed sampling will not produce a sample that is slightly larger.

population, but may be just what is needed in a case study of a well-defined and relatively limited organization, community or other group. In an intensive organizational case study, a specific sample of organizational leaders could be supplemented with a probability sample of member organizations of organizations that had contact with homeless people in each of the studied counties.

snowball sampling

Snowball sampling is useful for hard-to-reach or hard-to-identify populations that do not have a sampling frame, but whose members are related in some way (at least some of them).

members of the population know each other). It can be used to study members of groups such as drug dealers, prostitutes, practicing criminals, participants in Alcoholics Anonymous groups, gang leaders, informal organizers, and the homeless. It can also be used to record relationships among members of a particular group (a sociometric study), to study the population of interest before developing a formal sampling design, and to

Development of what will become a census by informal leaders of small organizations or communities. However, researchers using snowball samples generally cannot be sure that their sample represents the entire population of interest, so generalizations must be cautious. Rob Rosenthal (1994) used snowball samples to study homeless people living in Santa Claus.

Barbara, California:

I started this process by attending a homeless meeting that I had heard about through my contacts with housing advocates. . . . A homeless woman. . . she invited me to do it. . . where she promised to introduce me. Thus began a snowball process. I entered a group through people I knew, met others, and through them gained access to new circles. (pp. 178, 180)

A problem with this technique is that the initial contacts can shape the entire sample and prevent access to some members of the population of interest:

I sat around the tree with [my contact]. Other people come, they're nice, but some regulars, especially the tougher men, don't sit with her. Am I making a mistake by tying myself too tight? She teaches them a lot. (Rosenthal 1994: 181)

More systematic versions of snowball sampling can reduce the potential for distortion. For example, respondent-directed sampling provides respondents with financial incentives to recruit peers (Heckathorn 1997). Limitations on the number of incentives each respondent can receive increases the diversity of the sample. Specific incentives can guide the sample to include them

174— I N V E S T I G A T I N G DIE S O Z I A L W E L D

snowball sampling A sampling method in which sample items are selected as identified by successive informants or interviewees.

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:29 page 174

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Chapter 5 Sampling—175

certain subgroups. When sampling is repeated in several waves, with new respondents incorporating more pairs, the sample composition converges on a more representative combination of characteristics than in the case of uncontrolled snowball sampling. Figure 5.12 shows how the sample expands into an increasingly diverse group through successive waves of recruitment (Heckathorn 1997:178). Figure 5.13 shows that even if the starting point were all white people, a respondent-driven sample would result in a reasonable ethnic mix of a racially diverse population (Heckathorn 2002:17).

Examples of quality lessons

Some lessons are contained in my evaluations of the examples in this chapter:

• We cannot judge the quality of a sample if we do not know what population it intends to represent. If the population is not specified because the researchers never knew what population they were trying to sample, then we can safely conclude that the sample itself is not a good one.

Respondent-driven sampling

Successive waves of samples gradually produce a more representative sample than is typical for snowball sampling.

Instructions to Respondents: “We will pay you $5 each for up to three names, but only one of those names can be someone from your own city. The others have to come from elsewhere."

EXPOSITION 5.12

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:29 page 175

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

176— I N V E S T I G AT I N G DIE S O Z I A L W E L D

• We cannot assess the quality of a sample if we do not know how the sample cases were selected from the population. If the method has been specified, we need to know if the cases were selected systematically and based on chance. In any case, we know that an arbitrary sampling method (such as face-to-face interviews on the street) undermines generalizability.

• The quality of the sample is determined by the actual sample obtained, not just by the sampling procedure itself. When many of the people selected for our sample are non-respondents or people (or other entities) who did not participate in the study, even though they were selected for the sample, will undermine the quality of our sample, even if we have chosen the sample in the best possible way.

• We must be aware that even researchers who obtain very good samples may be talking about the implications of their results for a group that is larger or simply different from the population from which they actually sampled. For example, findings from a representative sample of college students are often viewed as speaking about college students in general. And maybe they will; we just don't know for sure.

• A sample that allows comparisons with theoretically important variables is better than one that does not allow such comparisons. Even when we study people or social processes in depth, it is better to select people or settings based on their usefulness for studying relationships. If we limit an investigation to only one setting or only one type of person, we will inevitably ask ourselves what makes the difference.

Convergence of the interview sample on the true ethnic proportions in the population, after starting with whites only

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

recruitment wave

05-Schutt 6e-45771:FM-Schutt5e(4853) (for student CD).qxd 09/29/200811:29 page 176

Unreviewed pages. It may not be sold, copied or redistributed. owned by SAGE

Generalizability in Qualitative Research

Qualitative research often focuses on populations that are difficult to locate or very limited. As a result, non-probability sampling techniques such as availability sampling and snowball sampling are commonly used. Janet Wards Schofield (2002) suggests ways to increase the generalizability of samples obtained in such situations:

Study the typical. Selecting locations based on their suitability for a typical situation is much better than selecting based on convenience. (p. 181)

Carrying out multisite studies. A finding that is repeated when many sites are surveyed appears to be a well-functioning hypothesis about a site that has not yet been surveyed, rather than a finding that occurs at only one or two sites. . . . In general, a result obtained by examining several very heterogeneous sites would be more . . . It will likely be useful for understanding several other sites besides one that results from examining several very similar sites. (p. 184)

The concern of some qualitative researchers with understanding the details of a situation as an important topic of study in its own right also leads some to question the value of generalization as understood by most researchers. In the words of sociologist Norman Denzin:

The interpretivist rejects generalization as a goal and never aims to draw random samples from human experience. . . . Any instance of social interaction. . . it represents a piece of the world of life that is the right subject for interpretive inquiry. (Denzin quoted in Schofield 2002:173)

2 SAMPLE DISTRIBUTIONS

A well-designed probability sample is likely to be representative of the population from which it was drawn. But as you've seen, random sampling is still subject to sampling error due to chance. To solve this problem, social researchers consider the properties of a sampling distribution, a hypothesized distribution of a statistic over all the random samples that could be drawn from a population. Each individual random sample can be viewed as one of an infinite number of random samples that could theoretically have been selected from the population. If we had Gatsby's finances and Job's patience and we could draw an infinite number of samples and compute the same type of statistic for each of those samples, then we would have a sampling distribution. Understanding sampling distributions is the foundation for understanding how statisticians can estimate samples.