POPULATIONS AND SAMPLING 
Populations 

Definition  a complete set of elements (persons or objects) that possess some common characteristic defined by the sampling criteria established by the researcher 

Composed of two groups  target population & accessible population 


Target population (universe) 

The entire group of people or objects to which the researcher wishes to generalize the study findings 

Meet set of criteria of interest to researcher 

Examples 

All institutionalized elderly with Alzheimer's 

All people with AIDS 

All low birth weight infants 

All schoolage children with asthma 

All pregnant teens 

Accessible population 

the portion of the population to which the researcher has reasonable access; may be a subset of the target population 

May be limited to region, state, city, county, or institution 

Examples 

All institutionalized elderly with Alzheimer's in St. Louis county nursing homes 

All people with AIDS in the metropolitan St. Louis area 

All low birth weight infants admitted to the neonatal ICUs in St. Louis city & county 

All schoolage children with asthma treated in pediatric asthma clinics in universityaffiliated medical centers in the Midwest 

All pregnant teens in the state of Missouri 
Samples 

Terminology used to describe samples and sampling methods 

Sample = the selected elements (people or objects) chosen for participation in a study; people are referred to as subjects or participants 

Sampling = the process of selecting a group of people, events, behaviors, or other elements with which to conduct a study 

Sampling frame = a list of all the elements in the population from which the sample is drawn 

Could be extremely large if population is national or international in nature 

Frame is needed so that everyone in the population is identified so they will have an equal opportunity for selection as a subject (element) 

Examples 

A list of all institutionalized elderly with Alzheimer's in St. Louis county nursing homes affiliated with BJC 

A list of all people with AIDS in the metropolitan St. Louis area who are members of the St. Louis Effort for AIDS 

A list of all low birth weight infants admitted to the neonatal ICUs in St. Louis city & county in 1998 

A list of all schoolage children with asthma treated in pediatric asthma clinics in universityaffiliated medical centers in the Midwest 

A list of all pregnant teens in the Henderson school district 

Randomization = each individual in the population has an equal opportunity to be selected for the sample 

Representativeness = sample must be as much like the population in as many ways as possible 

Sample reflects the characteristics of the population, so those sample findings can be generalized to the population 

Most effective way to achieve representativeness is through randomization; random selection or random assignment 

Parameter = a numerical value or measure of a characteristic of the population; remember P for parameter & population 

Statistic = numerical value or measure of a characteristic of the sample; remember S for sample & statistic 

Precision = the accuracy with which the population parameters have been estimated; remember that population parameters often are based on the sample statistics 
Types of Sampling Methods  probability & nonprobability 
Probability Sampling Methods 

Also called random sampling 









Types of probability sampling  see table in course materials for details 


Simple random 










A table displaying hundreds of digits from 0 to 9 set up in such a way that each number is equally likely to follow any other 

See text for random sampling details & table of random numbers 







Stratified random 

Population is divided into subgroups, called strata, according to some variable or variables in importance to the study 

Variables often used include: age, gender, ethnic origin, SES, diagnosis, geographic region, institution, or type of care 

Two approaches to stratification  proportional & disproportional 


Proportional 

Subgroup sample sizes equal the proportions of the subgroup in the population 

Example: A high school population has 

15% seniors 

25% juniors 

25% sophomores 

35% freshmen 

With proportional sample the sample has the same proportions as the population 

Disproportional 

Subgroup sample sizes are not equal to the proportion of the subgroup in the population 

Example 

Class 
Population 
Sample 

Seniors 
15% 
25% 

Juniors 
25% 
25% 

Sophomores 
25% 
25% 

Freshmen 
35% 
25% 




With disproportional sample the sample does not have the same proportions as the population 

Cluster random sampling 

A random sampling process that involves stages of sampling 

The population is first listed by clusters or categories 

Procedure 

Randomly select 1 or more clusters and take all of their elements (single stage cluster sampling); e.g. Midwest region of the US 

Or, in a second stage randomly select clusters from the first stage of clusters; eg 3 states within the Midwest region 

In a third stage, randomly select elements from the second stage of clusters; e.g. 30 county health dept. nursing administrators from each state 


Systematic 

A random sampling process in which every kth (e.g. every 5th element) or member of the population is selected for the sample after a random start is determined 

Example 

Population (N) = 2000, sample size (n) = 50, k=N/n, so k = 2000 ) 50 = 40 

Use a table of random numbers to determine the starting point for selecting every 40th subject 

With list of the 2000 subjects in the sampling frame, go to the starting point, and select every 40th name on the list until the sample size is reached. Probably will have to return to the beginning of the list to complete the selection of the sample. 
Nonprobability sampling methods 

Characteristics 

Not every element of the population has the opportunity for selection in the sample 

No sampling frame 

Population parameters may be unknown 

Nonrandom selection 

More likely to produce a biased sample 

Restricts generalization 

Historically, used in most nursing studies 

Types of nonprobability sampling methods 


Convenience  aka chunk, accidental & incidental sampling 

Selection of the most readily available people or objects for a study 

No way to determine representativeness 

Saves time and money 


Quota 

Selection of sample to reflect certain characteristics of the population 

Similar to stratified but does not involve random selection 

Quotas for subgroups (proportions) are established 

E.g. 50 males & 50 females; recruit the first 50 men and first 50 women that meet inclusion criteria 


Purposive  aka judgmental or expert's choice sampling 

Researcher uses personal judgement to select subjects that are considered to be representative of the population 

Handpicked subjects 

Typical subjects experiencing problem being studied 


Snowball 

Also known as network sampling 

Subjects refer the researcher to others who might be recruited as subjects 
Time Frame for Studying the Sample 
See design notes on longitudinal & crosssectional studies 
Longitudinal 
Crosssectional 
Sample Size 

General rule  as large as possible to increase the representativeness of the sample 

Increased size decreases sampling error 

Relatively small samples in qualitative, exploratory, case studies, experimental and quasiexperimental studies 

Descriptive studies need large samples; e.g. 10 subjects for each item on the questionnaire or interview guide 

As the number of variables studied increases, the sample size also needs to increase in order to detect significant relationships or differences 

A minimum of 30 subjects is needed for use of the central limit theorem (statistics based on the mean) 

Large samples are needed if: 

There are many uncontrolled variables 

Small differences are expected in the sample/population on variables of interest 

The sample is divided into subgroups 

Dropout rate (mortality) is expected to be high 

Statistical tests used require minimum sample or subgroup size 
Power Analysis 
Power analysis = a procedure for estimating either the likelihood of committing a Type II error or a procedure for estimating sample size requirements 
Background Information for Understanding Power Analysis: Type I and Type II errors 

Type I error 

Based on the statistical analysis of data, the researcher wrongly rejects a true null hypothesis; and therefore, accepts a false alternative hypothesis 

Probability of committing a type I error is controlled by the researcher with the level of significance, alpha. 

Alpha a is the probability that a Type I error will occur 

Alpha a is established by researcher; usually a = .05 or .01 

Alpha a = .05 means there is a 5% chance of rejecting a true null hypothesis; OR out of 100 samples, a true null hypothesis would be rejected 5 times out of 100 and accepted 95 times out of 100. 

Alpha a = .01 means there is a 1% chance of rejecting a true null hypothesis; OR out of 100 samples, a true null hypothesis would be rejected 1 time out of 100 and accepted 99 times out of 100 

Type II error 

Based on the statistical analysis of data, the researcher wrongly accepts a false null hypothesis; and therefore, rejects a true alternate hypothesis 

Probability of committing a Type II error is reduced by a power analysis 

Probability of a Type II error is called beta b 

Power, or 1 b is the probability of rejecting the null hypothesis and obtaining a statistically significant result 
Type I & Type II Errors

In the real world, the actual situations is that the null hypothesis is : True 
In the real world, the actual situations is that the null hypothesis is : False 
Based on statistical analysis, the researcher concludes that: Null true: Null hypothesis is accepted 
Correct decision: the actual true null is accepted 
Type II error: the actual false null is accepted 
Based on statistical analysis, the researcher concludes that: Null false: Null hypothesis is rejected & alternate is accepted 
Type I error: the actual true null hypothesis is rejected 
Correct decision: the actual false null is rejected & alternate is accepted 
Background Information for Understanding Power Analysis: Population Effect Size  Gamma g 

Gamma g measures how wrong the null hypothesis is; it measures how strong the effect of the IV is on the DV; and it is used in performing a power analysis 

Gamma g is calculated based on population data from prior research studies, or determined several different ways depending on the nature of the data and the statistical tests to be performed 

The textbook discusses 4 ways to estimate gamma (population effect size) based upon: 

Testing the difference between 2 means (ttest) 

Testing the difference between 3> means (ANOVA) 

Testing bivariate correlation (relationship) between 2 variables (Pearson's r) 

Testing the difference in proportions between 2 groups (chisquare) 

If there is no relevant research on topic to estimate the population effect size (gamma), then use guidelines for gamma g or its equivalent 

Testing the difference between 2 means (ttest)  gamma g for small effects g = .20; medium effects g = .50; large effects g = .80 

Testing the difference between 3> means (ANOVA)  eta squared h^{2} for small effects h^{2} = .01; medium effects h^{2} = .06; large effects h^{2} = .14 

Testing bivariate correlation (relationship) between 2 variables (Pearson's r) gamma g for small effects g = .10; medium effects g = .30; large effects g = .50 

Testing the difference in proportions between 2 groups (chisquare  no conventions for unknown populations 
Determining Sample Size through Power Analysis 

Need to have the following data: 

Level of significance criterion = alpha a, use .05 for most nursing studies and your calculations 

Power = 1  b (beta); if beta is not known standard power is .80, so use this when you are determining sample size 

Population size effect = gamma g or its equivalent, e.g. eta squared h^{2}; use recommended values for small, medium, or large effect for the statistical test you plan to use to answer research questions or test hypothesis 

Use tables on pages 455459 of Polit & Hungler or other reference 

Mathematical formulas and computer programs can also be used for calculation of sample size 
Sampling Error and Sampling Bias 

Sampling error = The difference between the sample statistic (e.g. sample mean) and the population parameter (e.g. population mean) that is due to the random fluctuations in data that occur when the sample is selected 

Sampling bias 

Also called systematic bias or systematic variance 

The difference between sample data and population data that can be attributed to faulty sampling of the population 

Consequence of selecting subjects whose characteristics (scores) are different in some way from the population they are suppose to represent 

This usually occurs when randomization is not used 
Randomization Procedures in Research 

Randomization = each individual in the population has an equal opportunity to be selected for the sample 

Random selection = from all people who meet the inclusion criteria, a sample is randomly chosen 

Random assignment 

The assignment of subjects to treatment conditions in a random manner. 

It has no bearing on how the subjects participating in an experiment are initially selected. 

See Polit & Hungler, pg. 160162 for random assignment to groups and group random assignment to tx. using a random numbers table 