Skip to main content

Sampling Surveys: Methods and Applications

  • Chapter
  • First Online:
Statistics for Business and Financial Economics

Abstract

In statistics, we are interested in information about a population. For example, we might be interested in how the residents of a community feel about the construction of a new high school. There are two ways to obtain information about how the residents feel about this issue. We could take a census and simply ask each and every resident about his or her attitude toward such a project. Or we could take a smaller sample of the residents and try to draw inferences about the community’s feelings from the feelings that members of this sample express.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We will return to this example in Example 20.3 in the next section of this chapter.

  2. 2.

    J. K.eon and H. Assael (982), “Nonsampling vs. Sampling Errors in Survey Research” Journal of Marketing, Spring 1982, 114–123.

  3. 3.

    This result is obtained under the assumption that \( {\sigma^2}=\sum\limits_{i=1}^N {{{{{{{\left( {{x_i}-\mu } \right)}}^2}}} \left/ {N} \right.}.} \) If \( {\sigma^2}=\sum\limits_{i=1}^N {{{{{{{\left( {{x_i}-\mu } \right)}}^2}}} \left/ {N-1, } \right.}} \) then the finite sample adjustment factor will be (Nn)/N. This kind of finite sampling adjustment factor will be used in both stratified random sampling and two-staged cluster sampling.

  4. 4.

    \( {{\bar{x}}_{st }} = \sum\nolimits_{J=1}^H {{W_j}} {{\bar{x}}_j}, \) where W j  = N j /N j . Because the samples N j are selected by random sampling and are independent of each other, Eq. 20.8 holds.

  5. 5.

    The derivation of the sample size for proportional allocation defined in Eq. 20.13 and that of the sample size for optimal allocation with similar variable cost in each stratum, defined in Eq. 20.15, can be found in T. Yamane (1967), Elementary Sampling Theory, (Englewood Cliffs, NJ.: Prentice Hall), Chapter 6.

  6. 6.

    See T. Yamane (1967), Elementary Sampling Theory, Chap. 8.

  7. 7.

    This application is drawn from G. A. Churchill, Jr. (1983), Marketing Research: Methodological Foundations, 3d ed., (Chicago: Dryden), pp. 441–442. Copyright© 1983 by The Dryden Press, reprinted by permission of the publisher.

Author information

Authors and Affiliations

Authors

Appendix 1: The Jackknife Method for Removing Bias from a Sample Estimate

Appendix 1: The Jackknife Method for Removing Bias from a Sample Estimate

In this appendix, we discuss the jackknife method, which can be used in conjunction with sampling to remove the bias of an estimator and to produce confidence intervals.

The jackknife is a general technique that can be applied to any linear estimator. It works by using the original sample to create a new set of “pseudovalues.” The jackknife procedure involves the following steps:

  1. 1.

    The n sample values are divided into m subsets, and m is set equal to n in many applications. For example, removing one piece of data at a time leaves m = n subsets of data with (n − 1) observations in each set.

  2. 2.

    An estimate based on all the data is calculated. Call this value x All.

  3. 3.

    An estimate based on all the data except the data from the first of the m subsets is calculated; call it x −1. Estimates of x −2, x −3, …, x m are also calculated.

  4. 4.

    The “pseudovalue” x 1 is calculated as

    $$ {x_1}={x_{\mathrm{ All}}}+\left( {m-1} \right)\left( {{x_{\mathrm{ All}}}-{x_{-1 }}} \right) $$
    (20.23)

    Likewise, we can calculate x 2, x 3,…, x m . These pseudovalues will constitute a “pseudosample” that acts like a random sample. Alternatively, Eq. 20.23 can be rewritten as

    $$ {x_1}=m{x_{\mathrm{ All}}}-\left( {m-1} \right){x_{-1 }} $$
    (20.24)
  5. 5.

    The mean \( \bar{x} \) and the standard deviation s of the pseudosample can now be calculated and used to produce confidence intervals. For example, a 90 % confidence interval for the population mean μ can be defined as

    $$ \mu =\bar{x} \pm {t_{.05 }}\frac{s}{{\sqrt{m}}} $$
    (20.25)

    where t .05 is the t statistic with the significance level α = .05.

It may not be clear from an introduction of the jackknife technique why this procedure is preferable to a simple or a stratified random sample. It has been shown that when the original estimate is biased but is asymptotically unbiased – that is, unbiased in large samples – jackknifing often eliminates the bias. Also, the jackknife procedure makes it possible to compute confidence intervals for the population parameters when the samples taken are small and the population standard deviation is unknown.

Example 20.13 Removing the Bias of Accounts Receivable Estimates.

Suppose an auditor is interested in determining the mean growth rate of uncollectible accounts receivable. This figure will help the auditor find out whether a store has an abnormally high increase in uncollectibles.

To obtain these estimates, the auditor randomly samples the uncollectibles of six department stores in 1989 and 1990. The results of this sample are given in Table 20.3. The auditor is interested in constructing a 95 % confidence interval for the ratio in uncollectibles. In Table 20.4, we present the ratio in uncollectibles, u, where

$$ u=\frac{{1990\mathrm{ uncollectibles}}}{{1989\mathrm{ uncollectibles}}} $$

The issue now before us is how we compute a confidence interval for u by using the jackknife method.

The first step in the jackknife procedure is to calculate x All the observation based on all the data. From our example, this is the ratio of total uncollectibles in 1990 to total uncollectibles in 1989.

$$ {x_{\mathrm{ All}}}=\frac{914,500 }{825,000 }=1.108 $$
Table 20.3 Random sample of uncollectibles for six stores
Table 20.4 U Ratios for six different stores

Next we compute x −1, using all the data except the data of AAA Company.

$$ {x_{-1 }}=\frac{914,500-225,000 }{825,000-200,000 }=\frac{689,500 }{625,000 }=1.103 $$

Similarly, we compute x −2 by deleting the data of BBB Company, x −3, by deleting the data of CCC Company, and so on.

$$ \begin{array}{lllllllll} {x_{-2 }} =\frac{914,500-92,000 }{825,000-84,000 }=1.110 \\{x_{-3 }} =\frac{914,500-152,000 }{325,000-127,000 }=1.092 \\{x_{-4 }} =\frac{914,500-13,500 }{825,000-12,000 }=1.108 \\{x_{-5 }} =\frac{914,500-390,000 }{825,000-375,000 }=1.166 \\{x_{-6 }} =\frac{914,500-42,000 }{825,000-27,000 }=1.093 \\ \end{array} $$

Using this information, we calculate our pseudovalues x 1, x 2, …, x 6 in accordance with Eq. 20.23.

$$ \begin{array}{lll} {x_1} & ={x_{\mathrm{ All}}}+\left( {m-1} \right)\left( {{x_{\mathrm{ All}}}-{x_{-1 }}} \right) \\& =1.108+5\left( {1.108-1.103} \right) \\& =1.133 \\{x_2} & =1.108+5\left( {1.108-1.110} \right) \\& =1.098 \\{x_3} & =1.108+5\left( {1.108-1.092} \right) \\& =1.188 \\{x_4} & =1.108+5\left( {1.108-1.108} \right) \\& =1.108 \\{x_5} & =1.108+5\left( {1.108-1.166} \right) \\& =.818 \\{x_6} & =1.108+5\left( {1.108-1.093} \right) \\& =1.183\end{array} $$

We can now use these pseudovalues, x 1,…, x 6, as our pseudosample to construct our confidence interval. First, we compute the sample mean \( \bar{x} \) of the pseudosample. Then we compute the sample standard deviation s of the pseudosample. Finally, we construct a confidence interval, using Student’s t distribution (it is given in Table A4 of Appendix A).

$$ {\begin{array}{llll} {x} &= \frac{{{x_1}+{x_2}+{x_3}+{x_4}+{x_5}+{x_6}}}{6} \\ \\ &= \frac{6.528 }{6} \\ \\ &= 1.088 \\ \\ s & = {{\left\{ {\left. {\frac{1}{n-1 }{{{({x_1}-\bar{x})}}^2}+{{{({x_2}-\bar{x})}}^2}+\cdot \cdot \cdot +{{{({x_6}-\bar{x})}}^2}} \right\}} \right.}^{1/2 }} \cr \\ &= \left\{ {\frac{1}{5}[{(1.133-1.088)^2}+{(1.098-1.088)^2}+{(1.188-1.088)^2}]} \right. \\ \\ & \qquad + {(1.108-1.088)^2+{(.818-1.088)^2+{{(1.183-1.088)^2}}}} \left\{ \right\}^{ 1/2 } & =.137 \end{array}} $$

For a 95 % confidence interval, we use t .025 with 5 degrees of freedom.

$$ {t_{.025.5 }}=2.57 $$

Then the 95 % confidence interval in terms of Eq. 20.25 can be computed as

$$ \begin{array}{lllllll} \qquad\qquad \bar{x}-{t_{.025 }}\frac{s}{{\sqrt{n}}}<\mu <\bar{x}+{t_{.025 }}\frac{s}{{\sqrt{n}}} \cr \\ 1.088-2.57(.137/\sqrt{6})<\mu <1.088+2.57(.137/\sqrt{6}) \\ \\ \qquad\qquad\qquad\qquad.944<\mu <1.232 \\ \end{array} $$

The 95 % confidence interval for the increase in uncollectibles is the interval between 0.944 and 1.232.

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Lee, CF., Lee, J.C., Lee, A.C. (2013). Sampling Surveys: Methods and Applications. In: Statistics for Business and Financial Economics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-5897-5_20

Download citation

Publish with us

Policies and ethics