The Business Case for Bootstrapping: When You’re Stuck with Incomplete Data, Here’s How You Make it Work!
Methods Track
Downloadable Files:
Abstract:
The purpose of inferential statistics is to reliably extend sample characteristics to a population. Mild assumptions such as the Central Limit Theorem and Identically and Independently Drawn (IID) estimators ensure that, based on a given level of precision, samples will be representative of the population from which it was drawn. However, when samples are small, most standard test statistics will be insignificant, because sample characteristics may fail to adequately capture variation in the population. Parametric Bootstrapping is an alternate approach which resamples the underlying distribution of the sample in order to estimate population characteristics. The objective of this presentation is to illustrate how to use Parametric Bootstrapping for Cost Estimating with small samples to enhance defensibility of cost estimates.
This presentation will be organized as follows. First, we identify the client’s dilemma. Our clients often have limited access to large amounts of primary data due to institutional constraints. For example, the US Office of Management and Budget (OMB) Paperwork Reduction Act makes it extremely difficult, expensive, and time consuming for federal agencies to collect large amounts of primary data. As such, federal agencies are frequently limited to collect secondary data from miscellaneous sources for Cost Benefit Analysis, Cost Estimating, and budgeting. Small samples of data are often not reliable since many times it is impossible to extend sample characteristics to a given population. Clearly, small samples undermine the credibility, validity, and scientific merit of cost estimates that are performed for federal clients.
We identify that the potential solution for extrapolating from small samples is Parametric Bootstrapping. Efron (1979) first showed that, given certain mild assumptions, Resampling with Replacement of a small sample will approximate the characteristics of a population. This method has been validated by Efron (loc. cit.), Bickel and Freedman (1981), Singh (1981), Beran (1982, 1987), Bretagnolle (1983), Gaenssler (1986) and others.
Next, the practical value of using Parametric Bootstrapping is identified. Benefits of using Bootstrapping are enormous, adding real value through actionable study plans, increased client confidence, and more compelling research results. Bootstrapping also adds value because this method avoids additional investment and time spent collecting more data if it isn’t necessary, leverages studies which had failed to deliver actionable results, increases efficacy of study estimates, provides more rigorous and compelling results, and supports management & understanding of risks to the study plan and study variables
A description of how Bootstrapping works is subsequently addressed, including fundamentals of supporting theorems in Inferential Statistics. A spreadsheet model is also introduced, and illustrates the computational ease and transparency of Bootstrapping. Successful applications to client work with cost estimating are also discussed, as well as the limitations of the technique.
This presentation shows one solution to estimating population characteristics with very small samples with a simple spreadsheet tool. Extensions of are also suggested, such as Non-Parametric Bootstrapping, that can be used when there is no or little evidence that the distribution of the underlying true population is normal.
Author(s):
Brett Gelso
Booz Allen Hamilton
Brett R. Gelso, Ph.D. currently applies environmental and energy economics, and related quantitative methods, to evaluate and estimate the costs and benefits of proposed regulations in the environmental, energy, and infrastructure domains. As a primary contributor to Booz Allen’s Economics and Business Analysis (EBA) team’s intellectual capital, he is developing several advanced methodologies in the areas of ecological asset valuation, risk management, and carbon trading.
Prior to joining BAH, Dr. Gelso worked at several U.S. government agencies including the Environmental Protection Agency, Department of Interior, and Department of Homeland Security. During his tenure in the federal government, Dr. Gelso focused on estimating the benefits and costs of potential regulation and on improving models that incorporated nonmarket, intrinsic, or ecosystem benefits of regulation. Dr. Gelso also maintains his connection with academia with presentations at conferences, publishing, and teaching Statistics and Econometrics at American University.
Glenn Grossman
Booz Allen Hamilton
Glenn Grossman, Research Design Manager, has developed and managed advanced methodological approaches and analytical strategies for healthcare research and project monitoring and evaluation. Mr. Grossman has utilized numerous quantitative management design techniques for project planning, performance optimization, and research evaluation. Extending back over 15 years, he has experience leading teams in collecting data from diverse populations and institutional environments and is familiar with their corresponding analytical requirements. He received doctoral training in Epidemiology and Statistics from the University of North Carolina at Chapel Hill and graduate training in Environmental Health and Urban and Environmental Policy from Tufts University.
Eric Druker
Booz Allen Hamilton
Eric R. Druker CCEA graduated from the College of William and Mary with a B.S. in Applied Mathematics in 2005 concentrating in both Operations Research and Probability & Statistics with a minor in Economics. He is employed by Booz Allen Hamilton as an Associate and currently serves as President of the St. Louis Society of Cost Estimating & Analysis (SCEA) Chapter. In 2009, he was named SCEA’s National Estimator/Analyst of the Year for Technical Achievement. Mr Druker currently supports NASA’s Cost Analysis Division, performing joint cost & schedule risk analysis across a variety of projects. In his previous position as a Technical/Research Lead at Northrop Grumman, Mr. Druker developed the risk analysis process/methodology used to assess financial risk to the company during Independent Cost Evaluations (ICEs). As a part of this, he participated in or lead over 30 ICEs on Northrop Grumman proposals; briefing results to executive management including the President and CEO of Northrop Grumman. In addition to multiple SCEA conferences, Eric has been an invited presenter at The Naval Postgraduate School’s Acquisition Research Symposium, DoDCAS and NASA’s Project Management Challenge. Prior to coming to Booz Allen, he helped to develop Northrop Grumman’s Independent Cost Evaluation (ICE) risk analysis practices and served as lead author of the Regression and Cost/Schedule Risk modules for the 2008 Cost Estimating Body of Knowledge (CEBoK) update.