Leveraging the Wisdom of Crowds with Modern Regression, Machine Learning, and Ensembles with Application to Army Software Sustainment
From the Journal of Cost Analysis and Parametrics: Volume 10 | Issue 2 | April 2022
Downloadable File: JCAPv10i2-LeveragingtheWisdomofCrowds-Smart
Abstract: Cost estimating often relies on the log-transformed ordinary least squares method for the development of cost estimating relationships. This method has weaknesses; the most significant of which is that it provides estimates that are biased low. These deficiencies can be corrected, and predictive accuracy can be improved, using modern regression methods and applying machine learning techniques. Statisticians have found that predictive accuracy can be even further improved through the combination of multiple models in an ensemble, or crowd approach.
The article discusses these methods in detail and applies them to an extensive dataset of 192 Army systems. Data analysis reveals several types of cost estimating relationships based on release type, release rhythm, and categories of data. This article discusses significance testing and goodness-of-fit metrics for all models developed.
Authors:
Dr. Christian B. Smart is the Chief Scientist for Galorath Federal. He has a Ph.D. in Applied Mathematics, with more than 20 years of experience with the application of predictive analytics, machine learning, and risk management to defense and aerospace programs. In 2020, Dr. Smart published Solving for Project Risk Management: Understanding the Critical Role of Uncertainty in Project Management with McGraw-Hill. He is the 2021 recipient of the Frank Freiman Lifetime Achievement Award from ICEAA.
Ms. Kimberly Roye is a Senior Data Scientist with Bank of America. Starting her career as a Mathematical Statistician for the US Census Bureau, Kimberly transitioned to a career in Cost Analysis over 10 years ago. She has supported several Department of Defense hardware, software and vehicle programs, as well as NASA and the Department of Homeland Security (DHS). She is currently a lead developer of Machine Learning training for the Army and DHS. Kimberly earned a MS in Applied Statistics from Rochester Institute of Technology and a dual BS in Mathematics/Statistics from the University of Georgia.
Ms. Cheryl Jones is a senior systems engineer at the US Army Futures Command and the technical lead for the Army Software Sustainment Cost Estimation initiative. The objective of this project is to provide estimation approaches to accurately estimate and justify software resources. Ms. Jones is the project manager of Practical Software and Systems Measurement and the DoD representative to ISO SC7, System and Software Engineering. She is co-editor of ISO/IEC/IEEE 15288, Systems Life Cycle Processes.
Dr. Brad Clark is Vice-President of Software Metrics Inc. His area of expertise is software cost and schedule data collection, analysis and parametric modeling. He co-authored a book with Barry Boehm titled Software Cost Estimation with COCOMO II and another with Ray Madachy titled Software Cost Estimation Metrics Manual for Defense Systems. Dr. Clark received a Ph.D. in Computer Science in from the University of Southern California.
Mr. Paul Janusz is a senior software quality engineer at the US Army Futures Command at Picatinny Arsenal. He is responsible for implementing software measurement and independent verification and validation for a wide variety of armament software intensive systems. Currently, Mr. Janusz is supporting the Army Software Sustainment Cost Estimation initiative, assessing how software sustainment is being performed on Army programs, analyzing their estimated and actual measurement data, and evaluating the resulting impacts upon the software that is delivered to the warfighter. He is a member of the Practical Software and Systems Measurement (PSM) project, adapting the measurement framework to account for the issues faced by sustainment projects and the unique characteristics of continuous iterative development projects.
Mr. James Doswell is a senior software cost analysis at DASA-CE, responsible for independent government cost estimates for Army systems. He serves as the division’s Software Team Lead, responsible for overseeing software analysis, developing predictive models, data collection, and organization wide software cost estimates.