Thomas Huckle and Tobias NeckelJan 17, 2022
Bits and Bugs
A Scientific and Historical Review of Software Failures in Computational Science
A true understanding of the pervasive role of software in the world demands an awareness of the volume and variety of real-world software failures and their consequences. No more thorough survey of these events may be available than Thomas Huckle and Tobias Neckel's Bits and Bugs: A Scientific and Historical Review of Software Failures in Computational Science (SIAM, 2019). Their book organizing an extensive collection of episodes into eight chapters that expand on an array of flavors of failures, increasing in intricacy from precision and rounding errors to the software–hardware interface and problems that emerge from complexity.
As I see it, this book serves three audiences: Instructors of computer engineering or numerical methods will find an educational text uniquely suited to a focus on software failures; software engineers will find an equally unique reference text; and students of the practice or the history of computational science will find a fully blazed trail through these complicated stories. Dr. Huckle joined me to discuss his and his coauthor's motivations for assembling the book, a sampler of the chapter headliners, and some of his thoughts on new and evolving computational tools with their own attendant opportunities for failure.
Technical readers will appreciate the mathematical excursions that rigorously introduce topics essential to understanding each chapter's headlining episodes, the exercises and MATLAB code provided at the book's website, and links to sources at Dr. Huckle's website. I found value in the recurring lesson that real-world failures arise from the coincidence of multiple, often multitudinous errors, as well as in the authors' consistent emphasis on the real human toll that the study of these errors is driven to prevent. That said, all readers may appreciate the fanciful taxonomy given in the introduction and the amusing (though sometimes apocryphal) idiosyncratic failures surveyed in the appendix.
Suggested companion works:
- Peter G. Neumann, Illustrative Risks to the Public in the Use of Computer Systems and Related Technology
- Nancy G. Leveson, Safeware: System Safety and Computers
- Glenford J. Myers, Software Reliability: Principles and Practices
- Lauren Ruth Wiener, Digital Woes: Why We Should Not Depend On Software
- Ivars Peterson, Fatal Defect: Chasing Killer Computer Bugs
Thomas Huckle completed a degree program in mathematics and physics education and in pure mathematics, received a doctorate in 1985, and acquired his postdoctoral teaching qualification (habiliation) in 1991 at the University of Würzburg. A German research Foundation (DFG) grant enabled him to spend time performing research at Stanford University (1993–1994). In 1995 Professor Huckle joined TUM as professor of scientific computing. He has also been a member of the Mathematics Faculty since 1997. His primary research area is numerical linear algebra and its application in fields such as informatics and physics. His work focuses on solving linear problems on parallel computers, image processing and reconstruction, partial differential equations, and structured matrices.
Tobias Neckel has studied applied mathematics at the Technical University of Munich (TUM) and received a doctorate in Computer Science at TUM in 2009. He is currently senior researcher in scientific computing at TUM and has conducted research at the École Polytechnique, France (2003), the Tokyo Institute of Technology (2008), and the Australian National University (2017). His research interests include the numerical solution of differential equations, hierarchic and adaptive methods, uncertainty quantification, and various aspects of high-performance computing.
Cory Brunson is an Assistant Professor at the Laboratory for Systems Medicine at the University of Florida. His research focuses on geometric and topological approaches to the analysis of medical and healthcare data.