Apr 3, 2013

Thoughts on Testing Software

As my co-founder Nat and I are building out our backend systems using Python, Mongo, etc., we're writing suites of test code that are starting to rival or even surpass the production code base in both size and complexity since there's so much to consider: unit tests, functional tests, and end-to-end tests, as well as tests of (time-dependent) data quality, each of which could be a full-time job in itself. Thought I'd take a breather from the marathon known as founding a company to consider some ideas about software testing in general.

From my knowledge of and experience with large-scale software engineering, testing code is not just added value to existing code, but is fundamentally just as important as the production code itself. "How Google Tests Code" by James Whittaker, James Arbon, and Jeff Carollo gave a fascinating look into how Google structures its development. It was interesting that the status and monetary rewards given to testers at Google are on par with pure software engineers. There currently exist three software engineer titles at Google with a varying degree of testing duties: 
  • Software Engineer (SWE) is the traditional developer role concerned with writing code that gets used by consumers.
  • Software Engineers in Test (SET) is also a developer role, but with a focus on testability and general test infrastructure.
  • Test Engineer (TE) have responsibilities that overlap with the SETs, but with more focus towards users. 
Fortunately or unfortunately, startup founders need to do all three, while under many different pressures.

Production Code :: Testing Code
I've been considering the complementary roles of production code and testing code. Production code in relation to the test code that serves to test it should not be viewed as two different aspects of software development, but rather intimately related to form an indistinguishable whole. The list below organizes related concepts, with left hand and right hand terms related. These distinctions are not perfectly independent, but provide a first approximation to consider such related ideas.

Schrodinger Picture :: Heisenberg Picture
I view the complementary relationship between production and testing code as akin to the two ways that quantum mechanics (the field in which I did my phd dissertation research) may be formulated under either the Schrodinger Picture or the Heisenberg Picture. (Please feel free to move onto the next sections if you haven't had exposure to quantum mechanics, as those other sections consider the same ideas in a different context.) In the Schrodinger Picture, the state vectors describing a quantum system evolve in time, but the operators that act on them remain time independent. In the Heisenberg Picture, in contrast, the state vectors themselves remain time-independent and while the operators evolve in time. One may view production code as the system of interest more akin to the Schrodinger picture, while focusing on the changing test environment in which the code executes more akin to the Heisenberg Picture.

System of Interest :: Environment
One way to understand how quantum systems lose their quantum properties is through the process of Decoherence, in which there is no collapse of the wavefunction and instead the environment acts continuously to weakly measure the system of interest, eventually so that a purely quantum system with such quantum properties as superposition, entanglement, and non-locality transitions into a classical system that no longer exhibits those qualities. Taking this view further, not only can the environment be viewed as a large factor in understand your quantum state of interest, but it is fundamentally involved in defining the very nature of what can be observed in the quantum system itself (in particular, what base is the environment measuring the quantum system of interest). To connect this with production and test code, you can consider the production code as the system of interest, with the testing code acting to place it different environments with different parameters.

Observed :: Observer (but not strictly)
The production code is the system that is observed, whereas the various external factors that affect it serve to "observe" it under different circumstances. Production code may be correct only when observed in a specific set of ways.

Low Degrees of Freedom :: High Degrees of Freedom
We can view production code as having a lower number of degrees of freedom of interest compared to the higher degrees of freedom in testing. When writing the code, we assume that we understand the failure points and write exception handling to deal with known unknowns accordingly. Testing allows you to expand these known unknowns artificially to encompass a larger space of possibilities or degrees of freedom.

Current Outcome :: Counterfactual
Production code concerns what is tangible, concrete, and exists right now, and various error handling accounts for what the programmer understands encompassing a small set of cases that intuitively make sense. On the other hand, the space of branching points and data flow through your code is essentially infinite. For example, an unsigned long int in C++ goes from 0 to 4294967295. A complete and perfect testing program would need to account for all the other "counterfactual branchings" in addition to the current long int the program is processing or is stored in memory.

Subject :: White Space
Just like the text in a page or the subject of a photograph respectively require white space and empty space to defines or enhances the subject (it's hard to read text that's bunched up together and a self portrait taken in the middle of a busy nightclub probably isn't very flattering, haha) , production code needs testing code to better define it.

Truth :: Context of Truth
In general, a bug-free program with some initial set of conditions processing data works to produce a "truth", however that truth is only truthful under the specific context in which it exists in. Testing defines a context in which code is true (as an aside, the testing context is only a small subset of all testing contexts).


After writing this long post about some ideas I've had about testing, I can't help thinking about testing the testing code itself, testing that code, and so on. However, that's an asymptotic and inductive process that takes very long. How long? Much longer than it takes for the funding to our startup to run out! :)

So with that, will have to sign off and get back to writing test code. I hope you enjoyed this post as much as I enjoyed writing it, it was a nice refreshing change of pace with the thousands upon thousands of lines of Python code (and test code) I've been writing.

- Kevin

No comments:

Post a Comment