Chapter 6. Lesson 6: Measurements in the Personal Software Process

Table of Contents
Readings
Program 6A: Predictions
Summary

Understand the principles of process measurement, including the Goal-Question-Metric paradigm

Read chapter 7 of the textbook

Write program 6A using PSP1.1

Readings

Chapter 7 of the text is devoted to measurements: what they are, how they are classified, how they are defined, how they are collected. It's interesting, since so much of the PSP has devoted itself to measurements, that we wait so long to explore them, but it's still a nice overview.

Humphrey divides measurements into the following categories:

 

Table 6-1. Categories of measurement

CategoryDescription
objective/subjectiveObjective measures count things; subjectives measures involve human judgment.
absolute/relativeAbsolute measures are not affected by the addition of new items; relative measures are sensitive to new data. Objective measures are often absolute, while subjective measures tend to be relative.
explicit/derivedExplicit measures are taken directly, while derived measures are computed from other explicit or derived measures
dynamic/staticDynamic measures have a time dimension; static measurements do not.
predictive/explanatoryPredictive measures can be obtained or generated in advance, while explanatory measures are produced after the fact.
 
--[Humphrey95] 

It's interesting to view metrics in these categories for the purposes of possible automation (I again repeat my mantra that the process must be as automated as possible or no one will use it). With that in mind, objective measures are to be valued over subjective; and generally, absolute over relative (although relative measures calculated on a curve, for example, would be fine).

The computer is an outstanding tool for computing derived measures, so much so that automated metric design should focus on a small set of explicit measures from which the derived measures could be calculated at any time, rather than stored. Much as software projects should store the minimum files necessary to construct the entire project at a moment's notice, I view the process of automated metrics design and collection as that of Makefile generation: an elaborate network of dependencies which can generate desired information from a small set of data. Consider the logfiles I'm generating for these projects; it's possible to garner a great deal of information from them, such as time in phase, interruption data, defects by injection/removal phase, etc... but all data is stored in a single, human-readable text file.

The computer also excels at dynamic measures, because of its ability to deal with time, both in timestamp generation (such as my logfiles), and in time calculation. It would be an interesting exercise, for example, to generate a graph of project size over time from a source code repository tree, and use that data as a predictive tool for project completion (time-related data from a bug-tracking system could be similarly useful).

Myself, I have a complaint against the term "predictive measure" -- if it's predictive, it has not in fact been measured. But I certainly see Humphrey's point that, for example, LOC is in general an explanatory (after-the-fact) measure, yet one can, with some effort, generate a prediction for project LOC prior to the fact; that is, after all, the point of much of the PSP.

Humphrey then goes on to outline the "goal-question-metric" paradigm, in which the practicioner [Humphrey95]:

  1. Define the principal goals for your activity.

  2. Construct a comprehensive set of questions to help you achieve those goals, and

  3. Define and gather the data required to answer these questions.

I spent a few years as the "quality management representative" in my unit in Grand Forks, ND. My supervisor, as it were, was very concerned about the implementation of quality management; in particular, he seemed fascinated with metrics and their representation and display-- not necessarily their practical use. Very soon, our unit sported a mighty-looking wall of metrics, almost none of which were used in any real way. If we'd payed more attention to the GQM concept, things may have gone somewhat better. Is it was, the collection of measurements became a laughing matter.

Humphrey spends the rest of the chapter on examples, as well as a good section on "the impact of data gathering", which should be considered by anyone implementing a metrics-gathering process. Even in this small study, I can see positive impacts of metrics-gathering on my own software development process: I'm paying more attention to the type of errors generated and the fix time of those errors, as well as trying to spend more time in the planning and design phases to reduce the number of errors in the compile/test phases-- and this is happening without any formal study of the numbers. The very act of having to record each defect draws attention to the problems at hand.

However, I'm aware of the bad effects of gathering metrics, and having seen them implemented poorly in the past, I rather strongly believe that individual metrics (on people) should only be used by those people for any sort of process analysis. Failure to do so will almost certainly result in people "working to fit the metrics" rather than gathering useful data. Metrics on group property, however (such as project size on a team project) could well be useful and valuable.