The Personal Software Process: an Independent Study
Prev	Chapter 6. Lesson 6: Measurements in the Personal Software Process	Next

Program 6A: Predictions

Enhance program 4A to calculate the linear regression parameters and the prediction interval

Given Requirements

Requirements: Write a program to calculate an LOC estimate and the 90 percent and 70 percent prediction intervals for this estimate... Use program 5A to calculate the value of the t distribution and use a linked list for the data. You may enhance program 4A to develop this program. Note that to calculate the value of t, you integrate from 0 to a trial value of t. You find the correct value by successively adjusting the trial value of t up or down until the p value is within an acceptable error of 0.85 (for a 70 percent prediction interval) or 0.95 (for a 90 percent prediction interval).

Testing: Thoroughly test the program. As one test, use the data for estimated object LOC and actual new and changed LOC in table D8 and the beta-0 and beta-1 values found from testing program 4A. Also assume an estimated object LOC value of 386. Under these conditions, the estimated LOC values and the parameter values obtained should be as follows:

Table 6-2. Test Results Format -- Program 6a:

Test	Parameter	Expected Value	Actual Value
Table D8	Beta₀	-22.55
	Beta₁	1.7279
	UPI( 70% )	874	414
	LPI( 70% )	414
	UPI( 90% )	1030
	LPI( 90% )	258
Program 6a, using historical data for programs 2a through 6a	Estimated new and changed LOC	N/A
	UPI( 70% )	N/A
	LPI( 70% )	N/A
	UPI( 90% )	N/A
	LPI( 90% )	N/A
	Actual new/changed LOC	N/A

--[Humphrey95]

Planning

Requirements

Program 6a lends itself to a bit more requirements-interpretation than some others. I'll clarify these as much as possible.

First, input. I'll keep my current trend of reading from standard input; it will allow me to reuse a great deal of code (particularly my hard-working if extremely simplistic simple_input_parser). In this case, though, the program will read pairs of numbers from standard input; a line with the string "stop" will represent the end of input. Because it would be convenient to include comments with the text files (thus allowing me to preserve a sort of "database" of historical data, annotated with meaning), two dashes (--) will represent the beginning of an inline comment. Blank lines will be ignored. These pairs represent the historical data, and program 6A is free to calculate several parameters from there. Once the historical data is read, the next numbers read will be the new "x" numbers, from which we are to determine the "y" numbers. Reading the "x" numbers will continue until an EOF is encountered. The data which can be output for the historical data (beta values, etc) will be output at the end of the historical data; others (such as prediction intervals, etc) will be output after reading the "x" values. An example output after reading the data from table D8 in [Humphrey95] would be:

Historical data read:
Beta-0: -22.95
Beta-1: 1.7279
Standard deviation: 197.8956

Estimate at x= 386
Projected y : 644.429
t( 70 percent ): 1.108
t( 90 percent ): 1.860
Range( 70 percent): 229.9715; UPI: 874.401; LPI: 414.4579
Range( 90 percent): 386.0533; UPI: 1030.483; LPI: 258.3761

By showing the mid-calculation values for t, range, etc, I will hopefully simplify testing while keeping the program versatile and easy to use.

Size Estimate

Once again, my historical estimated-LOC-to-actual-LOC data is so wacked that according to the PROBE script it was ususable (even after throwing out programs 2a and 3a). While this is somewhat discouraging, it's a powerful motivator to pay more attention to my development practices and attempt to better my ability to do work.

Using the historical data as an average (as per the PROBE script), I've come up with an estimate of 234 new/changed LOC for program 6a. This seems a bit high to me, but we'll see how it goes.

Resource/Time Estimate

My time data is getting a little more stable, and I was able to use linear regression to predict a total project time of about 256 minutes based on PROBE's estimate for size.

Development

Design

Our single_variable_function and simpson_integrator objects do some more work this round, as well as our old standby, the simple_input_parser. We'll create several new single_variable_function subclasses: gamma_function, t_distribution_base, and of course t_distribution. There's something peculiar about the last, because the t-distribution is not only dependent on x, but also on the "degrees of freedom" involved; I will get around this by adding a "degrees of freedom" feature to the t-distribution class, effectively currying the original 2-variable function into a single-variable function.

The only other new class with substance is the prediction_parser class, which will transform input strings by deleting everything from the inline-comment identifier ("--") to the end of the line, and then stripping whitespace. It will then handle strings based on a state: before the end of historical data, pairs will be added to a prediction_list, a child of the venerable paired_number_list from lesson 4a. At the end of historical data (signified by the word "stop" on a line by itself), the class will print data statistics (regression parameters, etc.). Further numbers will be used as "x" values for prediction, and will each result in a prediction based on the regression parameters and the t-distribution. Blank lines will be ignored.

Code

No large surprises here, although while constructing some items I was prompted to spin off some elements into separate reusable files/classes (like the extremely lightweight but somewhat useful "error_log" class).

Reused code

The simple_input_parser, simpson_integrator, number_list, paired_number_list, and single_variable_function classes were reused in full.

error_log

/*
*/

#ifndef ERROR_LOG_H
#define ERROR_LOG_H

#include <string>

class error_log
{
  public:void check_error (bool should_be_false, const std::string & message);
  void log_error (const std::string & message);
  void clear_error_flag (void);
  void set_error_flag (void);
  bool error_flag (void) const;
    error_log (void);

    protected:bool m_error_flag;
};

#endif

/*
*/

/*
*/

#include "error_log.h"

#include <string>

void
error_log::check_error (bool should_be_false, const std::string & message)
{
  if (should_be_false)
    {
      m_error_flag = true;
      log_error (message);
    }
}

void
error_log::log_error (const std::string & message)
{
  cerr << "Error: " << message << "\n";
}

void
error_log::clear_error_flag (void)
{
  m_error_flag = false;
}

void
error_log::set_error_flag (void)
{
  m_error_flag = true;
}

bool error_log::error_flag (void) const
{
  return m_error_flag;
}

error_log::error_log (void)
{
  clear_error_flag ();
}

/*
*/

gamma_function

#ifndef GAMMA_FUNCTION_H
#define GAMMA_FUNCTION_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class gamma_function:public single_variable_function
{
  public:virtual double at (double x) const;
};

#endif

/*
*/

#include "gamma_function.h"

#ifndef IS_DOUBLE_EQUAL_H
#include "is_double_equal.h"
#endif

#include <math.h>

double
gamma_function::at (double x) const
{
  double
    Result = 0;
  if (is_double_equal (x, 1.0))
    {
      Result = 1;
    }
  else if (is_double_equal (x, 0.5))
    {
      Result = sqrt (M_PI);
    }
  else
    {
      Result = (x - 1.0) * at (x - 1.0);
    }
  return Result;
}

/*
*/

is_double_equal

#ifndef IS_DOUBLE_EQUAL_H
#define IS_DOUBLE_EQUAL_H

bool is_double_equal (const double lhs, const double rhs);

void set_is_double_equal_margin (const double new_margin);

#endif

/*
*/

#include "is_double_equal.h"

#include <math.h>

static double double_equal_margin = 0.00001;

bool is_double_equal (const double lhs, const double rhs)
{
  if (fabs (lhs - rhs) < double_equal_margin)
    {
      return true;
    }
  else
    {
      return false;
    }
}

void
set_double_equal_margin (const double new_margin)
{
  double_equal_margin = new_margin;
}

/*
*/

paired_number_list_predictor

#ifndef PAIRED_NUMBER_LIST_PREDICTOR_H
#define PAIRED_NUMBER_LIST_PREDICTOR_H

#ifndef PAIRED_NUMBER_LIST_H
#include "paired_number_list.h"
#endif
#ifndef T_DISTRIBUTIONL_H
#include "t_distribution.h"
#endif

class paired_number_list_predictor:public paired_number_list
{
  public:double variance (void) const;
  double standard_deviation (void) const;
  double projected_y (double x) const;
  double prediction_range (double x, double range) const;
  double lower_prediction_interval (double x, double range) const;
  double upper_prediction_interval (double x, double range) const;
  double t (double range) const;

    protected:t_distribution m_t_distribution;
  double prediction_range_base (void) const;
};

#endif

/*
*/

#include "paired_number_list_predictor.h"

#ifndef CONTRACT_H
#include "contract.h"
#endif

#include <math.h>

double
paired_number_list_predictor::variance (void) const
{
  REQUIRE (entry_count () > 2);
  double
    Result = 0;
  list < double >::const_iterator x_iter;
  list < double >::const_iterator y_iter;

  for (x_iter = m_xs.begin (), y_iter = m_ys.begin ();
       (x_iter != m_xs.end ()) && (y_iter != m_ys.end ()); ++x_iter, ++y_iter)
    {
      Result += pow (*y_iter - beta_0 () - beta_1 () * (*x_iter), 2);
    }
  Result *= 1.0 / (entry_count () - 2.0);
  return Result;
}

double
paired_number_list_predictor::standard_deviation (void) const
{
  return sqrt (variance ());
}

double
paired_number_list_predictor::projected_y (double x) const
{
  return beta_0 () + beta_1 () * x;
}

double
paired_number_list_predictor::t (double range) const
{
  const_cast <
    paired_number_list_predictor *
    >(this)->m_t_distribution.set_n (entry_count () - 2);
  return m_t_distribution.at (range);
}

double
paired_number_list_predictor::prediction_range (double x, double range) const
{
  REQUIRE (entry_count () > 0);
  const double
    a_t = t (range);
  const double
    dev = standard_deviation ();
  const double
    x_m = x_mean ();
  double
    ecount_inv = 1 / entry_count ();
  return t (range) * standard_deviation ()
    * sqrt (1.0 + 1.0 / static_cast < double >(entry_count ())
	    + pow (x - x_mean (), 2) / prediction_range_base ());
}

double
paired_number_list_predictor::lower_prediction_interval (double x,
							 double range) const
{
  return projected_y (x) - prediction_range (x, range);
}

double
paired_number_list_predictor::upper_prediction_interval (double x,
							 double range) const
{
  return projected_y (x) + prediction_range (x, range);
}

double
paired_number_list_predictor::prediction_range_base (void) const
{
  double
    Result = 0;
  for (std::list < double >::const_iterator x_iter = m_xs.begin ();
       x_iter != m_xs.end (); ++x_iter)
    {
      Result += pow ((*x_iter) - x_mean (), 2);
    }
  return Result;
}

/*
*/

t_distribution

#ifndef T_DISTRIBUTION_H
#define T_DISTRIBUTION_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class t_distribution:public single_variable_function
{
  public:virtual double at (double x) const;
  void set_n (double new_n);
    t_distribution (void);

    protected:double n;
  double next_guess (double arg, double last_result, double target) const;
};

#endif

#include "t_distribution.h"
#ifndef T_INTEGRAL_H
#include "t_integral.h"
#endif

void
t_distribution::set_n (double new_n)
{
  n = new_n;
}

double
t_distribution::at (double x) const
{
  //x is given in a range; we need to convert this to a two-sided
  //distribution so we get what we're expecting
  x = 0.5 + x / 2;
  t_integral
    integral;
  integral.set_n (n);
  double
    last_error = 0;
  const double
    error_margin = 0.0000000001;
  double
    integral_arg = 0;
  double
    this_result = 0.5;
  double
    last_result = 0;
  bool
    has_tried_once = false;
  while (!has_tried_once || (last_error > error_margin))
    {
      last_result = this_result;
      this_result = integral.at (integral_arg);
      last_error = fabs (this_result - x);
      //store the argument in a temp (don't assign to last_arg,
      //because we need both values for the next guess!
      integral_arg = next_guess (integral_arg, last_result, x);
      has_tried_once = true;
    }
  return integral_arg;
}

t_distribution::t_distribution (void)
{
  set_n (0);
}

double
t_distribution::next_guess (double arg, double last_result, double target) const
{
  double
    Result = 0;
  Result = arg + (target - last_result);
  return Result;
}

t_distribution_base

#ifndef T_DISTRIBUTION_BASE_H
#define T_DISTRIBUTION_BASE_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class t_distribution_base:public single_variable_function
{
  public:virtual double at (double x) const;
  void set_n (double new_n);
    t_distribution_base (void);
    protected:double n;
};

#endif

/*
*/

#include "t_distribution_base.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#include <math.h>


double
t_distribution_base::at (double x) const
{
  REQUIRE (n > 0);
  return pow (1 + x * x / n, -(n + 1) / 2);
}

void
t_distribution_base::set_n (double new_n)
{
  n = new_n;
}

t_distribution_base::t_distribution_base (void):
n (0)
{
}


/*
*/

t_integral

#ifndef T_INTEGRAL_H
#define T_INTEGRAL_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
#ifndef GAMMA_FUNCTION_H
#include "gamma_function.h"
#endif
#ifndef SIMPSON_INTEGRATOR_H
#include "simpson_integrator.h"
#endif
#ifndef T_DISTRIBUTION_BASE_H
#include "t_distribution_base.h"
#endif

class t_integral:public single_variable_function
{
  public:virtual double at (double x) const;
  void set_n (double new_n);
    t_integral (void);

    protected:simpson_integrator homer;
  gamma_function gamma;
  t_distribution_base base;
  double n;
  double multiplier (void) const;
};

#endif

/*
*/

#include "t_integral.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif

#include <math.h>


double
t_integral::multiplier (void) const
{
  return gamma.at ((n + 1) / 2) / (sqrt (n * M_PI) * gamma.at (n / 2));
}

void
t_integral::set_n (double new_n)
{
  n = new_n;
  base.set_n (new_n);
}

t_integral::t_integral (void)
{
  set_n (0);
}

double
t_integral::at (double x) const
{
  REQUIRE (n > 0);
  double
    Result = 0.5;
  if (x < 0)
    {
      Result = 0.5 - multiplier () * homer.integral (base, 0, -x);
    }
  else if (x > 0)
    {
      Result = 0.5 + multiplier () * homer.integral (base, 0, x);
    }
  return Result;
}

/*
*/

main.cpp

/*
*/

#include <fstream>
#include <iostream>
#include "string.h"

#ifndef PREDICTOR_PARSER_H
#include "predictor_parser.h"
#endif

istream *
input_stream_from_args (int arg_count, const char **arg_vector)
{
  istream *Result = NULL;
  if (arg_count == 1)
    {
      Result = &cin;
    }
  else
    {
      const char *help_text =
	"PSP exercise 6A: Calculate a prediction and interval given historical data.\nUsage:\n\tpsp_5a\n\n";
      cout << help_text;
    }
  return Result;
}

int
main (int arg_count, const char **arg_vector)
{
  //get the input stream, or print the help text as appropriate
  istream *input_stream = input_stream_from_args (arg_count, arg_vector);
  if (input_stream != NULL)
    {
      predictor_parser parser;
      parser.set_input_stream (input_stream);
      parser.parse_until_eof ();
    }
}

/*
*/

error_log.e

class ERROR_LOG
--very rudimentary error reporting and error flag tracking.

creation {ANY} 
   make

feature {ANY} 
   
   check_for_error(should_be_false: BOOLEAN; message: STRING) is 
      do  
         if should_be_false then 
            log_error(message);
         end; 
      end -- check_for_error
   
   log_error(message: STRING) is 
      do  
         std_error.put_string(message + "%N");
         set_error_flag;
      end -- log_error
   
   clear_error_flag is 
      do  
         error_flag := false;
      end -- clear_error_flag
   
   set_error_flag is 
      do  
         error_flag := true;
      end -- set_error_flag
   
   error_flag: BOOLEAN;
   
   make is 
      do  
         clear_error_flag;
      end -- make

end -- class ERROR_LOG

gamma_function.e

class GAMMA_FUNCTION
--gamma function, one of the bases of the t-distribution

inherit 
   SINGLE_VARIABLE_FUNCTION
      redefine at
      end; 
   
feature {ANY} 
   
   error_margin: DOUBLE is 0.000001;
   
   acceptably_equal(lhs, rhs: DOUBLE): BOOLEAN is 
      --is lhs within error_margin of rhs?
      do  
         Result := false;
         if lhs.in_range(rhs - error_margin,rhs + error_margin) then 
            Result := true;
         end; 
      end -- acceptably_equal
   
   at(x: DOUBLE): DOUBLE is 
      --gamma function
      do  
         if acceptably_equal(x,1.0) then 
            Result := 1;
         elseif acceptably_equal(x,0.5) then 
            Result := Pi.sqrt;
         else 
            Result := (x - 1.0) * at(x - 1.0);
         end; 
      end -- at

end -- class GAMMA_FUNCTION

main.e

class MAIN

creation {ANY} 
   make

feature {ANY} 
   
   make is 
      local 
         parser: PREDICTOR_PARSER;
         gamma: GAMMA_FUNCTION;
      do  
         !!parser.make;
         parser.set_input(io);
         parser.parse_until_eof;
      end -- make

end -- MAIN

paired_number_list_predictor.e

class PAIRED_NUMBER_LIST_PREDICTOR
--reads a set of paired numbers, does linear regression, predicts results

inherit 
   PAIRED_NUMBER_LIST
      redefine make
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   variance: DOUBLE is 
      local 
         i: INTEGER;
      do  
         Result := 0;
         from 
            i := xs.lower;
         until 
            not (xs.valid_index(i) and ys.valid_index(i))
         loop 
            Result := Result + (ys.item(i) - beta_0 - beta_1 * xs.item(i)) ^ 2;
            i := i + 1;
         end; 
         Result := Result / (entry_count - 2);
      end -- variance
   
   standard_deviation: DOUBLE is 
      do  
         Result := variance.sqrt;
      end -- standard_deviation
   
   projected_y(x: DOUBLE): DOUBLE is 
      --projected value of given x, using linear regression 
      --parameters from xs and ys
      do  
         Result := beta_0 + beta_1 * x;
      end -- projected_y
   
   prediction_range_base: DOUBLE is 
      --base of the prediction range, used in prediction_range
      local 
         i: INTEGER;
      do  
         Result := 0;
         from 
            i := xs.lower;
         until 
            not (xs.valid_index(i) and ys.valid_index(i))
         loop 
            Result := Result + (xs.item(i) - xs.mean) ^ 2;
            i := i + 1;
         end; 
      end -- prediction_range_base
   
   prediction_range(x, range: DOUBLE): DOUBLE is 
      --prediction range, based on given estimate and % range
      require 
         entry_count > 0; 
      do  
         Result := (1.0 + (1.0 / entry_count.to_double) + (((x - xs.mean) ^ 2) / prediction_range_base)).sqrt;
         Result := t(range) * standard_deviation * Result;
      end -- prediction_range
   
   lower_prediction_interval(x, range: DOUBLE): DOUBLE is 
      --LPI, from [Humphrey95]
      do  
         Result := projected_y(x) - prediction_range(x,range);
      end -- lower_prediction_interval
   
   upper_prediction_interval(x, range: DOUBLE): DOUBLE is 
      --UPI, from [Humphrey95]
      do  
         Result := projected_y(x) + prediction_range(x,range);
      end -- upper_prediction_interval
   
   t_distribution: T_DISTRIBUTION;
   
   make is 
      do  
         Precursor;
         !!t_distribution.make;
      end -- make
   
   t(range: DOUBLE): DOUBLE is 
      --gets the size of the t-distribution at the given alpha range
      do  
         t_distribution.set_n(entry_count - 2);
         Result := t_distribution.at(range);
      end -- t

end -- class PAIRED_NUMBER_LIST_PREDICTOR

predictor_parser.e

class PREDICTOR_PARSER
--reads a list of number pairs, and performs linear regression analysis

inherit 
   SIMPLE_INPUT_PARSER
      redefine parse_last_line, transformed_line
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   inline_comment_begin: STRING is "--";
   
   string_stripped_of_comment(string: STRING): STRING is 
      --strip the string of any comment
      local 
         comment_index: INTEGER;
      do  
         if string.has_string(inline_comment_begin) then 
            comment_index := string.index_of_string(inline_comment_begin);
            if comment_index = 1 then 
               Result := "";
            else 
               Result := string.substring(1,comment_index - 1);
            end; 
         else 
            Result := string;
         end; 
      end -- string_stripped_of_comment
   
   string_stripped_of_whitespace(string: STRING): STRING is 
      --strip string of whitespace
      do  
         Result := string;
         Result.left_adjust;
         Result.right_adjust;
      end -- string_stripped_of_whitespace
   
   transformed_line(string: STRING): STRING is 
      --strip comments and whitespace from parseable line      
      do  
         Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
      end -- transformed_line
   
   number_list: PAIRED_NUMBER_LIST_PREDICTOR;

feature {ANY} --parsing

   found_end_of_historical_data: BOOLEAN;
   
   reset is 
      --resets the parser and makes it ready to go again
      do  
         found_end_of_historical_data := false;
         number_list.reset;
      end -- reset
   
   make is 
      do  
         !!number_list.make;
         reset;
      end -- make
   
   parse_last_line_as_historical_data is 
      --interpret last_line as a pair of comma-separated values
      local 
         error_log: ERROR_LOG;
         comma_index: INTEGER;
         x_string: STRING;
         y_string: STRING;
         new_x: DOUBLE;
         new_y: DOUBLE;
      do  
         !!error_log.make;
         comma_index := last_line.index_of(',');
         error_log.check_for_error(comma_index = last_line.count + 1,"No comma:" + last_line);
         x_string := last_line.substring(1,comma_index - 1);
         y_string := last_line.substring(comma_index + 1,last_line.count);
         error_log.check_for_error(not (x_string.is_double or x_string.is_integer),"invalid X:" + last_line);
         error_log.check_for_error(not (y_string.is_double or y_string.is_integer),"invalid Y:" + last_line);
         if not error_log.error_flag then 
            new_x := double_from_string(x_string);
            new_y := double_from_string(y_string);
            number_list.add_entry(new_x,new_y);
            std_output.put_string("added: ");
            std_output.put_double(new_x);
            std_output.put_string(", ");
            std_output.put_double(new_y);
            std_output.put_new_line;
         end; 
      end -- parse_last_line_as_historical_data
   
   double_from_string(string: STRING): DOUBLE is 
      require 
         string.is_double or string.is_integer; 
      do  
         if string.is_double then 
            Result := string.to_double;
         elseif string.is_integer then 
            Result := string.to_integer.to_double;
         end; 
      end -- double_from_string
   
   historical_data_terminator: STRING is "stop";
   
   parse_last_line_as_end_of_historical_data is 
      --interpret last line as the end of historical data
      require 
         last_line.compare(historical_data_terminator) = 0; 
      do  
         found_end_of_historical_data := true;
         std_output.put_string("Historical data read.%NBeta-0: ");
         std_output.put_double(number_list.beta_0);
         std_output.put_string("%NBeta-1: ");
         std_output.put_double(number_list.beta_1);
         std_output.put_string("%NStandard Deviation: ");
         std_output.put_double(number_list.standard_deviation);
         std_output.put_string("%N%N");
      end -- parse_last_line_as_end_of_historical_data
   
   parse_last_line_as_prediction is 
      --interpret last line as a single x, for a predictive y
      local 
         error_log: ERROR_LOG;
         x: DOUBLE;
      do  
         !!error_log.make;
         error_log.check_for_error(not (last_line.is_double or last_line.is_integer),"Not a double : " + last_line);
         if not error_log.error_flag then 
            x := double_from_string(last_line);
            std_output.put_string("Estimate at x=");
            std_output.put_double(x);
            std_output.put_string("%NProjected y: ");
            std_output.put_double(number_list.projected_y(x));
            std_output.put_string("%Nt (70 percent): ");
            std_output.put_double(number_list.t(0.7));
            std_output.put_string("%Nt (90 percent): ");
            std_output.put_double(number_list.t(0.9));
            std_output.put_string("%NRange (70 percent): ");
            std_output.put_double(number_list.prediction_range(x,0.7));
            std_output.put_string("; UPI: ");
            std_output.put_double(number_list.upper_prediction_interval(x,0.7));
            std_output.put_string("; LPI: ");
            std_output.put_double(number_list.lower_prediction_interval(x,0.7));
            std_output.put_string("%NRange (90 percent): ");
            std_output.put_double(number_list.prediction_range(x,0.9));
            std_output.put_string("; UPI: ");
            std_output.put_double(number_list.upper_prediction_interval(x,0.9));
            std_output.put_string("; LPI: ");
            std_output.put_double(number_list.lower_prediction_interval(x,0.9));
            std_output.put_new_line;
         end; 
      end -- parse_last_line_as_prediction
   
   parse_last_line is 
      --parse the last line according to state
      do  
         if not last_line.empty then 
            if last_line.compare(historical_data_terminator) = 0 then 
               parse_last_line_as_end_of_historical_data;
            else 
               if found_end_of_historical_data then 
                  parse_last_line_as_prediction;
               else 
                  parse_last_line_as_historical_data;
               end; 
            end; 
         end; 
      end -- parse_last_line

end -- class PREDICTOR_PARSER

t_distribution.e

class T_DISTRIBUTION
--the t-distribution, used to find prediction ranges, etc.

inherit 
   SINGLE_VARIABLE_FUNCTION
      redefine at
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   n: INTEGER;
   
   set_n(new_n: INTEGER) is 
      do  
         n := new_n;
      end -- set_n
   
   make is 
      do  
         set_n(0);
      end -- make
   
   next_guess(arg, last_result, target: DOUBLE): DOUBLE is 
      do  
         Result := arg + (target - last_result);
      end -- next_guess
   
   error_margin: DOUBLE is 0.0000000001;
   
   at(x: DOUBLE): DOUBLE is 
      local 
         t_integral: T_INTEGRAL;
         last_error: DOUBLE;
         this_result: DOUBLE;
         last_result: DOUBLE;
         has_tried_once: BOOLEAN;
         target: DOUBLE;
      do  
         --convert the range to a two-sided distribution so we get 
         --what we're expecting
         target := 0.5 + x / 2;
         !!t_integral.make;
         t_integral.set_n(n);
         last_error := 0;
         from 
            Result := 0;
            has_tried_once := false;
         until 
            has_tried_once and error_margin > last_error
         loop 
            last_result := this_result;
            this_result := t_integral.at(Result);
            last_error := (this_result - target).abs;
            Result := next_guess(Result,last_result,target);
            has_tried_once := true;
         end; 
      end -- at

end -- class T_DISTRIBUTION

t_distribution_base.e

class T_DISTRIBUTION_BASE
--simple function as the base of the t-distribution integral

inherit 
   SINGLE_VARIABLE_FUNCTION
      redefine at
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   n: INTEGER;
      --degrees of freedom
   
   set_n(new_n: INTEGER) is 
      do  
         n := new_n;
      end -- set_n
   
   make is 
      do  
         set_n(0);
      end -- make
   
   at(x: DOUBLE): DOUBLE is 
      require 
         n > 0; 
      do  
         Result := ((1.0 + x * x / n.to_double) ^ - (n + 1)).sqrt;
      end -- at

end -- class T_DISTRIBUTION_BASE

t_integral.e

class T_INTEGRAL
--integral of the t-distribution up to a given point; NOT what you 
--use to make predictions; use T_DISTRIBUTION instead

inherit 
   SINGLE_VARIABLE_FUNCTION
      redefine at
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   n: INTEGER;
   
   set_n(new_n: INTEGER) is 
      do  
         n := new_n;
         t_distribution_base.set_n(new_n);
      end -- set_n
   
   make is 
      do  
         !!gamma;
         !!homer.make;
         !!t_distribution_base.make;
         set_n(0);
      end -- make
   
   multiplier: DOUBLE is 
      do  
         Result := gamma.at((n + 1) / 2) / ((n * Pi).sqrt * gamma.at(n / 2));
      end -- multiplier
   
   gamma: GAMMA_FUNCTION;
   
   homer: SIMPSON_INTEGRATOR;
   
   t_distribution_base: T_DISTRIBUTION_BASE;
   
   at(x: DOUBLE): DOUBLE is 
      do  
         Result := 0.5;
         if x < 0 then 
            Result := 0.5 - multiplier * homer.integral(t_distribution_base,0,- x);
         else 
            Result := 0.5 + multiplier * homer.integral(t_distribution_base,0,x);
         end; 
      end -- at

end -- class T_INTEGRAL

Compile

My continual problems abound: using the incorrect style of assignments for the wrong language, loop increments and instantiation in Eiffel, and header/source parity in C++. The C++ standard library continues to surprise me, which is not a good thing (during the course of this, previous programs "stopped working", not recognizing EOF in data files; the "solution" was to read and put back a character after each call to iostream::getline(), which changed nothing except that EOF was now detected. Strange).

Test

Despite the obvious disclaimer in the problem statement ("Note that to calculate the value of t, you integrate from 0 to a trial value of t. You find the correct value by successively adjusting the trial value of t up or down until the p value is within an acceptable error..." [Humphrey95]), I didn't understand what this meant until the program appeared horribly broken. Figuring this out took a great deal of time and resulted in a huge "design defect" in the test phase.

Table 6-3. Test Results Format -- Program 6a:

Test	Parameter	Expected Value	Actual Value- C++	Actual Value- Eiffel
Table D8	Beta₀	-22.55	-22.5525	-22.552533
	Beta₁	1.7279	1.72793	1.727932
	UPI( 70% )	874	874.431	874.430261
	LPI( 70% )	414	414.428	414.428507
	UPI( 90% )	1030	1030.39	1030.387465
	LPI( 90% )	258	258.47	258.471303
Program 6a, using historical data for programs 2a through 6a	Estimated new and changed LOC
	UPI( 70% )	N/A	400.809	400.809121
	LPI( 70% )	N/A	166.415	166.414689
	UPI( 90% )	N/A	504.297	504.297205
	LPI( 90% )	N/A	62.9266	92.926605
	Actual new/changed LOC	N/A	299	299

Postmortem

PSP1.1 Project Plan Summary

Table 6-4. Project Plan Summary

Student:	Victor B. Putz	Date:	000111
Program:	Statistic Predictor	Program#	6A
Instructor:	Wells	Language:	C++

Summary	Plan	Actual	To date
Loc/Hour	48	56	49
Planned time	256		256
Actual time		315	315
CPI (cost/performance index)			0.813
%reused	44	44	23

Program Size	Plan	Actual	To date
Base	17	17
Deleted	2	2
Modified	3	2
Added	231	297
Reused	233	254	396
Total New and Changed	234	299	1089
Total LOC	442	568	1712
Total new/reused	0	0	0

Time in Phase (min):	Plan	Actual	To Date	To Date%
Planning	46	58	241	18
Design	26	27	130	10
Code	70	83	351	27
Compile	15	24	89	7
Test	79	108	420	31
Postmortem	20	15	92	7
Total	256	315	1323	100
Defects Injected		Actual	To Date	To Date %
Plan		0	0	0
Design		17	42	30
Code		23	93	66
Compile		0	3	2.5
Test		0	3	2.5
Total development		40	141	100
Defects Removed		Actual	To Date	To Date %
Planning		0	0	0
Design		0	0	0
Code		9	31	22
Compile		22	71	50
Test		9	39	28
Total development		40	141	100
After Development		0	0

Eiffel code/compile/test

Time in Phase (min)	Actual	To Date	To Date %
Code	71	217	49
Compile	30	106	24
Test	22	117	27
Total	123	440	100
Defects Injected	Actual	To Date	To Date %
Design	0	4	4
Code	28	86	95
Compile	0	0	0
Test	0	1	1
Total	28	91	100
Defects Removed	Actual	To Date	To Date %
Code	0	1	1
Compile	22	61	67
Test	6	29	32
Total	28	91	100

Time Recording Log

Table 6-5. Time Recording Log

Student:	Victor B. Putz	Date:	000109
		Program:	6A

Start	Stop	Interruption Time	Delta time	Phase	Comments
000109 15:09:59	000109 16:36:01	27	59	plan
000109 16:36:04	000109 17:07:13	4	27	design
000110 08:50:12	000110 10:13:35	0	83	code
000110 10:18:57	000110 10:42:55	0	23	compile
000110 10:47:38	000110 12:36:42	1	108	test
000111 11:55:19	000111 12:10:28	0	15	postmortem

Table 6-6. Time Recording Log

Student:	Victor B. Putz	Date:	000111
		Program:	6a

Start	Stop	Interruption Time	Delta time	Phase	Comments
000111 08:53:31	000111 10:04:22	0	70	code
000111 10:04:25	000111 10:34:02	0	29	compile
000111 10:34:27	000111 10:56:03	0	21	test

Defect Reporting Logs

Table 6-7. Defect Recording Log

Student:	Victor B. Putz	Date:	000109
		Program:	6A

Defect found	Type	Reason	Phase Injected	Phase Removed	Fix time	Comments
000110 08:55:43	md	ig	design	code	2	created is_double_equal to handle margins for double comparison (since pure equality is problematical)
000110 09:02:48	md	ig	design	code	2	forgot to add n to the class and equation
000110 09:18:45	md	ig	design	code	3	added variance feature to simplify standard deviation
000110 09:33:09	md	ig	design	code	1	added double_from_string member
000110 09:36:23	md	cm	design	code	3	broke string_stripped_of_comments into its own method; broke string_stripped_of_whitespace into its own class
000110 09:47:51	md	cm	design	code	1	broke error_log into its own class
000110 09:52:50	md	ig	design	code	2	beefed up error_log to include flag member, etc
000110 10:00:18	md	ig	design	code	1	added last_line_is_blank member
000110 10:05:40	ic	ig	design	code	1	moved t() into its own feature instead of mucking about with t_integral, etc.
000110 10:21:19	sy	ty	code	compile	0	erroneously typed compiler directive
000110 10:22:32	sy	ty	code	compile	0	forgot to #include necessary header
000110 10:23:12	sy	ty	code	compile	0	forgot to make inherited feature "at" const
000110 10:23:59	sy	ty	code	compile	0	forgot to #include string in header
000110 10:24:44	wn	cm	code	compile	1	misnamed a feature
000110 10:26:18	sy	cm	code	compile	0	used const double&; meant to use const double
000110 10:29:12	sy	om	code	compile	0	forgot-- must declare pairs of iterators prior to for-loop
000110 10:30:17	wn	cm	code	compile	0	used t_integral instead of m_t_integral as feature name
000110 10:31:51	sy	ig	code	compile	0	const cast games...
000110 10:33:16	wn	cm	code	compile	0	wrong name; typed "n" instead of entry_count
000110 10:33:56	sy	om	code	compile	0	forgot to add declaration to for-loop
000110 10:34:44	sy	ty	code	compile	0	forgot to add parentheses to zero-argument method
000110 10:35:21	sy	ty	code	compile	0	accidentally added semicolon after parentheses in implementation
000110 10:36:07	sy	om	code	compile	0	forgot to #include "contract.h"
000110 10:36:38	sy	om	code	compile	0	forgot to qualify methods with object (beta-0, beta-1, etc)
000110 10:37:40	sy	ty	code	compile	0	forgot to add error message to check_error call
000110 10:38:25	wn	cm	code	compile	0	misnamed argument in is_double
000110 10:38:58	sy	ty	code	compile	0	accidentally added semicolon after initializer in constructor
000110 10:39:38	sy	om	code	compile	0	forgot to #include math.h
000110 10:40:31	sy	om	code	compile	0	const games (base...)
000110 10:41:20	sy	ty	code	compile	0	used eiffel-style assignment; typo in constructor as well
000110 10:42:06	sy	ty	code	compile	0	const games...
000110 10:47:41	md	ig	design	test	1	forgot to add constructor
000110 10:49:25	wc	cm	design	test	3	some confusion with whether or not the condition in "check_error" should be true or false to trigger the log
000110 10:54:14	we	ig	design	test	1	forgot that stating "1/(entry_count() - 2 )" used ints instead of doubles.
000110 10:57:48	wc	ig	design	test	2	another true/false issue with check_error
000110 11:00:22	wt	cm	code	test	9	Aw, geez... accidentally declared return type "result" as bool, with disastrous results
000110 11:11:39	wa	ig	design	test	18	missed where the multiplier went in the t integral-- was using it to the whole integral (after the 0.5 business)
000110 11:32:23	wa	ig	design	test	47	holy cow-- missed entirely how to do the t-distribution; now have the correct one in place.
000110 12:22:54	wa	ig	design	test	2	expedited the "next guess" algorithm
000110 12:26:46	we	ig	design	test	8	bloody c++... was silently doing int arithmetic in the range calculation instead of converting to double

Table 6-8. Defect Recording Log

Student:	Victor B. Putz	Date:	000111
		Program:	6a

Defect found	Type	Reason	Phase Injected	Phase Removed	Fix time	Comments
000111 10:05:05	wn	ig	code	compile	0	used "equals" instead of "compare" to compare strings
000111 10:06:03	sy	cm	code	compile	0	used () at end of no-argument feature call
000111 10:06:41	sy	ty	code	compile	0	typed beta-1 instead of beta_1
000111 10:07:25	sy	ty	code	compile	0	used comma instead of semicolon as argument separator
000111 10:07:47	sy	ty	code	compile	0	compiler didn't like a semicolon; odd
000111 10:08:33	sy	ty	code	compile	0	used c_style = for assignment
000111 10:09:05	sy	ig	code	compile	0	can't declare constants in local header
000111 10:10:43	sy	ig	code	compile	0	tried to change a formal argument; not allowed in Eiffel
000111 10:11:10	sy	ty	code	compile	0	forgot "then" after if
000111 10:11:33	sy	ty	code	compile	0	typed "homer_integral" instead of homer.integral
000111 10:12:00	sy	ty	code	compile	0	typed x.mean instead of xs.mean
000111 10:12:36	wn	cm	code	compile	0	used range instead of prediction_range
000111 10:13:55	sy	ty	code	compile	0	used _ instead of . for feature call
000111 10:14:31	sy	ty	code	compile	0	forgot to qualify number_list feature call with an object
000111 10:14:55	sy	ty	code	compile	0	forgot to qualify error_log feature call with object
000111 10:15:22	ic	om	code	compile	0	forgot to add double_from_string feature (reused from readable_paired_number_list)
000111 10:16:56	wt	om	code	compile	0	mistakenly declared some variables as integer, not double
000111 10:17:38	wn	cm	code	compile	0	used check_error instead of check_for_error
000111 10:19:48	mc	om	code	compile	0	forgot to add call to predictor_parser.make
000111 10:20:38	mc	om	code	compile	2	was not instantiating some members
000111 10:23:19	mc	om	code	compile	1	forgot several instantiations
000111 10:24:48	wa	ig	code	compile	8	Eiffel does not support arbitrary exponents, only integers-- so I had to change the alg to use a sqrt
000111 10:35:10	wa	ig	code	test	3	forgot to check for cases in which the line begins with the comment...?
000111 10:39:39	wa	cm	code	test	1	forgot to increment loop index
000111 10:41:53	ma	cm	code	test	1	forgot to update found_end_of_historical_data in parse_last_line_as_end_of_historical_data
000111 10:43:25	mc	cm	code	test	0	forgot to instantiate error_log
000111 10:44:14	wn	cm	code	test	0	tried to use to_double instead of double_from_string
000111 10:45:48	wa	ty	code	test	9	missed a minus sign in a calculation. Dang.

Prev	Home	Next
Lesson 6: Measurements in the Personal Software Process	Up	Summary