| The Personal Software Process: an Independent Study | ||
|---|---|---|
| Prev | Chapter 6. Lesson 6: Measurements in the Personal Software Process | Next |
Enhance program 4A to calculate the linear regression parameters and the prediction interval
Requirements: Write a program to calculate an LOC estimate and the 90 percent and 70 percent prediction intervals for this estimate... Use program 5A to calculate the value of the t distribution and use a linked list for the data. You may enhance program 4A to develop this program. Note that to calculate the value of t, you integrate from 0 to a trial value of t. You find the correct value by successively adjusting the trial value of t up or down until the p value is within an acceptable error of 0.85 (for a 70 percent prediction interval) or 0.95 (for a 90 percent prediction interval). Testing: Thoroughly test the program. As one test, use the data for estimated object LOC and actual new and changed LOC in table D8 and the beta-0 and beta-1 values found from testing program 4A. Also assume an estimated object LOC value of 386. Under these conditions, the estimated LOC values and the parameter values obtained should be as follows: Table 6-2. Test Results Format -- Program 6a:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| --[Humphrey95] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Program 6a lends itself to a bit more requirements-interpretation than some others. I'll clarify these as much as possible.
First, input. I'll keep my current trend of reading from standard input; it will allow me to reuse a great deal of code (particularly my hard-working if extremely simplistic simple_input_parser). In this case, though, the program will read pairs of numbers from standard input; a line with the string "stop" will represent the end of input. Because it would be convenient to include comments with the text files (thus allowing me to preserve a sort of "database" of historical data, annotated with meaning), two dashes (--) will represent the beginning of an inline comment. Blank lines will be ignored. These pairs represent the historical data, and program 6A is free to calculate several parameters from there. Once the historical data is read, the next numbers read will be the new "x" numbers, from which we are to determine the "y" numbers. Reading the "x" numbers will continue until an EOF is encountered. The data which can be output for the historical data (beta values, etc) will be output at the end of the historical data; others (such as prediction intervals, etc) will be output after reading the "x" values. An example output after reading the data from table D8 in [Humphrey95] would be:
Historical data read: Beta-0: -22.95 Beta-1: 1.7279 Standard deviation: 197.8956 Estimate at x= 386 Projected y : 644.429 t( 70 percent ): 1.108 t( 90 percent ): 1.860 Range( 70 percent): 229.9715; UPI: 874.401; LPI: 414.4579 Range( 90 percent): 386.0533; UPI: 1030.483; LPI: 258.3761 |
By showing the mid-calculation values for t, range, etc, I will hopefully simplify testing while keeping the program versatile and easy to use.
Once again, my historical estimated-LOC-to-actual-LOC data is so wacked that according to the PROBE script it was ususable (even after throwing out programs 2a and 3a). While this is somewhat discouraging, it's a powerful motivator to pay more attention to my development practices and attempt to better my ability to do work.
Using the historical data as an average (as per the PROBE script), I've come up with an estimate of 234 new/changed LOC for program 6a. This seems a bit high to me, but we'll see how it goes.
My time data is getting a little more stable, and I was able to use linear regression to predict a total project time of about 256 minutes based on PROBE's estimate for size.
Our single_variable_function and simpson_integrator objects do some more work this round, as well as our old standby, the simple_input_parser. We'll create several new single_variable_function subclasses: gamma_function, t_distribution_base, and of course t_distribution. There's something peculiar about the last, because the t-distribution is not only dependent on x, but also on the "degrees of freedom" involved; I will get around this by adding a "degrees of freedom" feature to the t-distribution class, effectively currying the original 2-variable function into a single-variable function.
The only other new class with substance is the prediction_parser class, which will transform input strings by deleting everything from the inline-comment identifier ("--") to the end of the line, and then stripping whitespace. It will then handle strings based on a state: before the end of historical data, pairs will be added to a prediction_list, a child of the venerable paired_number_list from lesson 4a. At the end of historical data (signified by the word "stop" on a line by itself), the class will print data statistics (regression parameters, etc.). Further numbers will be used as "x" values for prediction, and will each result in a prediction based on the regression parameters and the t-distribution. Blank lines will be ignored.
No large surprises here, although while constructing some items I was prompted to spin off some elements into separate reusable files/classes (like the extremely lightweight but somewhat useful "error_log" class).
The simple_input_parser, simpson_integrator, number_list, paired_number_list, and single_variable_function classes were reused in full.
/*
*/
#ifndef ERROR_LOG_H
#define ERROR_LOG_H
#include <string>
class error_log
{
public:void check_error (bool should_be_false, const std::string & message);
void log_error (const std::string & message);
void clear_error_flag (void);
void set_error_flag (void);
bool error_flag (void) const;
error_log (void);
protected:bool m_error_flag;
};
#endif
/*
*/ |
/*
*/
#include "error_log.h"
#include <string>
void
error_log::check_error (bool should_be_false, const std::string & message)
{
if (should_be_false)
{
m_error_flag = true;
log_error (message);
}
}
void
error_log::log_error (const std::string & message)
{
cerr << "Error: " << message << "\n";
}
void
error_log::clear_error_flag (void)
{
m_error_flag = false;
}
void
error_log::set_error_flag (void)
{
m_error_flag = true;
}
bool error_log::error_flag (void) const
{
return m_error_flag;
}
error_log::error_log (void)
{
clear_error_flag ();
}
/*
*/ |
#ifndef GAMMA_FUNCTION_H
#define GAMMA_FUNCTION_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class gamma_function:public single_variable_function
{
public:virtual double at (double x) const;
};
#endif |
/*
*/
#include "gamma_function.h"
#ifndef IS_DOUBLE_EQUAL_H
#include "is_double_equal.h"
#endif
#include <math.h>
double
gamma_function::at (double x) const
{
double
Result = 0;
if (is_double_equal (x, 1.0))
{
Result = 1;
}
else if (is_double_equal (x, 0.5))
{
Result = sqrt (M_PI);
}
else
{
Result = (x - 1.0) * at (x - 1.0);
}
return Result;
}
/*
*/ |
#ifndef IS_DOUBLE_EQUAL_H #define IS_DOUBLE_EQUAL_H bool is_double_equal (const double lhs, const double rhs); void set_is_double_equal_margin (const double new_margin); #endif |
/*
*/
#include "is_double_equal.h"
#include <math.h>
static double double_equal_margin = 0.00001;
bool is_double_equal (const double lhs, const double rhs)
{
if (fabs (lhs - rhs) < double_equal_margin)
{
return true;
}
else
{
return false;
}
}
void
set_double_equal_margin (const double new_margin)
{
double_equal_margin = new_margin;
}
/*
*/ |
#ifndef PAIRED_NUMBER_LIST_PREDICTOR_H
#define PAIRED_NUMBER_LIST_PREDICTOR_H
#ifndef PAIRED_NUMBER_LIST_H
#include "paired_number_list.h"
#endif
#ifndef T_DISTRIBUTIONL_H
#include "t_distribution.h"
#endif
class paired_number_list_predictor:public paired_number_list
{
public:double variance (void) const;
double standard_deviation (void) const;
double projected_y (double x) const;
double prediction_range (double x, double range) const;
double lower_prediction_interval (double x, double range) const;
double upper_prediction_interval (double x, double range) const;
double t (double range) const;
protected:t_distribution m_t_distribution;
double prediction_range_base (void) const;
};
#endif |
/*
*/
#include "paired_number_list_predictor.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#include <math.h>
double
paired_number_list_predictor::variance (void) const
{
REQUIRE (entry_count () > 2);
double
Result = 0;
list < double >::const_iterator x_iter;
list < double >::const_iterator y_iter;
for (x_iter = m_xs.begin (), y_iter = m_ys.begin ();
(x_iter != m_xs.end ()) && (y_iter != m_ys.end ()); ++x_iter, ++y_iter)
{
Result += pow (*y_iter - beta_0 () - beta_1 () * (*x_iter), 2);
}
Result *= 1.0 / (entry_count () - 2.0);
return Result;
}
double
paired_number_list_predictor::standard_deviation (void) const
{
return sqrt (variance ());
}
double
paired_number_list_predictor::projected_y (double x) const
{
return beta_0 () + beta_1 () * x;
}
double
paired_number_list_predictor::t (double range) const
{
const_cast <
paired_number_list_predictor *
>(this)->m_t_distribution.set_n (entry_count () - 2);
return m_t_distribution.at (range);
}
double
paired_number_list_predictor::prediction_range (double x, double range) const
{
REQUIRE (entry_count () > 0);
const double
a_t = t (range);
const double
dev = standard_deviation ();
const double
x_m = x_mean ();
double
ecount_inv = 1 / entry_count ();
return t (range) * standard_deviation ()
* sqrt (1.0 + 1.0 / static_cast < double >(entry_count ())
+ pow (x - x_mean (), 2) / prediction_range_base ());
}
double
paired_number_list_predictor::lower_prediction_interval (double x,
double range) const
{
return projected_y (x) - prediction_range (x, range);
}
double
paired_number_list_predictor::upper_prediction_interval (double x,
double range) const
{
return projected_y (x) + prediction_range (x, range);
}
double
paired_number_list_predictor::prediction_range_base (void) const
{
double
Result = 0;
for (std::list < double >::const_iterator x_iter = m_xs.begin ();
x_iter != m_xs.end (); ++x_iter)
{
Result += pow ((*x_iter) - x_mean (), 2);
}
return Result;
}
/*
*/ |
#ifndef T_DISTRIBUTION_H
#define T_DISTRIBUTION_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class t_distribution:public single_variable_function
{
public:virtual double at (double x) const;
void set_n (double new_n);
t_distribution (void);
protected:double n;
double next_guess (double arg, double last_result, double target) const;
};
#endif |
#include "t_distribution.h"
#ifndef T_INTEGRAL_H
#include "t_integral.h"
#endif
void
t_distribution::set_n (double new_n)
{
n = new_n;
}
double
t_distribution::at (double x) const
{
//x is given in a range; we need to convert this to a two-sided
//distribution so we get what we're expecting
x = 0.5 + x / 2;
t_integral
integral;
integral.set_n (n);
double
last_error = 0;
const double
error_margin = 0.0000000001;
double
integral_arg = 0;
double
this_result = 0.5;
double
last_result = 0;
bool
has_tried_once = false;
while (!has_tried_once || (last_error > error_margin))
{
last_result = this_result;
this_result = integral.at (integral_arg);
last_error = fabs (this_result - x);
//store the argument in a temp (don't assign to last_arg,
//because we need both values for the next guess!
integral_arg = next_guess (integral_arg, last_result, x);
has_tried_once = true;
}
return integral_arg;
}
t_distribution::t_distribution (void)
{
set_n (0);
}
double
t_distribution::next_guess (double arg, double last_result, double target) const
{
double
Result = 0;
Result = arg + (target - last_result);
return Result;
} |
#ifndef T_DISTRIBUTION_BASE_H
#define T_DISTRIBUTION_BASE_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
class t_distribution_base:public single_variable_function
{
public:virtual double at (double x) const;
void set_n (double new_n);
t_distribution_base (void);
protected:double n;
};
#endif |
/*
*/
#include "t_distribution_base.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#include <math.h>
double
t_distribution_base::at (double x) const
{
REQUIRE (n > 0);
return pow (1 + x * x / n, -(n + 1) / 2);
}
void
t_distribution_base::set_n (double new_n)
{
n = new_n;
}
t_distribution_base::t_distribution_base (void):
n (0)
{
}
/*
*/ |
#ifndef T_INTEGRAL_H
#define T_INTEGRAL_H
#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
#ifndef GAMMA_FUNCTION_H
#include "gamma_function.h"
#endif
#ifndef SIMPSON_INTEGRATOR_H
#include "simpson_integrator.h"
#endif
#ifndef T_DISTRIBUTION_BASE_H
#include "t_distribution_base.h"
#endif
class t_integral:public single_variable_function
{
public:virtual double at (double x) const;
void set_n (double new_n);
t_integral (void);
protected:simpson_integrator homer;
gamma_function gamma;
t_distribution_base base;
double n;
double multiplier (void) const;
};
#endif |
/*
*/
#include "t_integral.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#include <math.h>
double
t_integral::multiplier (void) const
{
return gamma.at ((n + 1) / 2) / (sqrt (n * M_PI) * gamma.at (n / 2));
}
void
t_integral::set_n (double new_n)
{
n = new_n;
base.set_n (new_n);
}
t_integral::t_integral (void)
{
set_n (0);
}
double
t_integral::at (double x) const
{
REQUIRE (n > 0);
double
Result = 0.5;
if (x < 0)
{
Result = 0.5 - multiplier () * homer.integral (base, 0, -x);
}
else if (x > 0)
{
Result = 0.5 + multiplier () * homer.integral (base, 0, x);
}
return Result;
}
/*
*/ |
/*
*/
#include <fstream>
#include <iostream>
#include "string.h"
#ifndef PREDICTOR_PARSER_H
#include "predictor_parser.h"
#endif
istream *
input_stream_from_args (int arg_count, const char **arg_vector)
{
istream *Result = NULL;
if (arg_count == 1)
{
Result = &cin;
}
else
{
const char *help_text =
"PSP exercise 6A: Calculate a prediction and interval given historical data.\nUsage:\n\tpsp_5a\n\n";
cout << help_text;
}
return Result;
}
int
main (int arg_count, const char **arg_vector)
{
//get the input stream, or print the help text as appropriate
istream *input_stream = input_stream_from_args (arg_count, arg_vector);
if (input_stream != NULL)
{
predictor_parser parser;
parser.set_input_stream (input_stream);
parser.parse_until_eof ();
}
}
/*
*/ |
class ERROR_LOG
--very rudimentary error reporting and error flag tracking.
creation {ANY}
make
feature {ANY}
check_for_error(should_be_false: BOOLEAN; message: STRING) is
do
if should_be_false then
log_error(message);
end;
end -- check_for_error
log_error(message: STRING) is
do
std_error.put_string(message + "%N");
set_error_flag;
end -- log_error
clear_error_flag is
do
error_flag := false;
end -- clear_error_flag
set_error_flag is
do
error_flag := true;
end -- set_error_flag
error_flag: BOOLEAN;
make is
do
clear_error_flag;
end -- make
end -- class ERROR_LOG |
class GAMMA_FUNCTION
--gamma function, one of the bases of the t-distribution
inherit
SINGLE_VARIABLE_FUNCTION
redefine at
end;
feature {ANY}
error_margin: DOUBLE is 0.000001;
acceptably_equal(lhs, rhs: DOUBLE): BOOLEAN is
--is lhs within error_margin of rhs?
do
Result := false;
if lhs.in_range(rhs - error_margin,rhs + error_margin) then
Result := true;
end;
end -- acceptably_equal
at(x: DOUBLE): DOUBLE is
--gamma function
do
if acceptably_equal(x,1.0) then
Result := 1;
elseif acceptably_equal(x,0.5) then
Result := Pi.sqrt;
else
Result := (x - 1.0) * at(x - 1.0);
end;
end -- at
end -- class GAMMA_FUNCTION |
class MAIN
creation {ANY}
make
feature {ANY}
make is
local
parser: PREDICTOR_PARSER;
gamma: GAMMA_FUNCTION;
do
!!parser.make;
parser.set_input(io);
parser.parse_until_eof;
end -- make
end -- MAIN |
class PAIRED_NUMBER_LIST_PREDICTOR
--reads a set of paired numbers, does linear regression, predicts results
inherit
PAIRED_NUMBER_LIST
redefine make
end;
creation {ANY}
make
feature {ANY}
variance: DOUBLE is
local
i: INTEGER;
do
Result := 0;
from
i := xs.lower;
until
not (xs.valid_index(i) and ys.valid_index(i))
loop
Result := Result + (ys.item(i) - beta_0 - beta_1 * xs.item(i)) ^ 2;
i := i + 1;
end;
Result := Result / (entry_count - 2);
end -- variance
standard_deviation: DOUBLE is
do
Result := variance.sqrt;
end -- standard_deviation
projected_y(x: DOUBLE): DOUBLE is
--projected value of given x, using linear regression
--parameters from xs and ys
do
Result := beta_0 + beta_1 * x;
end -- projected_y
prediction_range_base: DOUBLE is
--base of the prediction range, used in prediction_range
local
i: INTEGER;
do
Result := 0;
from
i := xs.lower;
until
not (xs.valid_index(i) and ys.valid_index(i))
loop
Result := Result + (xs.item(i) - xs.mean) ^ 2;
i := i + 1;
end;
end -- prediction_range_base
prediction_range(x, range: DOUBLE): DOUBLE is
--prediction range, based on given estimate and % range
require
entry_count > 0;
do
Result := (1.0 + (1.0 / entry_count.to_double) + (((x - xs.mean) ^ 2) / prediction_range_base)).sqrt;
Result := t(range) * standard_deviation * Result;
end -- prediction_range
lower_prediction_interval(x, range: DOUBLE): DOUBLE is
--LPI, from [Humphrey95]
do
Result := projected_y(x) - prediction_range(x,range);
end -- lower_prediction_interval
upper_prediction_interval(x, range: DOUBLE): DOUBLE is
--UPI, from [Humphrey95]
do
Result := projected_y(x) + prediction_range(x,range);
end -- upper_prediction_interval
t_distribution: T_DISTRIBUTION;
make is
do
Precursor;
!!t_distribution.make;
end -- make
t(range: DOUBLE): DOUBLE is
--gets the size of the t-distribution at the given alpha range
do
t_distribution.set_n(entry_count - 2);
Result := t_distribution.at(range);
end -- t
end -- class PAIRED_NUMBER_LIST_PREDICTOR |
class PREDICTOR_PARSER
--reads a list of number pairs, and performs linear regression analysis
inherit
SIMPLE_INPUT_PARSER
redefine parse_last_line, transformed_line
end;
creation {ANY}
make
feature {ANY}
inline_comment_begin: STRING is "--";
string_stripped_of_comment(string: STRING): STRING is
--strip the string of any comment
local
comment_index: INTEGER;
do
if string.has_string(inline_comment_begin) then
comment_index := string.index_of_string(inline_comment_begin);
if comment_index = 1 then
Result := "";
else
Result := string.substring(1,comment_index - 1);
end;
else
Result := string;
end;
end -- string_stripped_of_comment
string_stripped_of_whitespace(string: STRING): STRING is
--strip string of whitespace
do
Result := string;
Result.left_adjust;
Result.right_adjust;
end -- string_stripped_of_whitespace
transformed_line(string: STRING): STRING is
--strip comments and whitespace from parseable line
do
Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
end -- transformed_line
number_list: PAIRED_NUMBER_LIST_PREDICTOR;
feature {ANY} --parsing
found_end_of_historical_data: BOOLEAN;
reset is
--resets the parser and makes it ready to go again
do
found_end_of_historical_data := false;
number_list.reset;
end -- reset
make is
do
!!number_list.make;
reset;
end -- make
parse_last_line_as_historical_data is
--interpret last_line as a pair of comma-separated values
local
error_log: ERROR_LOG;
comma_index: INTEGER;
x_string: STRING;
y_string: STRING;
new_x: DOUBLE;
new_y: DOUBLE;
do
!!error_log.make;
comma_index := last_line.index_of(',');
error_log.check_for_error(comma_index = last_line.count + 1,"No comma:" + last_line);
x_string := last_line.substring(1,comma_index - 1);
y_string := last_line.substring(comma_index + 1,last_line.count);
error_log.check_for_error(not (x_string.is_double or x_string.is_integer),"invalid X:" + last_line);
error_log.check_for_error(not (y_string.is_double or y_string.is_integer),"invalid Y:" + last_line);
if not error_log.error_flag then
new_x := double_from_string(x_string);
new_y := double_from_string(y_string);
number_list.add_entry(new_x,new_y);
std_output.put_string("added: ");
std_output.put_double(new_x);
std_output.put_string(", ");
std_output.put_double(new_y);
std_output.put_new_line;
end;
end -- parse_last_line_as_historical_data
double_from_string(string: STRING): DOUBLE is
require
string.is_double or string.is_integer;
do
if string.is_double then
Result := string.to_double;
elseif string.is_integer then
Result := string.to_integer.to_double;
end;
end -- double_from_string
historical_data_terminator: STRING is "stop";
parse_last_line_as_end_of_historical_data is
--interpret last line as the end of historical data
require
last_line.compare(historical_data_terminator) = 0;
do
found_end_of_historical_data := true;
std_output.put_string("Historical data read.%NBeta-0: ");
std_output.put_double(number_list.beta_0);
std_output.put_string("%NBeta-1: ");
std_output.put_double(number_list.beta_1);
std_output.put_string("%NStandard Deviation: ");
std_output.put_double(number_list.standard_deviation);
std_output.put_string("%N%N");
end -- parse_last_line_as_end_of_historical_data
parse_last_line_as_prediction is
--interpret last line as a single x, for a predictive y
local
error_log: ERROR_LOG;
x: DOUBLE;
do
!!error_log.make;
error_log.check_for_error(not (last_line.is_double or last_line.is_integer),"Not a double : " + last_line);
if not error_log.error_flag then
x := double_from_string(last_line);
std_output.put_string("Estimate at x=");
std_output.put_double(x);
std_output.put_string("%NProjected y: ");
std_output.put_double(number_list.projected_y(x));
std_output.put_string("%Nt (70 percent): ");
std_output.put_double(number_list.t(0.7));
std_output.put_string("%Nt (90 percent): ");
std_output.put_double(number_list.t(0.9));
std_output.put_string("%NRange (70 percent): ");
std_output.put_double(number_list.prediction_range(x,0.7));
std_output.put_string("; UPI: ");
std_output.put_double(number_list.upper_prediction_interval(x,0.7));
std_output.put_string("; LPI: ");
std_output.put_double(number_list.lower_prediction_interval(x,0.7));
std_output.put_string("%NRange (90 percent): ");
std_output.put_double(number_list.prediction_range(x,0.9));
std_output.put_string("; UPI: ");
std_output.put_double(number_list.upper_prediction_interval(x,0.9));
std_output.put_string("; LPI: ");
std_output.put_double(number_list.lower_prediction_interval(x,0.9));
std_output.put_new_line;
end;
end -- parse_last_line_as_prediction
parse_last_line is
--parse the last line according to state
do
if not last_line.empty then
if last_line.compare(historical_data_terminator) = 0 then
parse_last_line_as_end_of_historical_data;
else
if found_end_of_historical_data then
parse_last_line_as_prediction;
else
parse_last_line_as_historical_data;
end;
end;
end;
end -- parse_last_line
end -- class PREDICTOR_PARSER |
class T_DISTRIBUTION
--the t-distribution, used to find prediction ranges, etc.
inherit
SINGLE_VARIABLE_FUNCTION
redefine at
end;
creation {ANY}
make
feature {ANY}
n: INTEGER;
set_n(new_n: INTEGER) is
do
n := new_n;
end -- set_n
make is
do
set_n(0);
end -- make
next_guess(arg, last_result, target: DOUBLE): DOUBLE is
do
Result := arg + (target - last_result);
end -- next_guess
error_margin: DOUBLE is 0.0000000001;
at(x: DOUBLE): DOUBLE is
local
t_integral: T_INTEGRAL;
last_error: DOUBLE;
this_result: DOUBLE;
last_result: DOUBLE;
has_tried_once: BOOLEAN;
target: DOUBLE;
do
--convert the range to a two-sided distribution so we get
--what we're expecting
target := 0.5 + x / 2;
!!t_integral.make;
t_integral.set_n(n);
last_error := 0;
from
Result := 0;
has_tried_once := false;
until
has_tried_once and error_margin > last_error
loop
last_result := this_result;
this_result := t_integral.at(Result);
last_error := (this_result - target).abs;
Result := next_guess(Result,last_result,target);
has_tried_once := true;
end;
end -- at
end -- class T_DISTRIBUTION |
class T_DISTRIBUTION_BASE
--simple function as the base of the t-distribution integral
inherit
SINGLE_VARIABLE_FUNCTION
redefine at
end;
creation {ANY}
make
feature {ANY}
n: INTEGER;
--degrees of freedom
set_n(new_n: INTEGER) is
do
n := new_n;
end -- set_n
make is
do
set_n(0);
end -- make
at(x: DOUBLE): DOUBLE is
require
n > 0;
do
Result := ((1.0 + x * x / n.to_double) ^ - (n + 1)).sqrt;
end -- at
end -- class T_DISTRIBUTION_BASE |
class T_INTEGRAL
--integral of the t-distribution up to a given point; NOT what you
--use to make predictions; use T_DISTRIBUTION instead
inherit
SINGLE_VARIABLE_FUNCTION
redefine at
end;
creation {ANY}
make
feature {ANY}
n: INTEGER;
set_n(new_n: INTEGER) is
do
n := new_n;
t_distribution_base.set_n(new_n);
end -- set_n
make is
do
!!gamma;
!!homer.make;
!!t_distribution_base.make;
set_n(0);
end -- make
multiplier: DOUBLE is
do
Result := gamma.at((n + 1) / 2) / ((n * Pi).sqrt * gamma.at(n / 2));
end -- multiplier
gamma: GAMMA_FUNCTION;
homer: SIMPSON_INTEGRATOR;
t_distribution_base: T_DISTRIBUTION_BASE;
at(x: DOUBLE): DOUBLE is
do
Result := 0.5;
if x < 0 then
Result := 0.5 - multiplier * homer.integral(t_distribution_base,0,- x);
else
Result := 0.5 + multiplier * homer.integral(t_distribution_base,0,x);
end;
end -- at
end -- class T_INTEGRAL |
My continual problems abound: using the incorrect style of assignments for the wrong language, loop increments and instantiation in Eiffel, and header/source parity in C++. The C++ standard library continues to surprise me, which is not a good thing (during the course of this, previous programs "stopped working", not recognizing EOF in data files; the "solution" was to read and put back a character after each call to iostream::getline(), which changed nothing except that EOF was now detected. Strange).
Despite the obvious disclaimer in the problem statement ("Note that to calculate the value of t, you integrate from 0 to a trial value of t. You find the correct value by successively adjusting the trial value of t up or down until the p value is within an acceptable error..." [Humphrey95]), I didn't understand what this meant until the program appeared horribly broken. Figuring this out took a great deal of time and resulted in a huge "design defect" in the test phase.
Table 6-3. Test Results Format -- Program 6a:
| Test | Parameter | Expected Value | Actual Value- C++ | Actual Value- Eiffel |
| Table D8 | Beta0 | -22.55 | -22.5525 | -22.552533 |
| Beta1 | 1.7279 | 1.72793 | 1.727932 | |
| UPI( 70% ) | 874 | 874.431 | 874.430261 | |
| LPI( 70% ) | 414 | 414.428 | 414.428507 | |
| UPI( 90% ) | 1030 | 1030.39 | 1030.387465 | |
| LPI( 90% ) | 258 | 258.47 | 258.471303 | |
| Program 6a, using historical data for programs 2a through 6a | Estimated new and changed LOC | |||
| UPI( 70% ) | N/A | 400.809 | 400.809121 | |
| LPI( 70% ) | N/A | 166.415 | 166.414689 | |
| UPI( 90% ) | N/A | 504.297 | 504.297205 | |
| LPI( 90% ) | N/A | 62.9266 | 92.926605 | |
| Actual new/changed LOC | N/A | 299 | 299 |
Table 6-4. Project Plan Summary
| Student: | Victor B. Putz | Date: | 000111 |
| Program: | Statistic Predictor | Program# | 6A |
| Instructor: | Wells | Language: | C++ |
| Summary | Plan | Actual | To date |
| Loc/Hour | 48 | 56 | 49 |
| Planned time | 256 | 256 | |
| Actual time | 315 | 315 | |
| CPI (cost/performance index) | 0.813 | ||
| %reused | 44 | 44 | 23 |
| Program Size | Plan | Actual | To date |
| Base | 17 | 17 | |
| Deleted | 2 | 2 | |
| Modified | 3 | 2 | |
| Added | 231 | 297 | |
| Reused | 233 | 254 | 396 |
| Total New and Changed | 234 | 299 | 1089 |
| Total LOC | 442 | 568 | 1712 |
| Total new/reused | 0 | 0 | 0 |
| Time in Phase (min): | Plan | Actual | To Date | To Date% |
| Planning | 46 | 58 | 241 | 18 |
| Design | 26 | 27 | 130 | 10 |
| Code | 70 | 83 | 351 | 27 |
| Compile | 15 | 24 | 89 | 7 |
| Test | 79 | 108 | 420 | 31 |
| Postmortem | 20 | 15 | 92 | 7 |
| Total | 256 | 315 | 1323 | 100 |
| Defects Injected | Actual | To Date | To Date % | |
| Plan | 0 | 0 | 0 | |
| Design | 17 | 42 | 30 | |
| Code | 23 | 93 | 66 | |
| Compile | 0 | 3 | 2.5 | |
| Test | 0 | 3 | 2.5 | |
| Total development | 40 | 141 | 100 | |
| Defects Removed | Actual | To Date | To Date % | |
| Planning | 0 | 0 | 0 | |
| Design | 0 | 0 | 0 | |
| Code | 9 | 31 | 22 | |
| Compile | 22 | 71 | 50 | |
| Test | 9 | 39 | 28 | |
| Total development | 40 | 141 | 100 | |
| After Development | 0 | 0 |
| Eiffel code/compile/test |
| Time in Phase (min) | Actual | To Date | To Date % |
| Code | 71 | 217 | 49 |
| Compile | 30 | 106 | 24 |
| Test | 22 | 117 | 27 |
| Total | 123 | 440 | 100 |
| Defects Injected | Actual | To Date | To Date % |
| Design | 0 | 4 | 4 |
| Code | 28 | 86 | 95 |
| Compile | 0 | 0 | 0 |
| Test | 0 | 1 | 1 |
| Total | 28 | 91 | 100 |
| Defects Removed | Actual | To Date | To Date % |
| Code | 0 | 1 | 1 |
| Compile | 22 | 61 | 67 |
| Test | 6 | 29 | 32 |
| Total | 28 | 91 | 100 |
Table 6-5. Time Recording Log
| Student: | Victor B. Putz | Date: | 000109 |
| Program: | 6A |
| Start | Stop | Interruption Time | Delta time | Phase | Comments |
| 000109 15:09:59 | 000109 16:36:01 | 27 | 59 | plan | |
| 000109 16:36:04 | 000109 17:07:13 | 4 | 27 | design | |
| 000110 08:50:12 | 000110 10:13:35 | 0 | 83 | code | |
| 000110 10:18:57 | 000110 10:42:55 | 0 | 23 | compile | |
| 000110 10:47:38 | 000110 12:36:42 | 1 | 108 | test | |
| 000111 11:55:19 | 000111 12:10:28 | 0 | 15 | postmortem | |
Table 6-6. Time Recording Log
| Student: | Victor B. Putz | Date: | 000111 |
| Program: | 6a |
| Start | Stop | Interruption Time | Delta time | Phase | Comments |
| 000111 08:53:31 | 000111 10:04:22 | 0 | 70 | code | |
| 000111 10:04:25 | 000111 10:34:02 | 0 | 29 | compile | |
| 000111 10:34:27 | 000111 10:56:03 | 0 | 21 | test | |
Table 6-7. Defect Recording Log
| Student: | Victor B. Putz | Date: | 000109 |
| Program: | 6A |
| Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
| 000110 08:55:43 | md | ig | design | code | 2 | created is_double_equal to handle margins for double comparison (since pure equality is problematical) |
| 000110 09:02:48 | md | ig | design | code | 2 | forgot to add n to the class and equation |
| 000110 09:18:45 | md | ig | design | code | 3 | added variance feature to simplify standard deviation |
| 000110 09:33:09 | md | ig | design | code | 1 | added double_from_string member |
| 000110 09:36:23 | md | cm | design | code | 3 | broke string_stripped_of_comments into its own method; broke string_stripped_of_whitespace into its own class |
| 000110 09:47:51 | md | cm | design | code | 1 | broke error_log into its own class |
| 000110 09:52:50 | md | ig | design | code | 2 | beefed up error_log to include flag member, etc |
| 000110 10:00:18 | md | ig | design | code | 1 | added last_line_is_blank member |
| 000110 10:05:40 | ic | ig | design | code | 1 | moved t() into its own feature instead of mucking about with t_integral, etc. |
| 000110 10:21:19 | sy | ty | code | compile | 0 | erroneously typed compiler directive |
| 000110 10:22:32 | sy | ty | code | compile | 0 | forgot to #include necessary header |
| 000110 10:23:12 | sy | ty | code | compile | 0 | forgot to make inherited feature "at" const |
| 000110 10:23:59 | sy | ty | code | compile | 0 | forgot to #include string in header |
| 000110 10:24:44 | wn | cm | code | compile | 1 | misnamed a feature |
| 000110 10:26:18 | sy | cm | code | compile | 0 | used const double&; meant to use const double |
| 000110 10:29:12 | sy | om | code | compile | 0 | forgot-- must declare pairs of iterators prior to for-loop |
| 000110 10:30:17 | wn | cm | code | compile | 0 | used t_integral instead of m_t_integral as feature name |
| 000110 10:31:51 | sy | ig | code | compile | 0 | const cast games... |
| 000110 10:33:16 | wn | cm | code | compile | 0 | wrong name; typed "n" instead of entry_count |
| 000110 10:33:56 | sy | om | code | compile | 0 | forgot to add declaration to for-loop |
| 000110 10:34:44 | sy | ty | code | compile | 0 | forgot to add parentheses to zero-argument method |
| 000110 10:35:21 | sy | ty | code | compile | 0 | accidentally added semicolon after parentheses in implementation |
| 000110 10:36:07 | sy | om | code | compile | 0 | forgot to #include "contract.h" |
| 000110 10:36:38 | sy | om | code | compile | 0 | forgot to qualify methods with object (beta-0, beta-1, etc) |
| 000110 10:37:40 | sy | ty | code | compile | 0 | forgot to add error message to check_error call |
| 000110 10:38:25 | wn | cm | code | compile | 0 | misnamed argument in is_double |
| 000110 10:38:58 | sy | ty | code | compile | 0 | accidentally added semicolon after initializer in constructor |
| 000110 10:39:38 | sy | om | code | compile | 0 | forgot to #include math.h |
| 000110 10:40:31 | sy | om | code | compile | 0 | const games (base...) |
| 000110 10:41:20 | sy | ty | code | compile | 0 | used eiffel-style assignment; typo in constructor as well |
| 000110 10:42:06 | sy | ty | code | compile | 0 | const games... |
| 000110 10:47:41 | md | ig | design | test | 1 | forgot to add constructor |
| 000110 10:49:25 | wc | cm | design | test | 3 | some confusion with whether or not the condition in "check_error" should be true or false to trigger the log |
| 000110 10:54:14 | we | ig | design | test | 1 | forgot that stating "1/(entry_count() - 2 )" used ints instead of doubles. |
| 000110 10:57:48 | wc | ig | design | test | 2 | another true/false issue with check_error |
| 000110 11:00:22 | wt | cm | code | test | 9 | Aw, geez... accidentally declared return type "result" as bool, with disastrous results |
| 000110 11:11:39 | wa | ig | design | test | 18 | missed where the multiplier went in the t integral-- was using it to the whole integral (after the 0.5 business) |
| 000110 11:32:23 | wa | ig | design | test | 47 | holy cow-- missed entirely how to do the t-distribution; now have the correct one in place. |
| 000110 12:22:54 | wa | ig | design | test | 2 | expedited the "next guess" algorithm |
| 000110 12:26:46 | we | ig | design | test | 8 | bloody c++... was silently doing int arithmetic in the range calculation instead of converting to double |
Table 6-8. Defect Recording Log
| Student: | Victor B. Putz | Date: | 000111 |
| Program: | 6a |
| Defect found | Type | Reason | Phase Injected | Phase Removed | Fix time | Comments |
| 000111 10:05:05 | wn | ig | code | compile | 0 | used "equals" instead of "compare" to compare strings |
| 000111 10:06:03 | sy | cm | code | compile | 0 | used () at end of no-argument feature call |
| 000111 10:06:41 | sy | ty | code | compile | 0 | typed beta-1 instead of beta_1 |
| 000111 10:07:25 | sy | ty | code | compile | 0 | used comma instead of semicolon as argument separator |
| 000111 10:07:47 | sy | ty | code | compile | 0 | compiler didn't like a semicolon; odd |
| 000111 10:08:33 | sy | ty | code | compile | 0 | used c_style = for assignment |
| 000111 10:09:05 | sy | ig | code | compile | 0 | can't declare constants in local header |
| 000111 10:10:43 | sy | ig | code | compile | 0 | tried to change a formal argument; not allowed in Eiffel |
| 000111 10:11:10 | sy | ty | code | compile | 0 | forgot "then" after if |
| 000111 10:11:33 | sy | ty | code | compile | 0 | typed "homer_integral" instead of homer.integral |
| 000111 10:12:00 | sy | ty | code | compile | 0 | typed x.mean instead of xs.mean |
| 000111 10:12:36 | wn | cm | code | compile | 0 | used range instead of prediction_range |
| 000111 10:13:55 | sy | ty | code | compile | 0 | used _ instead of . for feature call |
| 000111 10:14:31 | sy | ty | code | compile | 0 | forgot to qualify number_list feature call with an object |
| 000111 10:14:55 | sy | ty | code | compile | 0 | forgot to qualify error_log feature call with object |
| 000111 10:15:22 | ic | om | code | compile | 0 | forgot to add double_from_string feature (reused from readable_paired_number_list) |
| 000111 10:16:56 | wt | om | code | compile | 0 | mistakenly declared some variables as integer, not double |
| 000111 10:17:38 | wn | cm | code | compile | 0 | used check_error instead of check_for_error |
| 000111 10:19:48 | mc | om | code | compile | 0 | forgot to add call to predictor_parser.make |
| 000111 10:20:38 | mc | om | code | compile | 2 | was not instantiating some members |
| 000111 10:23:19 | mc | om | code | compile | 1 | forgot several instantiations |
| 000111 10:24:48 | wa | ig | code | compile | 8 | Eiffel does not support arbitrary exponents, only integers-- so I had to change the alg to use a sqrt |
| 000111 10:35:10 | wa | ig | code | test | 3 | forgot to check for cases in which the line begins with the comment...? |
| 000111 10:39:39 | wa | cm | code | test | 1 | forgot to increment loop index |
| 000111 10:41:53 | ma | cm | code | test | 1 | forgot to update found_end_of_historical_data in parse_last_line_as_end_of_historical_data |
| 000111 10:43:25 | mc | cm | code | test | 0 | forgot to instantiate error_log |
| 000111 10:44:14 | wn | cm | code | test | 0 | tried to use to_double instead of double_from_string |
| 000111 10:45:48 | wa | ty | code | test | 9 | missed a minus sign in a calculation. Dang. |