Program 10a: Multiple Regression

Using a linked list, write a program to calculate the 3-parameter multiple regression parameters and the prediction interval.

Given Requirements

 

Requirements: Write a program to calculate the three-variable multiple regression estimation parameters, make an estimate from user-supplied inputs, and determine the 70 percent and 90 percent prediction intervals. The formulas and methods for doing this calculation are described in sections A9, A10, and A11 [of the text]... Further enhance the linked list from program 4a.

Testing: Thoroughly test the program. As one test, use the data in Table D16. For the estimated values, use 185 LOC of new code, 150 LOC of reused code, and 45 LOC of modified code. The projected hours should be 20.76 hours, the 90 percent UPI is 33.67 hours, and the 90 percent LPI is 7.84 hours. The 70 percent prediction interval is from 14.63 to 26.89 hours. The values of the beta parameters are Beta0 = 0.56645, Beta1 = 0.06533, Beta2 = 0.008719, and Beta3 = 0.15105. These are equivalent to productivities of 15.3 new and changed LOC per hour, reuse of 114.7 LOC per hour, and modification of 6.6 LOC per hour. As a further test case you should also verify that your program produces the same results given in the examples in sections A9 and A10 (pages 554 and 558). Submit a test report giving your results in the format in Table D17.

Table 11-1. D16: Multiple Regression Exercise Data

Program NumberNew LOC wReused LOC xModified LOC yHours z
1345652331.4
2168181814.6
394006.4
41871859828.3
5621871042.1
62550015.3
sum1670355149138.1
Average278.3359.1724.8323.02
Estimate18515045?

Table 11-2. D17: Test Results Format -- Program 10A

TestParameterExpected ValueActual Value - C++Actual Value - Eiffel
Table D16    
 Beta00.56645  
 Beta10.06533  
 Beta20.008719  
 Beta30.15105  
 Projected Hours20.76  
 UPI(70%)26.89  
 LPI(70%)14.63  
 UPI(90%)33.67  
 LPI(90%)7.84  
Appendix A    
 Beta06.7013  
 Beta10.0784  
 Beta20.0150  
 Beta30.2461  
 Projected Hours140.9  
 UPI(70%)179.7  
 LPI(70%)102.1  
 UPI(90%)222.74  
 LPI(90%)59.06  
 
--[Humphrey95] 

Planning

Requirements

The requirements are pretty straightforward. I'll be enhancing program 9a for this one, and in program 9a we took the point of view that after the historical data we should have a list of commands with arguments. The new command to be used will be multiple-regression-prediction x-value, y-value, z-value, .... That way, we can extend the wee little command set of this program and make the most of code reuse.

In keeping with program 9a, program 10a should be able to handle input arrays of any reasonable size (ie should fit in memory)-- any field width, etc. That may seem to make the design more complicated, but if I'm right it will actually be easier to code a general solution than a specific one in this case.

Output will be in much the same format as program 9a. Output from the multiple-regression-prediction command will be as follows:

Multiple Regression Prediction:
Solution: ( 0.56645, 0.06544, ... 0.15105 )
variance : ...
standard deviation : ...

Prediction from values ( w, x, y...):
Projected Result: 20.76
UPI( 70% ): 26.89
LPI( 70% ): 14.63
UPI( 90% ): 33.67
LPI( 90% ): 7.84

Errors in the program's execution (invalid input, too few parameters for calculations, etc) should be recognized and displayed using stderr.

Size Estimate

I'm a little confused, I confess, about the difference between the "conceptual design" which I've generated in the past to form size estimates, and the "high-level design" described in the PSP3 process, which comes after the size and time estimates. With that in mind, I'm still going to make a fairly high-level design for the size estimate.

With that brief design, PROBE gets me an initial estimate of only 239 new/changed LOC. Considering that the book repeatedly notes that program 10 is large enough that multiple sessions are usually necessary, this seems very optimistic (it's the same amount that PROBE predicted for program 9a). We'll see what comes of it. I have divided the size estimate into the first cycle (building the gaussian_solver class, about 104 LOC) and a second cycle (fleshing out the number_array_list_parser_2 structure and modifying the parser, about 135 LOC). These estimates do not include test scaffolding for the first cycle; I'm not sure how to handle that, but it will only be a few LOC, which won't be included in the final.

Resource/Time Estimate

With the PROBE-generated estimate of 239 new/changed LOC, I get an estimate of 285 minutes for program 10a, with about 89 minutes going to the first cycle and 119 in the second cycle.

High-level Design and Design Review

I actually find myself wishing that Humphrey had decided on a completely different project for program 10a; I want to start using high-level design and design reviews, but the bulk of the high-level design has already been done in the form of previous projects-- and as much as I'd like to further my knowledge, I can't bring myself to reinvent the wheel. The basic structure of parsing textual input using a derivative of the simple_input_parser has worked well, and I'm simply expanding on that, so much of the high-level design has been encapsulated in earlier work. User operation scenarios are therefore sparse. Since few of my objects have any explicit state (except for values), it's tough to call parts of this a "state machine" except in the sense that number_array_lists have contents, the gaussian_solver has an "is_solvable" state, etc. But when requirements for certain "states" (is_solvable, for example) are encoded into contracts for other routines, it seems odd to specify a "state transition" rather than simply specifying the "is_solvable" evaluation. And most of the function-level specification will be done at the cycle level rather than the high-level design level.

In addition, the high-level design was supposed to be used to break down the development into cycles of around 100 LOC-- but I had already done that in the planning phase! So I'm afraid the high-level design and design reviews, while interesting, were not particularly useful for program 10a.

Development

Cycle 1: Gaussian Solver

Design

Basic design for the gaussian solver chunk fleshed out the gaussian_solver class and added a test stub to the main program; the brief diagram is below. Because much code for adding and summing arrays of numbers was coming up in gaussian_solver, I eventually broke that out into its own class, numeric_array.

Cycle 1 design diagram, using a form of UML

Design Review

The PSP3 process adds something to the design review, a bit about using "...defined methods such as execution tables, trace tables, or mathematical verification" [Humphrey95]. For program 10a-- and I'm not sure why-- I did not follow this advice on either cycle; in fact, it's safe to say that the design reviews for program 10a were done more sloppily than for any program to date (where design reviews were done).

I'm not sure why this is-- perhaps I was tired and fell into old habits, perhaps I was overconfident about my initial design-- but I paid the price. Program 10a took significantly longer than expected, mostly for silly errors that I could have caught with a design review.

Code

Reused Code

Much code was reused for this program, including the classes adder, chi_squared_base, chi_squared_integral, error_log, gamma_function, is_double_equal, normal_distribution_integral, normal_distribution_inverse, normal_function_base, normalizer, simple_input_parser, simpson_integrator, single_variable_function, square, t_distribution, t_distribution_base, t_integral, and whitespace_stripper classes. Much of the code from the number_array_list and number_array_list_parser classes from previous programs were used. The main.cpp and main.e files were used intact with the minor change of the parser type; they are not reproduced here.

gaussian_solver
#ifndef GAUSSIAN_SOLVER_H
#define GAUSSIAN_SOLVER_H
#ifndef NUMERIC_ARRAY_H
#include "numeric_array.h"
#endif
class number_array_list_3;
class gaussian_solver_imp;

class gaussian_solver
{
 protected:
  gaussian_solver_imp* m_pimp;
  const number_array_list_3& coefficients( void ) const;
 public:
  void add_coefficients( const numeric_array& new_coefficients );
  numeric_array solution( void ) const;

 public:
  gaussian_solver( void );
  gaussian_solver( const gaussian_solver& rhs );
  ~gaussian_solver( void );

 public:
  bool is_solvable( void ) const;
  gaussian_solver reduced_solver( void ) const;
  numeric_array reduction_coefficients( const numeric_array& first, const numeric_array& subject ) const;
  bool is_completely_reduced( void ) const;
  double first_solution_coefficient( const numeric_array& partial_solution ) const;
  bool is_valid_solution( const numeric_array& test_solution ) const;
  numeric_array first_row( void ) const;
};

#endif
#include "gaussian_solver.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef IS_DOUBLE_EQUAL_H
#include "is_double_equal.h"
#endif
#ifndef NUMBER_ARRAY_LIST_3_H
#include "number_array_list_3.h"
#endif

class gaussian_solver_imp
{
public:
  number_array_list_3 coefficients;
};

gaussian_solver::gaussian_solver( void )
{
  m_pimp = new gaussian_solver_imp;
}

gaussian_solver::gaussian_solver( const gaussian_solver& rhs )
{
  m_pimp = new gaussian_solver_imp;
  m_pimp->coefficients = rhs.coefficients();
}

gaussian_solver::~gaussian_solver( void )
{
  REQUIRE( m_pimp != NULL );
  delete m_pimp;
}

const number_array_list_3&
gaussian_solver::coefficients( void ) const
{
  REQUIRE( m_pimp != NULL );
  return m_pimp->coefficients;
}

void
gaussian_solver::add_coefficients( const numeric_array& new_coefficients )
{
  REQUIRE( coefficients().is_valid_entry( new_coefficients ) );
  if ( coefficients().entry_count() > 0 )
    {
      REQUIRE( coefficients().entry_count() < coefficients().head().size() );
    }
  m_pimp->coefficients.add_entry( new_coefficients );
}

numeric_array
gaussian_solver::solution( void ) const
{
  REQUIRE( is_solvable() );
  numeric_array Result;
  if ( is_completely_reduced() )
    {
      Result.push_back( coefficients().head()[1] / coefficients().head()[0] );
    }
  else
    {
      Result = reduced_solver().solution();
      Result.insert( Result.begin(), first_solution_coefficient( Result ) );
    }
  ENSURE( is_valid_solution( Result ) );
  return Result;
}

bool
gaussian_solver::is_solvable( void ) const
{
  bool Result = false;
  if ( coefficients().field_count() == coefficients().entry_count() + 1 )
    {
      Result = true;
    }
  return Result;
}

gaussian_solver
gaussian_solver::reduced_solver( void ) const
{
  REQUIRE( !is_completely_reduced() );
  gaussian_solver Result;
  number_array_list_3::const_iterator iter = coefficients().begin();
  ++iter;
  while( iter != coefficients().end() )
    {
      Result.add_coefficients( reduction_coefficients( coefficients().head(), *iter ) );
      ++iter;
    }
  ENSURE( Result.is_solvable() );
  return Result;
}
  
numeric_array
gaussian_solver::reduction_coefficients( const numeric_array& first, const numeric_array& subject ) const
{
  REQUIRE( first.size() == subject.size() );
  REQUIRE( first.size() > 1 );
  const double multiplicand = ( - subject.head() / first.head() );
  const numeric_array partial_result = subject.added_to_array( first.mult_by_scalar( multiplicand ) );
  CHECK( is_double_equal( partial_result.head(), 0 ) );
  numeric_array Result = partial_result.tail();
  ENSURE( Result.size() == first.size() - 1 );
  return Result;
}

bool
gaussian_solver::is_completely_reduced( void ) const
{
  bool Result = false;
  if ( ( coefficients().entry_count() == 1 ) && is_solvable() ) 
    {
      Result = true;
    }
  return Result;
}

double
gaussian_solver::first_solution_coefficient( const numeric_array& partial_solution ) const
{
  REQUIRE( !is_completely_reduced() );
  REQUIRE( partial_solution.size() == coefficients().head().size() - 2 );
  numeric_array middle_section;
  numeric_array::const_iterator middle_begin = first_row().begin();
  ++middle_begin;
  numeric_array::const_iterator middle_end = first_row().end();
  --middle_end;
  middle_section.insert( middle_section.begin(), middle_begin, middle_end );
  const double x = first_row()[0];
  const double y = partial_solution.mult_by_array( middle_section ).sum();
  const double z = first_row()[ first_row().size() - 1 ]; 
  double Result = ( z - y ) / x;
  return Result;
}

bool
gaussian_solver::is_valid_solution( const numeric_array& test_solution ) const
{
  bool Result = false;
  numeric_array first_row_arguments = first_row();
  first_row_arguments.pop_back();
  if ( ( test_solution.size() == first_row().size() - 1 )
       && ( test_solution.mult_by_array( first_row_arguments ).sum() == first_row()[ first_row().size() - 1 ] ) )
    {
      Result = true;
    }
  return Result;
}

numeric_array
gaussian_solver::first_row( void ) const
{
  return coefficients().head();
}
class GAUSSIAN_SOLVER

creation {ANY} 
   make

feature {ANY} 
   
   coefficients: NUMBER_ARRAY_LIST_3;

feature {ANY} 
   
   make is 
      --creation
      do  
         !!coefficients.make;
      end -- make
   
   add_coefficients(new_coefficients: NUMERIC_ARRAY) is 
      --adds a set of coefficients to the list
      require 
         coefficients.is_valid_entry(new_coefficients); 
         coefficients.entry_count > 0 implies coefficients.entry_count + 1 < coefficients.head.count; 
      do  
         coefficients.add_entry(new_coefficients);
      ensure 
         coefficients.entry_count = old coefficients.entry_count + 1; 
      end -- add_coefficients
   
   is_solvable: BOOLEAN is 
      --is the current set of coefficients solvable?
      do  
         Result := false;
         if coefficients.head.count = coefficients.entry_count + 1 then 
            Result := true;
         end; 
      end -- is_solvable
   
   reduced_solver: like Current is 
      --solver, "reduced" by one equation and one parameter
      require 
         not completely_reduced; 
      local 
         i: INTEGER;
      do  
         !!Result.make;
         from 
            i := coefficients.lower + 1;
         until 
            not coefficients.valid_index(i)
         loop 
            Result.add_coefficients(reduction_coefficients(coefficients.first,coefficients.item(i)));
            i := i + 1;
         end; 
      ensure 
         Result.is_solvable; 
      end -- reduced_solver
   
   reduction_coefficients(first, subject: NUMERIC_ARRAY): NUMERIC_ARRAY is 
      --the coefficients used to reduce the solver; mostly used internally
      require 
         first.count = subject.count; 
         first.lower = subject.lower; 
         first.count > 1; 
      local 
         multiplicand: DOUBLE;
         partial_result: NUMERIC_ARRAY;
      do  
         multiplicand := - (subject.head / first.head);
         partial_result := subject.added_to_array(first.mult_by_scalar(multiplicand));
         check 
            partial_result.head.in_range(- 0.0000001,0.0000001); 
         end;
         Result := partial_result.tail;
      ensure 
         Result.count = first_row.count - 1; 
      end -- reduction_coefficients
   
   solution: NUMERIC_ARRAY is 
      --beta parameters for the multiple regression solution
      require 
         is_solvable; 
      do  
         if completely_reduced then 
            !!Result.make_empty;
            Result.add_first(coefficients.head.last / coefficients.head.first);
         else 
            Result := reduced_solver.solution;
            Result.add_first(first_solution_coefficient(reduced_solver.solution));
         end; 
      ensure 
         is_valid_solution(Result); 
      end -- solution
   
   completely_reduced: BOOLEAN is 
      --is reduced to one equation with two unknowns?
      do  
         Result := false;
         if coefficients.entry_count = 1 and is_solvable then 
            Result := true;
         end; 
      end -- completely_reduced
   
   first_solution_coefficient(partial_solution: NUMERIC_ARRAY): DOUBLE is 
      --first coefficient to the solution with the given 
      --parameters for the other coefficients
      require 
         not completely_reduced; 
         valid_partial_solution: partial_solution.count = coefficients.head.count - 2; 
      do  
         Result := (first_row.last - partial_solution.mult_by_array(first_row.slice(first_row.lower + 1,first_row.upper - 1)).sum) / first_row.first;
      end -- first_solution_coefficient
   
   is_valid_solution(possible_solution: NUMERIC_ARRAY): BOOLEAN is 
      --is test_solution a valid solution for the given matrix?
      do  
         Result := false;
         if possible_solution.count = first_row.count - 1 and possible_solution.mult_by_array(first_row.slice(first_row.lower,first_row.upper - 1)).sum.in_range(first_row.last - 0.0000001,first_row.last + 0.0000001) then 
            Result := true;
         end; 
      end -- is_valid_solution
   
   first_row: NUMERIC_ARRAY is 
      --first row of the coefficients
      require 
         coefficients.entry_count > 0; 
      do  
         Result := coefficients.first;
      end -- first_row

end -- class GAUSSIAN_SOLVER
numeric_array
/*
*/

#ifndef NUMERIC_ARRAY_H
#define NUMERIC_ARRAY_H

#include <vector>
#include <string>

class numeric_array : public std::vector< double >
{
 public:
  double head( void ) const;
  numeric_array tail( void ) const;
  double sum( void ) const;
  numeric_array mult_by_array( const numeric_array& rhs ) const;
  numeric_array mult_by_scalar( double scalar ) const;
  numeric_array added_to_array( const numeric_array& rhs ) const;
  std::string to_string( void ) const;
};



#endif


/*
*/
/*
*/

#include "numeric_array.h"

#ifndef CONTRACT_H
#include "contract.h"
#endif
#include <stdio.h>
#include <string>
double
numeric_array::head( void ) const 
{
  REQUIRE( size() > 0 );
  return *begin();
}

numeric_array
numeric_array::tail( void ) const
{
  numeric_array Result;
  if ( size() > 1 )
    {
      numeric_array::const_iterator iter = begin();
      ++iter;
      Result.insert( Result.begin(), iter, end() );
    }
  return Result;
}

double
numeric_array::sum( void ) const
{
  double Result = 0;
  if ( size() > 0 )
    {
      Result = head() + tail().sum();
    }
  return Result;
}

numeric_array
numeric_array::mult_by_array( const numeric_array& rhs ) const
{
  REQUIRE( size() == rhs.size() );
  numeric_array Result;
  if ( size() > 0 )
    {
      Result = tail().mult_by_array( rhs.tail() );
      Result.insert( Result.begin(), head() * rhs.head() );
    }
  ENSURE( Result.size() == size() );
  return Result;
}

numeric_array
numeric_array::mult_by_scalar( double scalar ) const
{
  numeric_array Result;
  if ( size() > 0 )
    {
      Result = tail().mult_by_scalar( scalar );
      Result.insert( Result.begin(), head() * scalar );
    }
  ENSURE( Result.size() == size() );
  return Result;
}

numeric_array
numeric_array::added_to_array( const numeric_array& rhs ) const
{
  numeric_array Result;
  if ( size() > 0 )
    {
      Result = tail().added_to_array( rhs.tail() );
      Result.insert( Result.begin(), head() + rhs.head() );
    }
  ENSURE( Result.size() == size() );
  return Result;
}
  
static std::string double_as_string( double x ) 
{
  char buffer[ 1000 ];
  sprintf( buffer, "%f", x );
  return std::string( buffer );
}


std::string
numeric_array::to_string( void ) const
{
  std::string Result = "( ";
  for( numeric_array::const_iterator iter = begin(); iter != end(); ++iter )
    {
      Result.append( double_as_string( *iter ) );
      if ( iter + 1 != end() )
	{
	  Result.append( ", " );
	}
    }
  Result.append( " )" );
  return Result;
}






/*
*/
class NUMERIC_ARRAY
--simple array of numbers; handles some mappings and functionality 
--like NUMBER_ARRAY_LIST and its descendants

inherit 
   ARRAY[DOUBLE];
   
creation {ANY} 
   make, make_empty, with_capacity, from_collection

feature {ANY} 
   
   make_empty is 
      --creates with empty array and indices (1,0) for lower, upper
      do  
         make(1,0);
      end -- make_empty
   
   head: like item is 
      --first item in the array
      require 
         count > 0; 
      do  
         Result := first;
      end -- head
   
   tail: like Current is 
      --items in the array after the first
      do  
         if count > 1 then 
            Result := slice(lower + 1,upper);
         else 
            !!Result.make_empty;
         end; 
      ensure 
         count > 0 implies Result.count = count - 1; 
      end -- tail
   
   sum: like item is 
      --sum of the items in the array
      do  
         Result := 0;
         if count > 0 then 
            Result := head + tail.sum;
         end; 
      end -- sum
   
   mult_by_array(rhs: like Current): like Current is 
      --each item in the array multiplied by a corresponding element 
      --in the second array
      require 
         count = rhs.count; 
      do  
         if count = 0 then 
            !!Result.make_empty;
         else 
            Result := tail.mult_by_array(rhs.tail);
            Result.add_first(head * rhs.head);
         end; 
      end -- mult_by_array
   
   mult_by_scalar(scalar: like item): like Current is 
      --each item in the array multiplied by a scalar
      do  
         if count = 0 then 
            !!Result.make_empty;
         else 
            Result := tail.mult_by_scalar(scalar);
            Result.add_first(head * scalar);
         end; 
      end -- mult_by_scalar
   
   added_to_array(rhs: like Current): like Current is 
      --each item in the array added to the corresponding element in rhs
      require 
         count = rhs.count; 
      do  
         if count = 0 then 
            !!Result.make_empty;
         else 
            Result := tail.added_to_array(rhs.tail);
            Result.add_first(head + rhs.head);
         end; 
      end -- added_to_array
   
   to_string: STRING is 
      --string representation in the form "( x0, x1, ... xn )"
      local 
         i: INTEGER;
      do  
         !!Result.make_from_string("( ");
         from 
            i := lower;
         until 
            not valid_index(i)
         loop 
            item(i).append_in(Result);
            if valid_index(i + 1) then 
               Result.append_string(", ");
            end; 
            i := i + 1;
         end; 
         Result.append_string(" )");
      end -- to_string

end -- class NUMERIC_ARRAY

Code Review

Sigh... much as the design reviews suffered, so did the code reviews, to the point that I had double the number of defects reaching compile as I had expected. Disappointing, but I realize it was my fault for lacking some necessary discipline.

Compile

No surprises except for double the number of defects injected in code as I'd expected. Ouch.

Test

Actually, test went well, finding one fewer defect than expected and taking as much time as predicted. Around here, I felt I was doing well with program 10a.

Cycle 2: Multiple Regression

Design

Only a little more thought went into design of cycle 2, because it was mostly enhancements for the number_array_list and number_array_list_parser classes.

Cycle 2 design diagram, using a form of UML

Design Review

If I was lax on cycle 1's design review, I was doubly so on cycle 2. Encouraged by the good test results from cycle 1, I had apparently forgotten that the code phase had taken twice as long as expected and generated twice as many errors! As a result, cycle 2 suffered from poor design-- most of which was fine, but cost me in test.

Code

Reused Code

All code from cycle 1 was reused, except for 4 lines of stub test code.

number_array_list_3
/*
*/

#ifndef NUMBER_ARRAY_LIST_3_H
#define NUMBER_ARRAY_LIST_3_H

#include <list>
#include <vector>
#include <string>

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif
#ifndef NUMERIC_ARRAY_H
#include "numeric_array.h"
#endif
#ifndef GAUSSIAN_SOLVER_H
#include "gaussian_solver.h"
#endif

class number_array_list_3 : public std::list< numeric_array >
{
 protected:
  int m_field_count;

 public:
  int field_count( void ) const;
  numeric_array head( void ) const;
  number_array_list_3 tail( void ) const;
  number_array_list_3 suffixed_by_item( const numeric_array& entry ) const;
  number_array_list_3 suffixed_by_list( const number_array_list_3& rhs ) const;
  number_array_list_3 mapped_to( int field_index, const single_variable_function& f ) const;
  number_array_list_3 multiplied_by_list( int field_index, const number_array_list_3& rhs ) const;
  number_array_list_3 sorted_by( int field_index ) const;
  number_array_list_3 items_less_than( int field_index, double value ) const;
  number_array_list_3 items_greater_than_or_equal_to( int field_index, double value ) const;

  bool is_valid_entry( const numeric_array& entry ) const;
  bool is_valid_field_index( int field_index ) const;
  bool is_sorted_by( int field_index ) const;
  double sum_by_field( int field_index ) const;
  double mean_by_field( int field_index ) const;
  int entry_count( void ) const;
  
  void set_field_count( int new_field_count );
  void add_entry( const numeric_array& entry );

  void make_from_entry( const numeric_array& entry );
  void make( void );
  void make_from_list( const number_array_list_3& rhs );
  number_array_list_3( void );

  double standard_deviation( int field_index, bool is_full_population ) const;
  double variance( int field_index, bool is_full_population ) const;
  number_array_list_3 normalized_series( int field_index ) const;
  int normalization_s_value( void ) const;
  number_array_list_3 normalized_series_table( int field_index ) const;
  number_array_list_3 normal_distribution_segments( int s ) const;
  number_array_list_3 chi_squared_table( int field_index ) const;
  double chi_squared_q( int field_index ) const;
  double chi_squared_1_minus_p( int field_index ) const;
  number_array_list_3 select_series( int field_index ) const;
  int items_in_range( int field_index, double lower_limit, double upper_limit ) const;
  std::string entry_as_string( const numeric_array& entry ) const;
  std::string table_as_string( void ) const;

  numeric_array gaussian_solver_term( int term_number ) const;
  gaussian_solver to_gaussian_solver( void ) const;
  double full_variance( void ) const;
  number_array_list_3 full_variance_series( const numeric_array& gauss_solution ) const;
  double full_standard_deviation( void ) const;
  double prediction_range( double percent_range, const numeric_array& prediction_parameters ) const;
  double lower_prediction_interval( double percent_range, const numeric_array& prediction_parameters ) const;
  double upper_prediction_interval( double percent_range, const numeric_array& prediction_parameters ) const;
  double prediction( const numeric_array& prediction_parameters ) const;
};

#endif

/*
*/
/*
*/

#include "number_array_list_3.h"

#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef ADDER_H
#include "adder.h"
#endif
#ifndef SQUARE_H
#include "square.h"
#endif
#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#include "normal_distribution_inverse.h"
#endif
#ifndef CHI_SQUARED_INTEGRAL_H
#include "chi_squared_integral.h"
#endif
#ifndef NORMALIZER_H
#include "normalizer.h"
#endif
#ifndef T_DISTRIBUTION_H
#include "t_distribution.h"
#endif

#include <math.h>
#include <stdio.h>

int
number_array_list_3::field_count( void ) const
{
  return m_field_count;
}

numeric_array
number_array_list_3::head( void ) const
{
  REQUIRE( entry_count() >= 1 );
  return *(begin());
}

number_array_list_3
number_array_list_3::tail( void ) const
{
  number_array_list_3 Result;
  if ( entry_count() > 1 )
    {
      number_array_list_3::const_iterator tail_head = begin();
      ++tail_head;
      Result.insert( Result.begin(), tail_head, end() );
      Result.m_field_count = field_count();
      ENSURE( Result.entry_count() == entry_count() - 1 );
    }
  return Result;
}

number_array_list_3
number_array_list_3::suffixed_by_item( const numeric_array& entry ) const
{
  number_array_list_3 Result;
  Result.make_from_list( *this );
  Result.add_entry( entry );
  ENSURE( Result.entry_count() == entry_count() + 1 );
  return Result;
}

number_array_list_3
number_array_list_3::suffixed_by_list( const number_array_list_3& rhs ) const
{
  number_array_list_3 Result;
  Result.make_from_list( *this );
  for ( number_array_list_3::const_iterator iter = rhs.begin(); iter != rhs.end(); ++iter )
    {
      Result.add_entry( *iter );
    }
  ENSURE( Result.entry_count() == entry_count() + rhs.entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::mapped_to( int field_index, const single_variable_function& f ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  if ( entry_count() > 0 )
    {
      numeric_array new_entry = head();
      new_entry[ field_index ] = f.at( head()[ field_index ] );
      Result.make_from_list( Result.suffixed_by_item( new_entry ).suffixed_by_list( tail().mapped_to( field_index, f )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::multiplied_by_list( int field_index, const number_array_list_3& rhs ) const
{
  REQUIRE( entry_count() == rhs.entry_count() );
  REQUIRE( field_count() == rhs.field_count() );
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  if ( entry_count() > 0 )
    {
      numeric_array new_entry = head();
      new_entry[ field_index ] = head()[ field_index ] * rhs.head()[field_index];
      Result.make_from_list( Result.suffixed_by_item( new_entry ).
			      suffixed_by_list( tail().multiplied_by_list( field_index, rhs.tail() ) ) );
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  if ( is_sorted_by( field_index ) )
    {
      Result.make_from_list( *this );
    }
  else
    {
      Result.make_from_list( tail().items_less_than( field_index, head()[field_index] ).sorted_by( field_index ).
			     suffixed_by_item( head() ).
			     suffixed_by_list( tail().items_greater_than_or_equal_to( field_index, head()[field_index] ).sorted_by( field_index )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::items_less_than( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  for ( number_array_list_3::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] < value )
	{
	  Result.add_entry( *iter );
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::items_greater_than_or_equal_to( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  for ( number_array_list_3::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] >= value )
	{
	  Result.add_entry(*iter);
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

bool
number_array_list_3::is_valid_entry( const numeric_array& entry ) const
{
  bool Result = false;
  if ( ( entry_count() == 0 ) 
       || ( ( entry_count() > 0 ) && ( entry.size() == field_count() ) ) )
    {
      Result = true;
    }
  return Result;
}

bool
number_array_list_3::is_valid_field_index( int field_index ) const
{
  bool Result = true;
  if ( entry_count() > 0 )
    {
      Result = ( 0 <= field_index ) && ( field_index < m_field_count );
    }
  return Result;
}

bool
number_array_list_3::is_sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  bool Result = true;
  if ( entry_count() > 1 )
    {
      Result = ( head()[field_index] < tail().head()[field_index] ) && tail().is_sorted_by( field_index );
    }
  return Result;
}

double
number_array_list_3::sum_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) )
  double Result = 0;
  if ( entry_count() > 0 )
    {
      Result = head()[field_index] + tail().sum_by_field( field_index );
    }
  return Result;
}

double
number_array_list_3::mean_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  REQUIRE( entry_count() > 0 );
  double Result = sum_by_field( field_index ) / static_cast<double>( entry_count() );
  return Result;
}

int
number_array_list_3::entry_count( void ) const
{
  return size();
}

void
number_array_list_3::set_field_count( int new_field_count )
{
  REQUIRE( entry_count() == 0 );
  m_field_count = new_field_count;
  ENSURE( field_count() == new_field_count );
}

void
number_array_list_3::add_entry( const numeric_array& entry )
{
  if ( entry_count() == 0 )
    {
      set_field_count( entry.size() );
    }
  REQUIRE( is_valid_entry( entry ) );
  const int old_entry_count = entry_count();
  push_back( entry );
  ENSURE( entry_count() == old_entry_count + 1 );
}

void
number_array_list_3::make_from_entry( const numeric_array& entry )
{
  make();
  add_entry( entry );
  ENSURE( entry_count() == 1 );
  ENSURE( head() == entry );
}

void
number_array_list_3::make( void )
{
  clear();
  m_field_count = -1;
  ENSURE( m_field_count == -1 );
  ENSURE( entry_count() == 0 );
}

void
number_array_list_3::make_from_list( const number_array_list_3& rhs )
{
  make();
  insert( begin(), rhs.begin(), rhs.end() );
  m_field_count = rhs.field_count();
  ENSURE( entry_count() == rhs.entry_count() );
}

number_array_list_3::number_array_list_3( void )
{
  make();
}

double
number_array_list_3::standard_deviation( int field_index, bool is_full_population ) const
{
  double Result = sqrt( variance( field_index, is_full_population ) );
  return Result;
}

double
number_array_list_3::variance( int field_index, bool is_full_population ) const
{
  adder an_adder( -mean_by_field( field_index ) );
  square a_square;
  if ( is_full_population )
    {
      CHECK( entry_count() > 0 );
    }
  else
    {
      CHECK( entry_count() > 1 );
    }
  const double divisor = static_cast< double >( is_full_population ? entry_count() : entry_count() - 1 );
  const double Result = mapped_to( field_index, an_adder ).
    mapped_to( field_index, a_square ).sum_by_field( field_index ) / divisor;
  return Result;
}

number_array_list_3
number_array_list_3::normalized_series( int field_index ) const
{
  //cout << "Standard dev: " << standard_deviation( field_index, false ) << "\n";
  normalizer a_normalizer( mean_by_field( field_index ), standard_deviation( field_index, false ) );
  number_array_list_3 Result = mapped_to( field_index, a_normalizer ).select_series( field_index );
  //cout << "**in normalized_series**\n" << mapped_to( field_index, a_normalizer ).table_as_string() << "\n";
  ENSURE( Result.entry_count() == entry_count() );
  //cout << "**normalized_series result**\n" << Result.table_as_string() << "\n";
  return Result;
}

int
number_array_list_3::normalization_s_value( void ) const
{
  REQUIRE( entry_count() % 5 == 0 );
  int Result = entry_count() / 5;
  return Result;
}

number_array_list_3
number_array_list_3::normalized_series_table( int field_index ) const
{
  REQUIRE( entry_count() > 0 );
  number_array_list_3 sorted_list;
  sorted_list.make_from_list( sorted_by( field_index ) );
  number_array_list_3 sorted_normalized_list = sorted_list.normalized_series( field_index );
  //cout << "Sorted list: \n" << sorted_list.table_as_string() << "\n";
  //cout << "sorted normalized_list: \n" << sorted_normalized_list.table_as_string() << "\n";
  CHECK( sorted_list.entry_count() == entry_count() );
  CHECK( sorted_normalized_list.entry_count() == entry_count() );
  number_array_list_3::const_iterator sorted_list_iter;
  number_array_list_3::const_iterator sorted_normalized_list_iter;
  int i = 0;
  number_array_list_3 Result;
  for( i = 0, sorted_list_iter = sorted_list.begin(), sorted_normalized_list_iter = sorted_normalized_list.begin();
       sorted_list_iter != sorted_list.end() && sorted_normalized_list_iter != sorted_normalized_list.end();
       )//++i, ++sorted_list_iter, ++sorted_normalized_list_iter )
    {
      numeric_array new_entry;
      new_entry.push_back( static_cast<double>(i + 1) );
      new_entry.push_back( static_cast<double>(i + 1) / static_cast<double>(entry_count()));
      new_entry.push_back( (*sorted_list_iter)[ field_index ] );
      new_entry.push_back( (*sorted_normalized_list_iter)[ 0 ] );
      Result.add_entry( new_entry );

      ++i;
      ++sorted_list_iter;
      ++sorted_normalized_list_iter;
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list_3
number_array_list_3::normal_distribution_segments( int s ) const
{
  REQUIRE( s > 0 );
  const double norm_floor = -10000;
  const double norm_ceiling = 10000;
  normal_distribution_inverse norm_inverse;
  number_array_list_3 Result;
  for ( int i = 1; i <= s; ++i )
    {
      numeric_array new_entry;
      if ( i == 1 ) 
	{
	  new_entry.push_back( norm_floor );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast<double>(i - 1) ) 
						* ( 1.0 / static_cast< double >( s ) )));
	}
      if ( i == s )
	{
	  new_entry.push_back( norm_ceiling );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast< double >( i ) )
						* ( 1.0 / static_cast< double >( s ) )));
	}
      Result.add_entry( new_entry );
    }
  return Result;
}

number_array_list_3
number_array_list_3::chi_squared_table( int field_index ) const
{
  number_array_list_3 normal_segments = normal_distribution_segments( normalization_s_value() );
  number_array_list_3 norm_table = normalized_series_table( field_index );
  number_array_list_3 Result;
  square a_square;
  //cout << "norm table\n" << norm_table.table_as_string() << "\n\n";
  int i = 0;
  number_array_list_3::const_iterator iter;
  for ( i = 0, iter = normal_segments.begin(); iter != normal_segments.end(); ++iter, ++i )
    {
      numeric_array new_entry;
      new_entry.push_back( i );
      new_entry.push_back( (*iter)[0] );
      new_entry.push_back( (*iter)[1] );
      new_entry.push_back( static_cast< double >(norm_table.entry_count()) / norm_table.normalization_s_value() );
      new_entry.push_back( static_cast< double >(norm_table.items_in_range( 3, new_entry[1], new_entry[2] ) ) );
      new_entry.push_back( a_square.at( new_entry[3] - new_entry[4] ) );
      new_entry.push_back( new_entry[5] / new_entry[ 3 ] );
      Result.add_entry( new_entry );
    }
  return Result;
}

double
number_array_list_3::chi_squared_q( int field_index ) const
{
  //cout << chi_squared_table( field_index ).table_as_string();  
  double Result = chi_squared_table( field_index ).sum_by_field( 6 );
  return Result;
}

#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif

double
number_array_list_3::chi_squared_1_minus_p( int field_index ) const
{
  chi_squared_integral chi( normalization_s_value() - 1 );
  double Result = 1 - chi.at( chi_squared_q( field_index ) );
  return Result;
}

number_array_list_3
number_array_list_3::select_series( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list_3 Result;
  if ( entry_count() > 0 )
    {
      numeric_array new_entry;
      new_entry.push_back( head()[field_index] );
      Result.add_entry( new_entry );
    }
  if ( entry_count() > 1 )
    {
      Result.make_from_list( Result.suffixed_by_list( tail().select_series( field_index ) ));
    }
  ENSURE( Result.entry_count() == entry_count() );
  ENSURE( Result.field_count() == 1 );
  return Result;
}

int
number_array_list_3::items_in_range( int field_index, double lower_limit, double upper_limit ) const
{
  int Result = 0;
  if (( entry_count() > 0 )
      && ( head()[field_index] > lower_limit )
      && ( head()[field_index] <= upper_limit ) )
    {
      Result = 1;
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().items_in_range( field_index, lower_limit, upper_limit );
    }
  return Result;
}


std::string
number_array_list_3::entry_as_string( const numeric_array& entry ) const
{
  return entry.to_string();
}  

std::string
number_array_list_3::table_as_string( void ) const
{
  std::string Result = "";
  if ( entry_count() > 0 ) 
    {
      Result = Result + entry_as_string( head() ) + "\n";
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().table_as_string();
    }
  return Result;
}


numeric_array 
number_array_list_3::gaussian_solver_term( int term_number ) const
{
  REQUIRE( 0 <= term_number < field_count() );
  numeric_array Result;
  if ( term_number == 0 )
    {
      Result.push_back( static_cast<double>(entry_count()) );
      for ( int i = 0; i < field_count(); ++i )
	{
	  Result.push_back( sum_by_field( i ) );
	}
    }
  else
    {
      Result.push_back( sum_by_field( term_number - 1 ) );
      number_array_list_3 term_series = select_series( term_number - 1 );
      for ( int i = 0; i < field_count(); ++i ) 
	{
	  Result.push_back( term_series.multiplied_by_list( 0, select_series( i ) ).sum_by_field( 0 ) );
	}
    }
  ENSURE( Result.size() == field_count() + 1 );
  return Result;
}


gaussian_solver
number_array_list_3::to_gaussian_solver( void ) const
{
  gaussian_solver Result;
  for ( int i = 0; i < field_count(); ++i ) 
    {
      Result.add_coefficients( gaussian_solver_term( i ) );
    }
  ENSURE( Result.is_solvable() );
  return Result;
}

double
number_array_list_3::full_variance( void ) const
{
  const double Result = ( 1.0 / ( static_cast<double>(entry_count()) - static_cast<double>( field_count() ) ) )
    * full_variance_series( to_gaussian_solver().solution() ).sum_by_field( 0 );
  return Result;
}

number_array_list_3
number_array_list_3::full_variance_series( const numeric_array& gauss_solution ) const
{
  if ( entry_count() > 0 )
    {
      REQUIRE( gauss_solution.size() == field_count() );
    }
  number_array_list_3 Result;
  if ( entry_count() > 0 )
    {
      double new_entry = head()[ head().size() - 1 ] - gauss_solution[ 0 ];
      for ( int i = 0; i < field_count() - 1; ++i )
	{
	  new_entry = new_entry - gauss_solution[ i + 1 ] * head()[ i ];
	}
      new_entry = new_entry*new_entry;
      numeric_array new_entry_list;
      new_entry_list.push_back( new_entry );
      Result.make_from_entry( new_entry_list );
      Result = Result.suffixed_by_list( tail().full_variance_series( gauss_solution ) );
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

double
number_array_list_3::full_standard_deviation( void ) const
{
  const double Result = sqrt( full_variance() );
  return Result;
}

double
number_array_list_3::prediction_range( double percent_range, const numeric_array& prediction_parameters ) const
{
  REQUIRE( entry_count() > field_count() );
  double subtotal = 1.0 + 1.0 / static_cast<double>(entry_count() );
  for ( int i = 0; i < field_count() - 1; ++i )
    {
      adder an_adder( -mean_by_field( i ) );
      square a_square;
      const double numerator = a_square.at( prediction_parameters[ i ] - mean_by_field( i ) );
      const double denominator = mapped_to( i, an_adder ).mapped_to( i, a_square ).sum_by_field(i);
      subtotal = subtotal + numerator / denominator;
    }
  t_distribution t;
  t.set_n( entry_count() - field_count() );
  const double Result = t.at( percent_range )*full_standard_deviation()*sqrt( subtotal );
  return Result;
}

double 
number_array_list_3::lower_prediction_interval( double percent_range, const numeric_array& prediction_parameters ) const
{
  const double Result = prediction( prediction_parameters ) - prediction_range( percent_range, prediction_parameters );
  return Result;
}

double 
number_array_list_3::upper_prediction_interval( double percent_range, const numeric_array& prediction_parameters ) const
{
  const double Result = prediction( prediction_parameters ) + prediction_range( percent_range, prediction_parameters );
  return Result;
}

double 
number_array_list_3::prediction( const numeric_array& prediction_parameters ) const
{
  const double Result = to_gaussian_solver().solution().head() +
    to_gaussian_solver().solution().tail().mult_by_array( prediction_parameters ).sum();
  return Result;
}

/*
*/
class NUMBER_ARRAY_LIST_3

inherit 
   LINKED_LIST[NUMERIC_ARRAY]
      redefine make
      end; 
   
creation {ANY} 
   make, make_from_entry, make_from_list

feature {ANY} 
   
   field_count: INTEGER;
      --number of fields allowed in an entry
   
   head: like item is 
      --first item
      require 
         entry_count >= 1; 
      do  
         Result := first;
      end -- head
   
   tail: like Current is 
      --all items after the first
      do  
         !!Result.make;
         if entry_count > 1 then 
            Result := slice(lower + 1,upper);
            Result.set_field_count(field_count);
         end; 
      ensure 
         entry_count > 1 implies Result.entry_count = entry_count - 1 and Result.field_count = field_count; 
      end -- tail
   
   suffixed_by_list(rhs: like Current): like Current is 
      --the list, suffixed by another list
      local 
         i: INTEGER;
      do  
         !!Result.make_from_list(Current);
         from 
            i := rhs.lower;
         until 
            not rhs.valid_index(i)
         loop 
            Result.add_entry(rhs.item(i));
            i := i + 1;
         end; 
      ensure 
         Result.entry_count = entry_count + rhs.entry_count; 
      end -- suffixed_by_list
   
   make_from_entry(entry: like item) is 
      --clear the list, then add the entry
      do  
         make;
         add_entry(entry);
      ensure 
         head.is_equal(entry); 
         field_count = entry.count; 
      end -- make_from_entry
   
   make is 
      --clear the list
      do  
         Precursor;
         field_count := - 1;
      ensure 
         entry_count = 0; 
         field_count = - 1; 
      end -- make
   
   make_from_list(rhs: like Current) is 
      --clear the list, setting it equal to another list
      do  
         from_collection(rhs);
         field_count := rhs.field_count;
      end -- make_from_list
   
   sum_by_field(field_index: INTEGER): DOUBLE is 
      --the sum of a given field over all entries
      require 
         is_valid_field_index(field_index); 
      do  
         if entry_count = 0 then 
            Result := 0;
         else 
            Result := head.item(field_index) + tail.sum_by_field(field_index);
         end; 
      end -- sum_by_field
   
   mean_by_field(field_index: INTEGER): DOUBLE is 
      --the mean of a given field over all entries
      require 
         is_valid_field_index(field_index); 
         entry_count >= 1; 
      do  
         Result := sum_by_field(field_index) / entry_count.to_double;
      end -- mean_by_field
   
   entry_count: INTEGER is 
      --the number of entries
      do  
         Result := count;
      end -- entry_count
   
   add_entry(new_entry: like item) is 
      -- adds an entry to the end of the list
      require 
         is_valid_entry(new_entry); 
      do  
         if entry_count = 0 then 
            set_field_count(new_entry.count);
         end; 
         add_last(new_entry);
      ensure 
         entry_count = old entry_count + 1; 
      end -- add_entry
   
   mapped_to(field_index: INTEGER; f: SINGLE_VARIABLE_FUNCTION): like Current is 
      --the elements, with the given field mapped to f
      require 
         is_valid_field_index(field_index); 
      local 
         new_entry: like item;
      do  
         !!Result.make;
         if entry_count > 0 then 
            new_entry := head.twin;
            new_entry.put(f.at(head.item(field_index)),field_index);
            Result.add_entry(new_entry);
            Result := Result.suffixed_by_list(tail.mapped_to(field_index,f));
         end; 
      ensure 
         Result.entry_count = entry_count; 
      end -- mapped_to
   
   multiplied_by_list(field_index: INTEGER; rhs: like Current): like Current is 
      --the elements, with the given field multiplied by the 
      --corresponding field in rhs
      require 
         entry_count = rhs.entry_count; 
         field_count = rhs.field_count; 
         is_valid_field_index(field_index); 
      local 
         new_entry: like item;
      do  
         !!Result.make;
         if entry_count > 0 then 
            new_entry := head;
            new_entry.put(head.item(field_index) * rhs.head.item(field_index),field_index);
            Result.add_entry(new_entry);
            Result := Result.suffixed_by_list(tail.multiplied_by_list(field_index,rhs.tail));
         end; 
      ensure 
         Result.entry_count = entry_count; 
      end -- multiplied_by_list
   
   sorted_by(field_index: INTEGER): like Current is 
      --the list, sorted by the given field
      require 
         is_valid_field_index(field_index); 
      do  
         if is_sorted_by(field_index) then 
            !!Result.make_from_list(Current);
         else 
            !!Result.make_from_list(tail.items_less_than(field_index,head.item(field_index)).sorted_by(field_index).suffixed_by_item(head).suffixed_by_list(tail.items_greater_than_or_equal_to(field_index,head.item(field_index)).sorted_by(field_index)));
         end; 
      ensure 
         Result.entry_count = entry_count; 
      end -- sorted_by
   
   is_valid_entry(entry: like item): BOOLEAN is 
      --whether entry is a valid entry
      do  
         if entry_count = 0 and entry.count > 0 or entry.count = field_count and entry.lower = 1 then 
            Result := true;
         else 
            Result := false;
         end; 
      end -- is_valid_entry
   
   items_less_than(field_index: INTEGER; value: DOUBLE): like Current is 
      --list of items less than the given value in the given field
      require 
         is_valid_field_index(field_index); 
      local 
         i: INTEGER;
      do  
         !!Result.make;
         from 
            i := lower;
         until 
            not valid_index(i)
         loop 
            if not (item(i).item(field_index) >= value) then 
               Result.add_entry(item(i));
            end; 
            i := i + 1;
         end; 
      ensure 
         Result.entry_count <= entry_count; 
      end -- items_less_than
   
   items_greater_than_or_equal_to(field_index: INTEGER; value: DOUBLE): like Current is 
      --list of items greater than or equal to the given value in the 
      --given field
      require 
         is_valid_field_index(field_index); 
      local 
         i: INTEGER;
      do  
         !!Result.make;
         from 
            i := lower;
         until 
            not valid_index(i)
         loop 
            if item(i).item(field_index) >= value then 
               Result.add_entry(item(i));
            end; 
            i := i + 1;
         end; 
      ensure 
         Result.entry_count <= entry_count; 
      end -- items_greater_than_or_equal_to
   
   suffixed_by_item(entry: like item): like Current is 
      --the list, suffixed by a single item
      require 
         is_valid_entry(entry); 
      do  
         !!Result.make_from_list(Current);
         Result.add_entry(entry);
      ensure 
         Result.entry_count = entry_count + 1; 
      end -- suffixed_by_item
   
   set_field_count(new_field_count: INTEGER) is 
      --sets the field count
      require 
         entry_count = 0 or (entry_count > 0 implies head.count = new_field_count); 
      do  
         field_count := new_field_count;
      ensure 
         field_count = new_field_count; 
      end -- set_field_count
   
   is_valid_field_index(field_index: INTEGER): BOOLEAN is 
      --whether the given field index is valid
      do  
         if entry_count = 0 then 
            Result := true;
         else 
            Result := 1 <= field_index and field_index <= field_count;
         end; 
      end -- is_valid_field_index
   
   is_sorted_by(field_index: INTEGER): BOOLEAN is 
      --whether the list is sorted by the given field index
      require 
         is_valid_field_index(field_index); 
      do  
         Result := true;
         if entry_count > 1 then 
            Result := head.item(field_index) < tail.head.item(field_index) and tail.is_sorted_by(field_index);
         end; 
      end -- is_sorted_by

feature {ANY} --chi-squared distribution calcs

   variance(field_index: INTEGER; is_full_population: BOOLEAN): DOUBLE is 
      require 
         is_valid_field_index(field_index); 
         is_full_population implies entry_count > 0; 
         not is_full_population implies entry_count > 1; 
      local 
         divisor: DOUBLE;
         adder: ADDER;
         square: SQUARE;
      do  
         if is_full_population then 
            divisor := entry_count.to_double;
         else 
            divisor := (entry_count - 1).to_double;
         end; 
         !!adder.make(- mean_by_field(field_index));
         !!square;
         Result := mapped_to(field_index,adder).mapped_to(field_index,square).sum_by_field(field_index) / divisor;
      end -- variance
   
   standard_deviation(field_index: INTEGER; is_full_population: BOOLEAN): DOUBLE is 
      require 
         is_valid_field_index(field_index); 
      do  
         Result := variance(field_index,is_full_population).sqrt;
      end -- standard_deviation
   
   normalized_series(field_index: INTEGER): like Current is 
      --one series of the table, "normalized" into standard deviations
      require 
         is_valid_field_index(field_index); 
      local 
         normalizer: NORMALIZER;
      do  
         !!normalizer.make(mean_by_field(field_index),standard_deviation(field_index,false));
         Result := mapped_to(field_index,normalizer).select_series(field_index);
      ensure 
         Result.entry_count = entry_count; 
      end -- normalized_series
   
   normalization_s_value: INTEGER is 
      --number of segments in a normalization table
      require 
         entry_count.divisible(5); 
      do  
         Result := entry_count // 5;
      end -- normalization_s_value
   
   normalized_series_table(field_index: INTEGER): like Current is 
      --a normalized series table, used for error-checking and used 
      --in the chi-squared normalization test
      require 
         entry_count > 0; 
         is_valid_field_index(field_index); 
      local 
         sorted_list: like Current;
         sorted_normalized_list: like Current;
         i: INTEGER;
         new_entry: NUMERIC_ARRAY;
      do  
         sorted_list := sorted_by(field_index);
         sorted_normalized_list := sorted_list.normalized_series(field_index);
         check 
            sorted_list.entry_count = entry_count; 
            sorted_normalized_list.entry_count = entry_count; 
         end;
         from 
            i := sorted_list.lower;
            !!Result.make;
         until 
            not sorted_list.valid_index(i)
         loop 
            !!new_entry.make_empty;
            new_entry.add_last(i.to_double);
            new_entry.add_last(i.to_double / entry_count.to_double);
            new_entry.add_last(sorted_list.item(i).item(field_index));
            new_entry.add_last(sorted_normalized_list.item(i).first);
            Result.add_entry(new_entry);
            i := i + 1;
         end; 
      end -- normalized_series_table
   
   normal_distribution_segments(s: INTEGER): like Current is 
      --segments of the normal distribution; see [Humphrey95] for use
      require 
         s > 0; 
      local 
         i: INTEGER;
         new_entry: NUMERIC_ARRAY;
         norm_inverse: NORMAL_DISTRIBUTION_INVERSE;
      do  
         from 
            i := 1;
            !!Result.make;
            !!norm_inverse;
         until 
            i > s
         loop 
            !!new_entry.make_empty;
            if i = 1 then 
               new_entry.add_last(- 1000.0);
            else 
               new_entry.add_last(norm_inverse.at((i - 1).to_double * (1.0 / s.to_double)));
            end; 
            if i = s then 
               new_entry.add_last(1000.0);
            else 
               new_entry.add_last(norm_inverse.at(i.to_double * 1.0 / s.to_double));
            end; 
            Result.add_entry(new_entry);
            i := i + 1;
         end; 
      end -- normal_distribution_segments
   
   chi_squared_table(field_index: INTEGER): like Current is 
      -- chi-squared table, used to calculate the chi-squared distribution
      require 
         is_valid_field_index(field_index); 
      local 
         norm_segments: like Current;
         norm_table: like Current;
         i: INTEGER;
         new_entry: NUMERIC_ARRAY;
      do  
         from 
            norm_segments := normal_distribution_segments(normalization_s_value);
            norm_table := normalized_series_table(field_index);
            !!Result.make;
            i := norm_segments.lower;
         until 
            not norm_segments.valid_index(i)
         loop 
            !!new_entry.make_empty;
            new_entry.add_last(i.to_double);
            new_entry.add_last(norm_segments.item(i).item(1));
            new_entry.add_last(norm_segments.item(i).item(2));
            new_entry.add_last(norm_table.entry_count.to_double / norm_table.normalization_s_value.to_double);
            new_entry.add_last(norm_table.items_in_range(4,new_entry.item(2),new_entry.item(3)));
            new_entry.add_last((new_entry.item(4) - new_entry.item(5)) ^ 2);
            new_entry.add_last(new_entry.item(6) / new_entry.item(4));
            Result.add_entry(new_entry);
            i := i + 1;
         end; 
      end -- chi_squared_table
   
   chi_squared_q(field_index: INTEGER): DOUBLE is 
      --"Q" value of chi-squared normalization test
      do  
         Result := chi_squared_table(field_index).sum_by_field(7);
      end -- chi_squared_q
   
   chi_squared_1_minus_p(field_index: INTEGER): DOUBLE is 
      --1-p value of chi-squared normalization test
      local 
         chi_squared_integral: CHI_SQUARED_INTEGRAL;
      do  
         !!chi_squared_integral.make(normalization_s_value - 1);
         Result := 1 - chi_squared_integral.at(chi_squared_q(field_index));
      end -- chi_squared_1_minus_p
   
   select_series(field_index: INTEGER): like Current is 
      --the given series, extracted as a separate list
      require 
         is_valid_field_index(field_index); 
      local 
         new_entry: NUMERIC_ARRAY;
      do  
         if entry_count = 0 then 
            !!Result.make;
         else 
            !!new_entry.make_empty;
            new_entry.add_last(head.item(field_index));
            !!Result.make_from_entry(new_entry);
            if entry_count > 1 then 
               Result := Result.suffixed_by_list(tail.select_series(field_index));
            end; 
         end; 
      end -- select_series
   
   items_in_range(field_index: INTEGER; lower_limit, upper_limit: DOUBLE): INTEGER is 
      --the items from the given field which fit in (lower limit < item <= upper_limit)
      require 
         is_valid_field_index(field_index); 
      do  
         if entry_count > 0 and head.item(field_index) > lower_limit and upper_limit >= head.item(field_index) then 
            Result := 1;
         end; 
         if entry_count > 1 then 
            Result := Result + tail.items_in_range(field_index,lower_limit,upper_limit);
         end; 
      end -- items_in_range
   
   entry_as_string(entry: like item): STRING is 
      --entry as a string, ie "( 1, 2, 3 )"
      do  
         Result := entry.to_string;
      end -- entry_as_string
   
   table_as_string: STRING is 
      --table, as a set of entry_as_string lines
      do  
         if entry_count > 0 then 
            Result := entry_as_string(head) + "%N";
            if entry_count > 1 then 
               Result.append_string(tail.table_as_string);
            end; 
         end; 
      end -- table_as_string

feature {ANY} -- multiple regression features

   gaussian_solver_term(term_number: INTEGER): NUMERIC_ARRAY is 
      --term for gaussian solver to solve for multiple regression parameters
      require 
         1 <= term_number and term_number < field_count + 1; 
      local 
         i: INTEGER;
      do  
         !!Result.make_empty;
         if term_number = 1 then 
            Result.add_last(entry_count.to_double);
            from 
               i := head.lower;
            until 
               not head.valid_index(i)
            loop 
               Result.add_last(sum_by_field(i));
               i := i + 1;
            end; 
         else 
            Result.add_last(sum_by_field(term_number - 1));
            from 
               i := head.lower;
            until 
               not head.valid_index(i)
            loop 
               Result.add_last(select_series(term_number - 1).multiplied_by_list(1,select_series(i)).sum_by_field(1));
               i := i + 1;
            end; 
         end; 
      end -- gaussian_solver_term
   
   to_gaussian_solver: GAUSSIAN_SOLVER is 
      --creates a gaussian solver for the multiple regression parameters
      local 
         i: INTEGER;
      do  
         !!Result.make;
         from 
            i := head.lower;
         until 
            not head.valid_index(i)
         loop 
            Result.add_coefficients(gaussian_solver_term(i));
            i := i + 1;
         end; 
      ensure 
         Result.is_solvable; 
      end -- to_gaussian_solver
   
   full_variance: DOUBLE is 
      --total variance of the data set
      require 
         entry_count > field_count; 
      do  
         Result := ( 1.0 / (entry_count.to_double - field_count.to_double) ) * full_variance_series(to_gaussian_solver.solution).sum_by_field(1);
      end -- full_variance
   
   full_variance_series(gauss_solution: NUMERIC_ARRAY): like Current is 
      --variance series, used to calculate the variance
      require 
         entry_count > 0 implies gauss_solution.count = field_count; 
      local 
         i: INTEGER;
         new_entry: DOUBLE;
         new_entry_list: NUMERIC_ARRAY;
      do  
	 if entry_count = 0 then
	    !!Result.make
	 else
	    new_entry := head.last - gauss_solution.first;
	    from 
	       i := head.lower;
	    until 
	       not head.valid_index(i + 1)
	    loop 
	       new_entry := new_entry - gauss_solution.item(i + 1) * head.item(i);
	       i := i + 1;
	    end; 
	    new_entry := new_entry ^ 2;
	    !!new_entry_list.make_empty;
	    new_entry_list.add_last(new_entry);
	    !!Result.make_from_entry(new_entry_list);
	    Result := Result.suffixed_by_list(tail.full_variance_series(gauss_solution));
	 end
      end -- full_variance_series
   
   full_standard_deviation: DOUBLE is 
      --standard deviation of the data set
      do  
         Result := full_variance.sqrt;
      end -- full_standard_deviation
   
   prediction_range(percent_range: DOUBLE; prediction_parameters: NUMERIC_ARRAY): DOUBLE is 
      --prediction range for the given prediction and percent
      require 
         entry_count > field_count; 
      local 
         i: INTEGER;
         numerator: DOUBLE;
         denominator: DOUBLE;
         subtotal: DOUBLE;
         adder: ADDER;
         square: SQUARE;
         t: T_DISTRIBUTION;
      do  
         subtotal := 1.0 + 1.0 / entry_count.to_double;
         from 
            i := head.lower;
         until 
            not head.valid_index(i + 1)
         loop 
            !!adder.make(- mean_by_field(i));
	    !!square
            numerator := adder.at(prediction_parameters.item(i)) ^ 2;
            denominator := mapped_to(i,adder).select_series(i).mapped_to(1,square).sum_by_field(1);
            subtotal := subtotal + numerator / denominator;
            i := i + 1;
         end; 
         !!t.make;
         t.set_n(entry_count - field_count);
         Result := t.at(percent_range) * full_standard_deviation * subtotal.sqrt;
      end -- prediction_range
   
   lower_prediction_interval(percent_range: DOUBLE; parameters: NUMERIC_ARRAY): DOUBLE is 
      --LPI for percent range and prediction
      do  
         Result := prediction(parameters) - prediction_range(percent_range,parameters);
      end -- lower_prediction_interval
   
   upper_prediction_interval(percent_range: DOUBLE; parameters: NUMERIC_ARRAY): DOUBLE is 
      --UPI for percent range and distribution
      do  
         Result := prediction(parameters) + prediction_range(percent_range,parameters);
      end -- upper_prediction_interval
   
   prediction(parameters: NUMERIC_ARRAY): DOUBLE is 
      --prediction using multiple regression given the parameters by series
      do  
         Result := to_gaussian_solver.solution.head + to_gaussian_solver.solution.tail.mult_by_array(parameters).sum;
      end -- prediction

end -- class NUMBER_ARRAY_LIST_3
number_array_list_parser_3
#ifndef NUMBER_ARRAY_LIST_PARSER_3_H
#define NUMBER_ARRAY_LIST_PARSER_3_H

#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif
#ifndef NUMBER_ARRAY_LIST_3_H
#include "number_array_list_3.h"
#endif

class number_array_list_parser_3: public simple_input_parser
{
 protected:
  number_array_list_3 number_list;
  enum state_t { Parsing_historical_data, Parsing_commands };
  state_t state;

 public:
  virtual std::string transformed_line (const std::string & line) const;
  std::string string_stripped_of_comments (const std::string & str) const;
  static bool is_double (const std::string & str);
  static double double_from_string (const std::string & str);
  static const std::string & historical_data_terminator;
  static const std::string & inline_comment_begin;
  bool last_line_is_blank (void);

  virtual void parse_last_line (void);
  void parse_last_line_as_historical_data (void);
  void parse_last_line_as_end_of_historical_data (void);
  void parse_last_line_as_command( void );
  void print_normalization_check( int field_index ) const;
  void print_multiple_regression_prediction( const numeric_array& parameters ) const;
  void make( void );
  number_array_list_parser_3( void );

  bool last_line_is_valid_historical_data( void ) const;
  static std::vector< std::string > split_string( const std::string& string_to_split, const std::string& separator );
  static std::string string_before_separator( const std::string& string_to_split, const std::string& separator );
  static std::string string_after_separator( const std::string& string_to_split, const std::string& separator );
  static numeric_array numbers_from_string( const std::string& string_to_split );

};

#endif
/*
*/

#include "number_array_list_parser_3.h"
#ifndef WHITESPACE_STRIPPER_H
#include "whitespace_stripper.h"
#endif
#ifndef ERROR_LOG_H
#include "error_log.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif

void
number_array_list_parser_3::make (void)
{
  simple_input_parser::reset ();
  state = Parsing_historical_data;
  number_list.make ();
}

std::string number_array_list_parser_3::transformed_line (const std::string & str) const
{
  return whitespace_stripper::string_stripped_of_whitespace (string_stripped_of_comments (str));
}

std::string
  number_array_list_parser_3::string_stripped_of_comments (const std::string & str) const
{
  const std::string::size_type comment_index = str.find (inline_comment_begin);
  return str.substr (0, comment_index);
}

bool 
number_array_list_parser_3::is_double (const std::string & str)
{
  bool
    Result = true;
  char *
    conversion_end = NULL;
  strtod (str.c_str (), &conversion_end);
  if (conversion_end == str.data ())
    {
      Result = false;
    }
  return Result;
}

double
number_array_list_parser_3::double_from_string (const std::string & str)
{
  REQUIRE (is_double (str));
  return strtod (str.c_str (), NULL);
}


const
  std::string & number_array_list_parser_3::historical_data_terminator = "stop";

const
  std::string & number_array_list_parser_3::inline_comment_begin = "--";

bool number_array_list_parser_3::last_line_is_blank (void)
{
  if (last_line ().length () == 0)
    {
      return true;
    }
  else
    {
      return false;
    }
}

void
number_array_list_parser_3::parse_last_line (void)
{
  if (last_line_is_blank ())
    {
      return;
    }
  if ( state == Parsing_historical_data )
    {
      if ( last_line () == historical_data_terminator)
	{
	  parse_last_line_as_end_of_historical_data ();
	}
      else
	{
	  parse_last_line_as_historical_data ();
	}
    }
  else
    {
      parse_last_line_as_command();
    }
}


void
number_array_list_parser_3::parse_last_line_as_historical_data (void)
{
  if ( last_line_is_valid_historical_data() )
    {
      const numeric_array this_entry = numbers_from_string( last_line() );
      number_list.add_entry( this_entry );
    }
  else
    {
      error_log errlog;
      errlog.log_error( std::string( "Invalid data entry: " ) + last_line() );
    }
}
  
void
number_array_list_parser_3::parse_last_line_as_end_of_historical_data (void)
{
  REQUIRE (last_line () == historical_data_terminator);
  cout << "Historical data read; " 
       << number_list.entry_count() << " entries, " 
       << number_list.field_count() << " fields.\n";
  state = Parsing_commands;
}

number_array_list_parser_3::number_array_list_parser_3 (void)
{
  make();
}


bool
number_array_list_parser_3::last_line_is_valid_historical_data( void ) const
{
  const std::vector< std::string > substrings = split_string( last_line(), "," );
  bool Result = true;
  //make sure we have a valid field count for the number_array_list
  if ( ( number_list.entry_count() > 0 ) && ( substrings.size() != number_list.field_count() ) )
    {
      Result = false;
    }
  //...and that each substring can be interpreted as a double
  for ( int i = 0; i < substrings.size(); ++i )
    {
      if ( !is_double( substrings[ i ] ) )
	{
	  Result = false;
	}
    }
  return Result;
}

std::vector< std::string >
number_array_list_parser_3::split_string( const std::string& string_to_split, const std::string& separator ) 
{
  std::vector< std::string > Result;
  const std::string prefix = string_before_separator( string_to_split, separator );
  const std::string remainder = string_after_separator( string_to_split, separator );
  Result.push_back( prefix );
  if ( remainder.size() > 0 )
    {
      const std::vector< std::string > split_remainder = split_string( remainder, separator );
      Result.insert( Result.end(), split_remainder.begin(), split_remainder.end() );
    }
  return Result;
}

std::string
number_array_list_parser_3::string_before_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
    //not found; result is entire string
      Result = string_to_split;
    }
  else
    {
      Result = string_to_split.substr( 0, separator_position );
    }
  return Result;
}

std::string
number_array_list_parser_3::string_after_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
      //not found; result is empty
      Result = "";
    }
  else
    {
      Result = string_to_split.substr( separator_position + separator.size(), string_to_split.size() );
    }
  return Result;
}

numeric_array
number_array_list_parser_3::numbers_from_string( const std::string& string_to_split )
{
  const std::vector< std::string > number_strings = split_string( string_to_split, "," );
  numeric_array Result;
  for ( std::vector< std::string >::const_iterator iter = number_strings.begin(); iter != number_strings.end(); ++iter )
    {
      CHECK( is_double( *iter ) );
      const double new_value = double_from_string( *iter );
      Result.push_back( new_value );
    }
  return Result;
}

void
number_array_list_parser_3::parse_last_line_as_command( void )
{
  const std::string command = string_before_separator( last_line(), " " );
  const std::string argument_string = string_after_separator( last_line(), " " );
  std::vector< string > arguments = split_string( argument_string, " " );
  if ( command == "normality_check_on_series" )
    {
      print_normalization_check( static_cast< int >( double_from_string( arguments[0] ) ) );
    }
  else if ( command == "multiple-regression-prediction" )
    {
      print_multiple_regression_prediction( numbers_from_string( argument_string ) );
    }
  else
    {
      cout << "unknown command: " << last_line() << "\n";
    }
  }

void
number_array_list_parser_3::print_normalization_check( int field_index ) const
{
  cout << "Normalization check on series: " << field_index << "\n";
  cout << "Q: " << number_list.chi_squared_q( field_index ) << "\n";
  cout << "(1-p): " << number_list.chi_squared_1_minus_p( field_index ) << "\n";
}


void
number_array_list_parser_3::print_multiple_regression_prediction( const numeric_array& parameters ) const
{
  if ( number_list.entry_count() <= number_list.field_count() )
    {
      cout << "Too few entries for multiple regression prediction!\n";
    }
  else if ( parameters.size() != number_list.field_count() - 1 ) 
    {
      cout << "Incorrect number of parameters for multiple regression; expected " 
	   << number_list.field_count() - 1 << ", got " << parameters.size() << "\n";
    }
  else
    {
      cout << "Multiple regression prediction: \n";
      cout << "Solution: " << number_list.to_gaussian_solver().solution().to_string() << "\n";
      cout << "Variance: " << number_list.full_variance() << "\n";
      cout << "Standard Deviation: " << number_list.full_standard_deviation() << "\n\n";
      cout << "Prediction from values: " << parameters.to_string() << "\n";
      cout << "Projected result: " << number_list.prediction( parameters ) << "\n";
      cout << "UPI( 70% ): " << number_list.upper_prediction_interval( 0.7, parameters ) << "\n";
      cout << "LPI( 70% ): " << number_list.lower_prediction_interval( 0.7, parameters ) << "\n";
      cout << "UPI( 90% ): " << number_list.upper_prediction_interval( 0.9, parameters ) << "\n";
      cout << "LPI( 90% ): " << number_list.lower_prediction_interval( 0.9, parameters ) << "\n";
    }
}

  

/*
*/
class NUMBER_ARRAY_LIST_PARSER_3
--reads a list of number pairs, and performs linear regression analysis

inherit 
   SIMPLE_INPUT_PARSER
      redefine parse_last_line, transformed_line
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   inline_comment_begin: STRING is "--";
   
   string_stripped_of_comment(string: STRING): STRING is 
      --strip the string of any comment
      local 
         comment_index: INTEGER;
      do  
         if string.has_string(inline_comment_begin) then 
            comment_index := string.index_of_string(inline_comment_begin);
            if comment_index = 1 then 
               !!Result.make_from_string("");
            else 
               Result := string.substring(1,comment_index - 1);
            end; 
         else 
            Result := string.twin;
         end; 
      end -- string_stripped_of_comment
   
   string_stripped_of_whitespace(string: STRING): STRING is 
      --strip string of whitespace
      do  
         Result := string.twin;
         Result.left_adjust;
         Result.right_adjust;
      end -- string_stripped_of_whitespace
   
   transformed_line(string: STRING): STRING is 
      --strip comments and whitespace from parseable line      
      do  
         Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
      end -- transformed_line
   
   number_list: NUMBER_ARRAY_LIST_3;
   
   state: INTEGER;
   
   Parsing_historical_data: INTEGER is unique;
   
   Parsing_commands: INTEGER is unique;
   
   Command_normality_check_on_series: STRING is 
      once 
         Result := "normality_check_on_series";
      end -- Command_normality_check_on_series
   
   Command_multiple_regression_prediction: STRING is 
      once 
         Result := "multiple-regression-prediction";
      end -- Command_multiple_regression_prediction
   
   historical_data_terminator: STRING is "stop";
   
   double_from_string(string: STRING): DOUBLE is 
      require 
         string.is_double or string.is_integer; 
      do  
         if string.is_double then 
            Result := string.to_double;
         elseif string.is_integer then 
            Result := string.to_integer.to_double;
         end; 
      end -- double_from_string

feature {ANY} --parsing

   reset is 
      --resets the parser and makes it ready to go again
      do  
         state := Parsing_historical_data;
         number_list.make;
      end -- reset
   
   make is 
      do  
         !!number_list.make;
         reset;
      end -- make
   
   parse_last_line_as_historical_data is 
      --interpret last_line as a pair of comma-separated values
      local 
         error_log: ERROR_LOG;
         this_line_numbers: NUMERIC_ARRAY;
      do  
         !!error_log.make;
         if last_line_is_valid_historical_data then 
            this_line_numbers := numbers_from_string(last_line);
            number_list.add_entry(this_line_numbers);
         else 
            error_log.log_error("Invalid historical data: " + last_line + "%N");
         end; 
      end -- parse_last_line_as_historical_data
   
   parse_last_line_as_end_of_historical_data is 
      --interpret last line as the end of historical data
      require 
         last_line.compare(historical_data_terminator) = 0; 
      local 
         i: INTEGER;
      do  
         state := Parsing_commands;
         std_output.put_string("Historical data read; " + number_list.entry_count.to_string + "items read%N");
      end -- parse_last_line_as_end_of_historical_data
   
   parse_last_line_as_command is 
      --interpret the last line as a command
      local 
         command: STRING;
         argument_string: STRING;
         arguments: ARRAY[STRING];
      do  
         command := string_before_separator(last_line," ");
         argument_string := string_after_separator(last_line," ");
         arguments := split_string(argument_string," ");
         if command.is_equal(Command_normality_check_on_series) then 
            print_normalization_check_on_series(arguments.first.to_integer + 1);
         elseif command.is_equal(Command_multiple_regression_prediction) then 
            print_multiple_regression_prediction(numbers_from_string(argument_string));
         else 
            std_output.put_string("Unrecognized command string: " + last_line);
         end; 
      end -- parse_last_line_as_command
   
   parse_last_line is 
      --parse the last line according to state
      do  
         if not last_line.empty then 
            if state = Parsing_historical_data then 
               if last_line.compare(historical_data_terminator) = 0 then 
                  parse_last_line_as_end_of_historical_data;
               else 
                  parse_last_line_as_historical_data;
               end; 
            else 
               parse_last_line_as_command;
            end; 
         end; 
      end -- parse_last_line
   
   last_line_is_valid_historical_data: BOOLEAN is 
      --whether last line is valid historical data
      local 
         substrings: ARRAY[STRING];
         i: INTEGER;
      do  
         substrings := split_string(last_line,",");
         Result := true;
         if number_list.entry_count > 0 and substrings.count /= number_list.field_count then 
            Result := false;
         end; 
         from 
            i := substrings.lower;
         until 
            not substrings.valid_index(i)
         loop 
            if not (substrings.item(i).is_double or substrings.item(i).is_integer) then 
               Result := false;
            end; 
            i := i + 1;
         end; 
      end -- last_line_is_valid_historical_data
   
   split_string(string_to_split, separator: STRING): ARRAY[STRING] is 
      --a list of components of a string, separated by the given 
      --separator, ie split_string( "1,2,3", "," ) = [ "1", "2", "3" ]
      local 
         prior_to_separator: STRING;
         remainder: STRING;
         split_remainder: ARRAY[STRING];
         i: INTEGER;
      do  
         prior_to_separator := string_before_separator(string_to_split,separator);
         remainder := string_after_separator(string_to_split,separator);
         !!Result.make(1,0);
         Result.add_last(prior_to_separator);
         if remainder.count > 0 then 
            split_remainder := split_string(remainder,separator);
            from 
               i := split_remainder.lower;
            until 
               not split_remainder.valid_index(i)
            loop 
               Result.add_last(split_remainder.item(i));
               i := i + 1;
            end; 
         end; 
      end -- split_string
   
   string_before_separator(string_to_split, separator: STRING): STRING is 
      --the part of a string which comes before the separator, or 
      --the whole string if it's not found
      local 
         separator_index: INTEGER;
      do  
         separator_index := string_to_split.substring_index(separator,1);
         if separator_index = 0 then 
            --not found; copy whole string
            Result := string_to_split.twin;
         else 
            Result := string_to_split.substring(1,separator_index - 1);
         end; 
      end -- string_before_separator
   
   string_after_separator(string_to_split, separator: STRING): STRING is 
      --the part of the string after the separator,
      local 
         separator_index: INTEGER;
      do  
         separator_index := string_to_split.substring_index(separator,1);
         if separator_index = 0 then 
            --not found; result is empty
            !!Result.make_from_string("");
         else 
            Result := string_to_split.substring(separator_index + separator.count,string_to_split.count);
         end; 
      end -- string_after_separator
   
   numbers_from_string(string_to_split: STRING): NUMERIC_ARRAY is 
      --an array of numbers, from a string of numbers separated by commas
      local 
         number_strings: ARRAY[STRING];
         i: INTEGER;
      do  
         !!Result.make_empty;
         number_strings := split_string(string_to_split,",");
         from 
            i := number_strings.lower;
         until 
            not number_strings.valid_index(i)
         loop 
            check 
               number_strings.item(i).is_double or number_strings.item(i).is_integer; 
            end;
            Result.add_last(double_from_string(number_strings.item(i)));
            i := i + 1;
         end; 
      end -- numbers_from_string
   
   print_normalization_check_on_series(field_index: INTEGER) is 
      do  
         std_output.put_string("Normalization check on series ");
         std_output.put_integer(field_index);
            --std_error.put_string( number_list.sorted_by( field_index ).table_as_string + "%N" )
            --std_error.put_string( number_list.normalized_series_table( field_index ).table_as_string )
         std_output.put_string("%NQ: ");
         std_output.put_double(number_list.chi_squared_q(field_index));
         std_output.put_string("%N(1-p): ");
         std_output.put_double(number_list.chi_squared_1_minus_p(field_index));
         std_output.put_new_line;
      end -- print_normalization_check_on_series
   
   print_multiple_regression_prediction(parameters: NUMERIC_ARRAY) is 
      --print a multiple regression prediction, including all beta 
      --values and prediction ranges
      do  
         if number_list.entry_count <= number_list.field_count then 
            std_output.put_string("Too few entries for multiple regression prediction%N");
         elseif parameters.count /= number_list.field_count - 1 then 
            std_output.put_string("Incorrect number of parameters for multiple regression.  ");
            std_output.put_string("Expected " + (number_list.field_count - 1).to_string + ", got " + parameters.count.to_string + "%N");
         else 
            std_output.put_string("Multiple regression prediction.%N");
            std_output.put_string("Solution: " + number_list.to_gaussian_solver.solution.to_string + "%N");
            std_output.put_string("Variance: " + number_list.full_variance.to_string + "%N");
            std_output.put_string("Standard Deviation: " + number_list.full_standard_deviation.to_string + "%N");
            std_output.put_string("Prediction from values: " + parameters.to_string + "%N");
            std_output.put_string("Projected result: " + number_list.prediction(parameters).to_string + "%N");
            std_output.put_string("UPI( 70%% ): " + number_list.upper_prediction_interval(0.7,parameters).to_string + "%N");
            std_output.put_string("LPI( 70%% ): " + number_list.lower_prediction_interval(0.7,parameters).to_string + "%N");
            std_output.put_string("UPI( 90%% ): " + number_list.upper_prediction_interval(0.9,parameters).to_string + "%N");
            std_output.put_string("LPI( 90%% ): " + number_list.lower_prediction_interval(0.9,parameters).to_string + "%N");
         end; 
      end -- print_multiple_regression_prediction

end -- class NUMBER_ARRAY_LIST_PARSER_3

Code Review

Again, sloppy processes; I let several easy ones get past me (including several instances of my personal bugaboo, forgetting parentheses on zero-argument features). Embarassing.

Compile

Twice as many defects faced my in compile as I'd expected, both on cycle 1 and cycle 2. My review procedures are not doing well.

Test

Test took almost twice as long as predicted, thanks to a couple of sneaky little problems which could have been caught sooner (such as by using defined methods in my design review!). Finally, however, I got reasonable output:

Table 11-3. D17: Test Results Format -- Program 10A

TestParameterExpected ValueActual Value - C++Actual Value - Eiffel
Table D16    
 Beta00.566450.5664570.566457
 Beta10.065330.0653290.065329
 Beta20.0087190.0087190.008719
 Beta30.151050.1510490.151049
 Projected Hours20.7620.757420.757369
 UPI(70%)26.8926.888826.888841
 LPI(70%)14.6314.625914.625897
 UPI(90%)33.6733.673133.673056
 LPI(90%)7.847.841687.841682
Appendix A    
 Beta06.70136.7013376.701337
 Beta10.07840.0783660.078336
 Beta20.01500.0150410.015041
 Beta30.24610.2460560.246056
 Projected Hours140.9140.902140.901986
 UPI(70%)179.7179.752179.751770
 LPI(70%)102.1102.052102.052202
 UPI(90%)222.74222.737222.737419
 LPI(90%)59.0659.066659.066552

Postmortem

PSP3 Project Plan Summary

Table 11-4. Project Plan Summary

Student:Victor B. PutzDate:000206
Program:Multiple RegressionProgram#10A
Instructor:WellsLanguage:C++
SummaryPlanActualTo date
Loc/Hour 4645.646
Planned time285 1228
Actual time 4981746
CPI (cost/performance index)  0.70
%reused766751
Test Defects/KLOC302429
Total Defects/KLOC140164145
Yield (defects before test/total defects)798580
% Appraisal COQ613.77
% Failure COQ2829.528.5
COQ A/F Ratio0.210.460.25
Program SizePlanActualTo date
Base2020 
Deleted00 
Modified11 
Added238378 
Reused8048042502
Total New and Changed2393792110
Total LOC106212024874
Upper Prediction Interval (70%)326  
Lower Prediction Interval (70%)152  
Time in Phase (min):PlanActualTo DateTo Date%
Planning576551423
High-Level Design038381
High-Level Design Review044< 1
Detailed Design407238614
Detailed Design Review6561094
Code718865624
Code Review912853
Compile17511837
Test639660322
Postmortem20161766
Total2854982754100
Total Time UPI (70%)322   
Total Time LPI (70%)249   
Defects Injected ActualTo DateTo Date %
Plan 000
High-Level Design0000
High-Level Design Review0000
Detailed Design11209732
Design Review0000
Code224020066
Code Review0000
Compile003< 1
Test0252
Total development3362305100
Defects Removed ActualTo DateTo Date %
Planning 000
High-Level Design0000
High-Level Design Review0000
Detailed Design 000
Design Review28238
Code685317
Code Review362910
Compile153113845
Test796220
Total development3362305100
After Development000 
Defect Removal EfficiencyPlanActualTo Date 
Defects/Hour - Design Review22.58.512.66 
Defects/Hour - Code Review24.23020.5 
Defects/Hour - Compile56.336.545.2 
Defects/Hour - Test6.95.66.2 
DRL (design review/test)3.261.542.04 
DRL (code review/test)3.55.363.3 
DRL (compile/test)8.166.527.3 
Eiffel code/compile/test
Time in Phase (min)ActualTo DateTo Date %
Code6543448
Code Review14546
Compile2615918
Test5624828
Total161895100
Defects InjectedActualTo DateTo Date %
Design6127
Code3016893
Compile000
Test01< 1
Total36181100
Defects RemovedActualTo DateTo Date %
Code232
Code Review42011
Compile2110960
Test94927
Total36181100
Defect Removal EfficiencyActualTo Date 
Defects/Hour - Code Review17.1422.2 
Defects/Hour - Compile48.4641.1 
Defects/Hour - Test9.6411.9 
DRL (code review/test)1.81.9 
DRL (compile/test)5.033.5 

Time Recording Log

Table 11-5. Time Recording Log

Student: Date:000204
  Program: 
StartStopInterruption TimeDelta timePhaseComments
000204 07:43:21000204 08:50:11264plan 
000205 09:12:21000205 09:50:14037high-level design 
000205 09:51:42000205 09:55:3803high-level design review#cycle 1
000206 09:42:35000206 11:07:554342design 
000206 11:34:37000206 12:21:08046design review 
000206 12:28:47000206 14:20:535755code 
000206 14:20:54000206 14:28:1607code review 
000206 14:28:48000206 14:52:31023compile 
000206 14:52:34000206 15:20:15027test#cycle 2
000206 15:25:24000206 15:55:05029design 
000206 15:56:18000206 16:10:4759design review 
000206 16:10:49000206 16:51:55734code 
000206 16:52:00000206 16:56:1304code review 
000206 16:59:43000206 17:26:32026compile 
000206 17:26:39000206 18:34:56068test 
000206 18:45:02000206 19:00:35015postmortem 
      

Table 11-6. Time Recording Log

Student: Date:000207
  Program: 
StartStopInterruption TimeDelta timePhaseComments
000207 18:43:01000207 19:20:52235code 
000207 19:27:21000207 19:34:5407code review 
000207 19:35:03000207 19:43:0408compile 
000207 19:43:47000207 20:03:43019test#cycle 2
000207 20:05:57000207 20:35:38029code 
000207 20:36:21000207 20:42:2406code review 
000207 20:42:27000207 21:01:18018compile 
000207 21:02:31000207 21:38:25035test 
      

Defect Recording Log

Table 11-7. Defect Recording Log

Student: Date:000204
  Program: 
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
000206 11:37:47waigdesigndesign review5changed reduction_coefficients algorithm to properly produce a row of reduced coefficients
000206 11:48:13wnigdesigndesign review1was using obsolete mult_and_sum_array instead of mult_array and sum_array
000206 11:50:27wnigdesigndesign review1again, using obsolete mult_and_sum_array
000206 11:53:00mdigdesigndesign review1added implementation for mult_array_by_scalar and add_arrays
000206 11:56:45macmdesigndesign review0forgot to increment loop counter
000206 11:58:27waigdesigndesign review22factored array tail/head/sum etc logic into numeric_array class, changed base of number_array_list to reflect
000206 12:30:33wtigdesigncode2had designated sum's return type as numeric_array instead of like item
000206 12:41:56ctomdesigncode1forgot some ensure contracts
000206 12:54:21wtomdesigncode1minor-- declared no return type for first_solution_coefficient
000206 12:58:33ctomdesigncode3improved contracts on add_coefficients
000206 13:02:48maigdesigncode1missed inclusion of reduced_solver.solution in solution
000206 13:16:47waigdesigncode1missed correct extraction of first_row_arguments from coefficients
000206 14:24:42sytycodecode review0forgot parentheses on no-arg feature
000206 14:25:15sytycodecode review0forgot parentheses on no-arg feature
000206 14:25:41syomcodecode review0forgot const on features
000206 14:27:04wncmcodecode review0was using coefficients.size rather than first_row.size
000206 14:31:45wnomcodecompile0forgot to rename number_array_list to number_array_list_3, etc.
000206 14:32:20syomcodecompile0forgot to #include gaussian_solver.h
000206 14:33:22wncmcodecompile0gaussian_solver uses add_coefficients, not add_entry...
000206 14:34:20wncmcodecompile0didn't change std::vector< double > to numeric_array
000206 14:39:18sycmcodecompile0forgot to put in a missing variable name
000206 14:40:06sytycodecompile0used = instead of == for comparison
000206 14:40:25sytycodecompile0mistakenly put "." instead of "_" in variable name
000206 14:41:22macmcodecompile0forgot to return Result
000206 14:41:35sycmcodecompile0forgot parentheses on no-arg feature
000206 14:43:37wncmcodecompile0used tail() instead of end() for last iterator in array
000206 14:44:22sycmcodecompile0forgot parentheses on no-arg feature
000206 14:44:44sycmcodecompile0forgot to #include stdio
000206 14:45:34sytycodecompile0didn't separate class and feature with scope operator
000206 14:50:42iuigcodecompile1lists in C++ use different interface (not scalar-index)
000206 14:53:33wncmcodetest7for fully-reduced solution, was calculating the inverse of the correct answer
000206 15:02:10wtcmcodetest17OH, COME ON! I accidentally declared the Result type as bool, and the compiler never warned me, so the Result was always 1...
000206 16:00:20wcigcodedesign review0wrong loop termination (i+1, not i)
000206 16:02:04mcigcodedesign review0missed sqrt on subtotal for prediction_range
000206 16:18:34ctigdesigncode0missed ensure contract in gaussian_solver_term
000206 16:27:29ctigdesigncode0didn't have ensure contract in variance_series
000206 16:53:58sytycodecode review0misplaced parentheses
000206 16:54:55macmcodecode review0didn't return Result
000206 17:00:14wtcmcodecompile0full_variance_series was returning incorrect type (number_array_list instead of number_array_list_3)
000206 17:06:40syigcodecompile3grrr... C++ can't handle recursive class definitions, so had to break off a separate gaussian_solver_implementation
000206 17:11:54sycmcodecompile0parentheses on no-arg feature
000206 17:12:11sycmcodecompile0parentheses on no-arg feature
000206 17:12:58mccmcodecompile0forgot to sum a series
000206 17:13:28sycmcodecompile0parentheses on no-arg feature
000206 17:13:55sytycodecompile0mistyped variable name
000206 17:14:15sytycodecompile0:= instead of = for assignment
000206 17:14:59wncmcodecompile0wrong name for a feature-- mean_by_series instead of by field!
000206 17:15:52iucmcodecompile0forgot to pass field index to mapped_to!
000206 17:17:18wacmcodecompile0was selecting a series when I shouldn't have been
000206 17:17:52sycmcodecompile0parentheses, no-arg feature
000206 17:18:49wacmcodecompile0tried sum instead of sum_by_field
000206 17:19:40sycmcodecompile0parentheses, no-arg feature
000206 17:20:14sycmcodecompile0parentheses, no-arg feature
000206 17:22:56syomcodecompile0did not #include t_distribution.h
000206 17:25:12syomtestcompile0extracting the implementation details from gaussian_solver means I have to write TWO access functions (const/non-const) for coefficients. Lucky me.
000206 17:28:57iuomdesigntest5multiplied_by_list requires field counts to be the same, so you must select the series first. I bet I'll see this again.
000206 17:36:00waigdesigntest5multiplied_by_list was broken; wasn't using rhs.tail
000206 17:42:13mdigtesttest17Ah, yes... when we converted gaussian_solver to a dynamic implementation, we forgot to make a copy constructor. Hijinks ensued
000206 18:01:12ctigdesigntest0bad contract on gauss_solution == field_count - 1; should be == field_count
000206 18:02:00ctigdesigntest1...same contract; entry_count > 0 should imply the previous contract
000206 18:05:07wcomdesigntest16variance was using entry_count - 4 instead of entry_count - field_count
000206 18:22:21wacmdesigntest7DANG! Missed obvious incorrect assignment in prediction_range( the numerator calculation )
       

Table 11-8. Defect Recording Log

Student: Date:000207
  Program: 
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
000207 19:01:00ctigdesigncode0contract in add_coefficients needed work
000207 19:29:27wntycodecode review0used count instead of sum in calculation
000207 19:29:55miomcodecode review2forgot to implement make_empty
000207 19:35:37sycmcodecompile0forgot "then" after "if"
000207 19:36:01sycmcodecompile0forgot colon between feature and its return value
000207 19:37:15wtcmcodecompile1forgot-- had to update number_array_list_3 to use numeric_array
000207 19:39:03wncmcodecompile0mistyped item as Item
000207 19:41:46syigcodecompile0can't list a non-exported feature in the require clause of an exported feature
000207 19:42:24wncmcodecompile0mistyped possible_solution as test_solution
000207 19:44:06wcigcodetest1testing for equality with doubles is dangerous... used in_range and it got better.
000207 19:45:52mccmcodetest3forgot that manifest strings in Eiffel are not constant!
000207 19:49:54exigcodetest11bad test data
000207 20:01:47wccmdesigntest1is_valid_solution was using count + 1 instead of count - 1 for valid number of terms in possible solution
000207 20:19:57waigdesigncode1same as in the C++ version-- bad alg in design for gaussian_solver_term
000207 20:37:21macmcodecode review0forgot to increment loop counter
000207 20:41:09wacmdesigncode review0wrong algorithm in prediction_range (same as C++ error)
000207 20:42:36sycmcodecompile0= instead of :=...
000207 20:43:22sycmdesigncompile0poorly mismatched parentheses
000207 20:44:14sycmcodecompile0separated different-type parameters by , instead of ;
000207 20:45:08wncmcodecompile0wrong name for variance; should have been full_variance, etc.
000207 20:46:26syigcodecompile4forgot-- you don't separate "else if"
000207 20:51:48wntycodecompile0mistyped a name
000207 20:52:18wncmcodecompile0silly person! mistyped function name
000207 20:53:10wncmcodecompile1renamed "prediction_values" to "parameters"
000207 20:54:34wncmcodecompile0typed "sum_by_series" instead of "sum_by_field"
000207 20:55:15wcigcodecompile0er... can't use "0 < x < 4", must be "0 < x and x < 4" etc
000207 20:55:56wncmcodecompile0had to change the variance, standard_deviation, etc calls to full_variance, etc.
000207 20:56:58mccmcodecompile0had to separate make_from_entry and suffixed_by_list
000207 20:58:16mccmcodecompile0didn't initialize t correctly
000207 20:59:40iucmcodecompile0wasn't declaring a field to use in mapped_to
000207 21:00:06mccmcodecompile0wasn't doing appropriate to_string calls to print doubles
000207 21:02:34macmcodetest5er... durn thang; eiffel arrays start at index 1, not 0. This will probably cause me pain.
000207 21:11:43waigcodetest0was not allowing zero entries for full_variation_series, disrupting recursion
000207 21:14:58wacmdesigntest0was calculating predictions wrong!
000207 21:16:28macmcodetest0missed initialization of square!
000207 21:18:11wacmcodetest19used / instead of - for a calculation. Stupid, stupid... (and this was the most time-consuming)