The Personal Software Process: an Independent Study
Prev	Chapter 10. Lesson 10: Software Design, part II	Next

Program 9a: Normal Distribution fit

Using a linked list, write a program to do a chi-squared test for a normal distribution.

Given Requirements

Requirements: Write program 9a to calculate the degree to which a string of n real numbers is normally distributed. The methods and formulas used in this calculation are explained on page 529 [of the text]. Use program 5a to calculate the values of the normal and chi-squared distributions. Assume than n is > 20 and an even multiple of 5. Use program 8a to sort the numbers in ascending order.

Testing: Thoroughly test the program. As one test, use the LOC/Method data in table D14 as a test case [note: this is the same data from program 8a]. Here, the result should be Q=34.4 with a probability value < 0.005 that the data are normally distributed. The solution to this case is shown [in the text]. Submit a test report that includes the test results and uses the format in table D15

Table 10-1. D15: Test Results Format -- program 9a

Test	Expected Result	Actual Result
Table D14
Q	34.4
1-p	7.60*10^-5

--[Humphrey95]

Planning

Requirements

I'll make heavy reuse of the parsing and number array list classes used in program 8a, which means that the data input format will remain the same (note-- in program 8a, I suggested that the first usable line of the input file be a single number, the number of fields in the file; since the number of fields can be extrapolated from the first historical data entry, this requirement has been removed).

So much for getting the numbers into the program. Where we run into a problem is the next bit-- which series are we going to check for normality? If this is going to be reused, how shall we decide which series to use?

The simple response would be to simply check all the series, but I'll do something slightly different-- after the historical data terminator, I'll make the input consist of commands and numbers; in other words, something like this:

...
58, 19.33
305, 25.42
stop -- end of data
normality-check-on-series 1 -- request a normality check on the first series

This will allow us to add further commands in the future, paving the way for some minor expandibility (considering that there is only one more program in this series, that may be unnecessary).

When the normality-check-on-series command is encountered, the program will perform a normality check on the given series (series 1 being the first set of numbers, and so on), printing the Q and 1-p results, thusly:

Normality check on series: 1
Q: 34.4
(1-p): 7.6e-5

Size Estimate

To do the size estimate-- and to make it worthwhile-- I'm going to do a bit more of a conceptual design than I have earlier, listing my appropriate classes and the methods I think I'll need. The algorithm in use in program 9a is considerably more complex than those used in other programs (or at least it seems to me), so I need to do more preparation here.

A quick conceptual design using Dia gives us the following:

Preliminary design diagram, using a form of UML

Note the heavy use of single_variable_function subclasses (it's not depicted well, but each class in the preliminary design except for number_array_list_2 and number_array_list_parser_2 is a subclass of single_variable_function. The normal_distribution_inverse class will be used to calculate the values of the normal distribution for S segments. The tex lists these for certain values of S (table A24 in [Humphrey95]), and we can use the table (since the requirements guarantee the numbers will be divisible by 5) but this will be more general.

With that in mind, our size estimating template gives us a PROBE-generated estimate of 239 new/changed LOC.

Resource/Time Estimate

Historical data gives us a PROBE-generated estimate of 285 minutes for this project.

Development

Design

Once again, I'm using a cluster of tiny single_variable_function classes to do much of the work. I also reconstruct the "by-hand" tables as per the algorithm in [Humphrey95]-- not because they are necessary for the computation, but because they will simplify testing by showing results similar to those in the text. Our preliminary design is fleshed out as follows:

Design diagram, using a form of UML

Design Review

Actually, not all that many defects were caught by the design review this time around, and few of those were show-stoppers. The basic design seems reasonable.

Code

Reused Code

The error_log, gamma_function, is_double_equal, normal_function_base, simple_input_parser, simpson_integrator, square, and whitespace_stripper modules were used in their entirety. Much code from number_array_list and number_array_list_parser was reused, but those classes are reprinted here since much was changed in number_array_list.

Part of the design had to be changed because I had originally planned to inherit from number_array_list to create number_array_list_2-- unfortunately, many of the functions returned by number_array_list did not (of course) return values of number_array_list_2-- meaning that I couldn't chain together functions as easily as I wanted to, which was unacceptable; as a result, I just expanded the original class. I did not have a chance to verify if "like current" in Eiffel would fix these problems (it may well have done so!)

adder

#ifndef ADDER_H
#define ADDER_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class adder : public single_variable_function
{
 protected:
  double addend;
 public:
  adder( double new_addend );
  virtual double at( double x ) const;
};


#endif

#include "adder.h"

adder::adder( double new_addend ) :
addend( new_addend )
{
}

double
adder::at( double x ) const
{
  return x + addend;
}

chi_squared_base

/*
*/

#ifndef CHI_SQUARED_BASE_H
#define CHI_SQUARED_BASE_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class chi_squared_base : public single_variable_function
{
 protected:
  int n;
 public:
  virtual double at( double x ) const;
  double base( void ) const;
  chi_squared_base( int new_n );
};

#endif


/*
*/

/*
*/

#include "chi_squared_base.h"

#ifndef GAMMA_FUNCTION_H
#include "gamma_function.h"
#endif

#include <math.h>
#include <iostream>

double
chi_squared_base::at( double x )  const
{
  const double Result = base()
    * pow( x, ( static_cast<double>(n) / 2.0 - 1.0 ) )
    * exp( -x / 2.0 );
//    if ( x == 0 || x == 10 )
//      {
//        gamma_function gamma;
//        cout << "Chibase at " << x << "\n"
//  	   << "n: " << n << "\n"
//  	   << "2^(n/2) " << pow(2.0,static_cast<double>(n)/2.0) << "\n"
//  	   << "gamma(n/2)" << gamma.at( static_cast<double>(n)/2.0 ) << "\n"
//  	   << "Base: " << base() << "\n"
//  	   << "x^(n/2 - 1): " << pow( x, ( static_cast< double >(n) / 2.0 - 1.0 ) ) << "\n"
//  	   << "e^(-x/2) :" << exp( -x / 2.0 ) << "\n";
//      }
  return Result;
}

chi_squared_base::chi_squared_base( int new_n ) :
n ( new_n )
{
}

double
chi_squared_base::base( void ) const
{
  gamma_function gamma;
  const double Result = 1 / ( pow( 2.0, static_cast<double>(n)/2.0 ) * gamma.at( static_cast<double>(n)/2.0 ) );
  return Result;
}

/*
*/

chi_squared_integral

#ifndef CHI_SQUARED_INTEGRAL_H
#define CHI_SQUARED_INTEGRAL_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class chi_squared_integral : public single_variable_function
{
 protected:
  int n;
 public:
  virtual double at( double x ) const;
  chi_squared_integral( int new_n );
};

#endif

/*
*/

#include "chi_squared_integral.h"

#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef SIMPSON_INTEGRATOR_H
#include "simpson_integrator.h"
#endif

#include <math.h>

double
chi_squared_integral::at( double x ) const
{
  REQUIRE( x >= 0 );
  simpson_integrator homer;
  chi_squared_base chi_base( n );
  double Result = homer.integral( chi_base, 0, x );
  return Result;
}

chi_squared_integral::chi_squared_integral( int new_n ) :
n( new_n )
{
  REQUIRE( new_n > 0 );
}

/*
*/

normal_distribution_inverse

#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#define NORMAL_DISTRIBUTION_INVERSE_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class normal_distribution_inverse : public single_variable_function
{
 public:
  virtual double at( double x ) const;
  double next_guess( double arg, double last_result, double target ) const;
};


#endif

/*
*/

#include "normal_distribution_inverse.h"

#ifndef NORMAL_DISTRIBUTION_INTEGRAL_H
#include "normal_distribution_integral.h"
#endif

#include <math.h>

double
normal_distribution_inverse::at( double x ) const
{
  const double target = x;
  const double error_margin = 0.0000001;
  double last_error = 0;
  double Result = 0;
  double this_result = 0;
  bool has_tried_once = false;
  normal_distribution_integral norm;
  while ( !has_tried_once || ( last_error > error_margin ) )
    {
      const double last_result = this_result;
      this_result = norm.at( Result );
      last_error = fabs( this_result - target );
      if ( last_error > error_margin ) 
	{
	  Result = next_guess( Result, last_result, target );
	}
      has_tried_once = true;
    }
  return Result;
}


double
normal_distribution_inverse::next_guess( double arg, double last_result, double target ) const
{
  double Result = arg + ( target - last_result );
  return Result;
}

/*
*/

normalizer

#ifndef NORMALIZER_H
#define NORMALIZER_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class normalizer : public single_variable_function
{
 protected:
  double data_mean;
  double standard_deviation;
 public:
  normalizer( double new_data_mean, double new_standard_deviation );
  virtual double at( double x ) const;
};

#endif

#include "normalizer.h"

normalizer::normalizer( double new_data_mean, double new_standard_deviation ) :
data_mean( new_data_mean ),
standard_deviation( new_standard_deviation )
{
}

double
normalizer::at( double x ) const
{
  return ( x - data_mean ) / standard_deviation;
}

number_array_list

/*
*/

#ifndef NUMBER_ARRAY_LIST_H
#define NUMBER_ARRAY_LIST_H

#include <list>
#include <vector>
#include <string>

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class number_array_list : public std::list< std::vector< double > >
{
 protected:
  int m_field_count;

 public:
  int field_count( void ) const;
  std::vector< double > head( void ) const;
  number_array_list tail( void ) const;
  number_array_list suffixed_by_item( const std::vector< double >& entry ) const;
  number_array_list suffixed_by_list( const number_array_list& rhs ) const;
  number_array_list mapped_to( int field_index, const single_variable_function& f ) const;
  number_array_list multiplied_by_list( int field_index, const number_array_list& rhs ) const;
  number_array_list sorted_by( int field_index ) const;
  number_array_list items_less_than( int field_index, double value ) const;
  number_array_list items_greater_than_or_equal_to( int field_index, double value ) const;

  bool is_valid_entry( const std::vector< double >& entry ) const;
  bool is_valid_field_index( int field_index ) const;
  bool is_sorted_by( int field_index ) const;
  double sum_by_field( int field_index ) const;
  double mean_by_field( int field_index ) const;
  int entry_count( void ) const;
  
  void set_field_count( int new_field_count );
  void add_entry( const std::vector< double >& entry );

  void make_from_entry( const std::vector< double >& entry );
  void make( void );
  void make_from_list( const number_array_list& rhs );
  number_array_list( void );

  double standard_deviation( int field_index, bool is_full_population ) const;
  double variance( int field_index, bool is_full_population ) const;
  number_array_list normalized_series( int field_index ) const;
  int normalization_s_value( void ) const;
  number_array_list normalized_series_table( int field_index ) const;
  number_array_list normal_distribution_segments( int s ) const;
  number_array_list chi_squared_table( int field_index ) const;
  double chi_squared_q( int field_index ) const;
  double chi_squared_1_minus_p( int field_index ) const;
  number_array_list select_series( int field_index ) const;
  int items_in_range( int field_index, double lower_limit, double upper_limit ) const;
  std::string entry_as_string( const std::vector< double >& entry ) const;
  std::string table_as_string( void ) const;

};

#endif

/*
*/

/*
*/

#include "number_array_list.h"

#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef ADDER_H
#include "adder.h"
#endif
#ifndef SQUARE_H
#include "square.h"
#endif
#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#include "normal_distribution_inverse.h"
#endif
#ifndef CHI_SQUARED_INTEGRAL_H
#include "chi_squared_integral.h"
#endif
#ifndef NORMALIZER_H
#include "normalizer.h"
#endif

#include <math.h>
#include <stdio.h>

int
number_array_list::field_count( void ) const
{
  return m_field_count;
}

std::vector< double >
number_array_list::head( void ) const
{
  REQUIRE( entry_count() >= 1 );
  return *(begin());
}

number_array_list
number_array_list::tail( void ) const
{
  number_array_list Result;
  if ( entry_count() > 1 )
    {
      number_array_list::const_iterator tail_head = begin();
      ++tail_head;
      Result.insert( Result.begin(), tail_head, end() );
      Result.m_field_count = field_count();
      ENSURE( Result.entry_count() == entry_count() - 1 );
    }
  return Result;
}

number_array_list
number_array_list::suffixed_by_item( const std::vector< double >& entry ) const
{
  number_array_list Result;
  Result.make_from_list( *this );
  Result.add_entry( entry );
  ENSURE( Result.entry_count() == entry_count() + 1 );
  return Result;
}

number_array_list
number_array_list::suffixed_by_list( const number_array_list& rhs ) const
{
  number_array_list Result;
  Result.make_from_list( *this );
  for ( number_array_list::const_iterator iter = rhs.begin(); iter != rhs.end(); ++iter )
    {
      Result.add_entry( *iter );
    }
  ENSURE( Result.entry_count() == entry_count() + rhs.entry_count() );
  return Result;
}

number_array_list
number_array_list::mapped_to( int field_index, const single_variable_function& f ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector<double> new_entry = head();
      new_entry[ field_index ] = f.at( head()[ field_index ] );
      Result.make_from_list( Result.suffixed_by_item( new_entry ).suffixed_by_list( tail().mapped_to( field_index, f )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::multiplied_by_list( int field_index, const number_array_list& rhs ) const
{
  REQUIRE( entry_count() == rhs.entry_count() );
  REQUIRE( field_count() == rhs.field_count() );
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector< double > new_entry = head();
      new_entry[ field_index ] = head()[ field_index ] * rhs.head()[field_index];
      Result.make_from_list( Result.suffixed_by_item( new_entry ).
			      suffixed_by_list( tail().multiplied_by_list( field_index, rhs ) ) );
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( is_sorted_by( field_index ) )
    {
      Result.make_from_list( *this );
    }
  else
    {
      Result.make_from_list( tail().items_less_than( field_index, head()[field_index] ).sorted_by( field_index ).
			     suffixed_by_item( head() ).
			     suffixed_by_list( tail().items_greater_than_or_equal_to( field_index, head()[field_index] ).sorted_by( field_index )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::items_less_than( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] < value )
	{
	  Result.add_entry( *iter );
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

number_array_list
number_array_list::items_greater_than_or_equal_to( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] >= value )
	{
	  Result.add_entry(*iter);
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

bool
number_array_list::is_valid_entry( const std::vector<double>& entry ) const
{
  bool Result = false;
  if ( ( entry_count() == 0 ) 
       || ( ( entry_count() > 0 ) && ( entry.size() == field_count() ) ) )
    {
      Result = true;
    }
  return Result;
}

bool
number_array_list::is_valid_field_index( int field_index ) const
{
  bool Result = true;
  if ( entry_count() > 0 )
    {
      Result = ( 0 <= field_index ) && ( field_index < m_field_count );
    }
  return Result;
}

bool
number_array_list::is_sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  bool Result = true;
  if ( entry_count() > 1 )
    {
      Result = ( head()[field_index] < tail().head()[field_index] ) && tail().is_sorted_by( field_index );
    }
  return Result;
}

double
number_array_list::sum_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) )
  double Result = 0;
  if ( entry_count() > 0 )
    {
      Result = head()[field_index] + tail().sum_by_field( field_index );
    }
  return Result;
}

double
number_array_list::mean_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  REQUIRE( entry_count() > 0 );
  double Result = sum_by_field( field_index ) / static_cast<double>( entry_count() );
  return Result;
}

int
number_array_list::entry_count( void ) const
{
  return size();
}

void
number_array_list::set_field_count( int new_field_count )
{
  REQUIRE( entry_count() == 0 );
  m_field_count = new_field_count;
  ENSURE( field_count() == new_field_count );
}

void
number_array_list::add_entry( const std::vector< double >& entry )
{
  if ( entry_count() == 0 )
    {
      set_field_count( entry.size() );
    }
  REQUIRE( is_valid_entry( entry ) );
  const int old_entry_count = entry_count();
  push_back( entry );
  ENSURE( entry_count() == old_entry_count + 1 );
}

void
number_array_list::make_from_entry( const std::vector<double>& entry )
{
  make();
  add_entry( entry );
  ENSURE( entry_count() == 1 );
  ENSURE( head() == entry );
}

void
number_array_list::make( void )
{
  clear();
  m_field_count = -1;
  ENSURE( m_field_count == -1 );
  ENSURE( entry_count() == 0 );
}

void
number_array_list::make_from_list( const number_array_list& rhs )
{
  make();
  insert( begin(), rhs.begin(), rhs.end() );
  m_field_count = rhs.field_count();
  ENSURE( entry_count() == rhs.entry_count() );
}

number_array_list::number_array_list( void )
{
  make();
}

double
number_array_list::standard_deviation( int field_index, bool is_full_population ) const
{
  double Result = sqrt( variance( field_index, is_full_population ) );
  return Result;
}

double
number_array_list::variance( int field_index, bool is_full_population ) const
{
  adder an_adder( -mean_by_field( field_index ) );
  square a_square;
  if ( is_full_population )
    {
      CHECK( entry_count() > 0 );
    }
  else
    {
      CHECK( entry_count() > 1 );
    }
  const double divisor = static_cast< double >( is_full_population ? entry_count() : entry_count() - 1 );
  const double Result = mapped_to( field_index, an_adder ).
    mapped_to( field_index, a_square ).sum_by_field( field_index ) / divisor;
  return Result;
}

number_array_list
number_array_list::normalized_series( int field_index ) const
{
  //cout << "Standard dev: " << standard_deviation( field_index, false ) << "\n";
  normalizer a_normalizer( mean_by_field( field_index ), standard_deviation( field_index, false ) );
  number_array_list Result = mapped_to( field_index, a_normalizer ).select_series( field_index );
  //cout << "**in normalized_series**\n" << mapped_to( field_index, a_normalizer ).table_as_string() << "\n";
  ENSURE( Result.entry_count() == entry_count() );
  //cout << "**normalized_series result**\n" << Result.table_as_string() << "\n";
  return Result;
}

int
number_array_list::normalization_s_value( void ) const
{
  REQUIRE( entry_count() % 5 == 0 );
  int Result = entry_count() / 5;
  return Result;
}

number_array_list
number_array_list::normalized_series_table( int field_index ) const
{
  REQUIRE( entry_count() > 0 );
  number_array_list sorted_list;
  sorted_list.make_from_list( sorted_by( field_index ) );
  number_array_list sorted_normalized_list = sorted_list.normalized_series( field_index );
  //cout << "Sorted list: \n" << sorted_list.table_as_string() << "\n";
  //cout << "sorted normalized_list: \n" << sorted_normalized_list.table_as_string() << "\n";
  CHECK( sorted_list.entry_count() == entry_count() );
  CHECK( sorted_normalized_list.entry_count() == entry_count() );
  number_array_list::const_iterator sorted_list_iter;
  number_array_list::const_iterator sorted_normalized_list_iter;
  int i = 0;
  number_array_list Result;
  for( i = 0, sorted_list_iter = sorted_list.begin(), sorted_normalized_list_iter = sorted_normalized_list.begin();
       sorted_list_iter != sorted_list.end() && sorted_normalized_list_iter != sorted_normalized_list.end();
       )//++i, ++sorted_list_iter, ++sorted_normalized_list_iter )
    {
      std::vector<double>new_entry;
      new_entry.push_back( static_cast<double>(i + 1) );
      new_entry.push_back( static_cast<double>(i + 1) / static_cast<double>(entry_count()));
      new_entry.push_back( (*sorted_list_iter)[ field_index ] );
      new_entry.push_back( (*sorted_normalized_list_iter)[ 0 ] );
      Result.add_entry( new_entry );

      ++i;
      ++sorted_list_iter;
      ++sorted_normalized_list_iter;
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::normal_distribution_segments( int s ) const
{
  REQUIRE( s > 0 );
  const double norm_floor = -10000;
  const double norm_ceiling = 10000;
  normal_distribution_inverse norm_inverse;
  number_array_list Result;
  for ( int i = 1; i <= s; ++i )
    {
      std::vector< double > new_entry;
      if ( i == 1 ) 
	{
	  new_entry.push_back( norm_floor );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast<double>(i - 1) ) 
						* ( 1.0 / static_cast< double >( s ) )));
	}
      if ( i == s )
	{
	  new_entry.push_back( norm_ceiling );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast< double >( i ) )
						* ( 1.0 / static_cast< double >( s ) )));
	}
      Result.add_entry( new_entry );
    }
  return Result;
}

number_array_list
number_array_list::chi_squared_table( int field_index ) const
{
  number_array_list normal_segments = normal_distribution_segments( normalization_s_value() );
  number_array_list norm_table = normalized_series_table( field_index );
  number_array_list Result;
  square a_square;
  //cout << "norm table\n" << norm_table.table_as_string() << "\n\n";
  int i = 0;
  number_array_list::const_iterator iter;
  for ( i = 0, iter = normal_segments.begin(); iter != normal_segments.end(); ++iter, ++i )
    {
      std::vector< double > new_entry;
      new_entry.push_back( i );
      new_entry.push_back( (*iter)[0] );
      new_entry.push_back( (*iter)[1] );
      new_entry.push_back( static_cast< double >(norm_table.entry_count()) / norm_table.normalization_s_value() );
      new_entry.push_back( static_cast< double >(norm_table.items_in_range( 3, new_entry[1], new_entry[2] ) ) );
      new_entry.push_back( a_square.at( new_entry[3] - new_entry[4] ) );
      new_entry.push_back( new_entry[5] / new_entry[ 3 ] );
      Result.add_entry( new_entry );
    }
  return Result;
}

double
number_array_list::chi_squared_q( int field_index ) const
{
  //cout << chi_squared_table( field_index ).table_as_string();  
  double Result = chi_squared_table( field_index ).sum_by_field( 6 );
  return Result;
}

#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif

double
number_array_list::chi_squared_1_minus_p( int field_index ) const
{
  chi_squared_integral chi( normalization_s_value() - 1 );
  double Result = 1 - chi.at( chi_squared_q( field_index ) );
  return Result;
}

number_array_list
number_array_list::select_series( int field_index ) const
{
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector< double >new_entry;
      new_entry.push_back( head()[field_index] );
      Result.add_entry( new_entry );
    }
  if ( entry_count() > 1 )
    {
      Result.make_from_list( Result.suffixed_by_list( tail().select_series( field_index ) ));
    }
  ENSURE( Result.entry_count() == entry_count() );
  ENSURE( Result.field_count() == 1 );
  return Result;
}

int
number_array_list::items_in_range( int field_index, double lower_limit, double upper_limit ) const
{
  int Result = 0;
  if (( entry_count() > 0 )
      && ( head()[field_index] > lower_limit )
      && ( head()[field_index] <= upper_limit ) )
    {
      Result = 1;
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().items_in_range( field_index, lower_limit, upper_limit );
    }
  return Result;
}

std::string double_as_string( double x ) 
{
  char buffer[ 1000 ];
  sprintf( buffer, "%f", x );
  return std::string( buffer );
}

std::string
number_array_list::entry_as_string( const std::vector< double >& entry ) const
{
  std::string Result = "( ";
  for ( std::vector< double >::const_iterator iter = entry.begin(); iter != entry.end(); ++iter )
    {
      Result = Result + double_as_string( *iter );
      if ( ( iter + 1 ) != entry.end() )
	{
	  Result = Result + ", ";
	}
    }
  Result = Result + " )";
  return Result;
}  

std::string
number_array_list::table_as_string( void ) const
{
  std::string Result = "";
  if ( entry_count() > 0 ) 
    {
      Result = Result + entry_as_string( head() ) + "\n";
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().table_as_string();
    }
  return Result;
}

/*
*/

number_array_list_parser_2

#ifndef NUMBER_ARRAY_LIST_PARSER_H
#define NUMBER_ARRAY_LIST_PARSER_H

#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif
#ifndef NUMBER_ARRAY_LIST_H
#include "number_array_list.h"
#endif

class number_array_list_parser_2: public simple_input_parser
{
 protected:
  number_array_list number_list;
  enum state_t { Parsing_historical_data, Parsing_commands };
  state_t state;

 public:
  virtual std::string transformed_line (const std::string & line) const;
  std::string string_stripped_of_comments (const std::string & str) const;
  static bool is_double (const std::string & str);
  static double double_from_string (const std::string & str);
  static const std::string & historical_data_terminator;
  static const std::string & inline_comment_begin;
  bool last_line_is_blank (void);

  virtual void parse_last_line (void);
  void parse_last_line_as_historical_data (void);
  void parse_last_line_as_end_of_historical_data (void);
  void parse_last_line_as_command( void );
  void print_normalization_check( int field_index );
  void make( void );
  number_array_list_parser_2( void );

  bool last_line_is_valid_historical_data( void ) const;
  static std::vector< std::string > split_string( const std::string& string_to_split, const std::string& separator );
  static std::string string_before_separator( const std::string& string_to_split, const std::string& separator );
  static std::string string_after_separator( const std::string& string_to_split, const std::string& separator );
  static std::vector< double > numbers_from_string( const std::string& string_to_split );

};

#endif

/*
*/

#include "number_array_list_parser_2.h"
#ifndef WHITESPACE_STRIPPER_H
#include "whitespace_stripper.h"
#endif
#ifndef ERROR_LOG_H
#include "error_log.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif

void
number_array_list_parser_2::make (void)
{
  simple_input_parser::reset ();
  state = Parsing_historical_data;
  number_list.make ();
}

std::string number_array_list_parser_2::transformed_line (const std::string & str) const
{
  return whitespace_stripper::string_stripped_of_whitespace (string_stripped_of_comments (str));
}

std::string
  number_array_list_parser_2::string_stripped_of_comments (const std::string & str) const
{
  const std::string::size_type comment_index = str.find (inline_comment_begin);
  return str.substr (0, comment_index);
}

bool 
number_array_list_parser_2::is_double (const std::string & str)
{
  bool
    Result = true;
  char *
    conversion_end = NULL;
  strtod (str.c_str (), &conversion_end);
  if (conversion_end == str.data ())
    {
      Result = false;
    }
  return Result;
}

double
number_array_list_parser_2::double_from_string (const std::string & str)
{
  REQUIRE (is_double (str));
  return strtod (str.c_str (), NULL);
}


const
  std::string & number_array_list_parser_2::historical_data_terminator = "stop";

const
  std::string & number_array_list_parser_2::inline_comment_begin = "--";

bool number_array_list_parser_2::last_line_is_blank (void)
{
  if (last_line ().length () == 0)
    {
      return true;
    }
  else
    {
      return false;
    }
}

void
number_array_list_parser_2::parse_last_line (void)
{
  if (last_line_is_blank ())
    {
      return;
    }
  if ( state == Parsing_historical_data )
    {
      if ( last_line () == historical_data_terminator)
	{
	  parse_last_line_as_end_of_historical_data ();
	}
      else
	{
	  parse_last_line_as_historical_data ();
	}
    }
  else
    {
      parse_last_line_as_command();
    }
}


void
number_array_list_parser_2::parse_last_line_as_historical_data (void)
{
  if ( last_line_is_valid_historical_data() )
    {
      const std::vector< double >this_entry = numbers_from_string( last_line() );
      number_list.add_entry( this_entry );
    }
  else
    {
      error_log errlog;
      errlog.log_error( std::string( "Invalid data entry: " ) + last_line() );
    }
}
  
void
number_array_list_parser_2::parse_last_line_as_end_of_historical_data (void)
{
  REQUIRE (last_line () == historical_data_terminator);
  cout << "Historical data read; " 
       << number_list.entry_count() << " entries, " 
       << number_list.field_count() << " fields.\n";
  state = Parsing_commands;
}

number_array_list_parser_2::number_array_list_parser_2 (void)
{
  make();
}


bool
number_array_list_parser_2::last_line_is_valid_historical_data( void ) const
{
  const std::vector< std::string > substrings = split_string( last_line(), "," );
  bool Result = true;
  //make sure we have a valid field count for the number_array_list
  if ( ( number_list.entry_count() > 0 ) && ( substrings.size() != number_list.field_count() ) )
    {
      Result = false;
    }
  //...and that each substring can be interpreted as a double
  for ( int i = 0; i < substrings.size(); ++i )
    {
      if ( !is_double( substrings[ i ] ) )
	{
	  Result = false;
	}
    }
  return Result;
}

std::vector< std::string >
number_array_list_parser_2::split_string( const std::string& string_to_split, const std::string& separator ) 
{
  std::vector< std::string > Result;
  const std::string prefix = string_before_separator( string_to_split, separator );
  const std::string remainder = string_after_separator( string_to_split, separator );
  Result.push_back( prefix );
  if ( remainder.size() > 0 )
    {
      const std::vector< std::string > split_remainder = split_string( remainder, separator );
      Result.insert( Result.end(), split_remainder.begin(), split_remainder.end() );
    }
  return Result;
}

std::string
number_array_list_parser_2::string_before_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
    //not found; result is entire string
      Result = string_to_split;
    }
  else
    {
      Result = string_to_split.substr( 0, separator_position );
    }
  return Result;
}

std::string
number_array_list_parser_2::string_after_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
      //not found; result is empty
      Result = "";
    }
  else
    {
      Result = string_to_split.substr( separator_position + separator.size(), string_to_split.size() );
    }
  return Result;
}

std::vector< double >
number_array_list_parser_2::numbers_from_string( const std::string& string_to_split )
{
  const std::vector< std::string > number_strings = split_string( string_to_split, "," );
  std::vector< double > Result;
  for ( std::vector< std::string >::const_iterator iter = number_strings.begin(); iter != number_strings.end(); ++iter )
    {
      CHECK( is_double( *iter ) );
      const double new_value = double_from_string( *iter );
      Result.push_back( new_value );
    }
  return Result;
}

void
number_array_list_parser_2::parse_last_line_as_command( void )
{
  std::string command = string_before_separator( last_line(), " " );
  std::vector< string > arguments = split_string( string_after_separator( last_line(), " " ), " " );
  if ( command == "normality_check_on_series" )
    {
      print_normalization_check( static_cast< int >( double_from_string( arguments[0] ) ) );
    }
  else
    {
      cout << "unknown command: " << last_line() << "\n";
    }
}

void
number_array_list_parser_2::print_normalization_check( int field_index )
{
  cout << "Normalization check on series: " << field_index << "\n";
  cout << "Q: " << number_list.chi_squared_q( field_index ) << "\n";
  cout << "(1-p): " << number_list.chi_squared_1_minus_p( field_index ) << "\n";
}

/*
*/

main

main.cpp

adder.e

class ADDER
   
inherit
   SINGLE_VARIABLE_FUNCTION   
      redefine
	 at
      
creation
   make

feature {NONE}
   
   addend : DOUBLE
	 --number added to each argument in at
   
   make( new_addend : DOUBLE ) is
	 --create with given addend
      do
	 addend := new_addend
      end
   
feature {ANY}
   
   at( x : DOUBLE ) : DOUBLE is
	 --x + addend
      do
	 Result := x + addend
      end
   
end

chi_squared_base.e

class CHI_SQUARED_BASE
   --base calculation for the chi-squared distribution
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at;
	 
creation
   make

feature {ANY}
   
   n : INTEGER
	 --sample size of the distribution
   
   make( new_n : INTEGER ) is
	 --creation, setting the sample size
      require
	 new_n > 0
      do
	 n := new_n;
      end
   
   base : DOUBLE is
	 --base of the equation
      local
	 gamma : GAMMA_FUNCTION
      do
	 !!gamma
	 Result := 1 / ( ( 2 ^ n ).sqrt * gamma.at( n.to_double / 2.0 ) )
      end
   
   at( x : DOUBLE ) : DOUBLE is
	 --value of the function at x
      do
	 Result := base * ( ( x ^ (n-2) ).sqrt ) * ( -x/2.0 ).exp
      end
   
end

chi_squared_integral.e

class CHI_SQUARED_INTEGRAL
   
inherit
   SINGLE_VARIABLE_FUNCTION   
   redefine
      at;
      
creation
   make
   
feature {ANY}
   
   n : INTEGER
	 --size of the distribution, in samples
   
   make( new_n : INTEGER ) is
	 --creation
      require
	 new_n > 0
      do
	 n := new_n
      end
   
   at( x : DOUBLE ) : DOUBLE is
	 --integral evaluated from zero to the given number
      require
	 x > 0.0
      local
	 homer : SIMPSON_INTEGRATOR
	 chi_base : CHI_SQUARED_BASE
      do
	 !!homer.make
	 !!chi_base.make(n)
	 Result := homer.integral( chi_base, 0, x )
      end
   
end

normal_distribution_inverse.e

class NORMAL_DISTRIBUTION_INVERSE
   --inverse of the normal distribution, used to make segment tables 
   --for normalization fit
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at;
	 
feature { ANY }   

   at( x : DOUBLE ) : DOUBLE is
	 --inverse of the normal distribution
      local
	 target : DOUBLE
	 last_error : DOUBLE
	 last_result : DOUBLE
	 this_result : DOUBLE
	 has_tried_once : BOOLEAN
	 normal : NORMAL_DISTRIBUTION_INTEGRAL
	 error_margin : DOUBLE
      do
	 from
	    target := x;
	    last_error := 0;
	    last_result := 0;
	    this_result := 0;
	    Result := 0
	    has_tried_once := false
	    error_margin := 0.0000001
	    !!normal.make
	 until
	    has_tried_once and error_margin > last_error   
	 loop
	    last_result := this_result   
	    this_result := normal.at( Result );
	    last_error := (this_result - target).abs
	    if ( last_error > error_margin ) then
	       Result := next_guess( Result, last_result, target )
	    end
	    has_tried_once := true;   
	 end
      end   
   
   next_guess( arg, last_result, target : DOUBLE ) : DOUBLE is
      --next guess in the iterative scheme of things
      do
	 Result := arg + ( target - last_result )
      end
   
end

normalizer.e

class NORMALIZER
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at
	 
creation
   make

feature { NONE }
   
   data_mean : DOUBLE
   
   standard_deviation : DOUBLE
   
   make( new_data_mean, new_standard_deviation : DOUBLE ) is
	 --create with given feature values
      do
	 data_mean := new_data_mean
	 standard_deviation := new_standard_deviation
      end
   
feature {ANY}
   
   at( x : DOUBLE ) : DOUBLE is
	 --(x-mean)/standard_dev
      do
	 Result := ( x - data_mean ) / standard_deviation
      end
   
end

number_array_list.e

class NUMBER_ARRAY_LIST
   
inherit
   LINKED_LIST[ ARRAY[ DOUBLE ] ]
      redefine
	 make;
   
creation
   make, make_from_entry, make_from_list
      
feature { ANY }
   
   field_count : INTEGER
	 --number of fields allowed in an entry
   
   head : like item is
      --first item
      require
	 entry_count >= 1
      do
	 Result := first;
      end
   
   tail : like Current is
	 --all items after the first
      do
	 !!Result.make
	 if ( entry_count > 1 ) then
	    Result := slice( lower + 1, upper );
	    Result.set_field_count( field_count )
	 end --if
      ensure
	 ( entry_count > 1 ) implies Result.entry_count = entry_count - 1 and Result.field_count = field_count
      end
   
   suffixed_by_list( rhs: like Current ) : like Current is
	 --the list, suffixed by another list
      local
	 i : INTEGER
      do
	 !!Result.make_from_list( Current );
	 from
	    i := rhs.lower
	 until
	    not rhs.valid_index ( i )
	 loop
	    Result.add_entry( rhs.item( i ) );
	    i := i + 1
	 end --from
      ensure
	 Result.entry_count = entry_count + rhs.entry_count
      end
   
   make_from_entry( entry : like item ) is
	 --clear the list, then add the entry
      do
	 make
	 add_entry( entry )
      ensure
	 head.is_equal( entry )
	 field_count = entry.count
      end
   
   make is
	 --clear the list
      do
	 Precursor
	 field_count := -1
      ensure
	 entry_count = 0
	 field_count = -1
      end
   
   make_from_list( rhs: like Current ) is
	 --clear the list, setting it equal to another list
      do
	 from_collection( rhs )
	 field_count := rhs.field_count
      end
   
   sum_by_field( field_index : INTEGER ) : DOUBLE is
	 --the sum of a given field over all entries
      require
	 is_valid_field_index( field_index )
      do
	 if entry_count = 0 then
	    Result := 0
	 else
	    Result := head.item( field_index ) + tail.sum_by_field( field_index )
	 end
      end
   
   mean_by_field( field_index : INTEGER ) : DOUBLE is
	 --the mean of a given field over all entries
      require
	 is_valid_field_index( field_index )
	 entry_count >= 1
      do
	 Result := sum_by_field( field_index ) / entry_count.to_double
      end
   
   entry_count : INTEGER is
	 --the number of entries
      do
	 Result := count
      end
   
   add_entry( new_entry : like item ) is
	 -- adds an entry to the end of the list
      require
	 is_valid_entry( new_entry )
      do
	 if entry_count = 0 then
	    set_field_count( new_entry.count )
	 end
	 add_last( new_entry )
      ensure
	 entry_count = old entry_count + 1
      end
   
   mapped_to( field_index : INTEGER; f: SINGLE_VARIABLE_FUNCTION ) : like Current is
	 --the elements, with the given field mapped to f
      require
	 is_valid_field_index( field_index )
      local
	 new_entry : like item
      do
	 !!Result.make
	 if entry_count > 0 then
	    new_entry := head.twin
	    new_entry.put( f.at( head.item(field_index) ), field_index )
	    Result.add_entry( new_entry );
	    Result := Result.suffixed_by_list( tail.mapped_to( field_index, f ) )
	 end -- if
      ensure
	 Result.entry_count = entry_count
      end
   
   multiplied_by_list( field_index : INTEGER; rhs : like Current ) : like Current is
      --the elements, with the given field multiplied by the 
	 --corresponding field in rhs
      require
	 entry_count = rhs.entry_count
	 field_count = rhs.field_count
	 is_valid_field_index( field_index )
      local
	 new_entry : like item
      do
	 !!Result.make
	 if entry_count > 0 then
	    new_entry := head
	    new_entry.put( head.item( field_index ) * rhs.head.item( field_index ), field_index )
	    Result.add_entry( new_entry )
	    Result := Result.suffixed_by_list( tail.multiplied_by_list( rhs.tail ) )
	 end
      ensure
	 Result.entry_count = entry_count
      end
   
   sorted_by( field_index : INTEGER ) : like Current is
	 --the list, sorted by the given field
      require
	 is_valid_field_index( field_index )
      do
	 if is_sorted_by( field_index ) then
	    !!Result.make_from_list( Current )
	 else
	    !!Result.make_from_list( tail.items_less_than( field_index, head.item( field_index ) ).sorted_by( field_index ).
				     suffixed_by_item( head ).
				     suffixed_by_list( tail.items_greater_than_or_equal_to( field_index, head.item( field_index ) ).sorted_by( field_index ) ) )
	 end -- if
      ensure
	 Result.entry_count = entry_count
      end
   
   is_valid_entry( entry : like item ) : BOOLEAN is
	 --whether entry is a valid entry
      do
	 if ( entry_count = 0 and entry.count > 0 ) or ( entry.count = field_count and entry.lower = 1 ) then
	    Result := true
	 else
	    Result := false
	 end
      end
   
   items_less_than( field_index : INTEGER; value : DOUBLE ) : like Current is
	 --list of items less than the given value in the given field
      require
	 is_valid_field_index( field_index )
      local
	 i : INTEGER
      do
	 !!Result.make
	 from
	    i := lower
	 until
	    not valid_index( i )
	 loop
	    if not( item( i ).item( field_index ) >= value ) then
	       Result.add_entry( item( i ) )
	    end
	    i := i + 1
	 end
      ensure
	 Result.entry_count <= entry_count
      end
   
   items_greater_than_or_equal_to( field_index : INTEGER; value : DOUBLE ) : like Current is
      --list of items greater than or equal to the given value in the 
	 --given field
      require
	 is_valid_field_index( field_index )
      local
	 i : INTEGER
      do
	 !!Result.make
	 from
	    i := lower
	 until
	    not valid_index( i )
	 loop
	    if item(i).item( field_index ) >= value then
	       Result.add_entry( item( i ) )
	    end
	    i := i + 1
	 end
      ensure
	 Result.entry_count <= entry_count
      end
   
   suffixed_by_item( entry : like item ) : like Current is
	 --the list, suffixed by a single item
      require
	 is_valid_entry( entry )
      do
	 !!Result.make_from_list( Current )
	 Result.add_entry( entry )
      ensure
	 Result.entry_count = entry_count + 1
      end
   
   set_field_count( new_field_count : INTEGER ) is
	 --sets the field count
      require
	 entry_count = 0 or ( entry_count > 0 implies head.count = new_field_count )
      do
	 field_count := new_field_count
      ensure
	 field_count = new_field_count
      end
   
   is_valid_field_index( field_index : INTEGER ) : BOOLEAN is
	 --whether the given field index is valid
      do
	 if entry_count = 0 then
	    Result := true
	 else
	    Result := ( ( 1 <= field_index ) and ( field_index <= field_count ) )
	 end
      end
   
   is_sorted_by( field_index : INTEGER ) : BOOLEAN is
	 --whether the list is sorted by the given field index
      require
	 is_valid_field_index( field_index )
      do
	 Result := true
	 if entry_count > 1 then
	    Result := ( head.item( field_index ) < tail.head.item( field_index ) ) 
	       and tail.is_sorted_by( field_index )
	 end
      end
   
feature {ANY}
   --chi-squared distribution calcs
   
   variance( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
      require
	 is_valid_field_index( field_index )
	 is_full_population implies entry_count > 0
	 ( not is_full_population ) implies entry_count > 1
      local
	 divisor : DOUBLE
	 adder : ADDER
	 square : SQUARE
      do
	 if is_full_population then
	    divisor := entry_count.to_double
	 else
	    divisor := (entry_count - 1).to_double
	 end
	 !!adder.make( - mean_by_field( field_index ) )
	 !!square
	 Result := ( mapped_to( field_index, adder ).
		     mapped_to( field_index, square ).
		     sum_by_field( field_index ) ) / divisor
      end
   
   standard_deviation( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
      require
	 is_valid_field_index( field_index )
      do
	 Result := variance( field_index, is_full_population ).sqrt
      end
   
   normalized_series( field_index : INTEGER ) : like Current is
      --one series of the table, "normalized" into standard deviations
      require
	 is_valid_field_index( field_index )
      local
	 normalizer : NORMALIZER
      do
	 !!normalizer.make( mean_by_field( field_index ), standard_deviation( field_index, false ) )
	 Result := mapped_to( field_index, normalizer ).select_series( field_index )
      ensure
	 Result.entry_count = entry_count
      end
   
   normalization_s_value : INTEGER is
      --number of segments in a normalization table
      require
	 entry_count.divisible( 5 )
      do
	 Result := entry_count // 5
      end
   
   normalized_series_table( field_index : INTEGER ) : like Current is
      --a normalized series table, used for error-checking and used 
      --in the chi-squared normalization test
      require
	 entry_count > 0
	 is_valid_field_index( field_index )
      local
	 sorted_list : like Current
	 sorted_normalized_list : like Current
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
      do
	 sorted_list := sorted_by( field_index )
	 sorted_normalized_list := sorted_list.normalized_series( field_index )
	 check
	    sorted_list.entry_count = entry_count
	    sorted_normalized_list.entry_count = entry_count
	 end
	 from
	    i := sorted_list.lower
	    !!Result.make
	 until
	    not sorted_list.valid_index( i )
	 loop
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( i.to_double )
	    new_entry.add_last( i.to_double / entry_count.to_double )
	    new_entry.add_last( sorted_list.item( i ).item( field_index ) )
	    new_entry.add_last( sorted_normalized_list.item(i).first )
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   normal_distribution_segments( s : INTEGER ) : like Current is
	 --segments of the normal distribution; see [Humphrey95] for use
      require
	 s > 0
      local
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
	 norm_inverse : NORMAL_DISTRIBUTION_INVERSE
      do
	 from 
	    i := 1
	    !!Result.make
	    !!norm_inverse
	 until 
	    i > s
	 loop
	    !!new_entry.make( 1, 0 )
	    if i = 1 then
	       new_entry.add_last( -1000.0 )
	    else
	       new_entry.add_last( norm_inverse.at( ( i - 1 ).to_double * ( 1.0 / s.to_double ) ) )
	    end
	    if i = s then
	       new_entry.add_last( 1000.0 )
	    else
	       new_entry.add_last( norm_inverse.at( ( i.to_double * 1.0 / s.to_double ) ) )
	    end
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   chi_squared_table( field_index : integer ) : like Current is
	 -- chi-squared table, used to calculate the chi-squared distribution
      require
	 is_valid_field_index( field_index )
      local
	 norm_segments : like Current
	 norm_table : like Current
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
      do
	 from	 
	    norm_segments := normal_distribution_segments( normalization_s_value )
	    norm_table := normalized_series_table( field_index )
	    !!Result.make
	    i := norm_segments.lower
	 until
	    not norm_segments.valid_index( i )
	 loop
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( i.to_double )
	    new_entry.add_last( norm_segments.item( i ).item( 1 ) )
	    new_entry.add_last( norm_segments.item( i ).item( 2 ) )
	    new_entry.add_last( norm_table.entry_count.to_double / norm_table.normalization_s_value.to_double )
	    new_entry.add_last( norm_table.items_in_range( 4, new_entry.item(2), new_entry.item(3) ))
	    new_entry.add_last( ( new_entry.item(4) - new_entry.item(5) ) ^ 2 )
	    new_entry.add_last( new_entry.item(6) / new_entry.item(4) )
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   chi_squared_q( field_index : INTEGER ) : DOUBLE is				
	 --"Q" value of chi-squared normalization test
      do
	 Result := chi_squared_table( field_index ).sum_by_field( 7 )
      end
   
   chi_squared_1_minus_p( field_index : INTEGER ) : DOUBLE is
	 --1-p value of chi-squared normalization test
      local
	 chi_squared_integral : CHI_SQUARED_INTEGRAL
      do
	 !!chi_squared_integral.make( normalization_s_value - 1 )
	 Result := 1 - chi_squared_integral.at( chi_squared_q( field_index ) )
      end
   
   select_series( field_index : INTEGER ) : like Current is
	 --the given series, extracted as a separate list
      require
	 is_valid_field_index( field_index )
      local
	 new_entry : ARRAY[ DOUBLE ]
      do
	 if ( entry_count = 0 ) then
	    !!Result.make
	 else
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( head.item( field_index ) )
	    !!Result.make_from_entry( new_entry )
	    if entry_count > 1 then
	       Result := Result.suffixed_by_list( tail.select_series( field_index ) )
	    end
	 end
      end
   
   items_in_range( field_index : INTEGER; lower_limit, upper_limit : DOUBLE ) : INTEGER is
      --the items from the given field which fit in (lower limit < item <= upper_limit)
      require
	 is_valid_field_index( field_index )
      do	 
	 if entry_count > 0 and head.item(field_index) > lower_limit and upper_limit >= head.item( field_index ) then
	    Result := 1
	 end
	 if entry_count > 1 then
	    Result := Result + tail.items_in_range( field_index, lower_limit, upper_limit )
	 end
      end
   
   entry_as_string( entry : like item ) : STRING is
	 --entry as a string, ie "( 1, 2, 3 )"
      local
	 i : INTEGER
      do
	 !!Result.make_from_string( "( " )
	 from
	    i := entry.lower
	 until
	    not entry.valid_index( i )
	 loop
	    entry.item( i ).append_in( Result )
	    if entry.valid_index( i + 1 ) then
	       Result.append_string( ", " )
	    end
	    i := i + 1
	 end
	 Result.append_string( " )" )
      end
   
   table_as_string : STRING is
      --table, as a set of entry_as_string lines
      do
	 if entry_count > 0 then
	    Result := entry_as_string( head ) + "%N"
	    if entry_count > 1 then
	       Result.append_string( tail.table_as_string )
	    end
	 end
      end
      
end

number_array_list_parser_2.e

class NUMBER_ARRAY_LIST_PARSER_2
--reads a list of number pairs, and performs linear regression analysis

inherit 
   SIMPLE_INPUT_PARSER
      redefine parse_last_line, transformed_line
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   inline_comment_begin: STRING is "--";
   
   string_stripped_of_comment(string: STRING): STRING is 
      --strip the string of any comment
      local 
         comment_index: INTEGER;
      do  
         if string.has_string(inline_comment_begin) then 
            comment_index := string.index_of_string(inline_comment_begin);
            if comment_index = 1 then 
               !!Result.make_from_string( "" );
            else 
               Result := string.substring(1,comment_index - 1);
            end; 
         else 
            Result := string.twin;
         end; 
      end -- string_stripped_of_comment
   
   string_stripped_of_whitespace(string: STRING): STRING is 
      --strip string of whitespace
      do  
         Result := string.twin;
         Result.left_adjust;
         Result.right_adjust;
      end -- string_stripped_of_whitespace
   
   transformed_line(string: STRING): STRING is 
      --strip comments and whitespace from parseable line      
      do  
         Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
      end -- transformed_line
   
   number_list: NUMBER_ARRAY_LIST;
   
   state : INTEGER
   
   Parsing_historical_data : INTEGER is unique
   
   Parsing_commands : INTEGER is unique
   
   Command_normality_check_on_series : STRING is
      once
	 Result := "normality_check_on_series"
      end
   
   historical_data_terminator: STRING is "stop";
   
   double_from_string(string: STRING): DOUBLE is 
      require 
         string.is_double or string.is_integer; 
      do  
         if string.is_double then 
            Result := string.to_double;
         elseif string.is_integer then 
            Result := string.to_integer.to_double;
         end; 
      end -- double_from_string
   

feature {ANY} --parsing

   reset is 
      --resets the parser and makes it ready to go again
      do  
         state := Parsing_historical_data;
         number_list.make;
      end -- reset
   
   make is 
      do  
         !!number_list.make;
         reset;
      end -- make
   
   parse_last_line_as_historical_data is 
      --interpret last_line as a pair of comma-separated values
      local 
         error_log: ERROR_LOG;
	 this_line_numbers : ARRAY[ DOUBLE ]
      do  
	 !!error_log.make
	 if last_line_is_valid_historical_data then
	    this_line_numbers := numbers_from_string( last_line )
	    number_list.add_entry( this_line_numbers )
	 else
	    error_log.log_error( "Invalid historical data: " + last_line + "%N" )
	 end
      end
   
   parse_last_line_as_end_of_historical_data is 
      --interpret last line as the end of historical data
      require 
         last_line.compare(historical_data_terminator) = 0; 
      local
	 i : INTEGER
      do  
	 state := Parsing_commands
	 std_output.put_string( "Historical data read; " + number_list.entry_count.to_string + 
				"items read%N" )
      end -- parse_last_line_as_end_of_historical_data
   
   parse_last_line_as_command is
	 --interpret the last line as a command
      local
	 command : STRING
	 arguments : ARRAY[ STRING ]
      do
	 command := string_before_separator( last_line, " " )
	 arguments := split_string( string_after_separator( last_line, " ", ), " " )
	 if command.is_equal( Command_normality_check_on_series )  then
	    print_normalization_check_on_series( arguments.first.to_integer + 1)
	 else
	    std_output.put_string( "Unrecognized command string: " + last_line )
	 end
      end
   
   parse_last_line is 
      --parse the last line according to state
      do  
         if not last_line.empty then 
	    if state = Parsing_historical_data then
	       if last_line.compare(historical_data_terminator) = 0 then 
		  parse_last_line_as_end_of_historical_data;
	       else 
                  parse_last_line_as_historical_data;
               end; 
            else
	       parse_last_line_as_command
	    end
         end; 
      end -- parse_last_line
   
   last_line_is_valid_historical_data : BOOLEAN is
	 --whether last line is valid historical data
      local
	 substrings : ARRAY[ STRING ]
	 i : INTEGER
      do
	 substrings := split_string( last_line, "," )
	 Result := true
	 if ( number_list.entry_count > 0 and substrings.count /= number_list.field_count ) then
	    Result := false;
	 end
	 --check and see if each substring is convertible to a double
	 from
	    i := substrings.lower
	 until
	    not substrings.valid_index( i )
	 loop
	    if not ( substrings.item(i).is_double or substrings.item(i).is_integer ) then
	       Result := false;
	    end
	    i := i + 1
	 end
      end
   
   split_string( string_to_split, separator : STRING ) : ARRAY[ STRING ] is
      --a list of components of a string, separated by the given 
      --separator, ie split_string( "1,2,3", "," ) = [ "1", "2", "3" ]
      local
	 prior_to_separator : STRING
	 remainder : STRING
	 split_remainder : ARRAY[ STRING ]
	 i : INTEGER
      do
	 prior_to_separator := string_before_separator( string_to_split, separator )
	 remainder := string_after_separator( string_to_split, separator )
	 !!Result.make( 1, 0 )
	 Result.add_last( prior_to_separator )
	 if remainder.count > 0 then
	    split_remainder := split_string( remainder, separator )
	    from
	       i := split_remainder.lower
	    until
	       not split_remainder.valid_index( i )
	    loop
	       Result.add_last( split_remainder.item( i ) )
	       i := i + 1
	    end
	 end
      end
   
   string_before_separator( string_to_split, separator : STRING ) : STRING is
	 --the part of a string which comes before the separator, or 
	 --the whole string if it's not found
      local
	 separator_index : INTEGER
      do
	 separator_index := string_to_split.substring_index( separator, 1 )
	 if ( separator_index = 0 ) then
	    --not found; copy whole string
	    Result := string_to_split.twin
	 else
	    Result := string_to_split.substring( 1, separator_index - 1 )
	 end
      end
   
   string_after_separator( string_to_split, separator : STRING ) : STRING is
	 --the part of the string after the separator,
      local
	 separator_index : INTEGER
      do
	 separator_index := string_to_split.substring_index( separator, 1 )
	 if ( separator_index = 0 ) then
	    --not found; result is empty
	    !!Result.make_from_string( "" )
	 else
	    Result := string_to_split.substring( separator_index + separator.count, string_to_split.count )
	 end
      end
   
   numbers_from_string( string_to_split : STRING ) : ARRAY[ DOUBLE ] is
	 --an array of numbers, from a string of numbers separated by commas
      local
	 number_strings : ARRAY[ STRING ]
	 i : INTEGER
      do
	 !!Result.make( 1, 0 )
	 number_strings := split_string( string_to_split, "," )
	 from
	    i := number_strings.lower
	 until
	    not number_strings.valid_index( i )
	 loop
	    check
	       number_strings.item(i).is_double or number_strings.item(i).is_integer
	    end
	    Result.add_last( double_from_string( number_strings.item( i ) ) )
	    i := i + 1
	 end
      end
   
   print_normalization_check_on_series( field_index : INTEGER ) is
      do
	 std_output.put_string( "Normalization check on series " )
	 std_output.put_integer( field_index )
	 --std_error.put_string( number_list.sorted_by( field_index ).table_as_string + "%N" )
	 --std_error.put_string( number_list.normalized_series_table( field_index ).table_as_string )
	 std_output.put_string( "%NQ: " )
	 std_output.put_double( number_list.chi_squared_q( field_index ) )
	 std_output.put_string( "%N(1-p): ")
	 std_output.put_double( number_list.chi_squared_1_minus_p( field_index ) )
	 std_output.put_new_line 
      end
   
end -- class NUMBER_ARRAY_LIST_PARSER_2

main.e

class MAIN

creation {ANY} 
   make

feature {ANY} 
   
   make is 
      local 
         parser: NUMBER_ARRAY_LIST_PARSER_2;
      do  
         !!parser.make;
         parser.set_input(io);
         parser.parse_until_eof;
      end -- make

end -- MAIN

Code Review

I need to get better at code reviews. On both the Eiffel and C++ side, I missed several obvious problems-- on the Eiffel side, a problem which caused a great deal of delay (I had forgotten that Eiffel programs only copy by reference, and I had copied something and changed it, changing the original-- and completely baffling me for some time-- something to add to my code review checklist!)

Compile

Once again, the compiler caught many sneaky items. I'm impressed by the error-checking in the Eiffel compiler, particularly with regard to type compliance-- it caught several small things which I just didn't notice in the code review (including the fact that one of my comments was mismatched!). Very nice.

Test

My strategy of reproducing the calculation tables as in the text paid good dividends here, as I was able to print out the tables and search for problems. It worked very well, and though I had several problems with the programs themselves, they were relatively easy to find and fix due to the availability of interim values, etc.

Table 10-2. D15: Test Results Format -- program 9a

Test	Expected Result	Actual Result - C++	Actual Result - Eiffel
Table D14
Q	34.4	34.4	34.400000
1-p	7.60*10^-5	7.61098e-05	0.000079

I'm curious about the difference in calculation between the C++ and Eiffel programs with regard t othe 1-p entry; the values are extremely close, and I can't find significant differences in the calculations, but something seems a touch off kilter. I can't figure out if it's the level of precision involved, a difference in standard libraries, or what.

Postmortem

PSP2.1 Project Plan Summary

Table 10-3. Project Plan Summary

Student:	Victor B. Putz	Date:	000201
Program:	Normalization	Program#	9A
Instructor:	Wells	Language:	C++

Summary	Plan	Actual	To date
Loc/Hour	46	45	46
Planned time	285		943
Actual time		342	1248
CPI (cost/performance index)			0.76
%reused	47	69	46
Test Defects/KLOC	31	27	30.6
Total Defects/KLOC	141	135	140.4
Yield (defects before test/total defects)	78	80	78
% Appraisal COQ	5	8.5	5.58
% Failure COQ	29.8	19.6	28.32
COQ A/F Ratio	0.16	0.43	0.20

Program Size	Plan	Actual	To date
Base	20	20
Deleted	0	18
Modified	1	1
Added	237	257
Reused	231	565	1698
Total New and Changed	239	258	1731
Total LOC	489	824	3672
Upper Prediction Interval (70%)	326
Lower Prediction Interval (70%)	152

Time in Phase (min):	Plan	Actual	To Date	To Date%
Planning	54	77	449	20
Design	37	67	314	14
Design Review	6	13	53	2
Code	74	78	568	25
Code Review	9	16	73	3
Compile	17	18	132	6
Test	68	49	507	22
Postmortem	20	24	160	7
Total	285	342	2256	100
Total Time UPI (70%)	322
Total Time LPI (70%)	249
Defects Injected		Actual	To Date	To Date %
Plan		0	0	0
Design	10	12	77	32
Design Review	0	0	0	0
Code	22	23	160	66
Code Review	0	0	0	0
Compile	1	0	3	1
Test	1	0	3	1
Total development	34	35	243	100
Defects Removed		Actual	To Date	To Date %
Planning		0	0	0
Design		0	0	0
Design Review	1	6	15	6
Code	7	1	45	19
Code Review	3	8	23	9
Compile	15	13	107	44
Test	8	7	53	22
Total development	34	35	243	100
After Development	0	0	0
Defect Removal Efficiency	Plan	Actual	To Date
Defects/Hour - Design Review	13.5	27.6	22.5
Defects/Hour - Code Review	15.8	30	24.2
Defects/Hour - Compile	49.5	43.3	56.3
Defects/Hour - Test	6	8.57	6.9
DRL (design review/test)	2.25	3.22	2.26
DRL (code review/test)	2.63	3.5	3.5
DRL (compile/test)	8.25	5.05	8.16

Eiffel code/compile/test

Time in Phase (min)	Actual	To Date	To Date %
Code	62	369	50
Code Review	10	40	6
Compile	15	133	18
Test	41	192	26
Total	128	734	100
Defects Injected	Actual	To Date	To Date %
Design	1	6	4
Code	22	138	95
Compile	0	0	0
Test	0	1	1
Total	23	145	100
Defects Removed	Actual	To Date	To Date %
Code	0	1	1
Code Review	9	14	11
Compile	16	88	60
Test	5	40	28
Total	23	145	100
Defect Removal Efficiency	Actual	To Date
Defects/Hour - Code Review	12	24
Defects/Hour - Compile	64	40
Defects/Hour - Test	7.3	12.5
DRL (code review/test)	1.6	1.9
DRL (compile/test)	8.8	3.2

Time Recording Log

Table 10-4. Time Recording Log

Student:	Victor B. Putz	Date:	000130
		Program:	9a

Start	Stop	Interruption Time	Delta time	Phase	Comments
000130 10:50:04	000130 12:07:06	0	77	plan
000130 12:14:58	000130 13:25:57	4	66	design
000130 13:59:50	000130 14:12:30	0	12	design review
000130 14:24:05	000130 15:50:58	9	77	code
000130 16:00:00	000130 16:21:52	5	16	code review
000130 16:22:00	000130 16:39:45	0	17	compile
000130 16:42:18	000130 17:31:05	0	48	test
000130 17:31:29	000130 17:55:28	0	23	postmortem

Table 10-5. Time Recording Log

Student:		Date:	000201
		Program:

Start	Stop	Interruption Time	Delta time	Phase	Comments
000201 07:56:14	000201 09:00:44	2	62	code
000201 09:01:18	000201 09:11:07	0	9	code review
000201 09:11:20	000201 09:26:12	0	14	compile
000201 09:26:36	000201 10:07:25	0	40	test

Defect Reporting Logs

Table 10-6. Defect Recording Log

Student:	Victor B. Putz	Date:	000130
		Program:	9a

Defect found	Type	Reason	Phase Injected	Phase Removed	Fix time	Comments
000130 13:59:52	ct	ig	design	design review	1	missed minor contracts
000130 14:03:53	ma	ig	design	design review	0	didn't increment loop indices
000130 14:05:35	ct	ig	design	design review	0	missed contract for variance
000130 14:07:06	ct	ig	design	design review	0	require standard_deviation /= 0
000130 14:07:58	ct	ig	design	design review	0	require s > 0 in normal_distribution_segments
000130 14:09:41	ct	ig	design	design review	0	minor contracts
000130 15:44:53	md	ig	design	code	0	forgot to add parse_last_line_as_command
000130 16:02:40	ct	ig	code	code review	0	forgot to put in the require contract for make/constructor
000130 16:05:07	sy	om	code	code review	0	forgot to #include normal_distribution_integral
000130 16:06:00	sy	om	code	code review	0	forgot to make last_guess const in implementation
000130 16:13:33	sy	om	code	code review	0	forgot to return return value!
000130 16:17:23	wc	om	code	code review	0	was testing the integral of the normal distribution as 100, rather than 1
000130 16:19:06	sy	ty	code	code review	0	forgot parentheses around no-argument operation
000130 16:19:53	sy	ty	code	code review	0	forgot parentheses around no-argument operation
000130 16:20:27	sy	om	code	code review	0	forgot to return return value!
000130 16:23:13	sy	ig	design	compile	6	er... forgot that inheriting and mixing types gets very gross; combined number_array_list and number_array_list_2 into one
000130 16:30:48	wn	ty	code	compile	0	misspelled normal_distribution_inverse
000130 16:31:54	wn	cm	code	compile	0	used normalized_series_table instead of norm_table
000130 16:33:25	sy	cm	code	compile	0	didn't make the at function const as required
000130 16:34:07	sy	cm	code	compile	0	missed parentheses on no-arg function
000130 16:34:40	sy	cm	code	compile	0	forgot return type for select_series
000130 16:35:05	sy	ty	code	compile	0	declared Result with type::Result
000130 16:35:34	sy	om	code	compile	0	forgot const in implementation
000130 16:35:57	wn	cm	code	compile	0	er.. didn't type the correct name for the argument
000130 16:36:17	sy	om	code	compile	0	forgot parentheses for no-arg function
000130 16:36:58	sy	cm	code	compile	0	mistyped integer instead of int
000130 16:37:57	sy	om	code	compile	0	forgot to #include gamma_function.h
000130 16:38:22	sy	om	code	compile	0	forgot to #include simpson_integrator.h
000130 16:45:22	wc	om	code	test	0	had to add an exception that if the first guess was in, don't try a second guess!
000130 16:46:46	ct	om	design	test	0	forgot contract in normalized_series and select_series
000130 16:52:18	ct	om	design	test	0	forgot ensure contract in normalized_series_table
000130 16:53:01	wn	ty	code	test	1	was setting an iterator to end instead of begin for startup
000130 16:56:12	we	ig	design	test	7	was using field_index as index of normalized series, instead of zero index
000130 17:07:17	sy	ig	code	test	14	dangit-- C++ was doing silent conversions from int to double without thinking
000130 17:21:45	wa	ig	design	test	8	was not setting the correct degrees of freedom for chi2 integral

Table 10-7. Defect Recording Log

Student:		Date:	000201
		Program:

Defect found	Type	Reason	Phase Injected	Phase Removed	Fix time	Comments
000201 09:04:25	ct	om	code	code review	0	forgot to put in comment
000201 09:07:29	wn	cm	code	code review	0	forgot to change "Parsing_prediction_data" to "Parsing_commands"
000201 09:11:30	sy	ig	code	compile	0	forgot that comments which end a class must be correct!
000201 09:12:08	sy	cm	code	compile	0	separated arguments of different type with comma rather than semicolon
000201 09:12:44	sy	om	code	compile	0	accidentally put contract under the local block
000201 09:13:20	wn	om	code	compile	0	forgot to qualify item with feature call
000201 09:14:05	sy	ty	code	compile	0	forgot colon before return value
000201 09:15:14	sy	ty	code	compile	0	forgot "is" after feature declaration
000201 09:15:39	wa	cm	code	compile	1	forgot to take out program 8a code to print the sorted lists
000201 09:17:11	wn	ig	code	compile	0	used "equals" instead of "is_equal" for string equality test
000201 09:18:29	wn	cm	code	compile	0	used "lower" instead of "first" for first argument of an array
000201 09:19:12	wn	cm	code	compile	0	typed "entry" instead of "new_entry"
000201 09:19:49	wn	cm	code	compile	0	typed "i" rather than "field_index"
000201 09:20:12	wn	cm	code	compile	0	typed "n" rather than "entry_count"
000201 09:21:16	wn	ig	code	compile	0	apparently "/" is used to produce a double from two integers; // is what I wanted
000201 09:22:04	sy	ig	code	compile	0	used "is" in local block
000201 09:23:11	wt	ig	code	compile	0	urg... returned a double rather than an integer
000201 09:25:52	mc	cm	code	compile	0	was not instantiating my normal_distribution_inverse object
000201 09:27:00	wc	cm	design	test	1	wrong value (100 instead of 1) in normal_distribution_segments
000201 09:29:20	ct	ty	code	test	2	was doing test on old value of n, not new value!
000201 09:33:55	ma	om	code	test	0	forgot to update i in loop!
000201 09:35:00	wa	ig	code	test	7
000201 09:45:46	wa	ig	code	test	20	Okay, this hurt-- forgot that Eiffel doesn't copy things by default, so I was modifying the original list!