Program 9a: Normal Distribution fit

Using a linked list, write a program to do a chi-squared test for a normal distribution.

Given Requirements

 

Requirements: Write program 9a to calculate the degree to which a string of n real numbers is normally distributed. The methods and formulas used in this calculation are explained on page 529 [of the text]. Use program 5a to calculate the values of the normal and chi-squared distributions. Assume than n is > 20 and an even multiple of 5. Use program 8a to sort the numbers in ascending order.

Testing: Thoroughly test the program. As one test, use the LOC/Method data in table D14 as a test case [note: this is the same data from program 8a]. Here, the result should be Q=34.4 with a probability value < 0.005 that the data are normally distributed. The solution to this case is shown [in the text]. Submit a test report that includes the test results and uses the format in table D15

Table 10-1. D15: Test Results Format -- program 9a

TestExpected ResultActual Result
Table D14  
Q34.4 
1-p7.60*10-5 
 
--[Humphrey95] 

Planning

Requirements

I'll make heavy reuse of the parsing and number array list classes used in program 8a, which means that the data input format will remain the same (note-- in program 8a, I suggested that the first usable line of the input file be a single number, the number of fields in the file; since the number of fields can be extrapolated from the first historical data entry, this requirement has been removed).

So much for getting the numbers into the program. Where we run into a problem is the next bit-- which series are we going to check for normality? If this is going to be reused, how shall we decide which series to use?

The simple response would be to simply check all the series, but I'll do something slightly different-- after the historical data terminator, I'll make the input consist of commands and numbers; in other words, something like this:

...
58, 19.33
305, 25.42
stop -- end of data
normality-check-on-series 1 -- request a normality check on the first series

This will allow us to add further commands in the future, paving the way for some minor expandibility (considering that there is only one more program in this series, that may be unnecessary).

When the normality-check-on-series command is encountered, the program will perform a normality check on the given series (series 1 being the first set of numbers, and so on), printing the Q and 1-p results, thusly:

Normality check on series: 1
Q: 34.4
(1-p): 7.6e-5

Size Estimate

To do the size estimate-- and to make it worthwhile-- I'm going to do a bit more of a conceptual design than I have earlier, listing my appropriate classes and the methods I think I'll need. The algorithm in use in program 9a is considerably more complex than those used in other programs (or at least it seems to me), so I need to do more preparation here.

A quick conceptual design using Dia gives us the following:

Preliminary design diagram, using a form of UML

Note the heavy use of single_variable_function subclasses (it's not depicted well, but each class in the preliminary design except for number_array_list_2 and number_array_list_parser_2 is a subclass of single_variable_function. The normal_distribution_inverse class will be used to calculate the values of the normal distribution for S segments. The tex lists these for certain values of S (table A24 in [Humphrey95]), and we can use the table (since the requirements guarantee the numbers will be divisible by 5) but this will be more general.

With that in mind, our size estimating template gives us a PROBE-generated estimate of 239 new/changed LOC.

Resource/Time Estimate

Historical data gives us a PROBE-generated estimate of 285 minutes for this project.

Development

Design

Once again, I'm using a cluster of tiny single_variable_function classes to do much of the work. I also reconstruct the "by-hand" tables as per the algorithm in [Humphrey95]-- not because they are necessary for the computation, but because they will simplify testing by showing results similar to those in the text. Our preliminary design is fleshed out as follows:

Design diagram, using a form of UML

Design Review

Actually, not all that many defects were caught by the design review this time around, and few of those were show-stoppers. The basic design seems reasonable.

Code

Reused Code

The error_log, gamma_function, is_double_equal, normal_function_base, simple_input_parser, simpson_integrator, square, and whitespace_stripper modules were used in their entirety. Much code from number_array_list and number_array_list_parser was reused, but those classes are reprinted here since much was changed in number_array_list.

Part of the design had to be changed because I had originally planned to inherit from number_array_list to create number_array_list_2-- unfortunately, many of the functions returned by number_array_list did not (of course) return values of number_array_list_2-- meaning that I couldn't chain together functions as easily as I wanted to, which was unacceptable; as a result, I just expanded the original class. I did not have a chance to verify if "like current" in Eiffel would fix these problems (it may well have done so!)

adder

#ifndef ADDER_H
#define ADDER_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class adder : public single_variable_function
{
 protected:
  double addend;
 public:
  adder( double new_addend );
  virtual double at( double x ) const;
};


#endif
#include "adder.h"

adder::adder( double new_addend ) :
addend( new_addend )
{
}

double
adder::at( double x ) const
{
  return x + addend;
}

chi_squared_base

/*
*/

#ifndef CHI_SQUARED_BASE_H
#define CHI_SQUARED_BASE_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class chi_squared_base : public single_variable_function
{
 protected:
  int n;
 public:
  virtual double at( double x ) const;
  double base( void ) const;
  chi_squared_base( int new_n );
};

#endif


/*
*/
/*
*/

#include "chi_squared_base.h"

#ifndef GAMMA_FUNCTION_H
#include "gamma_function.h"
#endif

#include <math.h>
#include <iostream>

double
chi_squared_base::at( double x )  const
{
  const double Result = base()
    * pow( x, ( static_cast<double>(n) / 2.0 - 1.0 ) )
    * exp( -x / 2.0 );
//    if ( x == 0 || x == 10 )
//      {
//        gamma_function gamma;
//        cout << "Chibase at " << x << "\n"
//  	   << "n: " << n << "\n"
//  	   << "2^(n/2) " << pow(2.0,static_cast<double>(n)/2.0) << "\n"
//  	   << "gamma(n/2)" << gamma.at( static_cast<double>(n)/2.0 ) << "\n"
//  	   << "Base: " << base() << "\n"
//  	   << "x^(n/2 - 1): " << pow( x, ( static_cast< double >(n) / 2.0 - 1.0 ) ) << "\n"
//  	   << "e^(-x/2) :" << exp( -x / 2.0 ) << "\n";
//      }
  return Result;
}

chi_squared_base::chi_squared_base( int new_n ) :
n ( new_n )
{
}

double
chi_squared_base::base( void ) const
{
  gamma_function gamma;
  const double Result = 1 / ( pow( 2.0, static_cast<double>(n)/2.0 ) * gamma.at( static_cast<double>(n)/2.0 ) );
  return Result;
}

/*
*/

chi_squared_integral

#ifndef CHI_SQUARED_INTEGRAL_H
#define CHI_SQUARED_INTEGRAL_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class chi_squared_integral : public single_variable_function
{
 protected:
  int n;
 public:
  virtual double at( double x ) const;
  chi_squared_integral( int new_n );
};

#endif
/*
*/

#include "chi_squared_integral.h"

#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef SIMPSON_INTEGRATOR_H
#include "simpson_integrator.h"
#endif

#include <math.h>

double
chi_squared_integral::at( double x ) const
{
  REQUIRE( x >= 0 );
  simpson_integrator homer;
  chi_squared_base chi_base( n );
  double Result = homer.integral( chi_base, 0, x );
  return Result;
}

chi_squared_integral::chi_squared_integral( int new_n ) :
n( new_n )
{
  REQUIRE( new_n > 0 );
}

/*
*/

normal_distribution_inverse

#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#define NORMAL_DISTRIBUTION_INVERSE_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class normal_distribution_inverse : public single_variable_function
{
 public:
  virtual double at( double x ) const;
  double next_guess( double arg, double last_result, double target ) const;
};


#endif
/*
*/

#include "normal_distribution_inverse.h"

#ifndef NORMAL_DISTRIBUTION_INTEGRAL_H
#include "normal_distribution_integral.h"
#endif

#include <math.h>

double
normal_distribution_inverse::at( double x ) const
{
  const double target = x;
  const double error_margin = 0.0000001;
  double last_error = 0;
  double Result = 0;
  double this_result = 0;
  bool has_tried_once = false;
  normal_distribution_integral norm;
  while ( !has_tried_once || ( last_error > error_margin ) )
    {
      const double last_result = this_result;
      this_result = norm.at( Result );
      last_error = fabs( this_result - target );
      if ( last_error > error_margin ) 
	{
	  Result = next_guess( Result, last_result, target );
	}
      has_tried_once = true;
    }
  return Result;
}


double
normal_distribution_inverse::next_guess( double arg, double last_result, double target ) const
{
  double Result = arg + ( target - last_result );
  return Result;
}

/*
*/

normalizer

#ifndef NORMALIZER_H
#define NORMALIZER_H

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class normalizer : public single_variable_function
{
 protected:
  double data_mean;
  double standard_deviation;
 public:
  normalizer( double new_data_mean, double new_standard_deviation );
  virtual double at( double x ) const;
};

#endif
#include "normalizer.h"

normalizer::normalizer( double new_data_mean, double new_standard_deviation ) :
data_mean( new_data_mean ),
standard_deviation( new_standard_deviation )
{
}

double
normalizer::at( double x ) const
{
  return ( x - data_mean ) / standard_deviation;
}

number_array_list

/*
*/

#ifndef NUMBER_ARRAY_LIST_H
#define NUMBER_ARRAY_LIST_H

#include <list>
#include <vector>
#include <string>

#ifndef SINGLE_VARIABLE_FUNCTION_H
#include "single_variable_function.h"
#endif

class number_array_list : public std::list< std::vector< double > >
{
 protected:
  int m_field_count;

 public:
  int field_count( void ) const;
  std::vector< double > head( void ) const;
  number_array_list tail( void ) const;
  number_array_list suffixed_by_item( const std::vector< double >& entry ) const;
  number_array_list suffixed_by_list( const number_array_list& rhs ) const;
  number_array_list mapped_to( int field_index, const single_variable_function& f ) const;
  number_array_list multiplied_by_list( int field_index, const number_array_list& rhs ) const;
  number_array_list sorted_by( int field_index ) const;
  number_array_list items_less_than( int field_index, double value ) const;
  number_array_list items_greater_than_or_equal_to( int field_index, double value ) const;

  bool is_valid_entry( const std::vector< double >& entry ) const;
  bool is_valid_field_index( int field_index ) const;
  bool is_sorted_by( int field_index ) const;
  double sum_by_field( int field_index ) const;
  double mean_by_field( int field_index ) const;
  int entry_count( void ) const;
  
  void set_field_count( int new_field_count );
  void add_entry( const std::vector< double >& entry );

  void make_from_entry( const std::vector< double >& entry );
  void make( void );
  void make_from_list( const number_array_list& rhs );
  number_array_list( void );

  double standard_deviation( int field_index, bool is_full_population ) const;
  double variance( int field_index, bool is_full_population ) const;
  number_array_list normalized_series( int field_index ) const;
  int normalization_s_value( void ) const;
  number_array_list normalized_series_table( int field_index ) const;
  number_array_list normal_distribution_segments( int s ) const;
  number_array_list chi_squared_table( int field_index ) const;
  double chi_squared_q( int field_index ) const;
  double chi_squared_1_minus_p( int field_index ) const;
  number_array_list select_series( int field_index ) const;
  int items_in_range( int field_index, double lower_limit, double upper_limit ) const;
  std::string entry_as_string( const std::vector< double >& entry ) const;
  std::string table_as_string( void ) const;

};

#endif

/*
*/
/*
*/

#include "number_array_list.h"

#ifndef CONTRACT_H
#include "contract.h"
#endif
#ifndef ADDER_H
#include "adder.h"
#endif
#ifndef SQUARE_H
#include "square.h"
#endif
#ifndef NORMAL_DISTRIBUTION_INVERSE_H
#include "normal_distribution_inverse.h"
#endif
#ifndef CHI_SQUARED_INTEGRAL_H
#include "chi_squared_integral.h"
#endif
#ifndef NORMALIZER_H
#include "normalizer.h"
#endif

#include <math.h>
#include <stdio.h>

int
number_array_list::field_count( void ) const
{
  return m_field_count;
}

std::vector< double >
number_array_list::head( void ) const
{
  REQUIRE( entry_count() >= 1 );
  return *(begin());
}

number_array_list
number_array_list::tail( void ) const
{
  number_array_list Result;
  if ( entry_count() > 1 )
    {
      number_array_list::const_iterator tail_head = begin();
      ++tail_head;
      Result.insert( Result.begin(), tail_head, end() );
      Result.m_field_count = field_count();
      ENSURE( Result.entry_count() == entry_count() - 1 );
    }
  return Result;
}

number_array_list
number_array_list::suffixed_by_item( const std::vector< double >& entry ) const
{
  number_array_list Result;
  Result.make_from_list( *this );
  Result.add_entry( entry );
  ENSURE( Result.entry_count() == entry_count() + 1 );
  return Result;
}

number_array_list
number_array_list::suffixed_by_list( const number_array_list& rhs ) const
{
  number_array_list Result;
  Result.make_from_list( *this );
  for ( number_array_list::const_iterator iter = rhs.begin(); iter != rhs.end(); ++iter )
    {
      Result.add_entry( *iter );
    }
  ENSURE( Result.entry_count() == entry_count() + rhs.entry_count() );
  return Result;
}

number_array_list
number_array_list::mapped_to( int field_index, const single_variable_function& f ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector<double> new_entry = head();
      new_entry[ field_index ] = f.at( head()[ field_index ] );
      Result.make_from_list( Result.suffixed_by_item( new_entry ).suffixed_by_list( tail().mapped_to( field_index, f )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::multiplied_by_list( int field_index, const number_array_list& rhs ) const
{
  REQUIRE( entry_count() == rhs.entry_count() );
  REQUIRE( field_count() == rhs.field_count() );
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector< double > new_entry = head();
      new_entry[ field_index ] = head()[ field_index ] * rhs.head()[field_index];
      Result.make_from_list( Result.suffixed_by_item( new_entry ).
			      suffixed_by_list( tail().multiplied_by_list( field_index, rhs ) ) );
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  if ( is_sorted_by( field_index ) )
    {
      Result.make_from_list( *this );
    }
  else
    {
      Result.make_from_list( tail().items_less_than( field_index, head()[field_index] ).sorted_by( field_index ).
			     suffixed_by_item( head() ).
			     suffixed_by_list( tail().items_greater_than_or_equal_to( field_index, head()[field_index] ).sorted_by( field_index )));
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::items_less_than( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] < value )
	{
	  Result.add_entry( *iter );
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

number_array_list
number_array_list::items_greater_than_or_equal_to( int field_index, double value ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  number_array_list Result;
  for ( number_array_list::const_iterator iter = begin(); iter != end(); ++iter )
    {
      if ( (*iter)[field_index] >= value )
	{
	  Result.add_entry(*iter);
	}
    }
  ENSURE( Result.entry_count() <= entry_count() );
  return Result;
}

bool
number_array_list::is_valid_entry( const std::vector<double>& entry ) const
{
  bool Result = false;
  if ( ( entry_count() == 0 ) 
       || ( ( entry_count() > 0 ) && ( entry.size() == field_count() ) ) )
    {
      Result = true;
    }
  return Result;
}

bool
number_array_list::is_valid_field_index( int field_index ) const
{
  bool Result = true;
  if ( entry_count() > 0 )
    {
      Result = ( 0 <= field_index ) && ( field_index < m_field_count );
    }
  return Result;
}

bool
number_array_list::is_sorted_by( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  bool Result = true;
  if ( entry_count() > 1 )
    {
      Result = ( head()[field_index] < tail().head()[field_index] ) && tail().is_sorted_by( field_index );
    }
  return Result;
}

double
number_array_list::sum_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) )
  double Result = 0;
  if ( entry_count() > 0 )
    {
      Result = head()[field_index] + tail().sum_by_field( field_index );
    }
  return Result;
}

double
number_array_list::mean_by_field( int field_index ) const
{
  REQUIRE( is_valid_field_index( field_index ) );
  REQUIRE( entry_count() > 0 );
  double Result = sum_by_field( field_index ) / static_cast<double>( entry_count() );
  return Result;
}

int
number_array_list::entry_count( void ) const
{
  return size();
}

void
number_array_list::set_field_count( int new_field_count )
{
  REQUIRE( entry_count() == 0 );
  m_field_count = new_field_count;
  ENSURE( field_count() == new_field_count );
}

void
number_array_list::add_entry( const std::vector< double >& entry )
{
  if ( entry_count() == 0 )
    {
      set_field_count( entry.size() );
    }
  REQUIRE( is_valid_entry( entry ) );
  const int old_entry_count = entry_count();
  push_back( entry );
  ENSURE( entry_count() == old_entry_count + 1 );
}

void
number_array_list::make_from_entry( const std::vector<double>& entry )
{
  make();
  add_entry( entry );
  ENSURE( entry_count() == 1 );
  ENSURE( head() == entry );
}

void
number_array_list::make( void )
{
  clear();
  m_field_count = -1;
  ENSURE( m_field_count == -1 );
  ENSURE( entry_count() == 0 );
}

void
number_array_list::make_from_list( const number_array_list& rhs )
{
  make();
  insert( begin(), rhs.begin(), rhs.end() );
  m_field_count = rhs.field_count();
  ENSURE( entry_count() == rhs.entry_count() );
}

number_array_list::number_array_list( void )
{
  make();
}

double
number_array_list::standard_deviation( int field_index, bool is_full_population ) const
{
  double Result = sqrt( variance( field_index, is_full_population ) );
  return Result;
}

double
number_array_list::variance( int field_index, bool is_full_population ) const
{
  adder an_adder( -mean_by_field( field_index ) );
  square a_square;
  if ( is_full_population )
    {
      CHECK( entry_count() > 0 );
    }
  else
    {
      CHECK( entry_count() > 1 );
    }
  const double divisor = static_cast< double >( is_full_population ? entry_count() : entry_count() - 1 );
  const double Result = mapped_to( field_index, an_adder ).
    mapped_to( field_index, a_square ).sum_by_field( field_index ) / divisor;
  return Result;
}

number_array_list
number_array_list::normalized_series( int field_index ) const
{
  //cout << "Standard dev: " << standard_deviation( field_index, false ) << "\n";
  normalizer a_normalizer( mean_by_field( field_index ), standard_deviation( field_index, false ) );
  number_array_list Result = mapped_to( field_index, a_normalizer ).select_series( field_index );
  //cout << "**in normalized_series**\n" << mapped_to( field_index, a_normalizer ).table_as_string() << "\n";
  ENSURE( Result.entry_count() == entry_count() );
  //cout << "**normalized_series result**\n" << Result.table_as_string() << "\n";
  return Result;
}

int
number_array_list::normalization_s_value( void ) const
{
  REQUIRE( entry_count() % 5 == 0 );
  int Result = entry_count() / 5;
  return Result;
}

number_array_list
number_array_list::normalized_series_table( int field_index ) const
{
  REQUIRE( entry_count() > 0 );
  number_array_list sorted_list;
  sorted_list.make_from_list( sorted_by( field_index ) );
  number_array_list sorted_normalized_list = sorted_list.normalized_series( field_index );
  //cout << "Sorted list: \n" << sorted_list.table_as_string() << "\n";
  //cout << "sorted normalized_list: \n" << sorted_normalized_list.table_as_string() << "\n";
  CHECK( sorted_list.entry_count() == entry_count() );
  CHECK( sorted_normalized_list.entry_count() == entry_count() );
  number_array_list::const_iterator sorted_list_iter;
  number_array_list::const_iterator sorted_normalized_list_iter;
  int i = 0;
  number_array_list Result;
  for( i = 0, sorted_list_iter = sorted_list.begin(), sorted_normalized_list_iter = sorted_normalized_list.begin();
       sorted_list_iter != sorted_list.end() && sorted_normalized_list_iter != sorted_normalized_list.end();
       )//++i, ++sorted_list_iter, ++sorted_normalized_list_iter )
    {
      std::vector<double>new_entry;
      new_entry.push_back( static_cast<double>(i + 1) );
      new_entry.push_back( static_cast<double>(i + 1) / static_cast<double>(entry_count()));
      new_entry.push_back( (*sorted_list_iter)[ field_index ] );
      new_entry.push_back( (*sorted_normalized_list_iter)[ 0 ] );
      Result.add_entry( new_entry );

      ++i;
      ++sorted_list_iter;
      ++sorted_normalized_list_iter;
    }
  ENSURE( Result.entry_count() == entry_count() );
  return Result;
}

number_array_list
number_array_list::normal_distribution_segments( int s ) const
{
  REQUIRE( s > 0 );
  const double norm_floor = -10000;
  const double norm_ceiling = 10000;
  normal_distribution_inverse norm_inverse;
  number_array_list Result;
  for ( int i = 1; i <= s; ++i )
    {
      std::vector< double > new_entry;
      if ( i == 1 ) 
	{
	  new_entry.push_back( norm_floor );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast<double>(i - 1) ) 
						* ( 1.0 / static_cast< double >( s ) )));
	}
      if ( i == s )
	{
	  new_entry.push_back( norm_ceiling );
	}
      else
	{
	  new_entry.push_back( norm_inverse.at( ( static_cast< double >( i ) )
						* ( 1.0 / static_cast< double >( s ) )));
	}
      Result.add_entry( new_entry );
    }
  return Result;
}

number_array_list
number_array_list::chi_squared_table( int field_index ) const
{
  number_array_list normal_segments = normal_distribution_segments( normalization_s_value() );
  number_array_list norm_table = normalized_series_table( field_index );
  number_array_list Result;
  square a_square;
  //cout << "norm table\n" << norm_table.table_as_string() << "\n\n";
  int i = 0;
  number_array_list::const_iterator iter;
  for ( i = 0, iter = normal_segments.begin(); iter != normal_segments.end(); ++iter, ++i )
    {
      std::vector< double > new_entry;
      new_entry.push_back( i );
      new_entry.push_back( (*iter)[0] );
      new_entry.push_back( (*iter)[1] );
      new_entry.push_back( static_cast< double >(norm_table.entry_count()) / norm_table.normalization_s_value() );
      new_entry.push_back( static_cast< double >(norm_table.items_in_range( 3, new_entry[1], new_entry[2] ) ) );
      new_entry.push_back( a_square.at( new_entry[3] - new_entry[4] ) );
      new_entry.push_back( new_entry[5] / new_entry[ 3 ] );
      Result.add_entry( new_entry );
    }
  return Result;
}

double
number_array_list::chi_squared_q( int field_index ) const
{
  //cout << chi_squared_table( field_index ).table_as_string();  
  double Result = chi_squared_table( field_index ).sum_by_field( 6 );
  return Result;
}

#ifndef CHI_SQUARED_BASE_H
#include "chi_squared_base.h"
#endif

double
number_array_list::chi_squared_1_minus_p( int field_index ) const
{
  chi_squared_integral chi( normalization_s_value() - 1 );
  double Result = 1 - chi.at( chi_squared_q( field_index ) );
  return Result;
}

number_array_list
number_array_list::select_series( int field_index ) const
{
  number_array_list Result;
  if ( entry_count() > 0 )
    {
      std::vector< double >new_entry;
      new_entry.push_back( head()[field_index] );
      Result.add_entry( new_entry );
    }
  if ( entry_count() > 1 )
    {
      Result.make_from_list( Result.suffixed_by_list( tail().select_series( field_index ) ));
    }
  ENSURE( Result.entry_count() == entry_count() );
  ENSURE( Result.field_count() == 1 );
  return Result;
}

int
number_array_list::items_in_range( int field_index, double lower_limit, double upper_limit ) const
{
  int Result = 0;
  if (( entry_count() > 0 )
      && ( head()[field_index] > lower_limit )
      && ( head()[field_index] <= upper_limit ) )
    {
      Result = 1;
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().items_in_range( field_index, lower_limit, upper_limit );
    }
  return Result;
}

std::string double_as_string( double x ) 
{
  char buffer[ 1000 ];
  sprintf( buffer, "%f", x );
  return std::string( buffer );
}

std::string
number_array_list::entry_as_string( const std::vector< double >& entry ) const
{
  std::string Result = "( ";
  for ( std::vector< double >::const_iterator iter = entry.begin(); iter != entry.end(); ++iter )
    {
      Result = Result + double_as_string( *iter );
      if ( ( iter + 1 ) != entry.end() )
	{
	  Result = Result + ", ";
	}
    }
  Result = Result + " )";
  return Result;
}  

std::string
number_array_list::table_as_string( void ) const
{
  std::string Result = "";
  if ( entry_count() > 0 ) 
    {
      Result = Result + entry_as_string( head() ) + "\n";
    }
  if ( entry_count() > 1 )
    {
      Result = Result + tail().table_as_string();
    }
  return Result;
}

/*
*/

number_array_list_parser_2

#ifndef NUMBER_ARRAY_LIST_PARSER_H
#define NUMBER_ARRAY_LIST_PARSER_H

#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif
#ifndef NUMBER_ARRAY_LIST_H
#include "number_array_list.h"
#endif

class number_array_list_parser_2: public simple_input_parser
{
 protected:
  number_array_list number_list;
  enum state_t { Parsing_historical_data, Parsing_commands };
  state_t state;

 public:
  virtual std::string transformed_line (const std::string & line) const;
  std::string string_stripped_of_comments (const std::string & str) const;
  static bool is_double (const std::string & str);
  static double double_from_string (const std::string & str);
  static const std::string & historical_data_terminator;
  static const std::string & inline_comment_begin;
  bool last_line_is_blank (void);

  virtual void parse_last_line (void);
  void parse_last_line_as_historical_data (void);
  void parse_last_line_as_end_of_historical_data (void);
  void parse_last_line_as_command( void );
  void print_normalization_check( int field_index );
  void make( void );
  number_array_list_parser_2( void );

  bool last_line_is_valid_historical_data( void ) const;
  static std::vector< std::string > split_string( const std::string& string_to_split, const std::string& separator );
  static std::string string_before_separator( const std::string& string_to_split, const std::string& separator );
  static std::string string_after_separator( const std::string& string_to_split, const std::string& separator );
  static std::vector< double > numbers_from_string( const std::string& string_to_split );

};

#endif
/*
*/

#include "number_array_list_parser_2.h"
#ifndef WHITESPACE_STRIPPER_H
#include "whitespace_stripper.h"
#endif
#ifndef ERROR_LOG_H
#include "error_log.h"
#endif
#ifndef CONTRACT_H
#include "contract.h"
#endif

void
number_array_list_parser_2::make (void)
{
  simple_input_parser::reset ();
  state = Parsing_historical_data;
  number_list.make ();
}

std::string number_array_list_parser_2::transformed_line (const std::string & str) const
{
  return whitespace_stripper::string_stripped_of_whitespace (string_stripped_of_comments (str));
}

std::string
  number_array_list_parser_2::string_stripped_of_comments (const std::string & str) const
{
  const std::string::size_type comment_index = str.find (inline_comment_begin);
  return str.substr (0, comment_index);
}

bool 
number_array_list_parser_2::is_double (const std::string & str)
{
  bool
    Result = true;
  char *
    conversion_end = NULL;
  strtod (str.c_str (), &conversion_end);
  if (conversion_end == str.data ())
    {
      Result = false;
    }
  return Result;
}

double
number_array_list_parser_2::double_from_string (const std::string & str)
{
  REQUIRE (is_double (str));
  return strtod (str.c_str (), NULL);
}


const
  std::string & number_array_list_parser_2::historical_data_terminator = "stop";

const
  std::string & number_array_list_parser_2::inline_comment_begin = "--";

bool number_array_list_parser_2::last_line_is_blank (void)
{
  if (last_line ().length () == 0)
    {
      return true;
    }
  else
    {
      return false;
    }
}

void
number_array_list_parser_2::parse_last_line (void)
{
  if (last_line_is_blank ())
    {
      return;
    }
  if ( state == Parsing_historical_data )
    {
      if ( last_line () == historical_data_terminator)
	{
	  parse_last_line_as_end_of_historical_data ();
	}
      else
	{
	  parse_last_line_as_historical_data ();
	}
    }
  else
    {
      parse_last_line_as_command();
    }
}


void
number_array_list_parser_2::parse_last_line_as_historical_data (void)
{
  if ( last_line_is_valid_historical_data() )
    {
      const std::vector< double >this_entry = numbers_from_string( last_line() );
      number_list.add_entry( this_entry );
    }
  else
    {
      error_log errlog;
      errlog.log_error( std::string( "Invalid data entry: " ) + last_line() );
    }
}
  
void
number_array_list_parser_2::parse_last_line_as_end_of_historical_data (void)
{
  REQUIRE (last_line () == historical_data_terminator);
  cout << "Historical data read; " 
       << number_list.entry_count() << " entries, " 
       << number_list.field_count() << " fields.\n";
  state = Parsing_commands;
}

number_array_list_parser_2::number_array_list_parser_2 (void)
{
  make();
}


bool
number_array_list_parser_2::last_line_is_valid_historical_data( void ) const
{
  const std::vector< std::string > substrings = split_string( last_line(), "," );
  bool Result = true;
  //make sure we have a valid field count for the number_array_list
  if ( ( number_list.entry_count() > 0 ) && ( substrings.size() != number_list.field_count() ) )
    {
      Result = false;
    }
  //...and that each substring can be interpreted as a double
  for ( int i = 0; i < substrings.size(); ++i )
    {
      if ( !is_double( substrings[ i ] ) )
	{
	  Result = false;
	}
    }
  return Result;
}

std::vector< std::string >
number_array_list_parser_2::split_string( const std::string& string_to_split, const std::string& separator ) 
{
  std::vector< std::string > Result;
  const std::string prefix = string_before_separator( string_to_split, separator );
  const std::string remainder = string_after_separator( string_to_split, separator );
  Result.push_back( prefix );
  if ( remainder.size() > 0 )
    {
      const std::vector< std::string > split_remainder = split_string( remainder, separator );
      Result.insert( Result.end(), split_remainder.begin(), split_remainder.end() );
    }
  return Result;
}

std::string
number_array_list_parser_2::string_before_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
    //not found; result is entire string
      Result = string_to_split;
    }
  else
    {
      Result = string_to_split.substr( 0, separator_position );
    }
  return Result;
}

std::string
number_array_list_parser_2::string_after_separator( const std::string& string_to_split, const std::string& separator )
{
  const std::string::size_type separator_position = string_to_split.find( separator );
  std::string Result;
  if ( separator_position == string_to_split.npos )
    {
      //not found; result is empty
      Result = "";
    }
  else
    {
      Result = string_to_split.substr( separator_position + separator.size(), string_to_split.size() );
    }
  return Result;
}

std::vector< double >
number_array_list_parser_2::numbers_from_string( const std::string& string_to_split )
{
  const std::vector< std::string > number_strings = split_string( string_to_split, "," );
  std::vector< double > Result;
  for ( std::vector< std::string >::const_iterator iter = number_strings.begin(); iter != number_strings.end(); ++iter )
    {
      CHECK( is_double( *iter ) );
      const double new_value = double_from_string( *iter );
      Result.push_back( new_value );
    }
  return Result;
}

void
number_array_list_parser_2::parse_last_line_as_command( void )
{
  std::string command = string_before_separator( last_line(), " " );
  std::vector< string > arguments = split_string( string_after_separator( last_line(), " " ), " " );
  if ( command == "normality_check_on_series" )
    {
      print_normalization_check( static_cast< int >( double_from_string( arguments[0] ) ) );
    }
  else
    {
      cout << "unknown command: " << last_line() << "\n";
    }
}

void
number_array_list_parser_2::print_normalization_check( int field_index )
{
  cout << "Normalization check on series: " << field_index << "\n";
  cout << "Q: " << number_list.chi_squared_q( field_index ) << "\n";
  cout << "(1-p): " << number_list.chi_squared_1_minus_p( field_index ) << "\n";
}

/*
*/

main

main.cpp

adder.e

class ADDER
   
inherit
   SINGLE_VARIABLE_FUNCTION   
      redefine
	 at
      
creation
   make

feature {NONE}
   
   addend : DOUBLE
	 --number added to each argument in at
   
   make( new_addend : DOUBLE ) is
	 --create with given addend
      do
	 addend := new_addend
      end
   
feature {ANY}
   
   at( x : DOUBLE ) : DOUBLE is
	 --x + addend
      do
	 Result := x + addend
      end
   
end
   

chi_squared_base.e

class CHI_SQUARED_BASE
   --base calculation for the chi-squared distribution
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at;
	 
creation
   make

feature {ANY}
   
   n : INTEGER
	 --sample size of the distribution
   
   make( new_n : INTEGER ) is
	 --creation, setting the sample size
      require
	 new_n > 0
      do
	 n := new_n;
      end
   
   base : DOUBLE is
	 --base of the equation
      local
	 gamma : GAMMA_FUNCTION
      do
	 !!gamma
	 Result := 1 / ( ( 2 ^ n ).sqrt * gamma.at( n.to_double / 2.0 ) )
      end
   
   at( x : DOUBLE ) : DOUBLE is
	 --value of the function at x
      do
	 Result := base * ( ( x ^ (n-2) ).sqrt ) * ( -x/2.0 ).exp
      end
   
end

	 

chi_squared_integral.e

class CHI_SQUARED_INTEGRAL
   
inherit
   SINGLE_VARIABLE_FUNCTION   
   redefine
      at;
      
creation
   make
   
feature {ANY}
   
   n : INTEGER
	 --size of the distribution, in samples
   
   make( new_n : INTEGER ) is
	 --creation
      require
	 new_n > 0
      do
	 n := new_n
      end
   
   at( x : DOUBLE ) : DOUBLE is
	 --integral evaluated from zero to the given number
      require
	 x > 0.0
      local
	 homer : SIMPSON_INTEGRATOR
	 chi_base : CHI_SQUARED_BASE
      do
	 !!homer.make
	 !!chi_base.make(n)
	 Result := homer.integral( chi_base, 0, x )
      end
   
end

   

normal_distribution_inverse.e

class NORMAL_DISTRIBUTION_INVERSE
   --inverse of the normal distribution, used to make segment tables 
   --for normalization fit
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at;
	 
feature { ANY }   

   at( x : DOUBLE ) : DOUBLE is
	 --inverse of the normal distribution
      local
	 target : DOUBLE
	 last_error : DOUBLE
	 last_result : DOUBLE
	 this_result : DOUBLE
	 has_tried_once : BOOLEAN
	 normal : NORMAL_DISTRIBUTION_INTEGRAL
	 error_margin : DOUBLE
      do
	 from
	    target := x;
	    last_error := 0;
	    last_result := 0;
	    this_result := 0;
	    Result := 0
	    has_tried_once := false
	    error_margin := 0.0000001
	    !!normal.make
	 until
	    has_tried_once and error_margin > last_error   
	 loop
	    last_result := this_result   
	    this_result := normal.at( Result );
	    last_error := (this_result - target).abs
	    if ( last_error > error_margin ) then
	       Result := next_guess( Result, last_result, target )
	    end
	    has_tried_once := true;   
	 end
      end   
   
   next_guess( arg, last_result, target : DOUBLE ) : DOUBLE is
      --next guess in the iterative scheme of things
      do
	 Result := arg + ( target - last_result )
      end
   
end
   

normalizer.e

class NORMALIZER
   
inherit
   SINGLE_VARIABLE_FUNCTION
      redefine
	 at
	 
creation
   make

feature { NONE }
   
   data_mean : DOUBLE
   
   standard_deviation : DOUBLE
   
   make( new_data_mean, new_standard_deviation : DOUBLE ) is
	 --create with given feature values
      do
	 data_mean := new_data_mean
	 standard_deviation := new_standard_deviation
      end
   
feature {ANY}
   
   at( x : DOUBLE ) : DOUBLE is
	 --(x-mean)/standard_dev
      do
	 Result := ( x - data_mean ) / standard_deviation
      end
   
end
   

number_array_list.e

class NUMBER_ARRAY_LIST
   
inherit
   LINKED_LIST[ ARRAY[ DOUBLE ] ]
      redefine
	 make;
   
creation
   make, make_from_entry, make_from_list
      
feature { ANY }
   
   field_count : INTEGER
	 --number of fields allowed in an entry
   
   head : like item is
      --first item
      require
	 entry_count >= 1
      do
	 Result := first;
      end
   
   tail : like Current is
	 --all items after the first
      do
	 !!Result.make
	 if ( entry_count > 1 ) then
	    Result := slice( lower + 1, upper );
	    Result.set_field_count( field_count )
	 end --if
      ensure
	 ( entry_count > 1 ) implies Result.entry_count = entry_count - 1 and Result.field_count = field_count
      end
   
   suffixed_by_list( rhs: like Current ) : like Current is
	 --the list, suffixed by another list
      local
	 i : INTEGER
      do
	 !!Result.make_from_list( Current );
	 from
	    i := rhs.lower
	 until
	    not rhs.valid_index ( i )
	 loop
	    Result.add_entry( rhs.item( i ) );
	    i := i + 1
	 end --from
      ensure
	 Result.entry_count = entry_count + rhs.entry_count
      end
   
   make_from_entry( entry : like item ) is
	 --clear the list, then add the entry
      do
	 make
	 add_entry( entry )
      ensure
	 head.is_equal( entry )
	 field_count = entry.count
      end
   
   make is
	 --clear the list
      do
	 Precursor
	 field_count := -1
      ensure
	 entry_count = 0
	 field_count = -1
      end
   
   make_from_list( rhs: like Current ) is
	 --clear the list, setting it equal to another list
      do
	 from_collection( rhs )
	 field_count := rhs.field_count
      end
   
   sum_by_field( field_index : INTEGER ) : DOUBLE is
	 --the sum of a given field over all entries
      require
	 is_valid_field_index( field_index )
      do
	 if entry_count = 0 then
	    Result := 0
	 else
	    Result := head.item( field_index ) + tail.sum_by_field( field_index )
	 end
      end
   
   mean_by_field( field_index : INTEGER ) : DOUBLE is
	 --the mean of a given field over all entries
      require
	 is_valid_field_index( field_index )
	 entry_count >= 1
      do
	 Result := sum_by_field( field_index ) / entry_count.to_double
      end
   
   entry_count : INTEGER is
	 --the number of entries
      do
	 Result := count
      end
   
   add_entry( new_entry : like item ) is
	 -- adds an entry to the end of the list
      require
	 is_valid_entry( new_entry )
      do
	 if entry_count = 0 then
	    set_field_count( new_entry.count )
	 end
	 add_last( new_entry )
      ensure
	 entry_count = old entry_count + 1
      end
   
   mapped_to( field_index : INTEGER; f: SINGLE_VARIABLE_FUNCTION ) : like Current is
	 --the elements, with the given field mapped to f
      require
	 is_valid_field_index( field_index )
      local
	 new_entry : like item
      do
	 !!Result.make
	 if entry_count > 0 then
	    new_entry := head.twin
	    new_entry.put( f.at( head.item(field_index) ), field_index )
	    Result.add_entry( new_entry );
	    Result := Result.suffixed_by_list( tail.mapped_to( field_index, f ) )
	 end -- if
      ensure
	 Result.entry_count = entry_count
      end
   
   multiplied_by_list( field_index : INTEGER; rhs : like Current ) : like Current is
      --the elements, with the given field multiplied by the 
	 --corresponding field in rhs
      require
	 entry_count = rhs.entry_count
	 field_count = rhs.field_count
	 is_valid_field_index( field_index )
      local
	 new_entry : like item
      do
	 !!Result.make
	 if entry_count > 0 then
	    new_entry := head
	    new_entry.put( head.item( field_index ) * rhs.head.item( field_index ), field_index )
	    Result.add_entry( new_entry )
	    Result := Result.suffixed_by_list( tail.multiplied_by_list( rhs.tail ) )
	 end
      ensure
	 Result.entry_count = entry_count
      end
   
   sorted_by( field_index : INTEGER ) : like Current is
	 --the list, sorted by the given field
      require
	 is_valid_field_index( field_index )
      do
	 if is_sorted_by( field_index ) then
	    !!Result.make_from_list( Current )
	 else
	    !!Result.make_from_list( tail.items_less_than( field_index, head.item( field_index ) ).sorted_by( field_index ).
				     suffixed_by_item( head ).
				     suffixed_by_list( tail.items_greater_than_or_equal_to( field_index, head.item( field_index ) ).sorted_by( field_index ) ) )
	 end -- if
      ensure
	 Result.entry_count = entry_count
      end
   
   is_valid_entry( entry : like item ) : BOOLEAN is
	 --whether entry is a valid entry
      do
	 if ( entry_count = 0 and entry.count > 0 ) or ( entry.count = field_count and entry.lower = 1 ) then
	    Result := true
	 else
	    Result := false
	 end
      end
   
   items_less_than( field_index : INTEGER; value : DOUBLE ) : like Current is
	 --list of items less than the given value in the given field
      require
	 is_valid_field_index( field_index )
      local
	 i : INTEGER
      do
	 !!Result.make
	 from
	    i := lower
	 until
	    not valid_index( i )
	 loop
	    if not( item( i ).item( field_index ) >= value ) then
	       Result.add_entry( item( i ) )
	    end
	    i := i + 1
	 end
      ensure
	 Result.entry_count <= entry_count
      end
   
   items_greater_than_or_equal_to( field_index : INTEGER; value : DOUBLE ) : like Current is
      --list of items greater than or equal to the given value in the 
	 --given field
      require
	 is_valid_field_index( field_index )
      local
	 i : INTEGER
      do
	 !!Result.make
	 from
	    i := lower
	 until
	    not valid_index( i )
	 loop
	    if item(i).item( field_index ) >= value then
	       Result.add_entry( item( i ) )
	    end
	    i := i + 1
	 end
      ensure
	 Result.entry_count <= entry_count
      end
   
   suffixed_by_item( entry : like item ) : like Current is
	 --the list, suffixed by a single item
      require
	 is_valid_entry( entry )
      do
	 !!Result.make_from_list( Current )
	 Result.add_entry( entry )
      ensure
	 Result.entry_count = entry_count + 1
      end
   
   set_field_count( new_field_count : INTEGER ) is
	 --sets the field count
      require
	 entry_count = 0 or ( entry_count > 0 implies head.count = new_field_count )
      do
	 field_count := new_field_count
      ensure
	 field_count = new_field_count
      end
   
   is_valid_field_index( field_index : INTEGER ) : BOOLEAN is
	 --whether the given field index is valid
      do
	 if entry_count = 0 then
	    Result := true
	 else
	    Result := ( ( 1 <= field_index ) and ( field_index <= field_count ) )
	 end
      end
   
   is_sorted_by( field_index : INTEGER ) : BOOLEAN is
	 --whether the list is sorted by the given field index
      require
	 is_valid_field_index( field_index )
      do
	 Result := true
	 if entry_count > 1 then
	    Result := ( head.item( field_index ) < tail.head.item( field_index ) ) 
	       and tail.is_sorted_by( field_index )
	 end
      end
   
feature {ANY}
   --chi-squared distribution calcs
   
   variance( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
      require
	 is_valid_field_index( field_index )
	 is_full_population implies entry_count > 0
	 ( not is_full_population ) implies entry_count > 1
      local
	 divisor : DOUBLE
	 adder : ADDER
	 square : SQUARE
      do
	 if is_full_population then
	    divisor := entry_count.to_double
	 else
	    divisor := (entry_count - 1).to_double
	 end
	 !!adder.make( - mean_by_field( field_index ) )
	 !!square
	 Result := ( mapped_to( field_index, adder ).
		     mapped_to( field_index, square ).
		     sum_by_field( field_index ) ) / divisor
      end
   
   standard_deviation( field_index : INTEGER; is_full_population : BOOLEAN ) : DOUBLE is
      require
	 is_valid_field_index( field_index )
      do
	 Result := variance( field_index, is_full_population ).sqrt
      end
   
   normalized_series( field_index : INTEGER ) : like Current is
      --one series of the table, "normalized" into standard deviations
      require
	 is_valid_field_index( field_index )
      local
	 normalizer : NORMALIZER
      do
	 !!normalizer.make( mean_by_field( field_index ), standard_deviation( field_index, false ) )
	 Result := mapped_to( field_index, normalizer ).select_series( field_index )
      ensure
	 Result.entry_count = entry_count
      end
   
   normalization_s_value : INTEGER is
      --number of segments in a normalization table
      require
	 entry_count.divisible( 5 )
      do
	 Result := entry_count // 5
      end
   
   normalized_series_table( field_index : INTEGER ) : like Current is
      --a normalized series table, used for error-checking and used 
      --in the chi-squared normalization test
      require
	 entry_count > 0
	 is_valid_field_index( field_index )
      local
	 sorted_list : like Current
	 sorted_normalized_list : like Current
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
      do
	 sorted_list := sorted_by( field_index )
	 sorted_normalized_list := sorted_list.normalized_series( field_index )
	 check
	    sorted_list.entry_count = entry_count
	    sorted_normalized_list.entry_count = entry_count
	 end
	 from
	    i := sorted_list.lower
	    !!Result.make
	 until
	    not sorted_list.valid_index( i )
	 loop
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( i.to_double )
	    new_entry.add_last( i.to_double / entry_count.to_double )
	    new_entry.add_last( sorted_list.item( i ).item( field_index ) )
	    new_entry.add_last( sorted_normalized_list.item(i).first )
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   normal_distribution_segments( s : INTEGER ) : like Current is
	 --segments of the normal distribution; see [Humphrey95] for use
      require
	 s > 0
      local
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
	 norm_inverse : NORMAL_DISTRIBUTION_INVERSE
      do
	 from 
	    i := 1
	    !!Result.make
	    !!norm_inverse
	 until 
	    i > s
	 loop
	    !!new_entry.make( 1, 0 )
	    if i = 1 then
	       new_entry.add_last( -1000.0 )
	    else
	       new_entry.add_last( norm_inverse.at( ( i - 1 ).to_double * ( 1.0 / s.to_double ) ) )
	    end
	    if i = s then
	       new_entry.add_last( 1000.0 )
	    else
	       new_entry.add_last( norm_inverse.at( ( i.to_double * 1.0 / s.to_double ) ) )
	    end
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   chi_squared_table( field_index : integer ) : like Current is
	 -- chi-squared table, used to calculate the chi-squared distribution
      require
	 is_valid_field_index( field_index )
      local
	 norm_segments : like Current
	 norm_table : like Current
	 i : INTEGER
	 new_entry : ARRAY[ DOUBLE ]
      do
	 from	 
	    norm_segments := normal_distribution_segments( normalization_s_value )
	    norm_table := normalized_series_table( field_index )
	    !!Result.make
	    i := norm_segments.lower
	 until
	    not norm_segments.valid_index( i )
	 loop
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( i.to_double )
	    new_entry.add_last( norm_segments.item( i ).item( 1 ) )
	    new_entry.add_last( norm_segments.item( i ).item( 2 ) )
	    new_entry.add_last( norm_table.entry_count.to_double / norm_table.normalization_s_value.to_double )
	    new_entry.add_last( norm_table.items_in_range( 4, new_entry.item(2), new_entry.item(3) ))
	    new_entry.add_last( ( new_entry.item(4) - new_entry.item(5) ) ^ 2 )
	    new_entry.add_last( new_entry.item(6) / new_entry.item(4) )
	    Result.add_entry( new_entry )
	    i := i + 1
	 end
      end
   
   chi_squared_q( field_index : INTEGER ) : DOUBLE is				
	 --"Q" value of chi-squared normalization test
      do
	 Result := chi_squared_table( field_index ).sum_by_field( 7 )
      end
   
   chi_squared_1_minus_p( field_index : INTEGER ) : DOUBLE is
	 --1-p value of chi-squared normalization test
      local
	 chi_squared_integral : CHI_SQUARED_INTEGRAL
      do
	 !!chi_squared_integral.make( normalization_s_value - 1 )
	 Result := 1 - chi_squared_integral.at( chi_squared_q( field_index ) )
      end
   
   select_series( field_index : INTEGER ) : like Current is
	 --the given series, extracted as a separate list
      require
	 is_valid_field_index( field_index )
      local
	 new_entry : ARRAY[ DOUBLE ]
      do
	 if ( entry_count = 0 ) then
	    !!Result.make
	 else
	    !!new_entry.make( 1, 0 )
	    new_entry.add_last( head.item( field_index ) )
	    !!Result.make_from_entry( new_entry )
	    if entry_count > 1 then
	       Result := Result.suffixed_by_list( tail.select_series( field_index ) )
	    end
	 end
      end
   
   items_in_range( field_index : INTEGER; lower_limit, upper_limit : DOUBLE ) : INTEGER is
      --the items from the given field which fit in (lower limit < item <= upper_limit)
      require
	 is_valid_field_index( field_index )
      do	 
	 if entry_count > 0 and head.item(field_index) > lower_limit and upper_limit >= head.item( field_index ) then
	    Result := 1
	 end
	 if entry_count > 1 then
	    Result := Result + tail.items_in_range( field_index, lower_limit, upper_limit )
	 end
      end
   
   entry_as_string( entry : like item ) : STRING is
	 --entry as a string, ie "( 1, 2, 3 )"
      local
	 i : INTEGER
      do
	 !!Result.make_from_string( "( " )
	 from
	    i := entry.lower
	 until
	    not entry.valid_index( i )
	 loop
	    entry.item( i ).append_in( Result )
	    if entry.valid_index( i + 1 ) then
	       Result.append_string( ", " )
	    end
	    i := i + 1
	 end
	 Result.append_string( " )" )
      end
   
   table_as_string : STRING is
      --table, as a set of entry_as_string lines
      do
	 if entry_count > 0 then
	    Result := entry_as_string( head ) + "%N"
	    if entry_count > 1 then
	       Result.append_string( tail.table_as_string )
	    end
	 end
      end
      
end

	 
			     
	 

number_array_list_parser_2.e

class NUMBER_ARRAY_LIST_PARSER_2
--reads a list of number pairs, and performs linear regression analysis

inherit 
   SIMPLE_INPUT_PARSER
      redefine parse_last_line, transformed_line
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   inline_comment_begin: STRING is "--";
   
   string_stripped_of_comment(string: STRING): STRING is 
      --strip the string of any comment
      local 
         comment_index: INTEGER;
      do  
         if string.has_string(inline_comment_begin) then 
            comment_index := string.index_of_string(inline_comment_begin);
            if comment_index = 1 then 
               !!Result.make_from_string( "" );
            else 
               Result := string.substring(1,comment_index - 1);
            end; 
         else 
            Result := string.twin;
         end; 
      end -- string_stripped_of_comment
   
   string_stripped_of_whitespace(string: STRING): STRING is 
      --strip string of whitespace
      do  
         Result := string.twin;
         Result.left_adjust;
         Result.right_adjust;
      end -- string_stripped_of_whitespace
   
   transformed_line(string: STRING): STRING is 
      --strip comments and whitespace from parseable line      
      do  
         Result := string_stripped_of_whitespace(string_stripped_of_comment(string));
      end -- transformed_line
   
   number_list: NUMBER_ARRAY_LIST;
   
   state : INTEGER
   
   Parsing_historical_data : INTEGER is unique
   
   Parsing_commands : INTEGER is unique
   
   Command_normality_check_on_series : STRING is
      once
	 Result := "normality_check_on_series"
      end
   
   historical_data_terminator: STRING is "stop";
   
   double_from_string(string: STRING): DOUBLE is 
      require 
         string.is_double or string.is_integer; 
      do  
         if string.is_double then 
            Result := string.to_double;
         elseif string.is_integer then 
            Result := string.to_integer.to_double;
         end; 
      end -- double_from_string
   

feature {ANY} --parsing

   reset is 
      --resets the parser and makes it ready to go again
      do  
         state := Parsing_historical_data;
         number_list.make;
      end -- reset
   
   make is 
      do  
         !!number_list.make;
         reset;
      end -- make
   
   parse_last_line_as_historical_data is 
      --interpret last_line as a pair of comma-separated values
      local 
         error_log: ERROR_LOG;
	 this_line_numbers : ARRAY[ DOUBLE ]
      do  
	 !!error_log.make
	 if last_line_is_valid_historical_data then
	    this_line_numbers := numbers_from_string( last_line )
	    number_list.add_entry( this_line_numbers )
	 else
	    error_log.log_error( "Invalid historical data: " + last_line + "%N" )
	 end
      end
   
   parse_last_line_as_end_of_historical_data is 
      --interpret last line as the end of historical data
      require 
         last_line.compare(historical_data_terminator) = 0; 
      local
	 i : INTEGER
      do  
	 state := Parsing_commands
	 std_output.put_string( "Historical data read; " + number_list.entry_count.to_string + 
				"items read%N" )
      end -- parse_last_line_as_end_of_historical_data
   
   parse_last_line_as_command is
	 --interpret the last line as a command
      local
	 command : STRING
	 arguments : ARRAY[ STRING ]
      do
	 command := string_before_separator( last_line, " " )
	 arguments := split_string( string_after_separator( last_line, " ", ), " " )
	 if command.is_equal( Command_normality_check_on_series )  then
	    print_normalization_check_on_series( arguments.first.to_integer + 1)
	 else
	    std_output.put_string( "Unrecognized command string: " + last_line )
	 end
      end
   
   parse_last_line is 
      --parse the last line according to state
      do  
         if not last_line.empty then 
	    if state = Parsing_historical_data then
	       if last_line.compare(historical_data_terminator) = 0 then 
		  parse_last_line_as_end_of_historical_data;
	       else 
                  parse_last_line_as_historical_data;
               end; 
            else
	       parse_last_line_as_command
	    end
         end; 
      end -- parse_last_line
   
   last_line_is_valid_historical_data : BOOLEAN is
	 --whether last line is valid historical data
      local
	 substrings : ARRAY[ STRING ]
	 i : INTEGER
      do
	 substrings := split_string( last_line, "," )
	 Result := true
	 if ( number_list.entry_count > 0 and substrings.count /= number_list.field_count ) then
	    Result := false;
	 end
	 --check and see if each substring is convertible to a double
	 from
	    i := substrings.lower
	 until
	    not substrings.valid_index( i )
	 loop
	    if not ( substrings.item(i).is_double or substrings.item(i).is_integer ) then
	       Result := false;
	    end
	    i := i + 1
	 end
      end
   
   split_string( string_to_split, separator : STRING ) : ARRAY[ STRING ] is
      --a list of components of a string, separated by the given 
      --separator, ie split_string( "1,2,3", "," ) = [ "1", "2", "3" ]
      local
	 prior_to_separator : STRING
	 remainder : STRING
	 split_remainder : ARRAY[ STRING ]
	 i : INTEGER
      do
	 prior_to_separator := string_before_separator( string_to_split, separator )
	 remainder := string_after_separator( string_to_split, separator )
	 !!Result.make( 1, 0 )
	 Result.add_last( prior_to_separator )
	 if remainder.count > 0 then
	    split_remainder := split_string( remainder, separator )
	    from
	       i := split_remainder.lower
	    until
	       not split_remainder.valid_index( i )
	    loop
	       Result.add_last( split_remainder.item( i ) )
	       i := i + 1
	    end
	 end
      end
   
   string_before_separator( string_to_split, separator : STRING ) : STRING is
	 --the part of a string which comes before the separator, or 
	 --the whole string if it's not found
      local
	 separator_index : INTEGER
      do
	 separator_index := string_to_split.substring_index( separator, 1 )
	 if ( separator_index = 0 ) then
	    --not found; copy whole string
	    Result := string_to_split.twin
	 else
	    Result := string_to_split.substring( 1, separator_index - 1 )
	 end
      end
   
   string_after_separator( string_to_split, separator : STRING ) : STRING is
	 --the part of the string after the separator,
      local
	 separator_index : INTEGER
      do
	 separator_index := string_to_split.substring_index( separator, 1 )
	 if ( separator_index = 0 ) then
	    --not found; result is empty
	    !!Result.make_from_string( "" )
	 else
	    Result := string_to_split.substring( separator_index + separator.count, string_to_split.count )
	 end
      end
   
   numbers_from_string( string_to_split : STRING ) : ARRAY[ DOUBLE ] is
	 --an array of numbers, from a string of numbers separated by commas
      local
	 number_strings : ARRAY[ STRING ]
	 i : INTEGER
      do
	 !!Result.make( 1, 0 )
	 number_strings := split_string( string_to_split, "," )
	 from
	    i := number_strings.lower
	 until
	    not number_strings.valid_index( i )
	 loop
	    check
	       number_strings.item(i).is_double or number_strings.item(i).is_integer
	    end
	    Result.add_last( double_from_string( number_strings.item( i ) ) )
	    i := i + 1
	 end
      end
   
   print_normalization_check_on_series( field_index : INTEGER ) is
      do
	 std_output.put_string( "Normalization check on series " )
	 std_output.put_integer( field_index )
	 --std_error.put_string( number_list.sorted_by( field_index ).table_as_string + "%N" )
	 --std_error.put_string( number_list.normalized_series_table( field_index ).table_as_string )
	 std_output.put_string( "%NQ: " )
	 std_output.put_double( number_list.chi_squared_q( field_index ) )
	 std_output.put_string( "%N(1-p): ")
	 std_output.put_double( number_list.chi_squared_1_minus_p( field_index ) )
	 std_output.put_new_line 
      end
   
end -- class NUMBER_ARRAY_LIST_PARSER_2

main.e

class MAIN

creation {ANY} 
   make

feature {ANY} 
   
   make is 
      local 
         parser: NUMBER_ARRAY_LIST_PARSER_2;
      do  
         !!parser.make;
         parser.set_input(io);
         parser.parse_until_eof;
      end -- make

end -- MAIN

Code Review

I need to get better at code reviews. On both the Eiffel and C++ side, I missed several obvious problems-- on the Eiffel side, a problem which caused a great deal of delay (I had forgotten that Eiffel programs only copy by reference, and I had copied something and changed it, changing the original-- and completely baffling me for some time-- something to add to my code review checklist!)

Compile

Once again, the compiler caught many sneaky items. I'm impressed by the error-checking in the Eiffel compiler, particularly with regard to type compliance-- it caught several small things which I just didn't notice in the code review (including the fact that one of my comments was mismatched!). Very nice.

Test

My strategy of reproducing the calculation tables as in the text paid good dividends here, as I was able to print out the tables and search for problems. It worked very well, and though I had several problems with the programs themselves, they were relatively easy to find and fix due to the availability of interim values, etc.

Table 10-2. D15: Test Results Format -- program 9a

TestExpected ResultActual Result - C++Actual Result - Eiffel
Table D14  
Q34.434.434.400000
1-p7.60*10-57.61098e-050.000079

I'm curious about the difference in calculation between the C++ and Eiffel programs with regard t othe 1-p entry; the values are extremely close, and I can't find significant differences in the calculations, but something seems a touch off kilter. I can't figure out if it's the level of precision involved, a difference in standard libraries, or what.

Postmortem

PSP2.1 Project Plan Summary

Table 10-3. Project Plan Summary

Student:Victor B. PutzDate:000201
Program:NormalizationProgram#9A
Instructor:WellsLanguage:C++
SummaryPlanActualTo date
Loc/Hour 464546
Planned time285 943
Actual time 3421248
CPI (cost/performance index)  0.76
%reused476946
Test Defects/KLOC312730.6
Total Defects/KLOC141135140.4
Yield (defects before test/total defects)788078
% Appraisal COQ58.55.58
% Failure COQ29.819.628.32
COQ A/F Ratio0.160.430.20
Program SizePlanActualTo date
Base2020 
Deleted018 
Modified11 
Added237257 
Reused2315651698
Total New and Changed2392581731
Total LOC4898243672
Upper Prediction Interval (70%)326  
Lower Prediction Interval (70%)152  
Time in Phase (min):PlanActualTo DateTo Date%
Planning547744920
Design376731414
Design Review613532
Code747856825
Code Review916733
Compile17181326
Test684950722
Postmortem20241607
Total2853422256100
Total Time UPI (70%)322   
Total Time LPI (70%)249   
Defects Injected ActualTo DateTo Date %
Plan 000
Design10127732
Design Review0000
Code222316066
Code Review0000
Compile1031
Test1031
Total development3435243100
Defects Removed ActualTo DateTo Date %
Planning 000
Design 000
Design Review16156
Code714519
Code Review38239
Compile151310744
Test875322
Total development3435243100
After Development000 
Defect Removal EfficiencyPlanActualTo Date 
Defects/Hour - Design Review13.527.622.5 
Defects/Hour - Code Review15.83024.2 
Defects/Hour - Compile49.543.356.3 
Defects/Hour - Test68.576.9 
DRL (design review/test)2.253.222.26 
DRL (code review/test)2.633.53.5 
DRL (compile/test)8.255.058.16 
Eiffel code/compile/test
Time in Phase (min)ActualTo DateTo Date %
Code6236950
Code Review10406
Compile1513318
Test4119226
Total128734100
Defects InjectedActualTo DateTo Date %
Design164
Code2213895
Compile000
Test011
Total23145100
Defects RemovedActualTo DateTo Date %
Code011
Code Review91411
Compile168860
Test54028
Total23145100
Defect Removal EfficiencyActualTo Date 
Defects/Hour - Code Review1224 
Defects/Hour - Compile6440 
Defects/Hour - Test7.312.5 
DRL (code review/test)1.61.9 
DRL (compile/test)8.83.2 

Time Recording Log

Table 10-4. Time Recording Log

Student:Victor B. PutzDate:000130
  Program:9a
StartStopInterruption TimeDelta timePhaseComments
000130 10:50:04000130 12:07:06077plan 
000130 12:14:58000130 13:25:57466design 
000130 13:59:50000130 14:12:30012design review 
000130 14:24:05000130 15:50:58977code 
000130 16:00:00000130 16:21:52516code review 
000130 16:22:00000130 16:39:45017compile 
000130 16:42:18000130 17:31:05048test 
000130 17:31:29000130 17:55:28023postmortem 
      

Table 10-5. Time Recording Log

Student: Date:000201
  Program: 
StartStopInterruption TimeDelta timePhaseComments
000201 07:56:14000201 09:00:44262code 
000201 09:01:18000201 09:11:0709code review 
000201 09:11:20000201 09:26:12014compile 
000201 09:26:36000201 10:07:25040test 
      

Defect Reporting Logs

Table 10-6. Defect Recording Log

Student:Victor B. PutzDate:000130
  Program:9a
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
000130 13:59:52ctigdesigndesign review1missed minor contracts
000130 14:03:53maigdesigndesign review0didn't increment loop indices
000130 14:05:35ctigdesigndesign review0missed contract for variance
000130 14:07:06ctigdesigndesign review0require standard_deviation /= 0
000130 14:07:58ctigdesigndesign review0require s > 0 in normal_distribution_segments
000130 14:09:41ctigdesigndesign review0minor contracts
000130 15:44:53mdigdesigncode0forgot to add parse_last_line_as_command
000130 16:02:40ctigcodecode review0forgot to put in the require contract for make/constructor
000130 16:05:07syomcodecode review0forgot to #include normal_distribution_integral
000130 16:06:00syomcodecode review0forgot to make last_guess const in implementation
000130 16:13:33syomcodecode review0forgot to return return value!
000130 16:17:23wcomcodecode review0was testing the integral of the normal distribution as 100, rather than 1
000130 16:19:06sytycodecode review0forgot parentheses around no-argument operation
000130 16:19:53sytycodecode review0forgot parentheses around no-argument operation
000130 16:20:27syomcodecode review0forgot to return return value!
000130 16:23:13syigdesigncompile6er... forgot that inheriting and mixing types gets very gross; combined number_array_list and number_array_list_2 into one
000130 16:30:48wntycodecompile0misspelled normal_distribution_inverse
000130 16:31:54wncmcodecompile0used normalized_series_table instead of norm_table
000130 16:33:25sycmcodecompile0didn't make the at function const as required
000130 16:34:07sycmcodecompile0missed parentheses on no-arg function
000130 16:34:40sycmcodecompile0forgot return type for select_series
000130 16:35:05sytycodecompile0declared Result with type::Result
000130 16:35:34syomcodecompile0forgot const in implementation
000130 16:35:57wncmcodecompile0er.. didn't type the correct name for the argument
000130 16:36:17syomcodecompile0forgot parentheses for no-arg function
000130 16:36:58sycmcodecompile0mistyped integer instead of int
000130 16:37:57syomcodecompile0forgot to #include gamma_function.h
000130 16:38:22syomcodecompile0forgot to #include simpson_integrator.h
000130 16:45:22wcomcodetest0had to add an exception that if the first guess was in, don't try a second guess!
000130 16:46:46ctomdesigntest0forgot contract in normalized_series and select_series
000130 16:52:18ctomdesigntest0forgot ensure contract in normalized_series_table
000130 16:53:01wntycodetest1was setting an iterator to end instead of begin for startup
000130 16:56:12weigdesigntest7was using field_index as index of normalized series, instead of zero index
000130 17:07:17syigcodetest14dangit-- C++ was doing silent conversions from int to double without thinking
000130 17:21:45waigdesigntest8was not setting the correct degrees of freedom for chi2 integral
       

Table 10-7. Defect Recording Log

Student: Date:000201
  Program: 
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
000201 09:04:25ctomcodecode review0forgot to put in comment
000201 09:07:29wncmcodecode review0forgot to change "Parsing_prediction_data" to "Parsing_commands"
000201 09:11:30syigcodecompile0forgot that comments which end a class must be correct!
000201 09:12:08sycmcodecompile0separated arguments of different type with comma rather than semicolon
000201 09:12:44syomcodecompile0accidentally put contract under the local block
000201 09:13:20wnomcodecompile0forgot to qualify item with feature call
000201 09:14:05sytycodecompile0forgot colon before return value
000201 09:15:14sytycodecompile0forgot "is" after feature declaration
000201 09:15:39wacmcodecompile1forgot to take out program 8a code to print the sorted lists
000201 09:17:11wnigcodecompile0used "equals" instead of "is_equal" for string equality test
000201 09:18:29wncmcodecompile0used "lower" instead of "first" for first argument of an array
000201 09:19:12wncmcodecompile0typed "entry" instead of "new_entry"
000201 09:19:49wncmcodecompile0typed "i" rather than "field_index"
000201 09:20:12wncmcodecompile0typed "n" rather than "entry_count"
000201 09:21:16wnigcodecompile0apparently "/" is used to produce a double from two integers; // is what I wanted
000201 09:22:04syigcodecompile0used "is" in local block
000201 09:23:11wtigcodecompile0urg... returned a double rather than an integer
000201 09:25:52mccmcodecompile0was not instantiating my normal_distribution_inverse object
000201 09:27:00wccmdesigntest1wrong value (100 instead of 1) in normal_distribution_segments
000201 09:29:20cttycodetest2was doing test on old value of n, not new value!
000201 09:33:55maomcodetest0forgot to update i in loop!
000201 09:35:00waigcodetest7 
000201 09:45:46waigcodetest20Okay, this hurt-- forgot that Eiffel doesn't copy things by default, so I was modifying the original list!