| The Personal Software Process: an Independent Study | ||
|---|---|---|
| Prev | Chapter 1. Lesson 1: Introduction to the Personal Software Process | Next |
Using a linked list, write a program to calculate the mean and standard deviation of a set of data.
From [Humphrey95]:
Write a program to calculate the mean and standard deviation of a series of n real numbers. The mean is the average of the numbers. The formula for the standard deviation is given in [Humphrey95], and is described on page 508 of the text.
Use a linked list to hold the n numbers for the calculations.
Program 1A Testing: Thoroughly test the program. At least three of the tests should use the data in each of the three columns on the right of the following table:
Table 1-5. Sample data for program 1A: Pascal Object Size and Development Hours
Program Number Object LOC New and Changed LOC Development Hours 1 160 186 15.0 2 591 699 69.9 3 114 132 6.5 4 229 272 22.4 5 230 291 28.4 6 270 331 65.9 7 128 199 19.4 8 1657 1890 198.7 9 624 788 38.8 10 1503 1601 138.2 Standard Dev 572.03 625.63 62.26
The basic problem description does not give us a great deal of direction, which means we're free to work as we see fit. With that in mind, I'll iron out a few requirements ambiguities.
The problem statement does not specify how we get numbers into the program; with that in mind, I'll decide that the program by default accepts a list of ascii-text real numbers from standard input, terminated by an EOF. This will allow both interactive input for testing as well as redirection via standard unix pipes, etc. We will also allow the input to be retrieved from a file using the -f flag on the command line, i.e. psp_1a -f numbers.txt.
The problem statement does not specify what to do in the case of invalid input; since we will be using the program primarily for automated i/o on files, we will treat invalid input as unrecoverable; the program should display an error message with the offending line number and filename, if applicable. Since I am developing under Linux with EMACS, the error message format will be in the standard emacs-style error output of file:line:message.
Without any other guidance than a guess, I'll estimate 1.5 hours (90 minutes) for this program.
The basic design is fairly simple, with a single class (number_list), which inherits from the STL template class list< double >. After subclassing list< double >, we add two features for reading the list from an input stream (read_from_stream and read_entry_from_stream), one feature (add_entry) for simply adding a number to the list (to encapsulate some of the STL functionality), and several read-only constant attributes: sum (the sum of all elements), mean, standard_deviation, and entry_count (to count the number of entries, again just encapsulating some STL functionality)
The code was straightforward, although it went through several iterations in the test phase before turning into anything useful. The completed code is below:
/*
*/
#ifndef NUMBER_LIST_H
#define NUMBER_LIST_H
#include <list>
#include <iostream>
//a class which encapsulates a list of double values, adding the features of
//mean and standard deviation
class number_list : public list< double >
{
public:
void read_from_stream( istream& input );
void read_entry_from_stream( istream& input, int line_number );
double sum( void ) const;
double mean( void ) const;
double standard_deviation( void ) const;
int entry_count( void ) const;
void add_entry( double new_entry );
};
#endif
/*
*/ |
/*
*/
#include "number_list.h"
#include <assert.h>
#include <stdlib.h>
#include <math.h>
void
number_list::read_from_stream( istream& input )
{
int line_number = 1;
while ( !(input.eof()) )
{
read_entry_from_stream( input, line_number );
}
}
void
number_list::read_entry_from_stream( istream& input, int line_number )
{
double next_entry;
const int input_buffer_size = 255;
char input_buffer[ input_buffer_size ];
input.getline( input_buffer, input_buffer_size );
//now convert to floating point using strtod
char* conversion_end = NULL;
next_entry = strtod( input_buffer, &conversion_end );
if ( conversion_end != input_buffer )
{
add_entry( next_entry );
}
else if ( !(input.eof()) )
{
cerr << "(unknown file):" << line_number << ":Error in conversion to double\n";
}
}
double
number_list::sum( void ) const
{
double Result = 0;
for ( list<double>::const_iterator iter = begin();
iter != end();
++iter )
{
Result += *iter;
}
return Result;
}
double
number_list::mean( void ) const
{
assert( entry_count() > 0 );
return sum() / entry_count();
}
double
number_list::standard_deviation( void ) const
{
assert( entry_count() > 1 );
double sum_of_square_differences = 0;
for ( list<double>::const_iterator iter = begin();
iter != end();
++iter )
{
const double this_square_difference = *iter - mean();
sum_of_square_differences += this_square_difference * this_square_difference;
}
return sqrt( sum_of_square_differences / ( entry_count() - 1 ));
}
int
number_list::entry_count( void ) const
{
return size();
}
void
number_list::add_entry( double new_entry )
{
push_back( new_entry );
}
/*
*/ |
/*
*/
#include <fstream>
#include <iostream>
#include "string.h"
#ifndef NUMBER_LIST_H
#include "number_list.h"
#endif
istream *
input_stream_from_args( int arg_count, const char** arg_vector )
{
istream* Result = NULL;
if ( arg_count == 1 )
{
Result = &cin;
}
else
{
const char* help_text = "\
PSP exercise 1A: Get the mean and standard deviation of a list\n
of ASCII real numbers, from either standard input or a text\n
file.\n\n\
Usage: \n \
\tpsp_1a\n\n";
cout << help_text;
}
return Result;
}
int main( int arg_count, const char** arg_vector )
{
//get the input stream, or print the help text as appropriate
istream* input_stream = input_stream_from_args( arg_count, arg_vector );
if ( input_stream != NULL )
{
//read the entries from the input stream
number_list a_list;
a_list.read_from_stream( *input_stream );
cout << a_list.entry_count() << " numbers processed.\n";
//output the mean, as appropriate
if ( a_list.entry_count() > 0 )
{
cout << "Mean: " << a_list.mean() << "\n";
}
else
{
cout << "Too few entries to calculate mean\n";
}
//output the standard deviation, as appropriate
if ( a_list.entry_count() > 1 )
{
cout << "Standard Deviation: " << a_list.standard_deviation() << "\n";
}
else
{
cout << "Too few entries to calculate standard deviation.\n";
}
}
}
/*
*/ |
The compilation did reveal a few errors, mostly mistyped names and developer omissions (forgetting to include needed header files, etc). Otherwise, no significant problems here.
The design and code sections for this went so easily that I confess I was disappointed by the test results. I made several simple mistakes, but these mistakes formed the bulk of my time spent developing what should have been a very simple program. Some defects were minor and easy to fix-- for example, the number_list::read_entry_from_stream feature added an entry even for an invalid line, even processing the EOF signal as a number. More significantly, the original design called for a naive use of the C++ iostream capability, in other words, usage like this:
while ( !a_stream.eof() )
{
a_stream >> new_entry;
} |
The only problem with this is that apparently it will never generate an EOF signal at all, a fact of which I wasn't aware. I converted that section to use getline, which worked acceptably.
The most significant problem, though, resulted in some deviations from the original design spec: I was unable to use the standard C++ library stream classes to open a file from disk! In fact, even creating the input stream object resulted in a segment fault interrupt:
ifstream* a_stream = new ifstream() |
In fairness to myself, it has been some time since I've used the standard C++ stream library, but this behaviour did not mesh with my previous experience. A brief experiment with statically allocated streams had similar results when I attempted to check for EOF conditions.
I'll pursue this in the mean time, to check whether or not my unfamiliarity with the library resulted in these problems or if my compiler is simply acting peculiar. C++ is not always well supported in Linux, and I have had problems with their standard library implementation in the past. For the present, though, I will stick with standard input.
The program was tested with three files which contained the sample data from [Humphrey95], and one sample file with errors to check the error-reporting capability.
class NUMBER_LIST
inherit
LINKED_LIST[DOUBLE];
creation {ANY}
make
feature {ANY}
line_counter: INTEGER;
read_from_stream(input: INPUT_STREAM) is
do
line_counter := 0;
from
until
input.end_of_input
loop
read_entry_from_stream(input);
end;
end -- read_from_stream
report_error(error_type: STRING) is
do
std_error.put_string("(unknown file):");
std_error.put_integer(line_counter);
std_error.put_string(error_type);
std_error.put_new_line;
end -- report_error
read_entry_from_stream(input: INPUT_STREAM) is
do
input.read_line;
if not input.end_of_input then
if input.last_string.is_double then
add_last(input.last_string.to_double);
elseif input.last_string.is_integer then
add_last(input.last_string.to_integer.to_double);
else
report_error("Not a double");
end;
end;
end -- read_entry_from_stream
sum: DOUBLE is
local
i: INTEGER;
do
from
i := lower;
until
not valid_index(i)
loop
Result := Result + item(i);
i := i + 1;
end;
end -- sum
mean: DOUBLE is
require
count > 0;
do
Result := sum / count.to_double;
end -- mean
standard_deviation: DOUBLE is
require
count > 1;
local
i: INTEGER;
this_term_difference: DOUBLE;
top_term: DOUBLE;
do
from
i := lower;
until
not valid_index(i)
loop
this_term_difference := item(i) - mean;
top_term := top_term + this_term_difference ^ 2;
i := i + 1;
end;
Result := (top_term / (count - 1)).sqrt;
end -- standard_deviation
end -- NUMBER_LIST |
class MAIN
creation
make
feature { ANY }
make is
local
number_list : NUMBER_LIST
do
!!number_list.make
number_list.read_from_stream( io )
--now output results
if ( number_list.count > 0 ) then
std_output.put_string( "Mean: " )
std_output.put_double( number_list.mean )
else
std_output.put_string( "Not enough entries for mean" )
end --if
std_output.put_new_line
if ( number_list.count > 1 ) then
std_output.put_string( "Standard Deviation: " )
std_output.put_double( number_list.standard_deviation )
else
std_output.put_string( "Not enough entries for standard deviation" )
end --if
std_output.put_new_line
end
end-- MAIN |
A fairly simple tallying of time spent in phases and defects injected and removed. The forms used are listed in the following sections
Table 1-6. Project Plan Summary
| Student: | Victor B. Putz | Date: | 991213 |
| Program: | Standard Deviation | Program# | 1A |
| Instructor: | Wells | Language: | C++ |
| Time in Phase (min): | Plan | Actual | To Date | To Date% |
| Planning | 14 | 14 | 100 | |
| Design | 17 | 17 | 100 | |
| Code | 24 | 24 | 100 | |
| Compile | 5 | 5 | 100 | |
| Test | 51 | 51 | 100 | |
| Postmortem | 3 | 3 | 100 | |
| Total | 90 | 114 | 114 | 100 |
| Defects Injected | Actual | To Date | To Date % | |
| Design | 0 | 0 | 100 | |
| Code | 8 | 8 | 100 | |
| Compile | 1 | 1 | 100 | |
| Test | 3 | 3 | 100 | |
| Total development | 12 | 12 | 100 | |
| Defects Removed | Actual | To Date | To Date % | |
| Planning | 0 | 0 | 100 | |
| Design | 0 | 0 | 100 | |
| Code | 0 | 0 | 100 | |
| Compile | 7 | 7 | 100 | |
| Test | 5 | 5 | 100 | |
| Total development | 12 | 12 | 100 | |
| After Development | 12 | 12 |
| Eiffel code/compile/test |
| Time in Phase (min) | Actual | To Date | To Date % |
| Code | 15 | 45 | 100 |
| Compile | 10 | 30 | 100 |
| Test | 8 | 25 | 100 |
| Total | 33 | 33 | 100 |
| Defects Injected | Actual | To Date | To Date % |
| Code | 6 | 6 | 100 |
| Compile | 0 | 0 | 0 |
| Test | 0 | 0 | 0 |
| Total | 6 | 6 | 100 |
| Defects Removed | Actual | To Date | To Date % |
| Code | 0 | 0 | 0 |
| Compile | 3 | 3 | 50 |
| Test | 3 | 3 | 50 |
| Total | 6 | 6 | 100 |
Table 1-7. Time Recording Log
| Student: | Victor B. Putz | Date: | 991219 |
| Instructor: | Wells | Program# | 1A |
| Date | Start | Stop | Interruption time | Delta time | Phase | Comments |
| 991213 | 1558 | 1612 | 14 | plan | ||
| 1612 | 1622 | 10 | design | Setting up environment (automake etc) | ||
| 1623 | 1630 | 7 | design | Class design, etc. | ||
| 1630 | 1654 | 24 | code | |||
| 1655 | 1701 | 1 | 5 | compile | 1 interruption (phone call) | |
| 1702 | 1753 | 51 | test | problems with "new ifstream" construct | ||
| 1828 | 1831 | 3 | postmortem | does not include sgml writeups |
Table 1-8. Defect Reporting Log
| Student: | Victor B. Putz | Date: | 991219 |
| Instructor: | Wells | Program# | 1A |
| Date | Number | Type | Inject | Remove | Fix time | Fix defect | Description |
| 991219 | 1 | 20 | code | compile | 1 | Mistyped "input_stream" instead of "istream" | |
| 991219 | 2 | 210 | code | compile | 1 | forgot to include "fstream.h" | |
| 991219 | 3 | 20 | code | compile | 1 | attempted non-const operation on const reference | |
| 991219 | 4 | 20 | code | compile | 1 | attempted use of ^ operator on doubles | |
| 991219 | 5 | 20 | code | compile | 1 | mistyped input_stream instead of istream (again) | |
| 991219 | 6 | 20 | compile | compile | 1 | ? (forgot to put a description on the form) | |
| 991219 | 7 | 50 | code | compile | 1 | changed return type of main() unacceptably | |
| 991219 | 8 | 50 | test | test | 36 | Extreme difficulty dealing with ifstream-- eventually terminated | |
| 991219 | 9 | 50 | test | test | 1 | Processed EOF as a number (misuse of atof) | |
| 991219 | 10 | 80 | code | test | 1 | forgot to take sqrt of partial result in standard deviation | |
| 991219 | 11 | 50 | code | test | 3 | forgot to add error message in number_list::read_entry_from_stream | |
| 991219 | 12 | 20 | test | test | 1 | 11 | Forgot to update call to number_list::read_entry_from_stream in number_list::read_from_stream after adding a parameter. |
| Eiffel |
| 991220 | 1 | 20 | code | compile | 1 | Forgot "then" on if-then block | |
| 991220 | 2 | 20 | code | compile | 1 | Numerous-- used "=" instead of ":=" for assignment | |
| 991220 | 3 | 20 | code | compile | 3 | couldn't figure out how to redefine make and call precursor | |
| 991220 | 4 | 80 | code | test | 1 | wasn't allowing for integer-style (no decimal) entries in data | |
| 991220 | 5 | 80 | code | test | 1 | wasn't incrementing indices in loops | |
| 991220 | 6 | 80 | code | test | 1 | Erroneously typed "-" instead of "/" for standard dev calc. |