Program 2A: Simple LOC counter

Write a program to count program LOC.

Given Requirements

From [Humphrey95]:

Requirements:Write a program to count the logical lines in a program, omitting comments and blank lines. Use the counting standard produced by report exercise R1 to place one logical line on each physical line and count physical lines. Produce a single count for the entire program source file.

Testing: Thoroughly test the program. As one test, count the LOC in [program 1A] and 2A. Submit these data with your homework results, using the format in Table D6:

Table 2-7. Test Results Format -- Program 2A

Program NumberLOC
1A 
2A 

Planning

Program Requirements

With the same philosophy I used in program 1A, I'm going to take input not from a file, but from standard input. This can be used to our advantage, particularly since under Linux we can use program 2A as an element in a pipe, and combine LOC counts from several different source files thusly:

Example 2-1. Counting many source files

[vputz@yak_prime lesson_2]$ cat *.{h,cpp} | ./psp_2a
LOC: 233

The program will NOT attempt to analyze the syntax or logical construction of the source code. It will simply count physical lines according to the coding standards

Size estimate

Sigh... I missed this entry in the planning worksheet and produced no size estimate for program 2A

Resource estimate

This should have been done using a total size estimate and the %to date entries in the planning blocks. Instead, I took a brash guess for each phase, which serves me right for not paying closer attention to the planning script.

Development

Design

The basic design involves two classes: a simple_input_parser and a simple_loc_counter; the main() function simply invokes a simple_loc_counter to parse standard input.

The simple_input_parser class is indeed simple: once assigned an input stream, a call to simple_input_parser::parse_until_eof reads lines, one at a time, transforms them according to transformed_line, and stores the result in m_last_line. After each line is read, parse_until_eof calls parse_last_line to act on the last line read and transformed.

The simple_loc_counter class subclasses simple_input_parser; the transformation applied to each line is simple: leading and trailing whitespace is skipped. Each line is classified according to a number of functions such as last_line_is_comment, last_line_is_compiler_directive, etc. An integer counter takes care of "nested" block comments.

The simple_loc_counter class appends each LOC to a buffer of countable lines, and simply returns the size of that buffer (in LOC) when asked. It can also print each countable line, useful during testing.

Code/Compile/Test: C++

Several problems were discovered in the compilation phase, mostly having to do with missing features in the interfaces (missing reset methods, missing min/max functionality, etc). Most of these were quickly fixed (1-2 minute fix times)

Code

simple_input_parser.h
/*
*/

#ifndef SIMPLE_INPUT_PARSER_H
#define SIMPLE_INPUT_PARSER_H

#include <string>
#include <iostream>

//a standard "framework" for parsing a set of input lines
class simple_input_parser
{
  public:
    //sets input stream
  void set_input_stream (istream * new_input);
  //parse input until EOF
  void parse_until_eof (void);
  //reads single line from input, transforms it, stores it in last_line
  void read_line (void);
  //sets the last line read
  void set_last_line (const std::string & new_line);
  //last line read from input
  const std::string & last_line (void) const;
  //parses the last line read
  virtual void parse_last_line (void);
  //returns a "transformed" copy of the given line
  virtual std::string transformed_line (const std::string & line) const;
  //constructor
    simple_input_parser (void);
  //virtual destructor
    virtual ~ simple_input_parser (void);
  //resets the parser
  virtual void reset (void);

    private:
    //the input stream
    istream * m_input_stream;
  //the last line of input
    std::string m_last_line;
};

#endif

/*
*/
simple_input_parser.cpp
/*
*/

#include "simple_input_parser.h"
#ifndef CONTRACT_H
#include "contract.h"
#endif

void
simple_input_parser::set_input_stream (istream * new_input)
{
  m_input_stream = new_input;
}

void
simple_input_parser::parse_until_eof (void)
{
  REQUIRE (m_input_stream != NULL);
  while (!(m_input_stream->eof ()))
    {
      read_line ();
      parse_last_line ();
    }
}

void
simple_input_parser::read_line (void)
{
  REQUIRE (!(m_input_stream->eof ()));
  const int
    input_buffer_size = 255;
  char
    input_buffer[input_buffer_size];
  m_input_stream->getline (input_buffer, input_buffer_size);
  //for some reason, the G++ standard library needs me to do this or it
  //doesn't register the EOF condition properly.  This makes no sense
  //to me...
  char c = m_input_stream->get();
  m_input_stream->putback( c );
  std::string input_line (input_buffer);
  //no source code line should be longer than 255!
  CHECK (input_line.size () < 255);
  set_last_line (transformed_line (input_line));
}

void
simple_input_parser::set_last_line (const std::string & new_line)
{
  m_last_line = new_line;
}

const
std::string & simple_input_parser::last_line (void) const
{
  return m_last_line;
}

void
simple_input_parser::parse_last_line (void)
{
  //basic version does nothing
}

std::string simple_input_parser::transformed_line (const std::string & line) const
{
  return line;
}

simple_input_parser::simple_input_parser (void)
{
  reset ();
}

simple_input_parser::~simple_input_parser (void)
{
}

void
simple_input_parser::reset (void)
{
  m_input_stream = NULL;
  m_last_line = "";
}

/*
*/
simple_loc_counter.h
/*
*/

#ifndef SIMPLE_LOC_COUNTER_H
#define SIMPLE_LOC_COUNTER_H

#ifndef SIMPLE_INPUT_PARSER_H
#include "simple_input_parser.h"
#endif

#include <string>
#include <vector>

//subclass of simple_input_parser that stores countable lines of code in a buffer
//and can return their count.
class simple_loc_counter:public simple_input_parser
{
  public:
    //adds last line to the buffered lines if it is countable
  void parse_last_line (void);
  //the count of LOC
  int loc_count (void) const;
  //whether last line was comment
  bool last_line_is_comment (void) const;
  //whether last line was compiler directive
  bool last_line_is_compiler_directive (void) const;
  //whether we are in a block comment
  bool is_in_block_comment (void) const;
  //whether last line was part of a begin/end pair
  bool last_line_is_begin_or_end (void) const;
  //whether the last line was countable
  bool last_line_is_countable (void) const;
  //whether the last line was empty
  bool last_line_is_empty (void) const;
  //updates the block comment count
  void update_block_comment_count (void);
  //whether the last line starts with the given string
  bool last_line_starts_with (const std::string & search_string) const;
  //whether a given string starts with a search string
  static bool string_starts_with (const std::string & given_string,
				  const std::string & search_string);

  //returns the input string stripped of leading/trailing whitespace

  
     std::string string_stripped_of_whitespace (const std::string & input_string) const;
  //returns the transformed line (here, the line stripped of whitespace)
  virtual std::string transformed_line (const std::
					string & input_string) const;

  //constructor
    simple_loc_counter (void);
  //destructor
    virtual ~ simple_loc_counter (void);
  //resets the object
  void reset (void);

  //writes the countable lines to the given output stream
  void write_countable_lines (ostream & ostr) const;

    protected:
    //the buffered countable lines
    std::vector < std::string > m_countable_lines;
  //the "block comment" nesting level
  int m_block_comment_nesting_level;
  //the beginning of a block comment
  static const std::string & block_comment_begin;
  //the end of a block comment
  static const std::string & block_comment_end;
  //the beginning of an inline comment
  static const std::string & inline_comment_begin;
  //the beginning of a compiler directive
  static const std::string & compiler_directive_begin;
  //the "begin block" string
  static const std::string & block_begin;
  //the "end block" string
  static const std::string & block_end;
  //whitespace characters
  static const std::string & whitespace_characters;
};

#endif

/*
*/
simple_loc_counter.cpp
/*
*/

#include "simple_loc_counter.h"

#ifndef YAK_MIN_MAX_H
#include "yak_min_max.h"
#endif

void
simple_loc_counter::parse_last_line (void)
{
  if (last_line_is_countable ())
    {
      m_countable_lines.push_back (last_line ());
    }
}

int
simple_loc_counter::loc_count (void) const
{
  return m_countable_lines.size ();
}

bool
simple_loc_counter::last_line_is_comment (void) const
{
  bool Result = false;
  if (last_line_starts_with (block_comment_begin)
      || last_line_starts_with (inline_comment_begin)
      || is_in_block_comment ())
    {
      Result = true;
    }
  return Result;
}


bool
simple_loc_counter::last_line_is_compiler_directive (void) const
{
  bool Result = false;
  if (last_line_starts_with (compiler_directive_begin))
    {
      Result = true;
    }
  return Result;
}

bool
simple_loc_counter::is_in_block_comment (void) const
{
  bool Result = false;
  if (m_block_comment_nesting_level > 0)
    {
      Result = true;
    }
  return Result;
}

bool
simple_loc_counter::last_line_is_begin_or_end (void) const
{
  bool Result = false;
  if (last_line_starts_with (block_begin)
      || last_line_starts_with (block_end))
    {
      Result = true;
    }
  return Result;
}

bool
simple_loc_counter::last_line_is_empty (void) const
{
  return (last_line ().length () == 0);
}

bool
simple_loc_counter::last_line_is_countable (void) const
{
  bool Result = true;
  if ((last_line_is_comment ())
      || (last_line_is_begin_or_end ())
      || (last_line_is_compiler_directive ()) || (last_line_is_empty ()))
    {
      Result = false;
    }
  return Result;
}


void
simple_loc_counter::update_block_comment_count (void)
{
  //count through the string; add 1 to the block comment count if the begin
  //string is encountered, subtract one if the end string is encountered.
  for (unsigned int i = 0; i < last_line ().length (); ++i)
    {
      std::string line_remaining =
	last_line ().substr (i, last_line ().length ());
      if (string_starts_with (line_remaining, block_comment_begin))
	{
	  ++m_block_comment_nesting_level;
	}
      else if (string_starts_with (line_remaining, block_comment_end))
	{
	  --m_block_comment_nesting_level;
	}
    }
}

bool
  simple_loc_counter::
last_line_starts_with (const std::string & search_string) const
{
  return string_starts_with (last_line (), search_string);
}

bool
  simple_loc_counter::string_starts_with (const std::string & given_string,
					  const std::string & search_string)
{
  int
    substring_size =
    yak_min (given_string.length (), search_string.length ());
  std::string substring = given_string.substr (0, substring_size);
  bool Result = (substring == search_string);
  return Result;
}

std::string
  simple_loc_counter::
string_stripped_of_whitespace (const std::string & input_string) const
{
  std::string::size_type start =
    input_string.find_first_not_of (whitespace_characters);
  if (start == input_string.npos)
    {
      start = 0;
    }
  std::string::size_type end =
    input_string.find_last_not_of (whitespace_characters);
  if (end == input_string.npos)
    {
      end = 0;
    }
  std::string Result = input_string.substr (start, (end == 0) ? 0 : end + 1);
  return Result;
}

std::string
  simple_loc_counter::transformed_line (const std::string & input_string) const
{
  return string_stripped_of_whitespace (input_string);
}

simple_loc_counter::simple_loc_counter (void)
{
  reset ();
}

simple_loc_counter::~simple_loc_counter (void)
{
}

void
simple_loc_counter::reset (void)
{
  m_countable_lines.clear ();
  m_block_comment_nesting_level = 0;
}

void
simple_loc_counter::write_countable_lines (ostream & ostr) const
{
  for (std::vector < std::string >::const_iterator iter =
       m_countable_lines.begin (); iter != m_countable_lines.end (); ++iter)
    {
      ostr << *iter << "\n";
    }
}

const
  std::string & simple_loc_counter::block_comment_begin = "/*";

const
  std::string & simple_loc_counter::block_comment_end = "*/";

const
  std::string & simple_loc_counter::inline_comment_begin = "//";

const
  std::string & simple_loc_counter::compiler_directive_begin = "#";

const
  std::string & simple_loc_counter::block_begin = "{";

const
  std::string & simple_loc_counter::block_end = "}";

const
  std::string & simple_loc_counter::whitespace_characters = " \t\n\0x32";

/*
*/
main.cpp
/*
*/

#ifndef SIMPLE_LOC_COUNTER_H
#include "simple_loc_counter.h"
#endif

istream *
input_stream_from_args (int arg_count, const char **arg_vector)
{
  istream *Result = NULL;
  if (arg_count == 1)
    {
      Result = &cin;
    }
  else
    {
      const char *help_text = "PSP exercise 2A: Count the physical LOC from standard input\n according to the style and counting guidelines in reports \n 1A and 2A. \n \n Usage:\n \tpsp_2a \n \n"; 
      cout << help_text;
    }
  return Result;
}

int
main (int arg_count, const char **arg_vector)
{
  //get the input stream, or print the help text as appropriate
  istream *input_stream = input_stream_from_args (arg_count, arg_vector);
  if (input_stream != NULL)
    {
      simple_loc_counter counter;
      counter.set_input_stream (input_stream);
      counter.parse_until_eof ();
      //output the loc
      cout << "LOC: " << counter.loc_count () << "\n";
    }
}


/*
*/
simple_input_parser.e
deferred class SIMPLE_INPUT_PARSER

feature {ANY} 
   
   parse_until_eof is 
      --parses all input until an EOF is reached
      require 
         input_stream /= Void; 
      do  
         from 
         until 
            input_stream.end_of_input
         loop 
            read_line;
            if not input_stream.end_of_input then 
               parse_last_line;
            end; 
         end; 
      end -- parse_until_eof
   
   set_input(new_input_stream: INPUT_STREAM) is 
      --sets the input stream
      do  
         input_stream := new_input_stream;
      end -- set_input
   
   read_line is 
      --reads a line from standard input
      do  
         input_stream.read_line;
         last_line := transformed_line(input_stream.last_string);
      end -- read_line
   
   last_line: STRING;
   
   input_stream: INPUT_STREAM;
   
   parse_last_line is 
      
deferred
      end -- parse_last_line
   
   transformed_line(to_transform: STRING): STRING is 
      --transforms the line according to rules defined in subclasses
      do  
         Result := to_transform;
      end -- transformed_line

end -- class SIMPLE_INPUT_PARSER
simple_loc_counter.e
class SIMPLE_LOC_COUNTER
-- counts one form of LOC in eiffel files

inherit 
   SIMPLE_INPUT_PARSER
      redefine parse_last_line, transformed_line
      end; 
   
creation {ANY} 
   make

feature {ANY} 
   
   make is 
      do  
         !!counted_lines.make(1,0);
      end -- make
   
   parse_last_line is 
      -- store countable lines in an array
      do  
         if last_line_is_countable then 
            counted_lines.add_last(last_line);
         end; 
      end -- parse_last_line
   
   counted_lines: ARRAY[STRING];
      --array containing countable lines 
   
   loc_count: INTEGER is 
      -- number of lines counted as LOC
      do  
         Result := counted_lines.count;
      end -- loc_count
   
   last_line_is_comment: BOOLEAN is 
      do  
         if last_line_starts_with(comment_begin) then 
            Result := true;
         else 
            Result := false;
         end; 
      end -- last_line_is_comment
   
   comment_begin: STRING is "--";
   
   last_line_is_compiler_directive: BOOLEAN is false;
   
   in_block_comment: BOOLEAN is false;
   
   last_line_starts_with(test_string: STRING): BOOLEAN is 
      do  
         if last_line.has_prefix(test_string) then 
            Result := true;
         else 
            Result := false;
         end; 
      end -- last_line_starts_with
   
   last_line_is_countable: BOOLEAN is 
      do  
         if last_line_is_comment or last_line_is_begin_or_end or last_line_is_empty then 
            Result := false;
         else 
            Result := true;
         end; 
      end -- last_line_is_countable
   
   last_line_is_begin_or_end: BOOLEAN is 
      do  
         if last_line_starts_with("do") or last_line_starts_with("end") then 
            Result := true;
         else 
            Result := false;
         end; 
      end -- last_line_is_begin_or_end
   
   last_line_is_empty: BOOLEAN is 
      do  
         if last_line.empty then 
            Result := true;
         else 
            Result := false;
         end; 
      end -- last_line_is_empty
   
   transformed_line(string: STRING): STRING is 
      do  
         Result := string_stripped_of_whitespace(string);
      end -- transformed_line
   
   string_stripped_of_whitespace(string: STRING): STRING is 
      do  
         Result := string.twin;
         Result.replace_all('%T',' ');
         Result.left_adjust;
         Result.right_adjust;
      end -- string_stripped_of_whitespace
   
   print_counted_lines(output: OUTPUT_STREAM) is 
      local 
         index: INTEGER;
      do  
         from 
            index := counted_lines.lower;
         until 
            not counted_lines.valid_index(index)
         loop 
            output.put_string(counted_lines.item(index));
            output.put_string("%N");
            index := index + 1;
         end; 
      end -- print_counted_lines

end -- class SIMPLE_LOC_COUNTER
main.e
class MAIN

creation {ANY} 
   make

feature {ANY} 
   
   make is 
      local 
         simple_loc_counter: SIMPLE_LOC_COUNTER;
      do  
         !!simple_loc_counter.make;
         simple_loc_counter.set_input(io);
         simple_loc_counter.parse_until_eof;
         std_output.put_string("LOC: ");
         std_output.put_integer(simple_loc_counter.loc_count);
      end -- make

end -- class MAIN

Compile

Minor problems in the compile phase; most of these were relatively straightforward typos from the coding phase (forgot to clean up "virtual" keyword when editing copied text, forgot to add simple_loc_counter:: to a declaration, etc). Most were quickly fixed, except for a 7-minute period where I had to jockey makefiles to get them to work properly.

Test

Ah, testing. Many problems were discovered here, and while some were easy to fix (evidently the standard C++ string class-- or at least this implementation-- gives different results between length and size...), some were downright tricky (if the standard C++ string class doesn't find what it's looking for in find_first_not_of or find_last_not_of, it returns a mysterious value, npos-- I confess I was expecting the length of the string or some such. This occupied about 21 minutes of the testing phase).

My decision to allow the simple_loc_counter class to print counted lines made testing much nicer, as I could more easily compare what the program was actually counting to what was in the submitted files.

As per the testing requirements, I have used program 2A to calculate LOC counts for programs 1A and 2A:

Table 2-8. Loc Results -- Program 2A, C++

Program NumberLOC
1A94
2A233

Table 2-9. Loc Results -- Program 2A, Eiffel

Program NumberLOC
1A82
2A95

Postmortem

PSP0.1 Project Plan Summary

Table 2-10. Project Plan Summary

Student:Victor B. PutzDate:991228
Program:LOC counterProgram#2A
Instructor:WellsLanguage:C++
Program SizePlanActualTo date
Base 0 
Deleted 0 
Modified 0 
Added 233 
Reused 00
Total New and Changed(no planned size)233233
Total LOC 233233
Total new/reused   
Time in Phase (min):PlanActualTo DateTo Date%
Planning1010247
Design30274413
Code60759930
Compile2026319
Test455911034
Postmortem1017206
Total175214328100
Defects Injected ActualTo DateTo Date %
Plan 000
Design 111126
Code 192763
Compile 125
Test 037
Total development 3143100
Defects Removed ActualTo DateTo Date %
Planning 000
Design 000
Code 111126
Compile 111842
Test 91433
Total development 3143100
After Development 00 
Eiffel code/compile/test
Time in Phase (min)ActualTo DateTo Date %
Code314653
Compile182832
Test51315
Total5487100
Defects InjectedActualTo DateTo Date %
Design3313
Code142087
Compile000
Test000
Total1723100
Defects RemovedActualTo DateTo Date %
Code114
Compile111461
Test5835
Total1723100

Time Recording Log

Table 2-11. Time Recording Log - C++

Student:Victor B. PutzDate:991219
Instructor:WellsProgram#1A
StartStopInterruption TimeDelta timePhaseComments
991228 10:56:15991228 11:25:13127design 
991228 11:28:59991228 12:48:48475code 
991228 12:49:09991228 13:15:23026compile 
991228 13:15:45991228 14:15:25059test 
991228 14:58:59991228 15:19:37317postmortem 
      

Table 2-12. Time Recording Log - Eiffel

Student:Victor B. PutzDate:000104
Instructor:WellsProgram#2A
StartStopInterruption TimeDelta timePhaseComments
000104 14:08:00000104 14:39:23031code 
000104 14:39:29000104 14:58:26018compile 
000104 15:00:37000104 15:16:01015test 
      

Defect Reporting Log

Table 2-13. Defect Recording Log - C++

Student:Victor B. PutzDate:991228
Instructor:WellsProgram#2A
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
991228 11:54:19icomdesigncode1Forgot to add setter method for last_line
991228 11:58:06icomdesigncode1Forgot to add reset method
991228 12:08:01icomdesigncode2Added last_line_starts_with method
991228 12:12:39icomdesigncode1Added compiler_directive_begin string
991228 12:15:30icomdesigncode1Added block begin/end strings
991228 12:18:56idtydesigncode1renamed "is_last_line..." methods to "last_line_is..."
991228 12:24:17icomdesigncode2Had to add yak_min_max to get min, max functionality
991228 12:26:54icomcodecode2Refactored last_line_starts_with into more generic string_starts_with method
991228 12:35:45icomdesigncode0Added whitespace_characters string
991228 12:38:17icomdesigncode1Added constructor and reset method
991228 12:42:35icomdesigncode1Added write_countable_lines method
991228 12:49:50wntycodecompile0Called "set_input" instead of "set_input_stream"
991228 12:52:26idomcodecompile1Forgot to add yak_defs, yak_exception to include path
991228 12:56:19iutycodecompile0 
991228 12:58:06wntycodecompile0Mistyped name of last_line_is_countable
991228 12:58:55wntycodecompile0Didn't fully qualify std::string::size_type
991228 13:00:17syomcodecompile0Forgot to clean up "virtual" from method declaration in .cpp file
991228 13:01:15syomcodecompile0Forgot to add "simple_loc_counter::" to method declaration
991228 13:02:52iuomcodecompile0Misused const qualifier in method declaration for write_countable_lines
991228 13:04:02iuomcodecompile0Needed to add const qualifier in write_countable_lines
991228 13:04:44syomcodecompile0Forgot to remove "static" from in-cpp declaration of class static variables
991228 13:06:30iuexcompilecompile7Had to jockey the makefile setup to reuse some external code not meant for this reuse
991228 13:20:28wnkncodetest4Used "size" instead of "length" for string length
991228 13:25:22isomcodetest1Forgot to add virtual destructors (warned by compiler)
991228 13:27:43iuomcodetest0Used "int" instead of "unsigned int" to compare to size()
991228 13:28:25wnomcodetest0Multiple uses of "size" instead of "length" for string length
991228 13:30:41iukncodetest21find_*_not_of return a mysterious "npos" when the string only contained the search characters.
991228 13:54:32icomdesigntest2Forgot to check for empty lines!
991228 13:58:59iukncodetest0Off-by-one; was not including the end character on string_stripped_of_whitespace
991228 14:01:20iukncodetest10Off-by-the-other-one; was including blank lines consisting of one space
991228 14:12:07wakncodetest1Was only checking for "}" to close blocks, discounting possible "};"...
       

Table 2-14. Defect Recording Log - Eiffel

Student:Victor B. PutzDate:000104
Instructor:WellsProgram#2A
Defect foundTypeReasonPhase InjectedPhase RemovedFix timeComments
000104 14:28:23icomdesigncode3Adding last_line_starts_with method
000104 14:39:40sytycodecompile1forgot a comma
000104 14:43:13isomcodecompile0forgot to add return type in function declaration
000104 14:44:28sytycodecompile0Forgot to add "end" to end of class declaration
000104 14:44:59sytycodecompile0forgot to add "end" to deferred feature declaration
000104 14:46:15sytycodecompile0used "return result" semantics instead of "Result := result" semantics
000104 14:46:56icigdesigncompile1forgot to add "last_line_is_countable" feature
000104 14:49:02wncmcodecompile0Used "last_string" in feature name instead of "last_line"
000104 14:50:36wncmcodecompile0Used "last_line" in feature name instead of "last_string" (external class)
000104 14:51:34wnomcodecompile2used "copy" instead of proper "twin"
000104 14:54:50wncmcodecompile0Used "null" instead of "void"
000104 14:56:19icomcodecompile1forgot to add "make" feature to simple_loc_counter
000104 15:01:11icomcodetest2forgot to add "print_counted_lines" feature (for debugging)
000104 15:03:50micmcodetest1forgot to increment index in loop
000104 15:06:34micmcodetest1forgot to add newline to print_counted_lines
000104 15:08:45wncmcodetest0changed "begin" constant to "do" constant (er... Eiffel doesn't use "begin")
000104 15:09:16mdomdesigntest4Er... forgot to exclude empty lines from count