TTIOK KWIC Programmer's Manual

This version of KWIC is a fusion of Chaz's code and Aaron's code.  Originally, we attempted to just use Aaron's code, but each person's code had enough specific strengths (Aaron had implemented comparators and a rudimentary process class, and Chaz had a more modular program design) to make it worthwhile to use both.

HOLISTIC OVERVIEW

       The KWIC program is designed to return KWICs, or KeyWords In Context.   The program has been redesigned to facilitate modularity.  The main 'kwic' class has been split into 4 classes, input, process, config and output.  This serves to facilitate a distinct separation between the main aspects of the program.  Thus, it is easier to figure out where (and how) to implement new features.

DESIGN DECISIONS

       One aspect of particular interest is the processing of the kwic.properties file.  We decided that it would be optimal to include some sort of error checking for the file, but it is not immediately obvious how to perform this.   One brainstorm was that we attempt to convert the argument to an integer to check if it contains proper integral data.  What we settled on was the K_PROP structure (see below).  The K_PROP has a field containing the inputted string as well as a field containing the atoi integer value of the string.  It is left to the programmer to know which field is correct.  This was done in order to promote setting flexibility.  Instead of having to hard code each property, there is a uniform interface to add new settings.
    Another interesting design point is transferring the settings data to the various parts of the program.  We came up with a fairly elegant solution: a function called getSetting inside the k_config class which, when called with a string argument(example: k_config.getSetting("order")  ) returns the proper K_PROP.
    A not as elegant design decision was the implementation of comparator selection as a 6-branch if-tree.  While not elegant, this move was primarily time motivated, as it was very quick to implement using cut/paste.
    Another design decision was the creation of the k_config class.  Though this was not in our original plan, it soon became clear that this was the most modular and efficient way to handle the kwic.properties file.
    The WORD structure was modified from Aaron's code.   The WORD struct allows the code to be sorted and modified (for example by excluding certain keywords) rather easily.  The usefulness of the WORD struct in streamlining the program is clear when you consider that Chaz's individual implementation had intertwined the process and output functionalities.

PROGRAM STRUCTURE

       The program includes the following files:
          kwic.cpp -- contains main
          k_input.h/cpp -- contains input functions for the program.  The input subsystem reads in the command line arguments, and, from them, reads the code from specified files and directories.  The input subsystem's final product which it passes to other classes is a vector of KEYS (see below).
          k_process.h/cpp -- contains sorting and other miscellaneous functions.  The process class takes the vector of KEYS and converts it to a vector of WORDs.  It then sorts the WORDs and processes any exclude data.
          k_output.h/cpp -- contains output functions, such as text formatting.  The output subsystem was not successfully debugged by the end of the project.  This is solely due to running out of time.
          k_config.h/cpp -- contains functions pertaining to the kwic.properties file.  This class was separated from the other three due to the fact that it is able to be implemented in a manner which is not interwoven with the other classes.  This added modularity made the program easier to code, read and modify.
          *comparator.h -- contains a comparator for use with sort.  Multiple comparators are be implemented.
          kwic.properties -- ascii file which contains properties to impact the functioning of the KWIC program.
          globals.h -- contains structs which are universally used in the 4 classes.

GLOBAL.H STRUCTS
        K_PROP -- contains a string field and an integer field, as well as a field for the setting's name.
        KEYS -- contains a string field for the name and filename and an int for the line number
        WORD -- a more complex struct.  Contains strings for the keyword and filename, vectors of strings for the before and after fields, and ints for line number and number of occurrences in file.  WORDs are sorted and then displayed as final output.

INCLUDED COMPARATORS
   
      word  -- alphabetical sort
          reverseword --
reverse alphabetical sort
          length -- sort by word length (shortest to longest)
          reverselength
-- sort by word length (longest to shortest)
          number -- sorts by number of occurrences (most to least)
           reversenumber -- sorts by number of occurrences (least to most)