[MSOE Homepage]

Dr. Taylor's MSOE Homepage

Unix is a Four
Letter Word

My Photo Album

My Personal Homepage

CS-285 Main page

CS-285 -- Lab 7: Spell Checker Comparisons

Winter Quarter 1999-2000

Electrical Engineering and Computer Science Department
Dr. Christopher C. Taylor

CC-27C, 277-7339

www.msoe.edu/~taylor/

Purpose

The purpose of this lab assignment is to compare the appropriateness of different data structures, in particular: the list, set, and hash table, for use with the task of spell checking.

Assignment (Due at the end of your week 10 lab time)

You should modify your spell checker programs from lab 1, lab 4, and lab 6 so that it is possible to benchmark their run times. You may make use of the Benchmark class (bmark.h) to do the timing You should also remove any user interaction (user replace word, etc.). You also may wish to compile your programs in release mode instead of debug mode.

Once you have made the appropriate modifications to each program, download a new dictionary file and three new test files (lab7a.txt, lab7b.txt, and lab7c.txt). The lab7a.txt file should be used when testing the set and hash table implementations; however, in all likelihood, your list implementation will be too slow for a test file this large. The files lab7b.txt and lab7c.txt contain excerpts from the lab7a.txt file and should be used to test your list implementation. (see specific details below)

In order to avoid timing complications due to network access time variations, you may wish to store these files on the local harddisk instead of your f: drive. Be sure to remove these files from the harddisk when you are done with them. Before running your programs with these data files, make some predictions on how quickly each of your programs will execute.

Once you have your programs ready to run, do the following (in this order):

  • Make some predictions on how quickly each of your programs will run.
  • Run your set and hash table implementations using the new dictionary and the lab7a.txt test file.
  • Take note of the run times for each of the programs.
  • Run your list implementation using the new dictionary and the two excerpt files.
  • Take note of the run times for each excerpt file.
  • Write a brief email message to your instructor indicating:
    • Predictions of how quickly each program would run.
    • How quickly each program did run.
    • The answer to the following questions:
      • Assume that the dictionary has N words and the test file has M words. What is the big-oh notation for the time complexity of each of your programs?
      • Was there a noticeable difference in the run times between processing lab7b.txt and lab7c.txt? Why do you suppose there was (or wasn't)?
      • Based on your benchmarks of the list implementation with the two excerpt files, how long do you anticipate it would take your list implementation to process the lab7a.txt file?
    • List any ways that this lab assignment could have been improved.

As with any report you submit, correct spelling and grammar are required; however, it need not follow the electronic submission guidelines, a standard email message is just fine. Be sure to send yourself a copy of your email message, in case something gets lost. It may be wise to keep a diskette backup as well.

If you have any questions, consult the instructor.


This page was created by Dr. Christopher C. Taylor, copyright 1999-2000.