CS385 -- Lab 3: Routing
Fall 2004
Objectives Addressed
- Be able to apply asymptotic time complexity analysis to choose among
competing algorithms.
- Be familiar with engineering applications for many of the fundamental
algorithms discussed in the course.
Purpose
This lab assignment is designed to stimulate creative solutions
to problem that is difficult to solve efficiently.
Overview
While asymptotic analysis allows us to evaluate an algorithm's
efficiency, it ignores other important issues like implementation complexity
and time constants. In this assignment, you should focus on developing
an algorithm that will produce the correct answer, scale well to large data
sets, and run as quickly as possible on the given data set (listed in order of
priority).
Assignment
Input Data
The GPS data available to you consists of a
file with multiple trips. Each trip represents one commute from my home to
MSOE or back. The first few lines of the file are:
Start time: 2004-03-05 11:09:23
record,fix,hour,min,sec,msec,latitude,longitude,alt
1,3,11,9,24,710,43.0835367,-88.0390017,702.09
2,3,11,9,26,310,43.0835367,-88.0390017,702.09
3,3,11,9,27,310,43.0835367,-88.0390017,702.09
...
The first line indicates the starting time for the first trip. The second
line contains labels for the different fields associated with each record.
The third line contains the data associated with the first data record. Each
record includes (separated by commas) the record number, the number of
satellites used to get a fix on the GPS receiver's position, the time the
record was made (in hours, minutes, seconds, and milliseconds), the latitude,
longitude, and altitude. These records continue until the end of the trip.
Data for the second trip follow immediately after the first trip. In this
particular example, the first trip has 1336 records. Here are lines 1337-1345
of the data file:
1335,4,11,31,52,750,43.0445550,-87.9081050,639.76
1336,4,11,31,53,750,43.0445550,-87.9081033,639.76
Start time: 2004-03-05 17:10:35
record,fix,hour,min,sec,msec,latitude,longitude,alt
1,3,17,10,36,380,43.0445983,-87.9081033,583.98
2,3,17,10,37,380,43.0445950,-87.9081017,583.98
3,3,17,10,38,380,43.0445917,-87.9081000,583.98
4,3,17,10,39,380,43.0445917,-87.9081033,587.27
5,3,17,10,40,380,43.0445850,-87.9081067,587.27
I would suggest that you begin testing with the
smallgps.zip file which contains a few
very short trips. At a minimum, your program should be able to produce all
of the required statistics for this file. This
file contains a subset of the biggps.zip
and may be use for testing.
Details
- A trip is defined as a collection of GPS positions
tracking the position of my vehicle from when the car is started
until it is turned off. Each trip in the data file begins with
Start time: ....
- Two trips are said to be on the same commute if and
only if both trips begin at the same position and end at the same
position.
- Two trips are said to be on the same route if and
only if every position in the first trip is also in the second trip
and every position in the second trip is also in the first trip.
- Generally speaking, the data file consists of two commutes (from my home
to MSOE or from MSOE to my home). However, the data file may contain
some additional commutes (e.g., from MSOE to church). The most
common commute is defined as commute with the most number of
trips in the data file. If multiple commutes tie for the most popular
commute, one of these commutes may be selected as the most popular.
- The most common route is defined as the route with the
number of trips in the data file.
- GPS positions should be consider equivalent if they are within
X meters of each other. Specifically, two trips
are from the same commute if and only if their starting positions
are within X meters of each other and their ending
positions are within X meters of each other.
- Your program should be able to work with arbitrary GPS trip data provided
it conforms to the format described above.
Algorithm Results
Your program should read a data file with GPS trip information (like
the input file above) and indicate the following:
- Number of unique starting positions.
- Number of unique ending positions.
- Number of unique commutes.
- For the most common commute:
- Number of unique trips.
- Starting position.
- Ending position.
- Number of unique routes.
- Number of unique trips for the most common route.
You should calculate the starting/ending position as the geometric
centroid of all the positions considered to be the same position (using
the X meter rule).
Interim report (due 11:00pm, the day prior to week 7, lab)
You should submit a partially completed lab report (fill in header
information and questions section) with answers to the following questions.
- What affect do you expect the choice of X meters to
have on the speed of your program? How significant is this affect
(if any) on the speed of your algorithm? (E.g., changing it from
15 to 7 meters will make it (a little bit faster)/(a little bit
slower)/(twice as fast)/twice as slow).
- Suppose the data file has K unique trips,
L unique commutes, and M unique
routes.
- Which will have the largest affect on the speed of your program?
- Which will have the least affect on the speed of your program?
- Which (if any) of K, L, and
M can be ignored when calculating O()
for your algorithm?
Lab report (due 11:00pm, the day prior to week 8, lecture 3)
Each pair should submit one lab report. Your report should include:
- Complete description of your algorithm. Be sure to discuss other
alternatives that you considered and why you chose your method over
other alternatives.
- Time complexity analysis for your algorithm.
- Original answers to above questions and any additional discussion.
- Results of your program for the
smallgps.zip and
biggps.zip files using
X = 15 meters and X = 2 meters.
- Benchmarks for your algorithm for the
given data files using X = 15, 10, 5, and 2 meters.
- Reactions/suggestions (optional)
- CSP spreadsheet (Continue with the spreadsheet you used for
lab 2, but use the naming convention:
385MSOEloginL3.xls.)
- Documented source code
- Results associated with the
biggps.zip file are not required;
however, please include them if you do have these results.
Oral Presentations (in class week 9, lecture 1)
The first lecture of week 9 will be devoted to student presentations.
- Each team should summarize the results of their solution to lab 3,
explain the algorithm they used, and, time permitting, describe
the alternatives they considered.
- Presentations are limited to five minutes in duration. Presenters
exceeding this limit risk being unceremoniously cut off. A short
period for questions and comments from class members will be provided
and is not included in the five minute limit. The duration of this
period is at the instructor's discretion.
- The presenters should arrange for any necessary audio-visual equipment.
The instructor will consult on this process if requested.
- Grading will be based on the likelihood that I would hire your team
to work for me on a similar project. (Think of this as a presentation
given as part of an interview process.)
- You should take this assignment seriously as it accounts for 40% of
your grade for this lab assignment. Students who do not devote adequate
time to preparing their presentation should expect to receive a poor
grade.
As with any report you submit, correct spelling and grammar are
required. In addition, your report should be submitted electronically
following the Electronic Submission
Guidelines. (You may wish to consult the
sample report before submitting your
report.) Be sure to keep copies of all your files, in case something
gets lost.
If you have any questions, consult the instructor.