Compsci 101, Fall 2017, Airport HOWTO

Getting Started

There are no files to snarf for this assignment. You'll need to write your program from scratch. Feel free to look at other programs you have written to help you get started, such as assignment 3 which also reads data from a URL.

You should read the datafile from the URL given.

It would be helpful to understand the example program we went over in the October 24 lecture on using dictionaries to calculate information about names, such as the first name that occurs the most often.

Using a small data file for testing

To convince yourself that your program is working, consider using a small data file for testing first. We have provided a smaller file you can also read from this url.

http://www.cs.duke.edu/courses/fall17/compsci101/data/airportSmall.txt

Each line of data represents one month at one airport with the fields separated by $'s. Those fields are (and in this order):

The first line below represents the Atlanta airport (code ATL) in June 2003 that had 216 cancelled flights, 11 carriers, 5843 delayed flights, 30060 total flights, 23974 ontime flights and has the name: Atlanta, GA: Hartsfield-Jackson Atlanta International.

Here is the small file:

ATL$June$2003$216$11$5843$30060$23974$Atlanta, GA: Hartsfield-Jackson Atlanta International
BOS$June$2003$138$14$1623$9639$7875$Boston, MA: Logan International
BWI$June$2003$29$11$1245$8287$6998$Baltimore, MD: Baltimore/Washington International Thurgood Marshall
CLT$June$2003$73$11$1562$8670$7021$Charlotte, NC: Charlotte Douglas International
DCA$June$2003$74$13$1100$6513$5321$Washington, DC: Ronald Reagan Washington National
DEN$June$2003$34$13$1611$11691$10024$Denver, CO: Denver International
ATL$July$2003$484$11$7452$30875$22868$Atlanta, GA: Hartsfield-Jackson Atlanta International
BOS$July$2003$226$14$1710$9972$8033$Boston, MA: Logan International
BWI$July$2003$57$11$1679$8611$6855$Baltimore, MD: Baltimore/Washington International Thurgood Marshall
CLT$July$2003$163$11$1895$8965$6883$Charlotte, NC: Charlotte Douglas International
DCA$July$2003$181$13$1260$6695$5241$Washington, DC: Ronald Reagan Washington National
DEN$July$2003$71$13$1753$12702$10869$Denver, CO: Denver International
ATL$May$2004$457$11$8802$34896$25565$Atlanta, GA: Hartsfield-Jackson Atlanta International
BOS$May$2004$225$16$2489$10985$8262$Boston, MA: Logan International
BWI$May$2004$49$12$2037$8902$6779$Baltimore, MD: Baltimore/Washington International Thurgood Marshall
CLT$May$2004$80$11$1291$8857$7479$Charlotte, NC: Charlotte Douglas International
DCA$May$2004$138$15$1476$7615$5976$Washington, DC: Ronald Reagan Washington National
DEN$May$2004$106$14$2012$12820$10653$Denver, CO: Denver International
ATL$June$2006$693$13$8470$33787$24447$Atlanta, GA: Hartsfield-Jackson Atlanta International
BOS$June$2006$367$13$3963$10746$6397$Boston, MA: Logan International
BWI$June$2006$92$14$2582$9086$6371$Baltimore, MD: Baltimore/Washington International Thurgood Marshall
CLT$June$2006$229$11$2558$9366$6550$Charlotte, NC: Charlotte Douglas International
DCA$June$2006$271$14$2078$7794$5375$Washington, DC: Ronald Reagan Washington National
DEN$June$2006$263$15$4082$20096$15675$Denver, CO: Denver International
ATL$May$2008$173$15$5217$34572$29152$Atlanta, GA: Hartsfield-Jackson Atlanta International
BOS$May$2008$156$14$1914$10276$8198$Boston, MA: Logan International
BWI$May$2008$28$14$1162$8988$7788$Baltimore, MD: Baltimore/Washington International Thurgood Marshall
CLT$May$2008$111$13$1698$10996$9179$Charlotte, NC: Charlotte Douglas International
DCA$May$2008$120$13$1185$7435$6118$Washington, DC: Ronald Reagan Washington National
DEN$May$2008$186$16$4782$20765$15765$Denver, CO: Denver International

Processing the data

After reading in the data, you might want to put it in a list first to make it easier to process. Alternatively you could put it in a dictionary, and then use that dictionary to create other dictionaries. Either way is fine

You are required to create at least three dictionaries.

  1. To calculate which airport has had the most months with 100 or more cancellations, you may want to create a dictionary with the key being an airport code (such as RDU) and the values being the number of months with 100 or more cancellations. This is the total number of months over the years. So the number may be more than 12.

  2. To calculate which month is the busiest for each airport code, you may want to write a function that creates and returns a dictionary for a given airport code. For example, if I passed in RDU, it would create a dictionary mapping each month to a list of total flights for each month from RDU.
  3. To calculate which airports have 80 percent or more of their flights on time, for each airport you will need to calculate the total number of flights (over all months and years) and the total number of ontime flights (over all months and years). You could use two dictionaries for these two totals.

Sample Run with Sample data file above

Note this is a sample run with the SMALL data file that is shown above.

Information about major airports in the United States

Airport with most months with 100 or more cancellations is  BOS
It had 5 months with 100 or more cancellations

Busiest month for each airport:
busiest month for ATL is May with 34734.0 average flights that month
busiest month for BOS is May with 10630.5 average flights that month
busiest month for DCA is May with 7525.0 average flights that month
busiest month for DEN is May with 16792.5 average flights that month
busiest month for CLT is May with 9926.5 average flights that month
busiest month for BWI is May with 8945.0 average flights that month

Airports that have >= 80 percent on time flights:
DEN 0.806747444732

NOTE: For the first question, both BOS and ATL have 5 months. Since there is a tie you can print out either one of them. Only print one if there is a tie.

NOTE: For the second question, they do not have to be in alphabetical order but it might be a good idea so you can tell they are all there!

NOTE: For the third question, the answer does need to be in alphabetical order by airport code. In this case there is only one airport. If there is more than one airport they need to be in sorted order by airport codes.

Note that you can print out the dictionary with the small file but it is NOT recommended to print out the dictionaries with the large file. THEY ARE TOO BIG.