Inspired by The Baby Name Wizard's NameVoyager visualization of the data from this government website, we have downloaded the last 30 years, 1981-2010, of the top 500 most popular baby names in the United States and made the data available here and also in the file you will snarf. You will complete five programs that uses these yearly rankings to print out information that is similar to what can be seen using the Wizard.
Complete the following five modules:
BarChartPlot
: Pick a name and gender
and find all its ranks over the last 30 years. If the name was not one of the top 500 for a given
year, use 501 for that year (one more than the maximum rank). As your program's output, plot its popularity as a bar chart, using the function FileUtilitiesProvided.createBarChart
.AverageRank
: Pick a name and compute its average rank over the last 30 years. If the name was not one of the top 500 for a given
year, use 0 for that year;
these years should not count as part of your name's average rank, only those year's in which it was one of the top 500. As your program's output, print out your
name and its average rank on one line together then, for each of
the last thirty years, print the year and its rank that year
together on a line. PopularGirlNames
: Find the letter with which most girl names start over the last 30 years. As your program's output, print all the names that start with this letter in alphabetical order each on a line by themselves.PopularBoyNames
: Find the boy name from which the most other boy names are derived over the last 30 years. For example, the names Frankie and Franklin both start with the name Frank thus, for our purposes, they are derived from it. Note, this definition will also count some alternate spellings as derivations, such as Glen and Glenn. In the case of a tie, return just the alphabetically first name. As your program's output, print the prefix name and all derived names in alphabetical order each on a line by themselves.PopularName
:
Each year two names are ranked #1, one for each sex, find the name, of any sex, that has most often been the ranked #1 over the last 30 years. As your program's output, print out the name and how often it has ranked #1 on a single line.Note, only the name's rank matters, not the number of babies with that name, i.e., the first column, not either of the other numeric columns.
We have included the Python module
FileUtilitiesProvided
which includes several functions
you can use, so you should not modify or add code to this file.
Although these functions are general enough to be used in multiple assignments, you should assume the file names for the yearly data are named in the following way in the data folder: "../data/fileXXXX.txt
" where XXXX
is the year for those rankings. You can snarf the starting files for this assignment or view them here.
A-credit/challenge: Again try to develop as many general functions for working
with the baby name data files that can reused to simplify writing
these programs. These general functions should be written in a
separate
module, UtilityFunctions
, that is imported into each of the five programs you write. In other words, try to reduce the amount of new code you need to write to solve each problem as much as possible.
Submit
your source code: the five programs mentioned above and
UtilityFunctions
; as well as a README file and an ANALYSIS
file
described below. Use the submit
name assign5-names.
In your ANALYSIS file, discuss the steps you took to generalize
functions from yourUtilityFunctions
module so they could be used
by other modules (i.e., how they are different than if you had just
written them for one or the other specifically). Additionally, document any bugs or problems in your
program that you were not able to resolve (i.e., there may be certain kinds
of input that you know are not handled properly). If you document bugs that
you cannot fix, and how you tried to fix them, they will affect your grade
far less than bugs we discover in running your program.
Your grade will be based on how well your programs function and whether you have included appropriate README and ANALYSIS files.