15 points
See the howto pages for details on starting this project, code, and the details pages for more details. The pages here describe in broad strokes what this assignment is about.
Get the snarf file that has the data files you need for this assignment or get them here.
Collaborative filtering and content-based filtering are two kinds of recommender systems that provide users with information to help them find and choose anything from books, to movies, to restaurants, to courses based on their own preferences compared to the preferences of others.
In 2009 Netflix awarded one million dollars to a group that had developed a better-recommender system than the Netflix, in-house system. This NY Times Magazine article describes the competition, the winning teams, and how the movie Napolean Dynamite caused problems for the algorithms and ranking/rating systems developed by contest participants.
In this assignment, you'll develop a program to test three different algorithms for recommending items based on the responses made by others. You'll be practicing reading data from files, using Python dictionaries and lists, and sorting data to find good matches.
The assignment comes in two conceptual parts:
We're providing three sources of data. Sometimes ratings are stored in a single file, sometimes in more than one file. You'll need to write a separate Python module to deal with each data source, then use what these modules return to develop ratings.
This first set of recommendations for a particular CS Professor comes from Netflix a while ago. As you can see, these recommendations are based on two movies seen and then all the data Netflix has on similar movies.
This next set of recommendations is for Prof. Rodger a while back, when she bought some small bags, and you can guess what types of things she buys for her kids.
The types of ratings you will see in all the data files uses ratings from 5 to -5.
Rating Meaning 5 Really liked it! 3 Liked it! 1 Okay — neither hot nor cold about it 0 Have not rated it -1 Not bad — but nothing really to say about it -3 Didn't like it -5 Hated it!
Expectations: For this assignment you are expected to have well documented code and style. This means you should have a comment for each main block of code to describe in words what the code is doing. You should have your name near the top of each file as a comment. See other expectations for style on the main assignment page.
PLEASE NOTE you will lose points for code that is not well commented!
ITEMS to Write:
Module ProcessAllFood.py
:
AllFoodRatings.txt
that has a small file with just 9 people who have rated a few
restaurants. You will create and return a list of the restaurants, named
itemlist
, and a dictionary
of information on the ratings for the restaurants, named dictratings
.
processData( filename )
itemlist
and dictratings
. Module ProcessAllBooks.py
:
AllBooksAuthors.txt
that has information about books and their authors, and reads in the data
file AllBooksRatings.txt
that has information about ratings
of those books. This function puts this information into the same format
as the restaurant information. In particular, this function returns
a list of the books, named
itemlist
, and a dictionary
of information on the ratings for the books, named dictratings
.
processData(booktitles, bookratings )
itemlist
and dictratings
. Module ProcessAllMovies.py
:
AllMoviesRatings.txt
that has information about ratings
of movies. This function puts this information into the same format
as the restaurant and book information. In particular, this function returns
a list of the movies, named
itemlist
, and a dictionary
of information on the ratings for the movies, named dictratings
.
processData( filename )
itemlist
and dictratings
. Module RecommenderForAll.py
:
average( itemlist, dictratings )
averageList
similarities( name, dictratings )
similarList
recommended( similarList, itemlist, dictratings, n )
recommendList
Module RecommenderFood.py
:
Module RecommenderBooks.py
:
Module RecommenderMovies.py
:
Be sure to submit the following SEVEN files:
Submit the items to the folder assign8-recommender using
eclipse/ambient or the websubmit.