During today's lab, we'll be creating word clouds. A word or tag cloud is a way of representing the most popular words from a body of work, and pointing out which of these popular words are more popular than others by representing each word in a size proportional to how important (e.g., number of occurrences) the word is. Perhaps the most famous word cloud is the one that collects all important Presidential Speeches and puts them in a scrollable interface so that you can compare important issues from today's presidents all the way back to the first one. This tool has been referenced in a number of scholarly papers because of the ease in which it supports comparisons. You can also create your own word clouds online at TagCrowd --- this is the site we used on the first day of class to get to know the class.
To modify the textual representation of a word
we'll be creating our output in HTML. The actual
code to generate HTML is accessible via the module HTMLWriter
.
The HTML is written to a separate file which you can view in a browser
(or in Eclipse!).
To write formatted HTML you could call the following three finctions (note, this is already done for you in the code):
start ---
this writes the beginning HTML code that is the start of a web page.format_sized_word
--- writes a word/string in a specified size to the file.finish
--- write the ending HTML code that finishes a web page