Digital HUAC- Project Update

This week, our team found the answer to our biggest development hurdle- DocumentCloud. Prior to this discovery, we were trying to figure out how to create a relational database, which would store meta tags of our corpus, that would respond to user input in our website’s search form.

It turns out that DocumentCloud, with an Open Calais backend, is able to create semantic metadata from document uploads and can pull the entities within the text. The ability to recognize entities (places, people, organizations) is particularly helpful for our project since these would be potential search categories. We are also able to create customized search categories through DocumentCloud by creating key value pairs. On Tuesday, we uploaded our 5 HUAC testimonies and started to create key value pairs, which are based on our taxonomy. (Earlier this week, we finalized our taxonomy after receiving feedback on our taxonomy from Professor Schrecker at Yeshiva University and Professor Cuordileone at CUNY City Tech.) In order to create these key value pairs, we had to read through each transcript and pull our answers, like this:

Field Notes & Examples Rand Brecht Disney Reagan Seeger
Hearing Date year-mo-day, 2015-03-10 1947-10-20 1947-10-30 1947-10-24 1947-10-23 1955-08-18
Congressional Session number 80th 80th 80th 80th 84th
Subject of Hearing Hollywood Hollywood Hollywood Hollywood Hollywood
Hearing location City, 2 letter state Washington, DC Washington, DC Washington, DC Washington, DC New York, NY
Witness Name Last Name, First Middle Rand, Ayn Brecht, Bertolt Disney, Walt Reagan, Ronald W. Seeger, Pete
Witness Occupation or profession Author Playwright Producer Actor Musician
Witness Organizational Affiliation Walt Disney Studios Screen Actors Guild People’s Songs
Type of Witness Friendly or Unfriendly Friendly Unfriendly Friendly Friendly Unfriendly
Result of appearance contempt charge, blacklist, conviction Blacklist Contempt charge, but successfully appealed; Blacklist

With DocumentCloud thrown back into the mix, we had to take a step back and start again with site schematics. We discussed each step of how the user would move through the site, down to the click, and how the backend would work to fulfill the user input in the search form. (Thanks, Amanda!) In terms of development, we will need to create a script (Python or PHP) that will allow the user’s input in the search box to “talk” to the DocumentCloud API and pull the appropriate data.


Amanda mentioned DocumentCloud to us a while ago, but our group thought it was more of a repository than a tool, so our plan was to investigate it later, after we figured out how to build a database. After hounding the Digital Fellows for the past couple of weeks on how to create a relational database, they finally told us, “You need to look at DocumentCloud.” Moral of the story: Question what you think you know.

On the design front, we started working in Bootstrap and have been experimenting with Github. We were able to push a test site through Github pages, but we still need to work on how to upload the rest of our site directory. This is our latest design of the site: