Digital HUAC- Project Update

This week, our team found the answer to our biggest development hurdle- DocumentCloud. Prior to this discovery, we were trying to figure out how to create a relational database, which would store meta tags of our corpus, that would respond to user input in our website’s search form.

It turns out that DocumentCloud, with an Open Calais backend, is able to create semantic metadata from document uploads and can pull the entities within the text. The ability to recognize entities (places, people, organizations) is particularly helpful for our project since these would be potential search categories. We are also able to create customized search categories through DocumentCloud by creating key value pairs. On Tuesday, we uploaded our 5 HUAC testimonies and started to create key value pairs, which are based on our taxonomy. (Earlier this week, we finalized our taxonomy after receiving feedback on our taxonomy from Professor Schrecker at Yeshiva University and Professor Cuordileone at CUNY City Tech.) In order to create these key value pairs, we had to read through each transcript and pull our answers, like this:

Field	Notes & Examples	Rand	Brecht	Disney	Reagan	Seeger
Hearing Date	year-mo-day, 2015-03-10	1947-10-20	1947-10-30	1947-10-24	1947-10-23	1955-08-18
Congressional Session number		80th	80th	80th	80th	84th
Subject of Hearing		Hollywood	Hollywood	Hollywood	Hollywood	Hollywood
Hearing location	City, 2 letter state	Washington, DC	Washington, DC	Washington, DC	Washington, DC	New York, NY
Witness Name	Last Name, First Middle	Rand, Ayn	Brecht, Bertolt	Disney, Walt	Reagan, Ronald W.	Seeger, Pete
Witness Occupation	or profession	Author	Playwright	Producer	Actor	Musician
Witness Organizational Affiliation				Walt Disney Studios	Screen Actors Guild	People’s Songs
Type of Witness	Friendly or Unfriendly	Friendly	Unfriendly	Friendly	Friendly	Unfriendly
Result of appearance	contempt charge, blacklist, conviction		Blacklist			Contempt charge, but successfully appealed; Blacklist

With DocumentCloud thrown back into the mix, we had to take a step back and start again with site schematics. We discussed each step of how the user would move through the site, down to the click, and how the backend would work to fulfill the user input in the search form. (Thanks, Amanda!) In terms of development, we will need to create a script (Python or PHP) that will allow the user’s input in the search box to “talk” to the DocumentCloud API and pull the appropriate data.

Amanda mentioned DocumentCloud to us a while ago, but our group thought it was more of a repository than a tool, so our plan was to investigate it later, after we figured out how to build a database. After hounding the Digital Fellows for the past couple of weeks on how to create a relational database, they finally told us, “You need to look at DocumentCloud.” Moral of the story: Question what you think you know.

On the design front, we started working in Bootstrap and have been experimenting with Github. We were able to push a test site through Github pages, but we still need to work on how to upload the rest of our site directory. This is our latest design of the site:

Digital Praxis Seminar Fall 2014 – Spring 2015

Need help with the Commons?