Tag Archives: Project Update

Final presentations

We were happy to have the chance to see everyone’s presentations last week. Over the course of the semester we’ve all been working so intently on our own projects, it sometimes felt like we didn’t have enough opportunity to fully appreciate what everyone else has been working on.

Not surprisingly, all of the projects are awesome!

This has been an amazingly group of classmates, and we would like to thank you all for your support and input over the last few months. We’re excited to see where things go from here–not just for the projects, but for everyone individually as well.

Looking forward to Tuesday, you’re all rockstars!


Gossip Girl Team Digital HUAC

DHUAC Update

This week, we’re polishing up our website and writing content for our presentation & paper. We are thinking about how to best showcase the work we have done and how to demonstrate the need for and usefulness of a full scale project. So we’re working on plans for next steps. In a way, thinking about the future helps us reflect on where we are and how we got here, which is useful as part of the praxis element of this project.
As part of this process, we’ve been asking around about HUAC testimonies–to see if we can figure out how many there are out there and where they might all be. Turns out, no one actually seems to know. The Wilson Center and John Jay Library said contact NYPL. NYPL has some stuff, but were not able to give us more than links to their holdings. LoC said contact NARA. NARA said, essentially, it is impossible to know, and in addition to whatever is published and in the world, they also have many boxes of closed executive hearings that are only barely indexed. What we do know is there is no one place where all the testimonies are, and certainly not online and not searchable. It’s sort of baffling that this is the case–one’s mind goes immediately to all of the other incomplete, sub-optimal, or dark archives containing important info that must be out there–though this certainly reinforces the value of our project.
Speaking of search, we’ve got a fancy new search interface. You can check it out and let us know what you think of it.  o_O

TANDEM Project Update 4.26.15


This has been a week of accelerated achievement on all fronts for TANDEM. Thanks to Steve, we have a working MVP hosted on www.dhtandem.com/tandem. Further, we have also made huge strides on the front end with Kelly’s robust initial set of HTML/CSS pages for the site. While the two ends are not tied together just yet, they are within sight as of this weekend. Jojo continues to surprise the group with her intuitive mix of outreach and awesome having sent out personalized invitations to key members in our contact list and people who have shown interest in the past few months. Keep reading for more detailed information about these and other developments.


MVP functionality added this week includes:

  • Ability to upload multiple files
  • Ability to persist data via a sqlite database containing project data and pointers to file locations
  • backend analytic code connected to front end
  • ability to zip and download results

Remaining tasks are:

  • Implement polished UI
  • Implement error handling
  • Handle session management so that simultaneous users keep their data separate
  • Look for opportunities to gain efficiency
  • Correct a small bug in the opencv output
  • Review security, backup file storage approaches and rework as needed to achieve best practices.


Continuing to garner community support, Jojo attended a GC Digital Initiatives event Tuesday as well as the English department’s Friday Forum. Additionally, initial invites for the launch went out to the digital fellows and DH Praxis friends and family via paperless post. Digital Fellow Ex Officio Micki Kaufman has already replied that she wouldn’t miss it.  I’m now working to organize outreach with the other teams.

The press release is coming along on the class wiki, too!!


With functionality ironed out, we continue to work with the dataset we have generated via TANDEM for the Mother Goose corpus. As part of our release, we will include work that we have done in both analysis and data visualization for the initial test corpus. If you have questions or points of interest in Mother Goose feel free to comment them below! We are interested in hearing the kinds of questions one might ask of a text/image corpus.


As we close in on the final weeks, we’ve come to realize that what we may not be able to write a script that will do all that we want, search-wise. Fortunately, working with DocumentCloud as our database has allowed us to utilize their robust functionality, and we have used their tools to provide basic search and browse functions on our site. With these in place, we’re focused on polishing our front end, pitch, and documentation. We are also considering adding one more layer of fun…

We would like to position this project part as useful tool for historians, part as a template for replicable front-end to DocumentCloud, and as participatory digital scholarship.

To the participatory aspect, we’re considering creating and implementing a crowd sourcing platform to help with assigning the needed metadata to the individual testimonies.

One of the early and lasting story lines behind the project has been making publicly accessible a collection of materials with a shadowy past and curious relationship to public/private spaces, agendas, politics, and notions of guilt. At this stage, scholars would appreciate having the transcripts collected and rendered (simply) searchable; the scattered nature of the testimonies themselves is a major roadblock to HUAC studies that we’re trying to level out. But beyond that, incorporating crowd sourcing would resonate with the true spirit of the Digital HUAC project, which in a sense is the anti HUAC project, by relying on contributions from the public. To include a wide array of contributors in documenting and publicizing material whose origins lie in silencing or coercing folks seems powerful.

We’d love to hear input from you, our classmates, on this potential new addition to the project.

CUNYcast weekly update

CUNYcast has hit the ground running with great work all around.

Outreach & Management:

This week outreach set up an amazing tabling event! We received over 30 signatures from people who are interested in casting! This week Tuesday evening we will be hosting a workshop for interested casters!


To get our front-end Calender and show info widgets working, we had to do two things:

Define sourceDomain: The installation guide for the widgets does a poor job of explaining this, but the sourceDomain is the site which you are pulling information from. The tutorials we were using stated that source domain should be your public site address, but no, it actually isn’t. Our public site address is cunycast.net (as in, the site we are sending information to) and the information we’re getting the information from to fill the widgets is airtime.cunycast.net (the proper sourceDomain).

Remove i-frame: Airtime recommends putting its widgets in an iframe, which stands for inline frame.  an inline frame allows you to embed another html document on to a page. As such, they let you define and manipulate rules within a specific section of your page. That said, they are finicky and hard to configure, in that not only do you have to figure out the proper dimensions for the frame on the page, but you also need to work between two .html to get it working properly. As such, we took the information from the i-frame.html and just embedded it in the <head> and <body> of our page.


The website has been flushed out we now have 5 pages:




The FAQ list is rounding out a lot of uses for the site and the Process page is going to be a great space for us to manage our longer form tutorials and our project development. This week will formally document our user testing and make the necessary changes to our site. This week we will perform 5 user testing experiments. We will open the website for a subject and ask them to respond. We will then ask them to use the tutorial page specifically, and respond. This should be a great way to finalize the language about our project.

YOU CAN NOT SEE THE CHANGES LIVE YET but… you will be able to see them Monday!


Stay tuned and I will update this post when the pages go live!


This week, team Digital HUAC worked on refining our project narrative. This work dovetails with both outreach and site content: we’ll use narrative material to pitch potential users and partners and beef up our site itself. Juliana developed a thorough “pitch kit” with relevant topics and questions, and in response we filled out sections such as: “Challenges with the Current State of HUAC Records,” and “Our Solution.” We feel that such an approach effectively communicates vital information to all parties. It also helps us think through issues concretely. Nothing forces you to articulate your project means and aims better than thinking about how strangers will interact with it all.

We also demoed a new MVP as a fallback plan. Given that we are gravitating towards fully leveraging Document Cloud’s search interface, we experimented with embedding the DC viewer and search mechanism in our site itself. This is less than ideal: for one thing, this only rendered string-search results that didn’t make use of the robust, standardized metadata that we took time to tag each transcript with. But it was helpful to think about recasting our MVP just in case, and we welcomed the chance to get under the hood of Document Cloud in more detail.

Digital HUAC Update

A short update today, as we continue to push forward on getting our search functional. We’re stalled out on a few specific questions that are, hopefully, the final barriers in putting it all together. We’ve reached out to the digital fellows and a few other people we hope can help us on these questions–

-What is the best way to connect to a REST API? Our code is currently configured using curl. Is that the best approach?

-What is the best way to structure our search in JSON—using a list (with indexed search results by location) or using an associative array of key-value pairs? We have created key-value metatags for our documents in DocumentCloud, but the resulting JSON search results only display the built-in metadata tags (e.g., title: “”, id: “”) and not our created metadata tags. Is that an issue on the DocumentCloud or on the coding side?

We’ve added a bunch more testimonies to our DocumentCloud group, and have started on entering the metadata for it. The writing and outreach process continue to move forward, along with some of the smaller aspects of UX and development.

Digital HUAC update

This week we are working on some large items:

Our number one goal this week has been to get our search functionality up and running. Daria has been a coding machine, working on this non-stop. We’re nearly there. Some of things Daria has been grappling with are connecting to the DocumentCloud API using a REST API call function and trying to figure out what is the best taxonomy to be read by both PHP and JSON. The existing tutorials and scripts either explain using PHP to connect to a MySQL database, or use Python to connect to the DocumentCloud API, however, Google Developers has a tutorial on using PHP to connect to the Google Books and Google News APIs, which has proven a useful tool in working the PHP to DocumentCloud API situation. For a peek behind the scenes, check out some of Daria’s code here.

Juliana and Chris have been hitting Twitter hard, and our followers have doubled in the last week. Juliana created an NYC DH account and is exploring it as a place for potential groups and people who might be interested in our project. We continue to amass a list of historians and institutions that will be interested in Digital HUAC. All of this outreach is working toward our short-term goal of getting our project name out there, and also our long-term goal of finding an institution to partner with (which is one more step on the road to Digital HUAC world-domination).

Juliana and Chris have also begun to write up our overarching narrative (the theme: NO APOLOGIES!) as a way to create a story to pitch, but also looking toward the future beyond class. What direction to we want the project to go in, and how is the narrative helpful in this regard? Along these same lines, we’re simultaneously writing content for our site, since many of our current pages are just placeholders. We’re slowly but steadily working toward a functional, robust site.

We started with 5 testimonies, because that seemed like a manageable number when we had a lot of technological unknowns. Now that we’ve gotten over some of our biggest technology hurdles, we’re able to increase the size of our corpus with relative ease. I am adding new testimonies to our DocumentCloud group daily and the associated metadata will be added in the coming week as well. We don’t have an updated target number of testimonies, but would like to get as many in as possible. This process of adding testimonies will continue throughout the rest of the semester. The added testimonies will make search testing significantly more interesting, as well as showcasing more of what this project’s full potential is.

We’ve also been at work on some smaller items:

–Getting our contact form to send an email to us.
–Getting the browse functionality going, at least in very beta way. For now, this will just be an alphabetical list of names. Each name, when clicked upon, will provide a results page of all the documents that person is named in.
–We agreed upon a Creative Commons license and have added that to our site in place of the ©.
–We have a new week-by-week action plan that details what needs to get done to get us to a fully-functional MVP by May 19.

TANDEM project update

The code merge was completed and tested on two local machines and uploaded to the server at Reclaimhosting.com. According to Tim Owens at Reclaim, the necessary Python packages were loaded on the server, but the code cannot find three of them, so, as of this date, the code has not been run (Note: running this code on the server is an interim step to verify that the core logic of the text analysis and image analysis works properly). However, the server was built out so that the demonstration Django application launches successfully. Unfortunately, once it launches, some of the pages cause errors as does any attempt to write to the database. Our subject matter expert has been contacted to help debug these errors.

On a separate development path, multiple members of the team are working on building the Django components we need to turn the analytics engine into an interactive web application. Steve is working on linking the the core program to a template or view. Chris, Kelly and Jojo are working on designing and building the templates in a Django framework. Current UI/UX concerns involve potential upload sizes combined with processing time, button prompts that launch the analysis, and ways to convey best practice documentation so that it’s clear, concise, and that it facilitates proactive troubleshooting. The next part of this process will be to address the presentation of the final page, where the user is promoted to download their file. This page has great potential to be underwhelming, but there are some simple features we can apply to jazz it up, such as data visualization examples and by providing external links to next-step options.

On the outreach front, Jojo went to a Django hacknight Wednesday to get a handle on people building Django apps. She made contact with several new advocates in addition to garnering further support from Django Girls participants web developers Nicole Dominguez and Jeri Rosenblum, as well as hacknight organizer Geoff Sechter. The new contacts include Michel Biezunski. He seems like he could help. And has used Django to upload and redistribute files for his app InstantPhotoAlbum. So he could help when we work on figuring out potential options for placing and giving back data.

Last but not least, Chris attended a meetup at DaniPad NYC Tech Coworking space in Queens, NY this past week. There, he met a handful of Python developers who had insight into working with Django based web-apps. Commercial uses for TANDEM-like were brainstormed and people responded with interest in testing a prototype. Along with academic beta-testers, some of these people will be included in the contact list when TANDEM is deployed.