Category Archives: Uncategorized


Publishing Case:

Chris is a Data Analyst for the Advertising department of XYZ Publishing. He has the banner ads from this year’s holiday campaign. He is interested in analyzing what generated the highest click-through rates for the company. Chris has previously downloaded and installed TANDEM to his desktop tool. Chris drag-and-drops his folder of ads onto the TANDEM interface. A progress bar appears. A .csv file is generated in the backend to store the output. The completion page gives Chris a downloadable CSV. Chris is directed to brief guides on how the data could possibly be used/visualized. Chris goes the basic route and enters excel to explore his data. He compares the data to the clickthrough rates in the ad server and notices a trend in the relationship between brightness and saturation, along with the number of words on the advertisement, and how many users clicked the ad. The brightest ads with 10 words or less had the highest click through rates. Chris is able to make an data-driven argument with the design team for brighter ads with minimal text in future campaigns.

Scholar Case:

Professor Plum is studying how advertising strategies have been affected by a significant historical event such as World War I. He has collected a corpus of print advertising materials spanning multiple product categories both before and after the event which is being studied. Plum wants to know what has changed and has developed theories regarding a number of features among which are the following questions:

  • Has the proportion of text to image changed? How?
  • Has the word usage changed? How?
  • Has the iconography changed? How?
  • How has the visual style changed? Are the different colors being used? Are the images more contrasty?

Using a tool outside of TANDEM, Professor Plum scans the materials into a digital format such JPG, TIFF, PDF or GIF. After the image files have been built, he downloads a copy of TANDEM from the Internet and installs it on his desktop computer. Plum launches TANDEM and starts the analysis process by inputting the name of the folder that contains  the electronic documents being studies. TANDEM outputs OCR, NLTK and FeatureExtractor data into a database, which can be saved.

Professor Plum can now use TANDEM (or some other visualization tool) to produce visualizations or tables on the parameters that are of particular interest to the scholar. Based on the results of these visualizations, Plum may make some adjustments to the settings in TANDEM to produce a more useful result. He may choose to export the results database to another application for further work or study.

Educator Case:

An early childhood educator, Yasya Berezovskiy, wants to study the effects of children’s literature on neurological development, exploring factors such as narrative, image representations, and lexiles (or word complexity/reading level) together. To date, Berezovskiy has worked with empirical evidence and collected fieldwork data.

Berezovskiy will be analyzing a number of children’s books with varying factors, ranging from author collections, time published, and theme.

Using TANDEM Berezovskiy can upload page images or entire works to process the work’s text in comparison to the visual information. Once complete, Berezovskiy can visualize the processed files in split screen, with the original image beside the visualized data. From there, Berezovskiy can choose to isolate individual elements to analyze, such as opacity, density, text to image ratio, text to color ratio, shape to text ratio, and more. Alternately, Berezovskiy can download the raw processed data to analyze using a separate visualization program.

The processed data will be complementary to other observational research being done by Berezovskiy’s colleagues. Without TANDEM, the evidence from the children’s books would have been only descriptive. Further, without TANDEM it would have taken Berezovskiy multiple programs and more effort.

Fairy Tale Nerd Case:

The user, a woman interested in creating a datavisualization for a pop lit site like — let’s say Ella, wants to look at Victorian illustrated fairy tale collections. Ella wants to analyze captions for art plates in all available published works. She wants a computer to process all available picture books to give her more information on the content of a work based on its visual properties as well as its textual content. She wants to get a computer to pull all the words included in the illustrations, as well as the ratio of those words in relation to what is written in the story (Are they direct quotes? Are they distinct?). She goes to the TANDEM interface. There, she sees a simple description of what files the application will yield. It’s so understandable! All the fields are so well explained! She clicks the upload button, finds the files on her computer, uploads the picture book scans, and runs the application. Once the TANDEM program has run, another window appears offering a number of file types. Each file type has a scroll over description of its applications and recommended datavis links. Once she has selected, she can download the data file (CSV or …. …..).

Ella takes it to her favorite datavis site and goes wild with joy at the new capabilities and bases for comparison. All her dreams have been answered. Thanks, TANDEM!


Suggested Resource – Full Stack Python

In our roaming around the internet, we discovered Full Stack Python written by Matt Makai of Twilio. Take a look at the TOC below for more specific info. Matt does a thorough job of documenting interesting and helpful resources and breaks down more complicated processes into smaller tasks.

He is also very responsive on twitter @mattmakai.

TIL Dropbox and BitTorrent both employ Python in their workflows.

Table of Contents

Every topic below with a link currently has a page on Full Stack Python. If there isn’t a link I’m working on getting a page for that topic up.

Tokyo Destruction Diary Pre-Pitch

The basic idea is to create an interactive map of Tokyo, charting instances of rapid destruction (1923 earthquake, WWII), social upheaval (protests of the 1960’s), and random acts of violence (1995 sarin gas attack, the 2008 Akihabara massacre), along with the city’s own growth and changes during the post-war years. Then I would juxtapose this historical data with trends in media related to the destruction of Tokyo and to see how media becomes a barometer for fears generated from past trauma or changes.

Though not all change and destruction in Tokyo is the result of horrific disasters or war. Tokyo is a city that almost perpetually has buildings being torn down and new ones being built up. According to a Frekonomics podcast, half of all homes in Japan are demolished after only 38 years ( Death and rebirth become cyclical parts of daily life that shape Tokyo, literally and figuratively.
So why focus on Japan and Tokyo? From the 1980’s (arguably earlier) to today, we have seen Japanese pop culture become more and more present in the American cultural landscape. It informs how we perceive Japan’s history and culture (though sometimes these perceptions may be skewed) and once obscure portions of Japanese arts and media have now become common knowledge thanks to fan communities, bloggers, publishers, and other people bridging the gap between our culture and Japan’s. Through this exchange, we’ve seen the Japanese death/rebirth cycle take form in movies, tv shows, books, video games, and more. Mothra snaps Tokyo Tower in twain, only for it to be in one piece again the next time Godzilla emerges from the murky depths.

This project would act as a way to chart Japan’s history, it’s changes in media, and it would ultimately take the form of a website which would be viewed by people interested in media, history, and Japanese culture.


Journal #1 Julia

I am excited and anxious for this process.

It was really nice to see all the familiar faces from DH1. The project I put forward for DH1 is something I will be pursing on an individual level. I think using DH in a preformative space to help me construct my artwork will push my work in ways that I have yet to know and understand. But this work is dependent on travel and is not appropriate for a group project it is to self serving. I will have no idea what information I have until after I take my trip and sift through my collections.

I am not one to get anxious in class, I love everything about school, work shopping, talking, and getting through ideas. But I will admit that I am anxious to know how this DH class project will move forward. My favorite teachers, teachers I base my own teaching ideals on, have always had  skills components to class and a theoretical components to class. I appreciate projects that have something to show at the end. I appreciate final projects that have a bit of showmanship and theater. Even if a project does not function in exactly the way it set out I like projects that have finished edges.

I like to have a plan an idea of the knowns and unknowns and I am anxious to know what I will be working on (besides my own side projects).

my goals for this class are to expand my technical DH skillset. I hope to learn enough about this workshoping teaching process to find ways to implement this kind of project based learning in my library profession. Digital Humanities  librarian is a job that has hit the librarian list serve, I have an idea about how I would brand my skillset for a position like this and there are some skills that I want to shore up before I claim this job title. This class will be  chance to showcase examples of my DH deliverable.

@yougenee skill set posting

I would like to describe my basic characteristics and strong point. Also couple of weak points…
-social science studies background
– organized/ neat
-analytic- good at summarizing long paragraphs.
-good at mediating opposing ideas
-liberal & open-minded/ flexible
-punctual in time, time management
-arranging time
-negotiation, communication
-I think I am a trainable person. I easily get used to a novel environment.
I have experience in designing google sites for my undergraduate course works.

I am thinking of doing project manager.
I want to avoid the designer because I have lack of knowledge of coding/ computer languages.
Outreach collaborator… maybe I will think about it.

To be honest, I am not still familiar with coding. I am afraid of making huge mistakes once I have to serve the role related to the coding.

I look forward to hearing interesting ideas for next class.


@jojokarlin’s Memory Trip pre-pitch

I am massively intimidated by the awesome pitches people are composing so concisely! I feel like Little Red in Into the Woods— scared, well ExcITED AND scared. I offer a rather hasty outline of what my pitch might be on Tuesday…

Memory is tricky stuff. In these digital times, it is a tradable commodity. How many gigs is your phone?

I want to create a memory map of my grandmother’s memory (loosely based on the map of a road trip) and in the process model a platform that others could use to assemble their own memory map with elderly relations who are not particularly digitally inclined. (My grandmother buys disposable wind up cameras).

1. Memory Map—  I am interested in modeling, in a map of sorts, my soon to be 97-year-old grandmother’s remarkable (largely pre-digital) memory. The Dodge ad from the Super Bowl somewhat made my argument for tying my grandmother’s memories to a road trip. She’s been driving a long time and her life almost spans the history of the automobile industry in America. Not only is the road trip a tradition I have with her, time in the car tends to be fairly meditative. The metaphor is useful — roads more and less traveled in life take us down paths we maybe remember– and the project becomes more memory tourism than memorial monument. (I don’t want to build a museum or a family archive — it’s not about ossifying the “true” facts of my grandmother’s life. I want the map to be an interactive spatialization of the way memory from all her years live in her today.

2. Platform for others to use– I have been thinking it should be done in Neatline with some fancy plugins. I would love to make something that doesn’t require elaborate tools for data collection (I’ve done initial interviews and video with my iphone). Ideally, once built, the memory map could be available to people wanting a way to digitally document the way older generations go about remembering.


I offer a photo of my grandmother at the Getty Museum — I bit their social media bait and had her pose and tweeted it. Naturally @theGettymuseum responded:

Screen Shot 2015-02-08 at 7.05.35 PM

I would like to help my grandmother continue to win the internet.




Lab Journal #1 – James Mason

The first day of class was much more exciting than I thought it would be. Sort of like The Hunger Games, but with more desks…. I’ve never actually seen/read The Hunger Games–did they have desks? Well then, I guess it was more like the Tri-Wizard tournament, they TOTALLY had desks:

Each competing school is allowed one Champion to represent them during the Tournament. Students wishing to participate write their names and the school they attend on a piece of parchment, and enter it into the Goblet of Fire. The Goblet is an impartial judge, and selects what it considers to be the best student from each school. At the appointed time, the Goblet ejects the names, making each selected student the official Champion for their school. Each selected Champion is then bound by a magical contract to see the Tournament through to the end.” –Thanks Harry Potter Wiki.

Sure, there are some differences, but let’s roll with it. We write our name on the parchment and cast it into the flames… those flames being the will of our peers. This isn’t exactly how I thought the process would go; I figured we’d come in on day one and the projects would already be selected for us by the teachers… less dirty, but less fun as well.

As for my project, I am still on the fence about pitching it, and it being so close to the eleventh hour means that I might shelve it for the time being. A few others have expressed interest in seeing it come to fruition, and I think that it would also be a good “refuge” project for those who are afraid of serious coding and development. Not only that, but as many skillset posts have suggested, we have a group very strong in outreach. Those truly looking for a challenge in that regard might have found one…. if I were pitching it.

Let’s suppose for a moment I were to pitch it. Not only is the project itself a work of DH, but it would create new opportunities to develop additional DH projects. Every podcast, curated via CUNYCast or not, is an object of Digital Humanities, specifically they are as Matthew Kirschenbaum puts it, born-digital objects. While this seems a mundane classification, that they are born digital allows them to be manipulated in ways that true-material artifacts cannot be. That said, I notice the aim of many projects proposed seems to be taking these true-material artifacts and digitizing them, such they might be as malleable as born-digital objects already are. In that regard, working with born-digital media skips several steps, and allows us to instead focus on different ways of making these artifacts even more useful and more malleable. I’m not the only one who thinks this, as I’ve already heard several great ideas from other students, such as Min’s desire for interactivity via podcasting and Julia’s ideas of creating a way to automatically tack on intro and outros without editing the audio feed. Already two awesome ideas in the FIRST WEEK…

…you know, if I were pitching it.



syllabi DHify (pre-pitch)

In preparation for Wednesday’s class, here’s my pre-pitch for Syllabi DHify:

Screen Shot 2015-02-08 at 12.20.05 PM

A syllabus should be a living document that evolves as the semester progresses. However, in practice a syllabus becomes quickly outdated — from the moment a single student scrawls marginalia onto a handout, indicating that something was incomplete, something had changed.

At the most basic level, Syllabi DHify will be a platform for both students and professors to quickly access and update course syllabi, removing the need for erroneous print-outs or Word documents shared via e-mail.

For teaching professionals Syllabi DHify will go a step further by providing a space for active pedagogical collaboration. Users at this level will have the opporutnity to share existing syllabi, collaborate with peers, and re-use shared content. Syllabi DHify will facilitate the incorporation of new pedagogical methods across disciplines. The platform itself will be an exercise in Digital Humanities methods and practices, drawing on the open sharing principles behind Massive Open Online Courses (MOOCs) and Open Access (OA). As such, it will provide provide teaching professionals not familiar with Digital Humanities a means to incorporate its technological, collaborative, and systematic practices into existing student course work.

Syllabi DHify aims to improve upon the way in which information is shared, allowing for a more fluid, collaborative learning experience.

If you’re interested in sharing the work of DH to the larger knowledge community — come join me.
If you’re ready to see higher education move forward — come join me.
We can take our methods from DH and share it. With anyone.