use case | Digital Praxis Seminar Fall 2014

Publishing Case:

Chris is a Data Analyst for the Advertising department of XYZ Publishing. He has the banner ads from this year’s holiday campaign. He is interested in analyzing what generated the highest click-through rates for the company. Chris has previously downloaded and installed TANDEM to his desktop tool. Chris drag-and-drops his folder of ads onto the TANDEM interface. A progress bar appears. A .csv file is generated in the backend to store the output. The completion page gives Chris a downloadable CSV. Chris is directed to brief guides on how the data could possibly be used/visualized. Chris goes the basic route and enters excel to explore his data. He compares the data to the clickthrough rates in the ad server and notices a trend in the relationship between brightness and saturation, along with the number of words on the advertisement, and how many users clicked the ad. The brightest ads with 10 words or less had the highest click through rates. Chris is able to make an data-driven argument with the design team for brighter ads with minimal text in future campaigns.

Scholar Case:

Professor Plum is studying how advertising strategies have been affected by a significant historical event such as World War I. He has collected a corpus of print advertising materials spanning multiple product categories both before and after the event which is being studied. Plum wants to know what has changed and has developed theories regarding a number of features among which are the following questions:

Has the proportion of text to image changed? How?
Has the word usage changed? How?
Has the iconography changed? How?
How has the visual style changed? Are the different colors being used? Are the images more contrasty?

Using a tool outside of TANDEM, Professor Plum scans the materials into a digital format such JPG, TIFF, PDF or GIF. After the image files have been built, he downloads a copy of TANDEM from the Internet and installs it on his desktop computer. Plum launches TANDEM and starts the analysis process by inputting the name of the folder that contains the electronic documents being studies. TANDEM outputs OCR, NLTK and FeatureExtractor data into a database, which can be saved.

Professor Plum can now use TANDEM (or some other visualization tool) to produce visualizations or tables on the parameters that are of particular interest to the scholar. Based on the results of these visualizations, Plum may make some adjustments to the settings in TANDEM to produce a more useful result. He may choose to export the results database to another application for further work or study.

Educator Case:

An early childhood educator, Yasya Berezovskiy, wants to study the effects of children’s literature on neurological development, exploring factors such as narrative, image representations, and lexiles (or word complexity/reading level) together. To date, Berezovskiy has worked with empirical evidence and collected fieldwork data.

Berezovskiy will be analyzing a number of children’s books with varying factors, ranging from author collections, time published, and theme.

Using TANDEM Berezovskiy can upload page images or entire works to process the work’s text in comparison to the visual information. Once complete, Berezovskiy can visualize the processed files in split screen, with the original image beside the visualized data. From there, Berezovskiy can choose to isolate individual elements to analyze, such as opacity, density, text to image ratio, text to color ratio, shape to text ratio, and more. Alternately, Berezovskiy can download the raw processed data to analyze using a separate visualization program.

The processed data will be complementary to other observational research being done by Berezovskiy’s colleagues. Without TANDEM, the evidence from the children’s books would have been only descriptive. Further, without TANDEM it would have taken Berezovskiy multiple programs and more effort.

Fairy Tale Nerd Case:

The user, a woman interested in creating a datavisualization for a pop lit site like Toast.net — let’s say Ella, wants to look at Victorian illustrated fairy tale collections. Ella wants to analyze captions for art plates in all available published works. She wants a computer to process all available picture books to give her more information on the content of a work based on its visual properties as well as its textual content. She wants to get a computer to pull all the words included in the illustrations, as well as the ratio of those words in relation to what is written in the story (Are they direct quotes? Are they distinct?). She goes to the TANDEM interface. There, she sees a simple description of what files the application will yield. It’s so understandable! All the fields are so well explained! She clicks the upload button, finds the files on her computer, uploads the picture book scans, and runs the application. Once the TANDEM program has run, another window appears offering a number of file types. Each file type has a scroll over description of its applications and recommended datavis links. Once she has selected, she can download the data file (CSV or …. …..).

Ella takes it to her favorite datavis site and goes wild with joy at the new capabilities and bases for comparison. All her dreams have been answered. Thanks, TANDEM!

User Story #1: A forensic computational linguist doing research on how interviewing style impacts witness responses. The value of the site to the user is being able to compare friendly vs. unfriendly witnesses (difficult to determine in general court transcripts) and the sheer number of available court transcripts available (also difficult to collect re general court transcripts). The person clicks the API link and follows the prompts to extract a cluster of readings from unfriendly witness testimony, and does a second export for a cluster from friendly witness testimony. The API exports the two corpora into an intermediary location (such as Zotero), which can be used with Python (NLTK) to compare, for example, the number of times interviewers repeated question for friendly vs. unfriendly.

User Story #2: High school civics & US history teacher, Chris. He is wants to assign the students to search the archive to find primary source documents from the McCarthy era. Students will have a list of topics and names to choose from as their research areas. Chris tests the site to see if it will be useful to his students. Chris uses the simple search box to search for both topics such as ‘treason’ and ‘democracy’. Chris uses the advanced search options to combine topics with names. Chris is looking for clean results pages, the option to save and export searches, and help with citation.

User Story #3: American Political Scientist, Jennie, doing research on US government responses during periods of perceived national security threats. Specifically, she is interested in the Foreign Intelligence Surveillance Courts (FISC), which are closed courts. Jennie wants to read about now-released documents that record the conduct of closed courts. Jennie wants to do mixture of qualitative and quantitative analysis. Qualitative: Jennie uses the advanced search to specify she wants only to look at hearings that have been identified in the category as ‘closed’ hearings, then does the same to specify only ‘open’ hearings. Quantitative: Jennie uses API and follows the prompts to extract data with the category ‘closed’ and date filter to do a statistical analysis of number of closed trials by year and how/if correlated to outside events.

Digital Praxis Seminar Fall 2014 – Spring 2015

Tag Archives: use case

TANDEM USE CASES

HUAC User Stories

Need help with the Commons?