Tag Archives: Python

TANDEM Project Update 4.11.15

TANDEM Week 9 Presentation

TANDEM: A Brief Agenda

I. Review our project goals

Discuss new interested users (advertising, biodiversity cataloging)
Discuss output applications in “Mother Goose Counts”

II. Describe our development drive

Branches of Dev underway
- UI/UX dynamic pages
- Django framework
- TANDEM tool python script

III. Explain our development steps

Two parallel paths were followed building Python “backend” code to run the analytics on the users’ input files
The paths were merged and tested on a laptop
The Python environment was then built on the server
A command line versionTANDEM will now run on the server using local server-based files.
@sreal19 will Demo TANDEM! (Fasten your seatbelts, folks!)

IV. Discuss next steps

What still needs doing hooking up front and back ends.
Getting polished examples of our output up along with clear links to available datavis resources.
Getting Kelly’s best practices documentation live.
Outreach (not just to beta testers, but to users who might not have considered these tools before — looking for education applications/journalism
Now is also the time to start considering the life beyond Praxis:
Grants for continuing work?
How much labor/manpower/development would be needed to move beyond MVP?
What does 1.0 look like?

Thanks for following and stay tuned for updates!

@dhTANDEM #picturebookshare

TANDEM project update

The code merge was completed and tested on two local machines and uploaded to the server at Reclaimhosting.com. According to Tim Owens at Reclaim, the necessary Python packages were loaded on the server, but the code cannot find three of them, so, as of this date, the code has not been run (Note: running this code on the server is an interim step to verify that the core logic of the text analysis and image analysis works properly). However, the server was built out so that the demonstration Django application launches successfully. Unfortunately, once it launches, some of the pages cause errors as does any attempt to write to the database. Our subject matter expert has been contacted to help debug these errors.

On a separate development path, multiple members of the team are working on building the Django components we need to turn the analytics engine into an interactive web application. Steve is working on linking the the core program to a template or view. Chris, Kelly and Jojo are working on designing and building the templates in a Django framework. Current UI/UX concerns involve potential upload sizes combined with processing time, button prompts that launch the analysis, and ways to convey best practice documentation so that it’s clear, concise, and that it facilitates proactive troubleshooting. The next part of this process will be to address the presentation of the final page, where the user is promoted to download their file. This page has great potential to be underwhelming, but there are some simple features we can apply to jazz it up, such as data visualization examples and by providing external links to next-step options.

On the outreach front, Jojo went to a Django hacknight Wednesday to get a handle on people building Django apps. She made contact with several new advocates in addition to garnering further support from Django Girls participants web developers Nicole Dominguez and Jeri Rosenblum, as well as hacknight organizer Geoff Sechter. The new contacts include Michel Biezunski. He seems like he could help. And has used Django to upload and redistribute files for his app InstantPhotoAlbum. So he could help when we work on figuring out potential options for placing and giving back data.

Last but not least, Chris attended a meetup at DaniPad NYC Tech Coworking space in Queens, NY this past week. There, he met a handful of Python developers who had insight into working with Django based web-apps. Commercial uses for TANDEM-like were brainstormed and people responded with interest in testing a prototype. Along with academic beta-testers, some of these people will be included in the contact list when TANDEM is deployed.

DigitalHUAC group update

DigitalHUAC project update

Search Form Update

After finalizing the taxonomy with our historian experts, we created a public project on DocumentCloud, where we uploaded the five sample testimonies. For each testimony, we input key value pairs based on our taxonomy.

We are still working on the script that will talk to the DocumentCloud API. In the meanwhile, we started working on a search form with HTML only. After making some very basic search forms, we came across a form builder for Bootstrap which allowed us to add more search options very easily. The form builder also provided the html, which we pasted into our test website.

Below is a screenshot:

API Script (Form Action) Update

Working with DocumentCloud, we found a) an app that allows users to work with DocumentCloud-documents through a (Django-powered) CMS (built by The Bay Citizen):

https://www.baycitizen.org/blogs/sandbox/djangodocumentcloud-integration-theres/

https://github.com/BayCitizen/django-doccloud

And b) a Python wrapper built for the DocumentCloud API:

https://github.com/datadesk/python-documentcloud

We looked at other documentation that explains how to post html form values into Python script (e.g., http://stackoverflow.com/questions/15965646/posting-html-form-values-to-python-script)

But are currently working with the Python API wrapper, which required downloading a more recent version of Python, with Pip installed, and then installing the python-documentcloud library:

Screen Shot 2015-03-22 at 9.40.18 PM

Though the initial attempt(s) return the following:

We are continuing with the following Python-documentcloud tutorial:

http://python-documentcloud.readthedocs.org/en/latest/index.html#

https://media.readthedocs.org/pdf/python-documentcloud/latest/python-documentcloud.pdf

In order to be able to extract text from the HUAC PDFs uploaded in DocumentCloud and return the excerpted text to the user:

http://python-documentcloud.readthedocs.org/en/latest/documents.html

And are meanwhile also playing with getting input from a browser via:

-Web forms in Django:

https://docs.djangoproject.com/en/1.7/topics/forms/

-And by using GET/POST methods inside a Python class index:

http://learnpythonthehardway.org/book/ex51.html

Joy Report – Data Tech [E]mmersion

It’s good to know your strengths.

I’m never going to be a data dude. Thanks to Stephen Real who turned me onto Lynda.com (forwarded from Matt), I watched several tutorials trying to recreate what Micki shared during her workshop on Thursday, Oct. 31^st.

But, let me back up a moment. Since acknowledging that I’m probably never going to be a data-dude, it occurs to me that my particular strength is as a communicator. To that end, let me share the last two week’s adventures in tech. I have been to EVERY available workshop except the ones on Thursday evenings when I have a previously scheduled class.

This has amounted to six in-person workshops at GC, one FB page, one WordPress site, three online tutorials and an impulsive registration for a Feminist technology course at Barnard (thank you Kelly for referring the info).

Here is what the last month of data-tech-[E]mmersion have looked like:

Tuesday, September 30 – Digital Fellow’s Social Media & Academia: Creating Digital Research Communities Workshop, (Andrew G. MKinney & Laura Kane), Library GC
Friday, October 1 – I wrote a “Twitter” review for the workshop and shared it with my classmates in the DH Praxis 2014 blog site on the Commons.
I also tweaked the Mother Studies webpage on the Commons blog post-workshop
Friday, October 24 – Fellows consultation with Patrick Smyth who showed me Ngram, “Python for kids” workbook, and some other cool things like “Internet Time Machine”, and “Distance Machine.”
Saturday, October 26, blogged about my experience with Patrick, and Ngramed two of my other classes at GC to compare words and texts from a gender perspective; American Studies, and Sociology of Gender.
Art+Feminism Wiki Workshop GC

Monday, October 27 – Wiki Art + Feminism workshop GC – we learned some Wiki code and also found out that only 5% of Wiki contributors are women.
Tuesday, October 28 – WordPress Advanced level users, Library GC. This workshop really helped me see some of the advanced options available to edit my site on the commons. Although these workshops are also frustrating because often we aren’t actually able to try things in the class and its tough to remember everything once you get back to your desk. Workshops should have an additional help session, or follow up lab (or online resource attached to them)
Wednesday, October 29 – Data Mapping for social media, Library GC
Came home that night and built a FB page and blog site called “OurHealthStories.” Thought this might serve as a repository for the big data project and these notes from class. Too much for the DH Blog (Don’t wanna be a “Blog Hog”). I’ve combed through a lot of data sets at this point, and many of them are health related. My own health issues, the state of health care in America today, and recent stories like the one about the creator of the game “Operation” who can’t afford an operation really touched me, and made me want to take action.
Below is a list of the data sites I’ve investigated thus far. I was envisioning a project comparing midwife activity to OBGYN deliveries in America because there is a section of my thesis that would benefit from this. Wrote my advisor.

B-
I have to run a data project for my DH class.
Have you ever, or do you have data on this:
Compare midwife assisted birth to physician assisted birth in US, and data map it.
I want to see the measurable comparisons of how midwives practice relative to doctors. Please let me know if you have anything, also because I want to use it in my thesis paper.
_

She wrote back and advised against it:
“There is a TON of data on this, and it’s kinda complicated. How are you defining midwife? Nurse-Midwife in-hospital? all midwives? all locations, birth centers, hospitals and homes? How are you controlling for maternal status? Just go take a quick look at the literature and you’ll see. I would not encourage you to include this in the thesis — not in this kind of oversimplistic ‘docs’ vs ‘midwives’ way — as I say, WAY too complicated for that.”
__

I wrote a friend of mine who is a public health nurse at Hunter.
She wrote back:
“Here are some sources. Is it by state or national data you need to map? Do you know Google Scholar search?
Here’s a link to a report published in 2012 re Midwifery Births
Here’s a link to an article comparing births MD vesus CNM
National Vital Statistics Report ***** best resource for raw data
CMS Hospital Compare
https://data.medicare.gov/data/hospital-compare
National Center for Health Statistics Vital Data
http://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm
NYC Dept. of Health Data & Statistics http://www.nyc.gov/html/doh/html/data/data.shtml_”

Thursday, October 30 – Data Visualization, (Micki Kaufman), Library GC; Impressive project and great demo. Again, I wish we could have actually tried to do some of the things Micki demoed.
Friday, October 31 – Can’t attend the Fellows open hours this week or next week. Wrote Micki to see if she could meet with me at any point during next week for specific questions/answers? Began to export and clean a data set from last year’s academic MOM Conference, thinking it would be interesting to map the geographic locations attendees hailed from.
Saturday, November 1 – Began the day online taking tutorials. Stephen Real and I met before class on Thursday and he suggested a few things after we discussed how we could create a collaborative project. Today I’m watching Lynda.com videos, but for the tutorials that follow up on where Micki left off on excel documents, I work on a MAC and don’t have a left/right click mouse. So I can’t try a lot of the things they’re demoing. Going to try PDF conversion and scrapping now.
Thursday, Nov. 6 – Stephen Real and I met up. He and I “played” with some data cleaning stuff. He told me about his “Great Expectations” project. Sounds cool. Spoke with Chris Vitale generously shared some of his tech finds (which people have already been writing about here). Stayed late to talk research ideas with Stephen Brier.
Friday, Nov. 7 – Technology and Pedagogy Certificate Program at the Library. We talked wordpress, plug-ins, and sever technology.
Weekend, Nov. 8 – did some research on potential final projects. Explored DH in a Box. I have three ideas. Can’t decide which one to go with. Thinking about creating a survey monkey to ask classmates which idea they like best?

I signed up for “Technologies of Feminism” at Barnard. Starts November 18 and runs for 5 weeks. Here’s what it’s about. Feminism has always been interested in science and technology. Twitter feminists, transgender hormone therapy, and women in STEM are only more recent developments in the long entangled history of tech, science, and gender. And because feminism teaches that technology embodies societal values and that scientific knowledge is culturally situated, it is one of the best intellectual tools for disentangling that history. In this five-week course, we will revisit foundational texts in feminist science studies and contextualize current feminist issues. Hashtag activism and cyberfeminism, feminist coding language and feminized labor, and the eugenic past of reproductive medicine will be among our topics. Readings will include work by Donna Haraway, Maria Fernandez, Lisa Nakamura, Beatriz Preciado and more. Participants of all genders are welcome. No prior knowledge in feminist theory is required.

During the fall 2014 semester, courses similar to this one are taking place across North America in a feminist learning experiment called the Distributed Open Collaborative Course, organized by the international Feminist Technology Network (FemTechNet). As a node in this network, our class will open opportunities for collaboration in online feminist knowledge building—through organizing, content creation, Wikipedia editing, and other means. Together, we will discuss how these technologies might extend the knowledge created in our classroom to audiences and spaces beyond it.

Still haven’t pulled together a comprehensive plan amidst the massive choices available for the data project yet.

WHEW!

I’m en-JOY-ing the journey, but I’m not sure if I can pinpoint a location or product YET. Onward I suppose.

Some other good Python resources

Just wanted to add to the list of good resources for (teaching yourself) Python:

Learn Python the Hard Way and the Code Academy Python tutorial, both mentioned in the previous blog post, are both excellent (and free)–both are interactive (which I’ve found to be *hugely* helpful in writing code that actually runs (instead of spending ages rewriting the thing before realizing the problem is something mundane, like a missing colon)) and LPTHW is rote (i.e., typing out the prompt code verbatim) which, unlike pretty much any other examples of good pedagogy, I’m finding to be a pretty good way to learn this stuff.

Another very good resource (which is also the book assigned in my computational linguistics class) is John Zelle’s Python Programming: An Introduction to Computer Science, which can be accessed for free at http://www.maths.nuigalway.ie/~gettrick/teach/cs102/pythonbook.pdf as well as Zelle’s companion website, which has slides and downloadable code for all the program examples in the book: http://mcsp.wartburg.edu/zelle/python/.

There’s also learnpython.org, also with an interactive function (in addition to countless examples and explanations) and an MIT open courseware class, A Gentle Introduction to Programming Using Python: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-189-a-gentle-introduction-to-programming-using-python-january-iap-2011/, which I haven’t yet looked at but appealingly promises to be a “gentle, yet intense, introduction” to programming.

Digital Praxis Seminar Fall 2014 – Spring 2015