Tag Archives: Django

TANDEM Project Update 4.11.15

TANDEM Week 9 Presentation

TANDEM: A Brief Agenda

I. Review our project goals

  • Discuss new interested users (advertising, biodiversity cataloging)
  • Discuss output applications in “Mother Goose Counts”

II. Describe our development drive

  • Branches of Dev underway
    • UI/UX dynamic pages
    • Django framework
    • TANDEM tool python script

III. Explain our development steps

  • Two parallel paths were followed building Python “backend” code to run the analytics on the users’ input files
  • The paths were merged and tested on a laptop
  • The Python environment was then built on the server
  • A command line versionTANDEM will now run on the server using local server-based files.
  • @sreal19 will Demo TANDEM! (Fasten your seatbelts, folks!)

IV. Discuss next steps

  • What still needs doing hooking up front and back ends.
  • Getting polished examples of our output up along with clear links to available datavis resources.
  • Getting Kelly’s best practices documentation live.
  • Outreach (not just to beta testers, but to users who might not have considered these tools before — looking for education applications/journalism
  • Now is also the time to start considering the life beyond Praxis:
  • Grants for continuing work?
  • How much labor/manpower/development would be needed to move beyond MVP?
  • What does 1.0 look like?

Thanks for following and stay tuned for updates!

@dhTANDEM #picturebookshare

tufte retweet

 

 

TANDEM project update

The code merge was completed and tested on two local machines and uploaded to the server at Reclaimhosting.com. According to Tim Owens at Reclaim, the necessary Python packages were loaded on the server, but the code cannot find three of them, so, as of this date, the code has not been run (Note: running this code on the server is an interim step to verify that the core logic of the text analysis and image analysis works properly). However, the server was built out so that the demonstration Django application launches successfully. Unfortunately, once it launches, some of the pages cause errors as does any attempt to write to the database. Our subject matter expert has been contacted to help debug these errors.

On a separate development path, multiple members of the team are working on building the Django components we need to turn the analytics engine into an interactive web application. Steve is working on linking the the core program to a template or view. Chris, Kelly and Jojo are working on designing and building the templates in a Django framework. Current UI/UX concerns involve potential upload sizes combined with processing time, button prompts that launch the analysis, and ways to convey best practice documentation so that it’s clear, concise, and that it facilitates proactive troubleshooting. The next part of this process will be to address the presentation of the final page, where the user is promoted to download their file. This page has great potential to be underwhelming, but there are some simple features we can apply to jazz it up, such as data visualization examples and by providing external links to next-step options.

On the outreach front, Jojo went to a Django hacknight Wednesday to get a handle on people building Django apps. She made contact with several new advocates in addition to garnering further support from Django Girls participants web developers Nicole Dominguez and Jeri Rosenblum, as well as hacknight organizer Geoff Sechter. The new contacts include Michel Biezunski. He seems like he could help. And has used Django to upload and redistribute files for his app InstantPhotoAlbum. So he could help when we work on figuring out potential options for placing and giving back data.

Last but not least, Chris attended a meetup at DaniPad NYC Tech Coworking space in Queens, NY this past week. There, he met a handful of Python developers who had insight into working with Django based web-apps. Commercial uses for TANDEM-like were brainstormed and people responded with interest in testing a prototype. Along with academic beta-testers, some of these people will be included in the contact list when TANDEM is deployed.

DigitalHUAC group update

DigitalHUAC project update

Search Form Update

After finalizing the taxonomy with our historian experts, we created a public project on DocumentCloud, where we uploaded the five sample testimonies. For each testimony, we input key value pairs based on our taxonomy.

We are still working on the script that will talk to the DocumentCloud API. In the meanwhile, we started working on a search form with HTML only. After making some very basic search forms, we came across a form builder for Bootstrap which allowed us to add more search options very easily. The form builder also provided the html, which we pasted into our test website.

Below is a screenshot:

1

API Script (Form Action) Update

Working with DocumentCloud, we found a) an app that allows users to work with DocumentCloud-documents through a (Django-powered) CMS (built by The Bay Citizen):

https://www.baycitizen.org/blogs/sandbox/djangodocumentcloud-integration-theres/

https://github.com/BayCitizen/django-doccloud

Screen Shot 2015-03-22 at 10.08.15 PM

And b) a Python wrapper built for the DocumentCloud API:

https://github.com/datadesk/python-documentcloud

We looked at other documentation that explains how to post html form values into Python script (e.g., http://stackoverflow.com/questions/15965646/posting-html-form-values-to-python-script)

But are currently working with the Python API wrapper, which required downloading a more recent version of Python, with Pip installed, and then installing the python-documentcloud library:

Screen Shot 2015-03-22 at 9.40.18 PM

Though the initial attempt(s) return the following:

Screen Shot 2015-03-22 at 9.55.58 PM

We are continuing with the following Python-documentcloud tutorial:

http://python-documentcloud.readthedocs.org/en/latest/index.html#

https://media.readthedocs.org/pdf/python-documentcloud/latest/python-documentcloud.pdf

In order to be able to extract text from the HUAC PDFs uploaded in DocumentCloud and return the excerpted text to the user:

http://python-documentcloud.readthedocs.org/en/latest/documents.html

And are meanwhile also playing with getting input from a browser via:

-Web forms in Django:

https://docs.djangoproject.com/en/1.7/topics/forms/

-And by using GET/POST methods inside a Python class index:

http://learnpythonthehardway.org/book/ex51.html

 

 

Tandem Team Report Week 7

PROJECT:

With our corpus defined and development goals set, the team is taking a two-pronged approach to the reaching the final project. While Chris and Steve focus on continuing to develop and code the working project, Kelly and Jojo have turned their attention to the work to be done with the corpus. Equally as important as building TANDEM is the ability to show a proof-of-concept and illustrate the value of the output TANDEM generates. While the duties among the team will still bleed as there is still design work that may arise for Kelly, outreach to be done by Jojo, and theoretical questions for Chris and Steve to weigh in on, our focus is much more pointed on particular pieces of achieving a functioning and valuable tool and methodology.

 

DEVELOPMENT:

A key milestone was reached this week when the text processing backend coding was completed. It will need to be thoroughly tested which the team expects to complete by 3/31. Additional work requires that the program be merged with the image processing code. This integration step is targeted for completion on 3/24. The current version of the program can be found on Github. The repo contains a number of test data files as well as documentation. The core program is TandemText.py.

The team decided to abandon the Flask web framework in favor of Django, primarily because there is much more local support (from the Digital Fellows) for Django. We were able to switch because we did not have a significant code base built in Flask and much of the work done on Flask may transfer well to Django. Optimistically, the team should be able to get a pilot “Hello World” application running under Django on the Reclaimhosting.com server (with help from Zach Davis and Tim Owens).

Finally, on the development front, the team needs to envision and plan for how we will persist data on the website. Will persistence even be see as a valuable feature by the users? If so, how will store and secure the data? How will we handle requests to amend or edit an existing result set? These decisions are pending, likely to be addressed at the 3/24 class.

 

 

OUTREACH:

This week TANDEM has maintained its twitter activity. Jojo is also working on reaching out to new communities while developing useful skills — she has taken on work at the Tow Center for Digital Journalism and is exploring possible applications of TANDEM there, and she got accepted to Django Girls next weekend and was assigned her team. She looks forward to meeting a number of people across disciplines and fields.