Thanks for all the energy, collegiality, thoughtfulness, and good vibes yesterday. The sense of community in the class is palpable and admirable, and it will help your projects succeed.

I’ve been mulling over the projects since last night, and though I’m sure you’ll clarify much over the next week, I wanted to share some quick thoughts about each project that I hope you grapple with as you write your plan. Please read comments on each others projects, as they may trigger thoughts about your own.

We need to know more about the work that’s already been done around these questions elsewhere, and the marriageability of the technologies that you imagine bringing together to enable this mode of looking. You also need to identify a usable corpus for your test case.

We need a clear sense of the amount of plain text that is available for you to work with, and whether an OCR component will be necessary for this project (if you have a substantial run of the proceedings available as plain text, I think you should eliminate the OCR bit for now). We also need to know how the XML markup will be generated (manually, automated, or both; each choice comes with its own set of questions), and what your taxonomy will be. And, we need to know what tools you’re going to need for each step from processing to presentation ( gathering, parsing, storing, retrieval, display).

We need to know what this project makes easier or possible. What do other podcast networks lack that your project can provide for CUNY? What material needs do you have for the project (mics? software?). Is this something that is targeted only towards CUNY, something that’s generalizable, or both?

We need a clear sense of the technical challenges you’ll face in querying, processing, parsing, and displaying data pulled from #sprezzetura. What do you know about this process, and what do you need to learn?

Finally, we’re going to need to hear from you where you want to host your project. As I noted, we can arrange Reclaim Hosting accounts… but before doing so take a look and make sure Reclaim has what you need. I can put you in touch with Tim Owens (who runs Reclaim) if you have questions.


  1. Chris Vitale

    Thank you for the plot points! We will make sure to address these and other issues/concerns we’ve discovered internally.

    We are open to suggestions for a corpus. Erin Glass mentioned some ideas about digitized magazines, and I believe you mentioned cartoons.

    We will dig into the public domain.

