Author Archives: Liam Sweeney

#skillset @LiamSweeney

Hi All, great to see everyone last week! Thoughts on how I could contribute to a project below.

Project Management: In my day job I do a lot of this. Am currently managing two surveys, one to measure the diversity of museum employees in North America and the other requesting data from academic libraries to measure Amazon’s market share of University Press print books. I’m getting okay at juggling.

Outreach: Also jives with my day job, particularly a project I’m on as a sustainability consultant for an open access journal (PPJ) starting out of Penn State/Michigan State’s Matrix, where I’m working to identify partners to help grow the project. I’m also eager to explore CUNY’s infrastructure, and relationship with NYPL, to identify different homes for various kinds of work (it is so vast!).

Developer: I don’t have much experience but am eager to learn, especially because it would be a departure from the daily grind. I’ve completed HTML/CSS and JavaScript code academy courses, some python, and have played around in R a bit. I’m into hunting down the answers on GitHub and Stack Overflow.

Designer: I have played around with the basics here- building a personal site and using the basics of Photoshop. But I don’t have any real training.

Link

Hi All,

Cannot wait to see you all for this last class! I want to say that I know this is a busy time for everyone, and I do not want to add to your plate. But if you can find a few minutes to read through this request, follow the link and provide top-of-mind texts it would be immensely helpful for me and anyone else interested in analyzing this data. If this project interests you I’ll be maintaining a link to the spreadsheet on my Commons profile, so anyone can play with it.

Request:

Social Citation intends to map the personal connections that give rise to the dissemination of influential texts. At this data gathering stage, I ask that you please share with me the texts that have been significant to your work–either intellectually or aesthetically engaging in a way that was somehow transformative for you. This can be as comprehensive or as bare-bones as you like. To share the texts please follow this link: bit.ly/socialcitationdata and find your name among the tabs at the bottom of the page. Your name will appear as it is on your Academic Commons Profile. Next, list the author along side the text. Then, under referrer, the person who referred you to the text (use NA if found yourself), the location of the discovery (if outside of an institution please write city and state, if the text was encountered within an institution please just include the institution). Finally, list the duration of time spent in that place or institution. An example might look like:

Graphs Maps and Trees Franco Moretti Matt Gold_Stephen Brier CUNY 2014-Present
Hyper Cities Todd Presner_David Shepard_Yoh Kawano Matt Gold_Stephen Brier CUNY 2014-Present
Planned Obsolescence Kathleen Fitzpatrick Matt Gold_Stephen Brier CUNY 2014-Present
Feel free to use these to start your list if you care to. Thank you for your time. I will keep this link open so the data may be used by anyone to experiment with network maps and visualizations of your own.

Thanks!

Social Citation

Hi All,

It’s so exciting to see the progress on this blog; I know so many of us are now able to do things we couldn’t at the beginning of this class, thanks in no small part to all the great workshops. My final project is largely facilitated by the Gephi workshop. In this post I want to share my process in case it’s useful to anyone, but also, crucially, ask for your help to bring it to life. In case this gets long I’ll say now that in the final two weeks of class I hope to ask the praxisers to complete the short (and fun!) exercise of mapping your favorite authors, as well as the people who helped you discover them, in a simple text file. Now into the weeds! (ps I will be more specific when I ask this in earnest).

My goal here is to make the citation process more social, to draw the connections between impactful texts/authors and the friends, partners, mentors, teachers, scholars, family etc. that helped you discover the content. I began with a basic tab delineated text file that looked like this: Screen Shot 2014-12-03 at 7.51.01 PM

the categories here from right to left are: author, my person, relationship, location. I didn’t get too hung up on the content, just typed what came to mind for maybe 15 minutes. It took a bit of tinkering to figure out how to show this in Gephi, but eventually I got this:

Screen Shot 2014-12-03 at 7.54.12 PM

Sorry if this is hard to see, but this is a very messy graph. There are some interesting things going on – the connections are by relationship and location, and they create pockets. Declan Meade is off on his own to the right because he’s the only Dubliner and the only Editor I have. I tried to make this a little more cohesive by changing my data to look like this:Screen Shot 2014-12-03 at 7.53.55 PM

So I got rid of relationship and location, I also made it a one-to-one relationship between everything, where “Me” was connected to each of my people, and each of my people were connected to the work they’d introduced me to. Then the graph changed to this:Screen Shot 2014-12-03 at 8.17.49 PM

This sacrificed some of the nuance of place and relationship, but it gained a simplicity that I think is critical in these visualizations to make sense at a glance.

I’m not sure whether I’d like to add in relationship as a node, or maybe offer it as a hover or something. (color coordinate edges with a key linking them to relationships??) I have more playing to do, and would love feedback. But I think this project gets way more interesting when “Me” is connected to “You”. And so I wonder if folks would be willing to participate in this exercise. I think we can all safely use the three print texts assigned in this course, creating a link between everyone. I’ll finalize the model over the weekend, to have a more developed request for you, but I think the easiest thing would be for me to set up a google doc with everyone’s name on a separate page and ask you to type out the data. It’s important to the project because only YOU know these things – there’s no way to scrape this. Thanks for your consideration, and looking forward to NYPL Labs tomorrow!

 

 

JSchool Jan Workshops

Regarding Sandeep Junnarkar’s intro to the JSchool last week:

I’ve been looking for something like those coding modules for a while and dove into the JSchool site to figure out what was available. I found the data scraping module, which has HTML/CSS and JavaScript/Jquery pre reqs. Sandeep mentioned that there’s a short scraping module in January, which is good for folks to take first. It’s more basic, no coding. Jan 8th 9AM to 3PM.

The JavaScript/JQuery module is 2 credits, and then the Scraping module (not the aforementioned) is 1.

Here’s a link to the short Jan workshops: http://www.journalism.cuny.edu/academics/january-academy/#.VHTzAjTF9yK

Data Set: Topic Modeling DfR

Hello Praxisers, I’m writing today about a dataset I’ve found. I’ll be really interested to hear any thoughts on how best to proceed, or more general comments.

I queried JSTOR’s dfr.jstor.org Data for Research for citations, keywords, bigrams, trigrams and quadgrams for the full run of PMLA. JSTOR gives this data upon request for all archived content. To do this I had to request an extension of the standard 1000 docs you can request from DfR. I then submitted the query and received an email notification several hours later that the dataset was ready for download at the DfR site. Both the query and the download are managed through the “Dataset Requests” tab at the top right of the website. It was a little over a gig, and I unzipped it and began looking at the files one by one in R.

Here’s where I ran into my first problem. I basically have thousands of small documents, with citation info for one issue per file, or a list of 40 trigrams from a single issue. My next step is to figure out how to prepare these files so that I’m working with a single large dataset instead of thousands of small ones.

I googled “DfR R analysis” and found a scholar, Andrew Goldstone, who has been working on analyzing the history of literary studies with DfR sets. His GitHub  contains a lot of the code and methodology for this analysis, including a description of his use of Mallet topic modeling through an R package. Not only is the methodology available, but so is the resulting artifact, a forthcoming article in New Literary History. My strategy now is simply to try to replicate some of his processes with my own dataset.