Dataset Project: Testing Gephi

I found the projects on the Visual Complexity site really beautiful and interesting, and I was inspired to start playing with Gephi in anticipation of using it for my dataset project.

 

I’m happy I started early! I downloaded the most recent version of Gephi and went through the tutorial using the Les Miserables sample dataset with no problems. I figured since that was so easy, I’d go ahead and visualize my Facebook network, just for fun.

 

I used Netvizz to extract my FB data. I immediately ran into problems getting the data into a format Gephi could read. Netvizz says to ‘right click, save as’, which wasn’t actually an option. Ultimately I opened the .gdf data in the browser, cut and pasted into a an Excel file to save as a csv, and also pasted the same data into a text file and saved. The Excel csv data would load into Gephi, but the IDs and labels were all wonky, and the graph was clearly a mess with number strings as node labels. I then tried the text file, which threw up error an message and wouldn’t even open. Some amount of Googling & trial and error later, I discovered I had to change the format to UTF-8, and change the file extension from .txt to .gdf.

 

Once that was sorted, I had trouble displaying the data in ‘data laboratory’ view of Gephi. I eventually discovered that Macs (or maybe just my really old Mac) are not, for some reason, entirely compatible with the current version of Gephi. OK. Uninstall the current version, re-install an older version. Fortunately that solved that particular problem.

 

So! Eventually I was able to get the data to open properly, and the graph to start looking like it should. I used the same steps from the tutorial to create a graph of my FB network. This part was easy, as it had been with the sample dataset–just following the instructions for a really basic visualization. Beautiful.

 

FB_noname_viz

 

Feeling emboldened by my problem solving and subsequent success, I started to play with a small part of my (anticipated) dataset. Several hours (and one trip to the grocery store) later, I’ve not been able to get the data into the proper format so that 1) Gephi will take it, and 2) it will display connections correctly. I can get one or the other, but not both.

 

This will definitely take more research & finessing, but I’m hopeful that I will be able to get it all to work. Stay tuned for the scintillating conclusion!

10 thoughts on “Dataset Project: Testing Gephi

  1. Micki

    Great to see your enthusiasm and progress… and here if you’d like any assistance – but it seems like you are doing a great job troubleshooting!

  2. Steve Brier

    I assume you have a Mac, Sarah, which has a one button mouse? You right click by holding down the Command key and then clicking and you get the “save as” menu. Hope that helps a little moving forward. Great start on your data project.

  3. Christopher Stein

    I got here seeing Steve’s comment in the new My Commons stream. Sorry to jump in but I had a couple of quick comments that may, or may not, be of some help. Steve, sorry to correct you but it is the “control” key to right click on a one button mouse.

    Sarah, I feel your pain on this process. It’s an all too common part of working with data vis. You’re doing a great job working through it. I don’t know how much you all have talked about Github but if the project is hosted here, as Gephi is, it can be a good first place to go to find out whether your problem is know and being addressed by the developers (and the problem you worked around is there https://github.com/gephi/gephi/issues/748 ). Also there is often a more recent build on Github. For Gephi it is 0.9 (https://github.com/gephi/gephi) vs 0.8.2 on their main site (https://gephi.github.io/).

    The other thing I look for is someone who is doing similar things and blogging about it like Amanda Visconti here http://www.literaturegeek.com/2013/09/09/dataintogephi/

    Ok, not so quick a comment, as usual.

  4. Matthew K. Gold (he/him)

    Excellent to see you moving forward with this experiment, Sarah! It’s telling that a significant part of the struggle involved text transformations — one of the most important steps in preparing data for visualization or other analysis. I hope that Chris’s suggestions are useful and I urge you to follow up with Micki and to keep at it.

  5. Sarah Cohn Post author

    Thanks to everyone’s advice and enthusiasm! I do have the ‘right click’ option enabled on my trackpad (no mouse) but, maddeningly, ‘save as’ was never an option on the list opened by the right click.

    Before I started I did poke around the documentation of Gephi on GitHub, but a lot of it seemed like it was for developers and people whose technical knowledge was far beyond my own. And I was impatient and wanted to produce something! I will definitely revisit it and spend more time reading.

    Having spent some time away from the computer thinking about my data problem, I have some ideas on how to try and fix it. If (when?) none of those work, I’ll be in touch asking questions.

  6. Mary Catherine Kinniburgh

    Christopher–thank you for pointing out the differences in updates between Github and Gephi’s main site. I’d been experimenting with Gephi for projects, but didn’t realize that Github was the best place to look for documentation and updates.

    Sarah–very cool that your data experimentation is coming right along! For sharing Gephi, a great tool is Sigma.js–it’s a javascript library that allows you to mount your Gephi visualization online in an interactive way, as opposed to taking a screenshot. Fun tools for future work…

  7. Renzo Adler

    Have you learned something about FB or Data Visualization that you were previously unaware of or does this make you look at FB/social networking in a new light?

  8. Selena Williams

    I found the site on Visual Complexity fascinating as well. I did not go as far as downloading, however it started me thinking about my research and how I could utilize large data to chart childhood obesity in urban areas and link that to academics over the past twenty to fifty years. Not sure how to begin to tap into this, but the visual complexity was a good start.

Comments are closed.