Let’s talk about networks. Even though this may be a long blog post, it is actually pretty short given how complex networks can be. Networks are used to show the relationships between things. These “things” can be people, words, topics, ideas, et cetera. The “things” are called nodes (this is the term we tend to use in DH, while the sciences usually use the term “vertices”) while the relationships between them are edges (the lines that connect the nodes).They can be unimodal, bimodal, and multimodal. While I could go into great detail about what networks are and how they are best used in the humanities, there is no reason to reinvent the wheel. A great article on networks as they relate to the humanities was written by Scott Weingart, called Demystifying Networks (parts I and II) and can be found here. While not wishing to reinvent the wheel, there are some important points from Scott’s work that we at Cyber DH think need to be stressed.
Firstly, beware of “shiny new toy syndrome” or SNTS for short (because everything can be an acronym just like everything can be made into a network). Just because you can make a network, does not mean that you should, just like acronyms. It is tempting to use networks for everything because everything has some relationship to everything else. However, just because the relationship exists does not mean that the network is valid or that the relationship is really meaningful for your research. Remember, you are still doing humanities research and will need to thoroughly explain why the network is relevant, what the network proves, disproves, or even what new research questions it might raise, and exactly what the program you used to create the network did to make this beautiful new visualization.
Secondly, you may need to change the actual steps you utilize for different tools. Sometimes you may need to skip steps, and sometimes you may need to add steps. This is because many of these tools were developed by computer scientists and intended for use by those in the hard sciences. However, some humanists hijacked these tools and are using them for other purposes. This means that sometimes there are steps recommended in the use of certain tools that we may want to skip or add. This is because we are actually interested in the information that the suggested step is trying to help you avoid, as the results are not very “scientific” or “mathy.” This is not such a problem for humanists, and, in fact, may just be the information we are after. Ted Underwood gives a good example of this phenomena on his blog.
Lastly, and very shortly, remember, Digital Humanities is meant to emphasise the humanities, not the digital. Basically, make sure the network has some literary, historical, cultural, or philosophical value or any combination of these elements, as one of the beautiful (and also difficult) things about Digital Humanities and networks is that they are both very interdisciplinary.
With this, there are also tools with which we can create these networks. We here at CyberDH go back and forth as to which tool is prefered. As with most things in life, the prefered networking visualization tool greatly depends on the situation and the details. Some of these include elements like your data (how much, what format, how much does it need to be cleaned up, et cetera), your overall goal (paper publication, web publication, as a research tool, for crowd sourcing, et cetera), and your intended audience (academics, students, Joe Public, your mom). Cytoscape is often used for our network creation needs and the download for Cytoscape can be found here. The pros of Cytoscape are that the interface is user friendly and it handles undirected networks as well as very large data sets and, therefore, very large networks (it was built for scientific data visualizations which can have nodes in the millions). However, it has difficulty with multimodal networks. A great tutorial on the use of Cytoscape for humanists was created by Miriam Posner at UCLA and can be found here.
The second tool that I will mention is the use of Google fusion tables. Fusion tables have an option for creating a network, amongst other things, with the data you provide. Also, the data must be in a spreadsheet, either Google Sheets or as a .csv file. You also need a Google account to use fusion tables, however, no downloading is required. You can access a tutorial on using fusion tables here.
The third tool that some may prefer is Gephi and here is the link to the download for Gephi. As you might have guessed, there are tutorials for using Gephi as well, however, none that are overly useful (if you have made one or know of one that was particularly useful, let me know in the comments below). Although you can find Gephi’s own tutorial here. Gephi also has pros and cons. Gephi has trouble handling undirected networks, basically, networks where the relationship is two way, such as character relationships in a book (meaning the statement Bilbo has a relationship with Gandalf is also true if you invert their names in that statement) but you can fix this manually in your .csv file. The fact that you can fix things manually is a strength for those who do not mind getting their hands dirty. A weakness is that some find Gephi not to be user friendly and you often have to do a bit more data cleaning and prep, however, Gephi (arguably) creates more aesthetically pleasing networks.
Another networking tool that can be used, and is a bit more technical, is igraph. Now igraph has the capability of being used with the R programming language or with Python, so you can pick your poison (preferably the one you have spent some time building up an immunity to, just like iocane powder). In R, the igraph comes as a package that you can download into R Studio. The igraph definitely requires more programming knowledge than all the rest mentioned here, but seems to also allow for more manipulation, again, as long as you know how to code. In addition, igraph, in either R or Python, is not a visualization tool, which means that it can be used with either Gephi or Cytoscape above as well, but not Google Fusion tables. There are various examples of code used to create network visualizations using igraph, but few good tutorials. Although this one by Katherine Ognyanova is pretty decent and a good place to start.
If you want more information on using networks for humanities purposes, feel free to come to the CyberDH/AVL workshop on the topic. It will be on Friday, November 18th, from 12p-1p at the IQ Wall in the Wells Library. I know it is a long way off, but I know the end of the semester can fill up fast. I hope to see many of you there. Until then, enjoy some examples of networks created using some of the tools mentioned above.