…and then the world

“where nothing we’ve actually seen has been mapped or outlined…”

several dots on a map

with 10 comments

2010 is already looking like it’ll be fairly busy, not least because nearly a quarter of it is gone already. Over the next twelve months, I should finish my thesis, while other projects are also being developed and carried out: I’m tutoring in a first-year unit this semester, and am currently writing up new work on the French political blog research, first outlined at IR10 last year, for both my thesis and a conference presentation.

That presentation will be in June, at the International Communication Association conference in Singapore, as a paper co-authored with Lars Kirchhoff and Thomas Nicolai from Sociomantic Labs in Germany. Where my IR10 presentation looked at the text content of blog posts, this paper will be covering the links being made, in their various guises.

As part of this work, and indeed in preparation for research into topical networks, the links made around particular events or themes, I’ve been busy looking into the more permanent/static networks created by blogroll links from sites in the sample population. As with the IR10 work, I’m using data collected by Thomas Nicolai and Lars Kirchhoff over the first eight months of 2009, with 217 political blogs, media resources, and other related websites represented in the final collected data. For this stage, I’ve taken these sites as a starting point, making a list of each blogroll out-link from each of the 217 sites as a two-column spreadsheet (host site, site linked to), and then importing the final list into Gephi for visualisation purposes.

[Because I was using a slightly older version of Gephi, I was also converting the spreadsheet into Pajek’s .net format in order to import it into Gephi using Excel 2 Pajek. However, the latest version of Gephi imports .csv, with extra import options through the .gdf format too]

Having not used Gephi before (I couldn’t get it to work when I tested out visualisation options quite a long time ago), my success in testing it out was greatly aided by the Gephi team releasing a step-by-step tutorial for new users. Importing every individual link originating from the 217 sites and following each tutorial step led to something that looks rather spectacular, although doesn’t really say much:

here comes sciencey

Of course, the risk with visualisation is that too much attention is spent on the ‘pretty’ side of things, or on preparing diagrams that look impressive (or ‘sciencey’), but don’t aid the research’s argument (or even confuse it further). While the initial aim of creating a blogroll network is to help me see the groups of sites that associate with each other, trying to get a handle on how these sites in the sample relate to each other, the warnings and advice from people such as Bernie Hogan at last year’s OII Summer Doctoral Programme have stayed in the back of my mind. As such, I’ve spent a fair amount of time over the last few weeks trying to clean up the data and improve the visualisations, not from an aesthetic point of view, but so I get a clearer sense of what I’m trying to describe.

here comes sciencey (part two)

With the full list of links containing over 5000 nodes, receiving at least one in-link from one of the 217 initial sites, one of the main problems in the first visualisation is the sheer number of nodes, and the implied overimportance of sites with many out-links (especially when these sites are the only ones linking to many nodes – it leads to large groups of satellites around nodes). The next step then, as seen above, was to restrict the nodes to those sites receiving two or more in-links from the initial 217 sites. A number of loose groupings were immediately apparent (see, for example, the top-left of the diagram), and these were followed up after the next round of cleaning the data:

here comes sciencey (5b)

here comes sciencey (part five)

In the first of these two visualisations, some nodes are coloured by their affiliation to particular political parties (either by being official sites or by containing the party name/acronym in their URL). A loose grouping of sites from the Front National (brown) and UMP (blue) in particular is apparent. In the second visualisation, I located sites that were members of three different blog communities or networks, organised around different themes or beliefs. Again, there is some loose grouping – unsurprising, considering this is a blogroll-oriented network, and often sites will have links either to the main page of the group or the other members in their blogrolls – but what is most interesting is the general location of the anti-Sarkozy group Les vigilants (in pink) between the left-wing and centrist party groupings (in the first of the two visualisations). For more details and visualisations-in-progress, check out my Flickr (and look out for updates on the related paper over the next few months!). The next important step, particularly in terms of new information, is comparing the blogroll links to the topical networks, and seeing whether the same associations are in play regardless of time or topic – this will be investigated further over the next few weeks. At this stage, in particular because of its ease of use (and not being restricted to the latest version of operating system-specific software, I’ll most likely continue to work with Gephi while I work on my thesis. I’d still like to try out Prefuse though at some point, but that may have to wait until after all this work is out of the way…

Advertisements

Written by Tim

19 March, 2010 at 3:46 pm

10 Responses

Subscribe to comments with RSS.

  1. […] recently, Tim outlined some of his recent observations over on his blog. The images below are just a few of the network representations that Highfield has created by […]

  2. Tim! Facebook! TIIIM!!!!

    *stern looks*

    sky

    27 April, 2010 at 10:34 am

    • 🙂 I’ll probably reactivate it sooner rather than later, but, to be honest, I’m enjoying not being on it all the time and actually getting work done – not that that is facebook’s fault, but more that using things like Leechblock didn’t work for as long as deactivating my account in order to stop being distracted.

      Tim

      27 April, 2010 at 6:45 pm

      • Fair enough. I’m using leechblock myself 🙂

        sky

        27 April, 2010 at 6:46 pm

    • Hi,

      this looks interesting. I am a bit new to the field, therefore, I might as a silly question, but how do you actually get data for you analysis?
      (Spinn3? )

      Gephi is just a tool for visualization as I see it.

      chris

      6 May, 2010 at 9:29 pm

      • Hi Chris,

        yes, I’m using Gephi only for visualisation. The data I have used for the above work was collected as part of a larger, ongoing research project between QUT and Sociomantic Labs, who very kindly crawl and scrape content from a specific list of sites for us. I haven’t tried out Spinn3r although it certainly looks interesting and has a substantial data set available to users.

        Tim

        7 May, 2010 at 7:35 pm

  3. […] few weeks ago we showed you some social graphs of the French political blogosphere created by our research partner Tim Highfield using an open-source network visualization software […]

  4. […] Several dots on a maps Bibliographie Pour des rĂ©fĂ©rences bibliographiques et plus nombreuses, voyez mon article “Methods for mapping hyperlink networks: examining the environments of Belgian news sites” (PDF) Barabasi, A., 2003. Linked: How Everything Is Connected to Everything Else and What It Means for Business, Science, and Everyday Life Reissue., Plume Books. […]

  5. […] you can see some social graphs of the French political blogosphere created by researcher Tim Highfield using an open-source network visualization software called […]

  6. […] outlined some of his recent observations over on his blog. The images below are just a few of the network representations that Highfield has created by […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: