Data & Stuff // Neil Houston

Yeap, data and stuff
  • scissors
    July 30th, 2009Neil HPublic Data, Visualisation

    In response to the Guardian Data Blog – Environmental Hackday, I decided to take my first look at some data regarding the environment.  I originally envisaged this as a world map, looking at the difference between the 1950 actual to the 2030 prediction.

    The main issue with this was that the visualisation did not allow an easy scale or difference, as most countries in WE/EE etc are very similiar % wise.  The disparity comes when you compare Africa to North America and suchlike.

    Therefore below, is a very quick visualisation of the spread of populations, by country that exist in the ‘urban environment’.  To do this, I turned the original spreadsheet (Percentage of global population living in cities) into a 3 column layout of, Country, Population % and Year (To Be Published Soon).

    I used a vertical layout, descending by the most % to least, so we can see that places like Monaco are obviously nigh 100% whereas in Burundi, in 2030 it is predicted that less than 25% of the population will be in cities.

    [more to come - it's late - bad point about this - tooooo much information]

    .
    .
    .
    .
    .
    .
    .
    .

    Top to Bottom - Urban Populations - Vert

    Tags:
  • scissors
    July 29th, 2009Neil HBirmingham, Visualisation

    Last weekend (24-26th July), was the Supersonic Festival at the Custard Factory.

    Pete Ashton, was ‘in charge’ of the twitter account @supersonicfest – this was used over the weekend to interact with the festival goers.  As well as the account, the hashtag #supersonic was used.

    Over on ash10.com Pete has conducted some preliminary analysis, focussing on the numbers and proportion of tweets sent, which contained one of:

    • @supersonicfest,
    • #supersonic
    • supersonic festival

    I offered up to do some quick analysis, and thought It might be interesting to look at text analysis.

    Using the service by IBM, called Many Eye’s, I uploaded the dataset that Pete provided (plain extract and on Many Eye’s) and did some very quick analysis.

    A wordle, is a simple map of common words, in this particular example I’ve removed off the ‘common English words’ as well as the keywords identified above.

    Superonic Wordle

    Supersonic Wordle

    For all the above, you can click on the image and interact with them on Many Eye’s.

    So, let’s see if we can see any relationships using a ‘phrase net’

    Supersonic Phrase Net

    Supersonic Phrase Net

    I’ve not excluded any words in the above, but we can see that ‘Goblin’ still shows up as a popular word.  With phrase nets, the idea is that you can see the relationships between words.

    A better way to drilldown into the patterns, is to use word trees.  In this case below, I’ve focussed on the phrases that include ‘rhubarbradio’, which covered the event (Listen Again on Rhubarb Radio).

    Rhubarbradio & Supersonic Word Tree

    Rhubarbradio & Supersonic Word Tree

    Finally,  ignore what was actually said.  Data is about patterns, therefore I suppose that the number of tweets including #supersonic exponentially increased over the weekend. This is in the form of a steamgraph, and is drawn directly from the Twitter search engine.

    Supersonic Steamgraph

    Supersonic Steamgraph

    The good thing about the steamgraph is that we can see some key words at the peaks, (if you click the graph you will be able to input your own search and then see the actual tweets that included the keyword)

    This was a whistle stop tour of some text/trend analysis tools.  I’d highly recommend having a play on Many Eye’s yourself.  Also take a look at this post on Juice Analytics regarding text analysis.

    In the second part of the analysis, I’ll be looking into the date, time and people trends.  So check it out soon.

    Tags: , ,
  • « Older Entries