"The Uncertainty of Identity: Linking Spatiotemporal Information between Virtual and Real Worlds".

This UK Engineering and Physical Sciences Research Council (EPSRC) funded project is a collaboration between University College London (Geography), Birmingham University (Computer Science), and City University London (Engineering), in partnership with experts in Visual Analytics at Purdue University and Arizona State University in the United States. Our goal is to link information pertaining to human characteristics in real and virtual worlds in order to better understand and manage the uncertainties inherent in establishing human identity in different geographic locations. The three year research programme began in November 2011 under EPSRC's 'Who Do You Think You Are?' initiative. The other funded projects under this initiative are Imprints (http://www.imprintsfutures.org/) and SuperIdentity (http://www.southampton.ac.uk/superidentity/).

Search engine data analytics are another route to find these relationships by identifying patterns in entity mentions among “authority” websites.  However, that method requires an accurate way of identifying the most authoritative pages on a given topic.  Here we do not have to concern ourselves with which source is more trustworthy. There are many other trends analyzed via co-occurrence on Twitter.

Here we do not have to concern ourselves with which source is more trustworthy. There are many other trends analyzed via co-occurrence on Twitter. For example, researchers are able find if something like the mentions and tweet text for a Mont Blanc pen correlate highly with positive or negative sentiment. Can we understand, accurately, the positive associations with products based on reactions to such reviews? Sentiment research shows us whether pens in general are losing popularity because of the loss of interest in handwriting. Whats behind this change? Analyzing tweets about popularity, we can find associated words, group nearby, to see if they mention an interest in handwriting, for example, or if it's the writing tool itself, one that is luxurious looking, that spawns spending here.

Our basic premise is that uncertainty in characterising and hence identifying individuals may be understood and managed by:

  • Detecting and exploring spatio-temporal profiles of lifestyles and activity patterns;
  • Concatenating and conflating detailed, but under-exploited, datasets in the virtual and real domains; and
  • Soliciting, triangulating and analysing crowd-sourced volunteered data that link physical and virtual identities

Great Britain's Most Common Surnames

We analysed 2007 Electoral Roll data to come up with the common surnames in each of the 8920 wards across Great Britain. For the twitter surnames, we analysed the 'user names' of 32 million geo-tagged tweets downloaded during September, 2012 to April, 2013 across Britain. Surnames were then mapped to the 8920 wards across Great Britain, and coded in different colors according to their origin.

Click here to view the map.


Top Twitter Names in LONDON

We downloaded approximately 4 million geo-tagged tweets during August to November, 2012. 'User names' associated with the geo-tagged tweets were analysed to produce a map of 'Top Twitter Names in London'.
Click here to view the map.