Great Britain's Most Common Surnames

We analysed 2007 Electoral Roll data to come up with the common surnames in each of the 8920 wards across Great Britain. For the twitter surnames, we analysed the 'user names' of 32 million geo-tagged tweets downloaded during September, 2012 to April, 2013 across Britain. Surnames were then mapped to the 8920 wards across Great Britain, and coded in different colors according to their origin.


Data analysis is fantastic for countries with 1st world internet use, and as less advantaged countries in place like Peru increase website use, we can help more traditional communities and cultures understand name conventions. What if we find connections between names that date back 500 years to the cultural peak of Incas living in and around Machu Pecha and Cusco, Peru?  This sort of analysis might provide insights archeologists never dreamed.

Twitter Ethnicity Maps of London

We used Twitter's API to examine a million geotagged tweets for London during September and October, 2012. Then we used Onomap - created in UCL - to give the most likely ethnicity of each name. This is the result. Click on the image to view an enlarged version. Other studies used, for example, data from ads for a Facebook case study, but we like the nonproprietary data “in the wild” on Twitter.  It gives us far larger sample sizes.


Mapping Surname Searches

From July 2011, Worldnames website has been archiving the IP addresses of the visitors who searched for a Surname. Later, the IP addresses were converted to their corresponding latitude/longitude values.

This Surname Search data was mapped to show where different Surnames were searched from. Click here to view this map.


Who is engaged with new communications technologies and social media and who is not?

Geodemographics is the study of people according to the small neighbourhood areas in which they live, and a variety of data sources such as censuses, address registers and official surveys allow us to measure and summarise how population characteristics vary over space. But there is no equivalent framework for understanding the geography of who is online, or who uses social media (and where) despite the fact that such activities shape increasing amounts of our personal identities.

As a first step towards understanding the uncertainties inherent in mapping virtual identities, we have produced the interactive map that appears below. The background to the map is provided by the citizen OpenStreetMap project (, and it is possible to display a conventional geodemographic classification, the Office for National Statistics Output Area Classification, against this back-cloth by clicking the 'Output Area Classification' box and sliding the bar to display the classes described in the map legend. You can pan around the map and zoom into it.

The ticked 'email data coverage' box means that the map also displays the most popular email service supplier for each Output Area in the UK, using 1.1 million records collected over the period 2002-8 and maintained by CACI Ltd. Clicking on the '2010 Population Estimates' box allows you to identify the proportion of the residents in each Output Area who had an email address registered in this database.

This information allows us to begin to identify the geography of online behaviour, and to supplement conventional geodemographic measures of neighbourhood characteristics with indicators of online participation and behaviour. Watch this space for further developments!
Base Layers:
Open Street Map

Email data coverage:

Output Area Classification:

OAC Legend:
     Blue Collar Communities
     City Living
     Prospering Suburbs
     Constrained by Circumstances
     Typical Traits

2010 population estimates:
District 2009
Show Legend on the map