Skip down to main content
PRESS RELEASE -

New machine learning algorithm can predict age and gender from just your Twitter profile

Social media icons float up from a phone.
PRESS RELEASE -

New machine learning algorithm can predict age and gender from just your Twitter profile

Published on
16 May 2019
A new “demographic inference” tool developed by academics can make predictions based solely on the information in a person’s social media profile (i.e. screen name, biography, profile photo, and name)
  • A new “demographic inference” tool developed by academics can make predictions based solely on the information in a person’s social media profile (i.e. screen name, biography, profile photo, and name)
  • The tool—which works in 32 languages—could pave the way for views expressed on social media to be factored in to popular survey methods.

Researchers at the University of Oxford, University of Michigan, University of Massachusetts, GESIS – Leibniz Institute for the Social Sciences, the Max Planck Institute, and Stanford University have developed a method to infer information about a social media account owner based on the information disclosed in their Twitter profile information.

A new machine learning systemunveiled at the Web Conference in San Francisco this week—learned the patterns associated with different ages, genders, and between organizations and individuals from a dataset of over four million Twitter accounts in 32 languages. This information was then combined with estimated locations and re-weighted against census data to produce more accurate estimates of population in 1,101 statistical regions across the EU.

This could pave the way for a more representative understanding of people’s views on key societal issues and topics, based on what they post on social media and attributed to specific geographical locations and demographic groups.

Dr Scott Hale, Senior Research Fellow, Oxford Internet Institute, University of Oxford said: “Despite providing lots of data points, social media has long been an unreliable tool for understanding what issues are most important to a wider population given how people self-select into using any one platform.

“This first study of its kind performs demographic predictions about a social media account’s owner based purely on the account’s profile information in 32 languages and then re-weights the online sample to be more similar to an offline population.

“We see this as a significant step towards using social media to get a more accurate picture on the issues and topics that most interest the public and understanding which groups’ views are over- or under-represented.”

This information and data underpinning this research has been made available in an open source library and you can test the Twitter profile inference tool at http://www.euagendas.org/m3demo.

For more information or to request an interview, please contact Mark Malbas on 01865 287220 or email mark.malbas@oii.ox.ac.uk

Notes for editors

Publication to be released at WWW conference:

Zijian Wang, Scott Hale, David Ifeoluwa Adelani, Przemyslaw Grabowicz, Timo Hartman, Fabian Flöck, and David Jurgens. 2019. Demographic Inference and Representative Population Estimates from Multilingual Social Media Data. In The World Wide Web Conference (WWW ’19), Ling Liu and Ryen White (Eds.). ACM, New York, NY, USA, 2056-2067. DOI: https://doi.org/10.1145/3308558.3313684.

Related Topics:

Privacy Overview
Oxford Internet Institute

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies
  • moove_gdrp_popup -  a cookie that saves your preferences for cookie settings. Without this cookie, the screen offering you cookie options will appear on every page you visit.

This cookie remains on your computer for 365 days, but you can adjust your preferences at any time by clicking on the "Cookie settings" link in the website footer.

Please note that if you visit the Oxford University website, any cookies you accept there will appear on our site here too, this being a subdomain. To control them, you must change your cookie preferences on the main University website.

Google Analytics

This website uses Google Tags and Google Analytics to collect anonymised information such as the number of visitors to the site, and the most popular pages. Keeping these cookies enabled helps the OII improve our website.

Enabling this option will allow cookies from:

  • Google Analytics - tracking visits to the ox.ac.uk and oii.ox.ac.uk domains

These cookies will remain on your website for 365 days, but you can edit your cookie preferences at any time via the "Cookie Settings" button in the website footer.