Tracking fluctuations in genetic and environmental influence on dynamic mood measurements via social media in young adults

PhD project (3/4 yr research project leading to independent research at the doctorate level)

Dr Oliver Davis, Claire Haworth

Return to list


The rapid evolution of genotyping and sequencing technologies means that genetic variation data are becoming readily available in the large populations necessary for research into the aetiology of complex traits and disorders. Now, rather than being limited by genotyping, we are starting to be restricted by the availability of phenotypic and environmental information. To understand the dynamics of genetic influences across development and in different contexts, we must develop new approaches that will complement traditional questionnaires and clinical data to give us affordable, repeatable and detailed assessments on a scale to match our vast repositories of genetic data.

Fortunately, new digital technologies can help us to do that. Our EMBERS (Emotion Monitoring by Electronic Remote Sensing) project uses online social networks to collect high-resolution phenotypic and environmental data. See our lab website for more info:

Aims & objectives

We aim to compare mood coded from tweets with standard questionnaire data collected at the same time, to establish the most effective method for using Twitter to measure mood in young adults. These coding methods will then be used to track the changing influences of genes and environment on positive and negative mood through emerging adulthood.


We have collected 5 million tweets on a sample of 2,500 young adult twins from the Twins Early Development Study (TEDS). Previous studies of online networks have typically used convenience samples, but our sample, drawn from TEDS, allows us to validate and link our Twitter data with twenty years’ worth of more traditional longitudinal data on cognitive and behavioural development already collected on this sample.

Methods will include automated text analysis, machine learning and multilevel modeling to characterize the relationship between questionnaire measures and coded tweets. In addition twin data and polygenic risk scores will be used to estimate the importance of genetic and environmental influences on dynamic measurements of mood. Emotional responses to real-world events can also be tracked using this sample, allowing us to investigate the importance of interactions between genetics and environment in shaping our resilience and recovery from life events.


Haworth & Davis (2014). From observational to dynamic genetics. Frontiers in Genetics, 5:6. doi: 10.3389/fgene.2014.00006

Created on Nov. 7, 2016, 1:52 p.m.