Data Science Events
Yulan He: Unsupervised Event Extraction and Storyline Generation from Text
Abstract: Newsworthy events are widely scattered not only on traditional news media but also on social media. In this talk, I will present a series of unsupervised Bayesian model for the extraction of structured representations of events from Twitter without the use of any labelled data. The extracted events are automatically clustered into coherence event type groups. In addition, event extraction and visualisation are jointly modelled to allow simultaneous extraction of events and visualisation of both events and tweets. To deal with lexical variations of certain named entities, word embeddings are incorporated into model learning. Moreover, to automatically infer the number of events from the data, the Dirichlet process mixture model is used for event generation.
Related events published in temporal proximity could be linked to form a storyline which allows a quick glimpse of event evolution over time. I will also present two models for storyline generation from news articles. The first one is a non-parametric generative model which extracts structured representations and evolution patterns of storylines simultaneously. The second one is the neural storyline generation model which approximates storyline distributions using neural networks and explores the similarity revealed by the title and the main body of a news article and also temporally-related news articles.
Short bio:
Yulan He is a Reader in Computer Science and the Director of the Systems Analytics Research Institute at Aston University, UK. She is experienced in statistical modelling and text mining, particularly the integration of machine learning and natural language processing for social media analysis. She has published over 150 papers on topics including sentiment analysis, information extraction, clinical text mining, recommender systems, learning analytics and spoken dialogue systems. She served as an Area Chair in ACL 2018, EMNLP 2018, NAACL 2016, EMNLP 2015, CCL 2015 and NLPCC 2015 and co-organised ECIR 2010 and PRIB 2007. She obtained her PhD degree in spoken language understanding from the University of Cambridge.