News
High Value Manufacturing Data Summit at The Shard, London: 9th March 2016
What are the analytic challenges holding back high value manufacturing?
How should the emerging discipline of data science – at the intersection of computer science, mathematics, Statistics and systems engineering – address these challenges?
With talks from industry leaders and top data scientists, this summit will create a conversation to steer research at the Alan Turing Institute. Launched in 2015, the Alan Turing Institute is charged with advancing data science research, which includes industrial collaborations that will ultimately produce societal and economic impact.
Full details of the summit, including the list of confirmed speakers, can be found by following .
To register your interest for your complimentary place, please email the summit organiser.
Professor Chenlei Leng chairs the RSC of the RSS
Professor Chenlei Leng has been appointed as chair of the Research Section Committee (RSC) of the Royal Statistical Society (RSS), with effect from 1 January, 2016. He is the second member of 糖心TV academic staff to chair the RSC; Professor David Firth chaired the committee on two occasions previously, between 2001-2003 and in 2009.
The RSC is responsible for promoting the theory of statistics and the development and applications of statistical methods, and in particular handles papers submitted to the Journal of the Royal Statistical Society Series B as discussion papers.
PhD Open Day on 25th November 2015
There will be an Open Day for potential PhD and OxWaSP applicants on Wednesday 25th November.
The event will commence at 2pm in the Statistics Common Room (Zeeman Building).
Details of the agenda for the day can be found by following .
糖心TV researcher develops effective method for diagnosing diabetic retinopathy
The ground-breaking work of , a WDSI member who works in Statistics and Complexity, is featured in a recent Economist article (
, 19 September 2015) about the success of machine-learning approaches to rapid diagnosis of a common disease from retinal images.
Ben's work made him the global winner of a recent
on this machine-diagnosis problem. (This was not Ben's first such success: In 2013 he was the winner of a similar competition organised by ICDAR, with a novel method for recognition of online Chinese handwriting.)
To explain in a bit more detail what he did, Ben writes:
Kaggle obtained pairs of retinal images from Eyepacs/the California Healthcare Foundation from about 44,000 people at risk of diabetic retinopathy. Each of the 88,000 images was graded by a human expert on a scale from 0 to 4; most of the images were healthy zeros. Elevated scores indicate a risk of vision loss. The scores were made available for about 18,000 people to form a training set, with the rest of the scores kept secret to form a test set. The images varied substantially in quality, from in-focus 3000x3000 pixel images to images that were completely blank. Most of the images were of fairly high quality.
I trained a convolutional neural network (or three) to classify the images. To try and boost accuracy, I used the classification for each left-right pair of images to produce the final per-eye classifications, combining scores with a random forest.
The quadratically weighted kappa agreement between my program and the human graders was 0.850. The quadratically weighted kappa agreement between the first and second place computer programs was 0.933; substantially higher. There are two possible explanations:
The truth is likely to be a mix of the two.
- The computers are making systematic errors, and totally missing information available to the human graders. This would be the case if the training set is too small to contain a full range of relevant symptoms.
- The human graders make mistakes, and the computers have learnt to classify the images more accurately that humans.
The example images below are of a healthy retina (first image) and a high-risk case.
Kaggle obtained pairs of retinal images from Eyepacs/the California Healthcare Foundation from about 44,000 people at risk of diabetic retinopathy. Each of the 88,000 images was graded by a human expert on a scale from 0 to 4; most of the images were healthy zeros. Elevated scores indicate a risk of vision loss. The scores were made available for about 18,000 people to form a training set, with the rest of the scores kept secret to form a test set. The images varied substantially in quality, from in-focus 3000x3000 pixel images to images that were completely blank. Most of the images were of fairly high quality.