Archive for April, 2016


A small model of ArXiv abstracts

April 12, 2016

I’ve been working with Dorota Glowacka of University of Helsinki on a search system built on the  We have a demo appearing in SIGIR 2016.  Here is a model with 100 topics built using normalised gamma priors on topics (giving each topic a variance parameter as well as a mean parameter) on the 1.1 million abstracts to March 2016.  The model took about 6 hours to run on my desktop.

This is a huge PNG file (2.8Mb).  YOU will not be able to view it unless you:

  • so load up on a big screen,
  • click on the image to enter image view mode,
  • then scroll down to bottom right click on “View full size” to bring it up,
  • and then zoom around to view.

Talk at Data Science Meetup

April 4, 2016

Today I’ll be giving a version of my “document analysis” grand tour talk to the Data Science Meetup in Melbourne. The slides for the talk in PDF are here.  I also did a smaller version of one of my new graphics, this one on obesity.  Needs to display for general viewers some distance away, so must be larger in perspective.   The standard ones need a really big screen or you need to be up close!