Posts Tagged ‘ICML’


Reviewing in Machine Learning

June 11, 2017

A common subject for mutual commiseration in the community is the quality of reviewing.  In huge and specialised conferences like NIPS and ICML, there are so many papers and so many reviewers that generally the match-up between reviewers and papers is quite good, as good as or better than for a journal article.  In smaller conferences, like ACML, and for grant applications in relatively small places like Australia (e.g, the ARC), the match-up can be a lot poorer.  This causes reviewer misunderstandings.

Of course, one needs to be aware of The Great NIPS Reviewing Experiment of 2014.  This is a grand applied statistical experiment that only machine learning folks could think of 😉  I’ll just mention this because it is important to understand that the reviewing process is very challenging, and we as a community are trying our hardest.

Now, I think its very reasonable for some reviewers to not be specialists in the subject of the paper, merely “knowledgable”.  After all, we would like the paper to be readable by more than just the 20 people who focus on that very specific topic.  These non-specialist reviewers generally flag themselves, so the meta-reviewers and authors know to take their comments with a (respectful) grain of salt.  But they can still be excellent in related and broadly applicable areas like experimental methodology and mathematical definitions of models, so they are still an important part of the reviewing ecosystem.  This works when reviewers know their limitations.  Unfortunately, reviewers don’t always do so.

But I still find general aspects of reviewing enlightening.

Case in point is our recent ICML 2017 paper “Leveraging Node Attributes for Incomplete Relational Data”.   Two reviewers said strong accept and one a mild reject.  For the would be rejecter, the method was too simple.  We knew this paper was not full of the usual theoretical complexities expected of an ICML one, of course, so we made sure the experimental work was rock solid.  It was a risk submitting to ICML anyway, as anyone with experience knows the experimental work at ICML can be patchy, its not something generally looked for by reviewers.  If you want quality experimental work in machine learning, go to the knowledge discovery conferences like SIGKDD, certainly not a machine learning conference!

The reason we submitted the paper to ICML was because this simple method beat all previous work handily, either or both in predictive performance or speed.  Simplicity it seems has its advantages, and people should find out about it when it happens.   But, if it was so damn simple, why didn’t someone try it already (in truth, it wasn’t that simple), and given it works so much better, shouldn’t people find out that for this problem all the ICML-ish model complexity of previous methods was unnecessary 😉 .  Now we did add a tricky hierarchical part to our otherwise simple model, just to appease the “meta is better” crowd, and we’re now busy trying to figure out how to add a novel stochastic process (something I love to do).

But unnecessary complexity is something I’m not a big fan of.  My favorite example of this is papers starting off with 2 pages of stochastic process theory before, finally, getting to the model and implementation.  But the model they implement is a truncated one, is completely finite and requires no stochastic process theory to analyse in any way.  In a longer journal format, linking the truncated version with the full stochastic process theory is important to round off the treatment.  In a short format paper with considerable experimental work, details of Levy processes are unnecessary if real non-parametric methods are not actually used in the algorithmic theory.


ICML 2017 paper: Leveraging Node Attributes for Incomplete Relational Data

May 19, 2017

Here is a paper with Ethan Zhao and Lan Du, both of Monash, we’ll present in Sydney.

Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data.

As usual, the reviews where entertaining, and some interesting results we didn’t get in the paper.  Its always enlightening doing comparative experiments.