(subjectif, tbc)
Tutorials
- Langford, Structured prediction. http://www.hunch.net/~l2s/
Invited talk: Bottou +++
- 2 learning modules in interaction (one explores, one classifies): misleading effects (exploration results are bad on average, therefore no need to explore, we were right the first time)
- test: must be reconsidered. (consider the tails of distribution and coverage)
Papers
- Unsupervised Domain Adaptation by Backpropagation Yaroslav Ganin, Victor Lempitsky
- two objectives on features: being discriminative wrt class; not discriminant wrt source/target
- Learning Transferable Features with Deep Adaptation Networks Mingsheng Long, Yue Cao, Jianmin Wang, Michael Jordan
- related to the previous + kernels.
- Strongly Adaptive Online Learning Amit Daniely, Alon Gonen, Shai Shalev-Shwartz
- Online and adaptive weights on experts + intervals + doubling trick
- Adaptive Belief Propagation Georgios Papachristoudis, John Fisher
- Question for me: why not considering several spanning trees...
- Weight Uncertainty in Neural Network Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra
- weight = Gaussian; + Bayes by backprop (Graves).
- Gradient-based Hyperparameter Optimization through Reversible Learning, Dougal Maclaurin, David Duvenaud, Ryan Adams
- derivative of misclassification wrt hyper-parameters.
- On Symmetric and Asymmetric LSHs for Inner Product Search Behnam Neyshabur, Nathan Srebro
- Different random projections for queries and for solutions.
- The Ladder: A Reliable Leaderboard for Machine Learning Competitions Avrim Blum, Moritz Hardt
- validation set, test set, multiple trials.
- Learning to Search Better than Your Teacher Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daume, John Langford
- ??
- Learning Fast-Mixing Models for Structured Prediction Jacob Steinhardt, Percy Liang
??
Papers where I think one could do otherwise
- On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments Yifan Wu, Andras Gyorgy, Csaba Szepesvari
- MCTS ?
Papers where I must have missed something (because otherwise, ...)
- Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection Julie Nutini, Mark Schmidt, Issam Laradji, Michael Friedlander, Hoyt Koepke
- not compared to BFGS !!!
Papiers anciens que j'avais manqués
- Label-Embedding for Attribute-Based Classification (embarrassingly simple).
Ce qui peut interesser...
Cécile: (rapport avec la thèse de Dawei).
On Identifying Good Options under Combinatorially Structured Feedback in Finite Noisy Environments
Yifan Wu, Andras Gyorgy, Csaba Szepesvari
Top 100 (et un peu plus) de ICML 2015
Bibliographie générée automatiquement par Mendeley.
Il y a quelques erreurs dans la bibliographie, notamment concernant les dates, mais normalement les noms d'articles et d'auteurs sont bons.
ICML2015.pdf
Seminar, October 14th, 14h00-16h00
Talks
From Word Embeddings To Document Distances, Matt Kusner, Yu Sun, Nicholas Kolkin, Kilian Weinberger
presented by Gregory Grefenstette
abstract:
We present the Word Mover’s Distance (WMD), a novel distance function between text documents. Our work is based on recent results in word embeddings that learn semantically meaningful representations for words from local co-occurrences in sentences. The WMD distance measures the dissimilarity between two text documents as the minimum amount of distance that the embedded words of one document need to “travel” to reach the embedded words of another document. We show that this distance metric can be cast as an instance of the Earth Mover’s Distance, a well studied transportation problem for which several highly efficient solvers have been developed. Our metric has no hyperparameters and is straight-forward to implement. Further, we demonstrate on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the WMD metric leads to unprecedented low k-nearest neighbor document classification error rates.
--
Weight Uncertainty in Neural Network, Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, Daan Wierstra
presented by Gaetan Marceau-Caron
abstract:
We introduce a new, efficient, principled and backpropagation-compatible algorithm for learning a probability distribution on the weights of a neural network, called Bayes by Backprop. It regularises the weights by minimising a compression cost, known as the variational free energy or the expected lower bound on the marginal likelihood. We show that this principled kind of regularisation yields comparable performance to dropout on MNIST classification. We then demonstrate how the learnt uncertainty in the weights can be used to improve generalisation in non-linear regression problems, and how this weight uncertainty can be used to drive the exploration-exploitation trade-off in reinforcement learning.
The Kendall and Mallows Kernels for Permutations,Yunlong Jiao, Jean-Philippe Vert
presented by Guillaume Charpiat
--
Abstract:
We show that the widely used Kendall tau correlation coefficient is a positive definite kernel for permutations. It offers a computationally attractive alternative to more complex kernels on the symmetric group to learn from rankings, or to learn to rank. We show how to extend it to partial rankings or rankings with uncertainty, and demonstrate promising results on high-dimensional classification problems in biomedical applications.