.
Click on the '+' to see the details and links to full text
Click on the '+' to see the details and links to full text
The 5 requested papers
ESs are robust to noise (Algorithmica 2010)
[+]
Jebalia, M., Auger, A. and Hansen, N.. Log linear convergence and divergence of the scale-invariant (1+1)-ES in noisy environments. In Algorithmica. To appear in 2010.
- PDF (draft): http://www.lri.fr/~hansen/AlgorithmicaPaper.pdf
- Abstract: Noise is present in many real-world continuous optimization problems. Stochastic search algorithms such as Evolution Strategies (ESs) have been proposed as effective search methods in such contexts. In this paper, we provide a mathematical analysis of the convergence of a (1+1)-ES on unimodal spherical objective functions in the presence of noise. We prove for a multiplicative noise model that for a positive expected value of the noisy objective function, convergence or divergence happens depending on the infimum of the support of the noise. Moreover, we investigate convergence rates and show that log-linear convergence is preserved in presence of noise. This result is a strong theoretical foundation of the robustness of ESs with respect to noise.
Statistical Learning Theory to obtain lower bounds for comparison-based EAs (Algorithmica 2010)
[+]
Fournier, H. and Teytaud, O.. Lower Bounds for Comparison Based Evolution Strategies using VC-dimension and Sign Patterns. In ''Algorithmica'. To appear in 2010.
- PDF (draft): http://hal.inria.fr/docs/00/45/27/91/PDF/evolution.pdf
- Abstract: We derive lower bounds on the convergence rate of comparison based or selection based algorithms, improving existing results in the continuous setting, and extending them to non-trivial results in the discrete case. This is achieved by considering the VC-dimension of the level sets of the fitness functions; results are then obtained through the use of the shatter function lemma. In the special case of optimization of the sphere function, improved lower bounds are obtained by an argument based on the number of sign patterns.
Affinity propagation for data streaming (KDD 2009)
[+]
Zhang, X. , Furtlehner, C. , Perez, J. , Germain, C. and Sebag, M.. Toward Autonomic Grids: Analysing the Job Flow with Affinity Streaming. In: 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) : p. 987-995. Paris France. 2009.
- PDF: http://hal.inria.fr/docs/00/39/38/25/PDF/sub_sigkdd_v3.pdf
- The Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007) provides an understandable, nearly optimal summary of a dataset, albeit with quadratic computational complexity. This paper, motivated by Autonomic Computing, extends AP to the data streaming framework. Firstly a hierarchical strategy is used to reduce the complexity to O(N^(1+eps.)); the distortion loss incurred is analysed in relation with the dimension of the data items. Secondly, a coupling with a change detection test is used to cope with non-stationary data distribution, and rebuild the model as needed. The presented approach Strap is applied to the stream of jobs submitted to the EGEE Grid, providing an understandable description of the job flow and enabling the system administrator to spot online some sources of failures.
Monte-Carlo Tree Search for Active Learning (ECML 2009)
[+]
Rolet, P. , Sebag, M. and Teytaud, O.. Boosting Active Learning to Optimality: a Tractable Monte-Carlo, Billiard-based Algorithm. In: ECML : p. 302-317. Bled Slovénie. 2009.
- PDF: http://hal.inria.fr/docs/00/43/38/66/PDF/BALO.pdf
- Abstract: This paper focuses on Active Learning with a limited number of queries; in application domains such as Numerical Engineering, the size of the training set might be limited to a few dozen or hundred examples due to computational constraints. Active Learning under bounded resources is formalized as a finite horizon Reinforcement Learning problem, where the sampling strategy aims at minimizing the expectation of the generalization error. A tractable approximation of the optimal (intractable) policy is presented, the Bandit-based Active Learner (BAAL) algorithm. Viewing Active Learning as a single-player game, BAAL combines UCT, the tree structured multi-armed bandit algorithm proposed by Kocsis and Szepesv´ari (2006), and billiard algorithms. A proof of principle of the approach demonstrates its good empirical convergence toward an optimal policy and its ability to incorporate prior AL criteria. Its hybridization with the Query-by-Committee approach is found to improve on both stand-alone BAAL and stand-alone QbC.
Bandit Algorithms for Operator selection: comparison-based rewards (GECCO 2010)
[+]
Fialho, A. , Schoenauer, M. and Sebag, M.. Toward Comparison-based Adaptive Operator Selection. In J. Branke et al., eds.: Genetic and Evolutionary Computation Conference (GECCO), ACM Press. July 2010. To appear.
- PDF (submitted version): http://www.lri.fr/~marc/Papers/banditGECCO10.pdf
- Abstract: Adaptive Operator Selection (AOS) turns the impacts of the applications of variation operators into Operator Selection through a Credit Assignment mechanism. However, most Credit Assignment make direct use of the fitness gain between parent and offspring. A first issue is that the Operator Selection that uses such Credit Assignment is likely to be highly dependent on the a priori unknown bounds of the fitness function. Additionally, those bounds are likely to change along evolution, as fitness gains tend to get smaller as convergence occurs. Furthermore, and maybe more importantly, a fitness-based credit assignment forbid any invariance by monotonous transformation of the fitness that is a source of robustness for comparison-based Evolutionary Algorithms. In this context, this paper proposes two new Credit Assignment mechanisms, one inspired by the Area Under the Curve paradigm, and the other close to the Sum of Ranks. Using fitness improvement as raw reward, and directly coupled to a Multi-Armed Bandit Operator Selection Rule, the resulting AOS obtain very good performances on both the OneMax problem and some artificial scenarios, while demonstrating their robustness with respect to hyperparameter and fitness transformations. Furthermore, using fitness ranks as raw reward results in a fully comparison-based AOS with reasonable performances.
A few more papers we consider important
Monte-Carlo Tree Search for code optimization (ICML 2009)
[+]
De Mesmay, F. , Rimmel, A. , Voronenko, Y. and Puschel, M.. Bandit-Based Optimization on Graphs with Application to Library Performance Tuning. In: ICML. Montréal Canada. 2009.
- PDF: http://hal.inria.fr/docs/00/37/95/23/PDF/icmlspiral.pdf
- Abstract: The problem of choosing fast implementations for a class of recursive algorithms such as the fast Fourier transforms can be formulated as an optimization problem over the language generated by a suitably defined grammar. We propose a novel algorithm that solves this problem by reducing it to maximizing an objective function over the sinks of a directed acyclic graph. This algorithm valuates nodes using Monte-Carlo and grows a subgraph in the most promising directions by considering local maximum k-armed bandits. When used inside an adaptive linear transform library, it cuts down the search time by an order of magnitude compared to the existing algorithm. In some cases, the performance of the implementations found is also increased by up to 10% which is of considerable practical importance since it consequently improves the performance of all applications using the library.
Improving CMA-ES for large-scale problems (Machine Learning 2009)
[+]
Suttorp, T., Hansen, N. and Igel, C. Efficient Covariance Matrix Update for Variable Metric Evolution Strategies. In: Machine Learning, Vol. 72(2) : p. 167–197. 2009.
- PDF: http://www.lri.fr/~hansen/SuttorpEtAl.pdf
- Abstract: Randomized direct search algorithms for continuous domains, such as evolution strategies, are basic tools in machine learning. They are especially needed when the gradient of an objective function (e.g., loss, energy, or reward function) cannot be computed or estimated efficiently. Application areas include supervised and reinforcement learning as well as model selection. These randomized search strategies often rely on normally distributed additive variations of candidate solutions. In order to efficiently search in non-separable and ill-conditioned landscapes the covariance matrix of the normal distribution must be adapted, amounting to a variable metric method. Consequently, covariance matrix adaptation (CMA) is considered state-of-the-art in evolution strategies. In order to sample the normal distribution, the adapted covariance matrix needs to be decomposed, requiring in general O(n3) operations, where n is the search space dimension. We propose a new update mechanism which can replace a rank-one covariance matrix update and the computationally expensive decomposition of the covariance matrix. The newly developed update rule reduces the computational complexity of the rank-one covariance matrix adaptation to (n2) without resorting to outdated distributions. We derive new versions of the elitist covariance matrix adaptation evolution strategy (CMA-ES) and the multi-objective CMA-ES. These algorithms are equivalent to the original procedures except that the update step for the variable metric distribution scales better in the problem dimension. We also introduce a simplified variant of the non-elitist CMA-ES with the incremental covariance matrix update and investigate its performance. Apart from the reduced time-complexity of the distribution update, the algebraic computations involved in all new algorithms are simpler compared to the original versions. The new update rule improves the performance of the CMA-ES for large scale machine learning problems in which the objective function can be evaluated fast.
Theory of Hyper-volume based Multi-objective Evolutionary Algorithm (FOGA 2009)
[+]
A. Auger, J. Bader, D. Brockhoff, and E. Zitzler, Theory of the Hypervolume Indicator: Optimal mu-Distributions and the Choice of the Reference Point. In Foundations of Genetic Algorithms (FOGA 2009), pages 87-102. ACM, 2009.
- PDF: http://www.lri.fr/~auger/abbz2009a.pdf
- Abstract: The hypervolume indicator is a set measure used in evolutionary multiobjective optimization to evaluate the performance of search algorithms and to guide the search. Multiobjective evolutionary algorithms using the hypervolume indicator transform multiobjective problems into single objective ones by searching for a finite set of solutions maximizing the corresponding hypervolume indicator. In this paper, we theoretically investigate how those optimal ?-distributions—finite sets of ? solutions maximizing the hypervolume indicator—are spread over the Pareto front of biobjective problems. This problem is of high importance for practical applications as these sets characterize the preferences that the hypervolume indicator encodes, i.e., which types of Pareto set approximations are favored. In particular, we tackle the question whether the hypervolume indicator is biased towards certain regions. For linear fronts we prove that the distribution is uniform with constant distance between two consecutive points. For general fronts where it is presumably impossible to characterize exactly the distribution, we derive a limit result when the number of points grows to infinity proving that the empirical density of points converges to a density proportional to the square root of the negative of the derivative of the front. Our analyses show that it is not the shape of the Pareto front but only its slope that determines how the points that maximize the hypervolume indicator are distributed. Experimental results illustrate that the limit density is a good approximation of the empirical density for small ?. Furthermore, we analyze the issue of where to place the reference point of the indicator such that the extremes of the front can be found if the hypervolume indicator is optimized. We derive an explicit lower bound (possibly infinite) ensuring the presence of the extremes in the optimal distribution. This result contradicts the common belief that the reference point has to be chosen close to the nadir point: for certain types of fronts, we show that no finite reference point allows to have the extremes in the optimal ?-distribution.
Reinforcement Learning for Autonomic Grids (IEEE ICAC 2008)
[+]
Perez, J. , Germain-Renaud, C. , Kégl, B. and Loomis, C.. Utility-based Reinforcement Learning for Reactive Grids. In: The 5th IEEE International Conference on Autonomic Computing. Chicago États-Unis. 2008.
of utility functions and reinforcement learning (RL) provides a general and efficient method for dynamically allocating grid resources in order to satisfy both end users with differentiated requirements and participating institutions. Combining RL methods and utility functions for resource allocation was pioneered
by Tesauro and Vengerov. While the application contexts are different, the resource allocation issues are very similar. The main difference in our work is that we consider a multi-criteria optimization problem that includes a fair-share objective. A first contribution of our work is the definition of a set of variables describing states and actions that allows us to formulate the grid scheduling problem as a continuous action-state space reinforcement learning problem. To capture the immediate goals of end users and the long-term objectives of administrators, we propose automatically derived utility functions. Finally, our experimental results on a synthetic workload and a real EGEE trace show that RL clearly outperforms the classical schedulers, so it is a realistic alternative to empirical scheduler design.
- PDF: http://hal.inria.fr/docs/00/28/73/54/PDF/RLICAC08.pdf
- Abstract: Large scale production grids are an important case for autonomic computing. They follow a mutualization paradigm: decision-making (human or automatic) is distributed and largely independent, and, at the same time, it must implement the highlevel goals of the grid management. This paper deals with the scheduling problem with two partially conflicting goals: fairshare and Quality of Service (QoS). Fair sharing is a wellknown issue motivated by return on investment for participating institutions. Differentiated QoS has emerged as an important and unexpected requirement in the current usage of production grids.
of utility functions and reinforcement learning (RL) provides a general and efficient method for dynamically allocating grid resources in order to satisfy both end users with differentiated requirements and participating institutions. Combining RL methods and utility functions for resource allocation was pioneered
by Tesauro and Vengerov. While the application contexts are different, the resource allocation issues are very similar. The main difference in our work is that we consider a multi-criteria optimization problem that includes a fair-share objective. A first contribution of our work is the definition of a set of variables describing states and actions that allows us to formulate the grid scheduling problem as a continuous action-state space reinforcement learning problem. To capture the immediate goals of end users and the long-term objectives of administrators, we propose automatically derived utility functions. Finally, our experimental results on a synthetic workload and a real EGEE trace show that RL clearly outperforms the classical schedulers, so it is a realistic alternative to empirical scheduler design.
Embodied Evolution (EA'09)
[+]
Bredèche, N. , Haasdijk, E. and Eiben, A.E.. On-line, On-board Evolution of Robot Controllers. In: Selected Papers from the 9th international conference on Artificial Evolution (Evolution Artificielle). To appear in 2010.
- PDF: http://hal.inria.fr/docs/00/41/32/59/PDF/bredeche09ea_final2-LNCS.pdf
- Abstract: This paper reports on a feasibility study into the evolution of robot controllers during the actual operation of robots (on-line), using only the computational resources within the robots themselves (on-board). We identify the main challenges that these restrictions imply and propose mechanisms to handle them. The resulting algorithm is evaluated in a hybrid system, using the actual robots’ processors interfaced with a simulator that represents the environment. The results show that the proposed algorithm is indeed feasible and the particular problems we encountered during this study give hints for further research.
An original representation for Evolutionary Domain-independent satisfycing Planning (ICAPS 2010)
[+]
Bibai, J. , Savéant, P. , Schoenauer, M. and Vidal, V.. An Evolutionary Metaheuristic Based on State Decomposition for Domain-Independent Satisficing Planning. In: ICAPS 2010. May 2010. To appear.
solutions, making it competitive with state-of-the-art planners. This article gives a full account of the algorithm, reports on the experiments and provides some insights on the algorithm behavior
- PDF: http://hal.archives-ouvertes.fr/docs/00/45/62/92/PDF/icaps10.pdf
- Abstract: DAEX is a metaheuristic designed to improve the plan quality and the scalability of an encapsulated planning system. DAEX is based on a state decomposition strategy, driven by an evolutionary algorithm, which benefits from the use of a classical planning heuristic to maintain an ordering of atoms within the individuals. The proof of concept is achieved by embedding the domain-independent satisficing YAHSP planner and using the critical path h1 heuristic. Experiments with the resulting algorithm are performed on a selection of IPC benchmarks from classical, cost-based and temporal domains. Under the experimental conditions of the IPC, and in particular with a universal parameter setting common to all domains, DAEYAHSP is compared to the best planner for each
solutions, making it competitive with state-of-the-art planners. This article gives a full account of the algorithm, reports on the experiments and provides some insights on the algorithm behavior
A hybrid one-class/regression SVM for multi-objective surrogate design (GECCO 2010)
[+]
Ilya Loshchilov, Marc Schoenauer and Mich\`ele Sebag. A Pareto-Compliant Surrogate Approach for Multiobjective Optimization. In J. Branke et al., eds., GECCO 2010. To appear.
- PDF: http://www.lri.fr/~marc/Papers/gecco10.pdf
- Abstract: Most surrogate approaches to multi-objective optimization build a surrogate model for each objective. The models for the objectives can then be used in different ways: inside a classical Evolutionary Multiobjective Optimization Algorithm (EMOA) in lieu of the actual objectives, without modifying the underlying EMOA; or to filter out points that the models predict as uninteresting. In contrast, the proposed approach aims at building a global surrogate model defined on the decision space and tightly characterizing the current Pareto set and the dominated region, in order to speed up the evolution progress toward the true Pareto set. This surrogate model is specified by combining a One-class SVM (to characterize the dominated points) and a Regression SVM (to clamp the Pareto front on a single value). The resulting surrogate model is then used within stateof-the-art EMOAs to pre-screen the individuals generated by application of standard variation operators, significantly reducing the number of evaluations of the actual objective functions on classical benchmarks problems.