AutoML - the automated search for machine learning (ML) pipelines that achieve high model accuracy - is more and more important to make ML accessible to a wider audience. However, often model accuracy is not the only requirement of ML application. There are other constraints based on metrics, such as inference/training time and ML pipeline size. One approach to address AutoML problems is successive halving [1]. Instead of training on all available instances, successive halving starts by evaluating a small number of instances. The small number of instances allows it to evaluate a large number of hyperparameter configurations. Then, it prunes the unsuccessful configurations and continues to evaluate the successful ones on more training instances. This way, it will successively half the number of evaluated configurations while increasing the number of training instances correspondingly. However, at the moment, these systems do not support constraints out of the box.

Problem / Task:  

The task is to extend the successive halving algorithm to support known constraints, such as inference/training time and ML pipeline size. More specifically, one has to find a way to prune constraint-violating configurations as early as possible. Finally, the new algorithm should be compared to other state-of-the-art constraint Bayesian optimization algorithms, such as Spearmint [2].


  • programming experience in Python (+ sklearn)
  • interest in data integration
  • experience in machine learning & database technologies 


[1] Falkner, S., Klein, A. and Hutter, F., 2018, July. BOHB: Robust and efficient hyperparameter optimization at scale. In International Conference on Machine Learning (pp. 1437-1446). PMLR.

[2] Gelbart, M.A., Snoek, J. and Adams, R.P., 2014. Bayesian optimization with unknown constraints. arXiv preprint arXiv:1403.5607.

For a detailed introduction to the topic, please get in contact via email with Felix Neutatz. 

Advisor and Contact:

Felix Neutatz < > (LUH) 

Prof. Dr. Ziawasch Abedjan < > (LUH)