Professional Experience expand all

Education expand all

Awards

  • 2022 Best Paper Nominee, MLArchSys 2022
  • 2022 Distinguished Paper Award, CGO. 29% acceptance rate 2022
  • 2020 SICSA PhD Award for Best Dissertation in Scotland 2019-2020. 2020
  • 2019 HiPEAC Travel Grant 2019
  • 2018 Distinguished Paper Award, ISSTA. 112 submissions, 28% acceptance rate 2018
  • 2017 Best Paper Award, PACT. 109 submissions, 23% acceptance rate 2017
  • 2017 Best Paper Award, CGO. 116 submissions, 22% acceptance rate 2017
  • 2015 PhD studentship, EPSRC grant EP/L01503X/1 2015
  • 2014 IET Institute of Engineering & Technology Prize 2014
  • 2009 Arkwright Scholarship, Rolls Royce Plc 2009
  • 2009 EES Engineering Education Scheme of England 2009
  • 2008 AESSEAL Design Innovation Award 2008

Publications expand all

  • 2022 2022
    AuthorsF. Tsimpourlas, P. Petoumenos, M. Xu, C. Cummins, K. Hazelwood, A. Rajan, H. Leather
    PublicationPACT

    BenchPress is a directed program synthesizer for compiler benchmarks. Using active learning, it ranks compiler features based on their significance and produces executables that target them.

  • 2022 2022
    AuthorsC. Fu, H. Huang, B. Wasti, C. Cummins, R. Baghdadi, K. Hazelwood, Y. Tian, J. Zhao, H. Leather
    PublicationPACT

    The high computation cost is one of the key bottlenecks for adopting deep neural networks (DNNs) in different hardware. Q-gym consists of a compiler which leverages equality saturation to generate computation expressions, and various parallelization methods to accelerate DNN inference on different hardware.

  • 2022 2022
    AuthorsM. Almakki, A. Izzeldin, Q. Huang, A. Haj Ali, C. Cummins
    PublicationMLArchSys

    Phase ordering offers improved program performance by specializing compiler optimizations to invidual programs. In this work we propose further specializing the phase ordering for each function. Results show up to 9% improvement.

  • 2022 2022
    AuthorsC. Cummins, B. Wasti, J. Guo, B. Cui, J. Ansel, S. Gomez, S. Jain, J. Liu, O. Teytaud, B. Steiner, Y. Tian, H. Leather
    PublicationCGO

    We aim to lower the barrier-to-entry to compiler optimization research. We present CompilerGym, a suite of tools that removes the significant engineering investment required try out new ideas on production compiler problems.

  • 2022 2022
    AuthorsN. Rotem, C. Cummins
    PublicationarXiv

    Profile guided optimization is an effective technique for improving the optimization ability of compilers based on dynamic behavior, but collecting profile data is expensive, cumbersome, and requires regular updating to remain fresh. We present a novel statistical approach to inferring branch probabilities that improves the performance of programs that are compiled without profile guided optimizations.

  • 2021 2021
    AuthorsS. Kourta, A. Namani, F. Tayeb, K. Hazelwood, C. Cummins, H. Leather, R. Baghdadi
    PublicationarXiv

    Accelerating e-graph construction is crucial for making the use of e-graphs practical in compilers. In this paper, we present Caviar, an e-graph-based TRS for proving expressions within compilers. Caviar is a fast (20x faster than base e-graph TRS) and flexible TRS.

  • 2021 2021
    AuthorsK. Yang, T. Zhang, C. Cummins, B. Cui, B. Steiner, L. Wang, J. Gonzalez, D. Klein, Y. Tian
    PublicationNeurIPS

    Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function. We develop a novel formal regret analysis for when and why an adaptive region partitioning scheme works. We also propose a new path planning method which improves the function value estimation within each sub-region, and uses a latent representation of the search space.

  • 2021 2021
    AuthorsC. Cummins, Z. Fisches, T. Ben-Nun, T. Hoefler, M. O’Boyle, H. Leather
    PublicationICML

    Most machine learning methods cannot replicate even the simplest of the abstract interpretations of data flow analysis that are critical to making good optimization decisions. To benchmark current and future learning techniques for compiler analyses we introduce an open dataset of 461k Intermediate Representation files for LLVM, covering five source programming languages, and 15.4M corresponding data flow results. We formulate data flow analysis as an MPNN and show that standard analyses can be learned, yielding improved performance on downstream compiler optimization tasks.

  • 2021 2021
    AuthorsB. Steiner, C. Cummins, H. He, H. Leather
    PublicationMLSys

    Scheduling deep learning workloads by predicting the expected performance of a partial schedule using an LSTM and carefully engineered features. Achieves 2.6x speedup over Halide and 1.5x speedup over TVM in orders of magnitude less search time.

  • 2020 2020
    AuthorsC. Cummins, Z. Fisches, T. Ben-Nun, T. Hoefler, H. Leather, M. O’Boyle
    PublicationarXiv

    Most machine learning methods cannot replicate even the simplest of the abstract interpretations of data flow analysis that are critical to making good optimization decisions. To benchmark current and future learning techniques for compiler analyses we introduce an open dataset of 461k Intermediate Representation files for LLVM, covering five source programming languages, and 15.4M corresponding data flow results. We formulate data flow analysis as an MPNN and show that standard analyses can be learned, yielding improved performance on downstream compiler optimization tasks.

  • 2020 2020
    AuthorsC. Cummins, Z. Fisches, T. Ben-Nun, T. Hoefler, H. Leather, M. O’Boyle
    PublicationML for Systems Workshop, NeurIPS

    We present ProGraML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics that enables analysis through deep learning. We show that standard analyses can be learned, significantly outperforming state-of-the-art approaches.

  • 2020 2020
    AuthorsB. Steiner, C. Cummins, H. He, H. Leather
    PublicationML for Systems Workshop, NeurIPS

    Scheduling deep learning workloads by predicting the expected performance of a partial schedule using an LSTM and carefully engineered features. Achieves 2.6x speedup over Halide and 1.5x speedup over TVM in orders of magnitude less search time.

  • 2020 2020
    AuthorsH. Leather, C. Cummins
    PublicationFDL, Kiel, Germany

    This paper provides a retrospective of machine learning in compiler optimisation from its earliest inception, through some of the works that set themselves apart, to today’s deep learning, finishing with our vision of the field’s future.

  • 2020 2020
    AuthorsC. Cummins, Z. V. Fisches, T. Ben-Nun, T. Hoefler, and H. Leather
    PublicationarXiv

    Novel graph-based representation for machine learning over programs. We capture whole-program control, data, and call flow at the IR-level, equipping machine learning models to replicate the types of compiler analyses critical to optimization. We set new state-of-the-art performance in two downstream tasks - heterogenous device mapping and algorithm classification.

  • 2020 2020
    AuthorsC. Cummins
    PublicationPhD Thesis, University of Edinburgh

    Deep learn­ing over pro­grams. De­veloped novel ma­chine learn­ing meth­ods for ran­dom pro­gram gen­er­a­tion, com­piler op­tim­isa­tions, and rep­res­ent­at­ive bench­mark­ing. Ap­plic­a­tions for het­ero­gen­eous par­al­lel­ism, com­piler test­ing, and ad­apt­ive per­form­ance tun­ing.

  • 2019 2019
    AuthorsA. Goens, A. Brauckmann, S. Ertel, C. Cummins, H. Leather, J. Castrillon
    PublicationMAPL, Arizona, USA

  • 2018 2018
    AuthorsC. Cummins, P. Petoumenos, A. Murray, and H. Leather
    PublicationISSTA (28% acceptance rate), Amsterdam, Netherlands

    Unsupervised machine learning to derive program generators for compiler fuzz testing. Implemented in 100x less code than state-of-the-art program generator, and 3.03x faster. Discovered 67 new bugs in OpenCL compilers, many of which are now fixed.

  • 2018 2018
    AuthorsC. Cummins, P. Petoumenos, A. Murray, and H. Leather
    PublicationACACES (extended abstract), Fiuggi, Italy

    Extend abstract early preview of my ISSTA’18 paper.

  • 2017 2017
    AuthorsC. Cummins, P. Petoumenos, Z. Wang, and H. Leather
    PublicationPACT (23% acceptance rate), Portland, Oregon

    Learning optimization heuristics directly from raw source code, without the need for feature extraction. Exceeds performance of state-of-the art predictive models using hand crafted features, and can transfer knowledge gained from one optimization task to another, even if the learned tasks are dissimilar.

  • 2017 2017
    AuthorsC. Cummins, P. Petoumenos, Z. Wang, and H. Leather
    PublicationCGO (22% acceptance rate), Austin, Texas

    Deep learning over massive codebases from GitHub to generate benchmark programs. Automatically synthesizes OpenCL kernels which are indistinguishable from hand-written code, and improves state-of-the-art predictive model performance by 4.30×.

  • 2016 2016
    AuthorsC. Cummins, P. Petoumenos, M. Steuwer, and H. Leather
    PublicationACACES (extended abstract), Fiuggi, Italy

    Machine learning-enabled autotuning of multi-GPU OpenCL workgroup sizes. Static tuning achieves only 26% of the maximum performance, my approach achieves 92%.

  • 2016 2016
    AuthorsC. Cummins, P. Petoumenos, M. Steuwer, and H. Leather
    PublicationHLPGPU, HiPEAC, Prague

    A distributed framework for dynamic prediction of optimisation parameters using machine learning. Automatically exceeds human experts by 1.22x.

  • 2016 2016
    AuthorsC. Cummins, P. Petoumenos, M. Steuwer, and H. Leather
    PublicationADAPT, HiPEAC, Prague

    Three methodologies to autotune stencil patterns using machine learning. Speedups of 3.79× over the best possible static size, 94% of the maximum performance.

  • 2015 2015
    AuthorsE. Bunkute, C. Cummins, F. Crofts, G. Bunce, I. T. Nabney, and D. R. Flower
    PublicationBioinformatics, 31(2), 295-296

    An open source search engine of protein isoelectric points. Provides public access to bioinformatics data from the literature for comparison and benchmarking purposes.

Invited Talks expand all

Other Academic Activities

Committees MLArchSys'22 Organizing Commitee (2022), COSMIC'19 General Co-chair (2019), PACT'18 HotCRP Chair (2018), and CGO'18 Web Chair (2018).
Peer Reviews PLDI (2022), MLSys (2022), CC (2022), ACM TACO (2020), CGO (2018), ACM TACO (2016), LCTES (2016), and CGO (2016).
Posters ISSTA (2018), ACACES (2018), PPar (2017), Google (2016), PPar (2016), ACACES (2016), PLDI (2016), HiPEAC (2016), Google (2015), and PPar (2015).

Key Technical Skills

Python
C/C++
Git
GNU / Linux
Bash
Jupyter
Bazel
SQL
TensorFlow
PyTorch
HTML+CSS+JS
Java
OpenCL

AI researcher specializing in compilers and code optimization.