The 4th Workshop on Machine Learning and Systems (EuroMLSys)

co-located with EuroSys '24

April 22nd 2024, Athens, Greece


The recent wave of research focusing on machine intelligence (machine learning and artificial intelligence) and its applications has been fuelled by both hardware improvements and deep learning frameworks that simplify the design and training of neural models. Advances in AI also accelerate research towards Reinforcement Learning (RL), where dynamic control mechanisms are designed to tackle complex tasks. Further, machine learning based optimisation, such as Bayesian Optimisation, is gaining traction in the computer systems community where optimisation needs to scale with complex and large parameter spaces; areas of interest range from hyperparameter tuning to system configuration tuning,

The EuroMLSys workshop will provide a platform for discussing emerging trends in building frameworks, programming models, optimisation algorithms, and software engineering to support AI/ML applications. At the same time, using ML for building such frameworks or optimisation tools will be discussed. EuroMLSys aims to bridge the gap between AI research and practice, through a technical program of fresh ideas on software infrastructure, tools, design principles, and theory/algorithms, from a systems perspective. We will also explore potential applications that will take advantages of ML.

News

  • Keynote speaker is announced! Tianqi Chen (CMU, Chief Technologist of OctoAI) will give a talk on Universally Deploy Large-language Models via ML Compilation.
  • The workshop program is up! It will start at 9:00 am.

Key dates

  • Paper submission deadline:February 20, 2024 (23:59 AoE) February 26, 2024 (23:59 AoE)
  • Acceptance notification: March 18, 2024 March 22, 2024
  • Final paper due: March 26, 2024 March 29, 2024
  • Workshop: April 22, 2024 (full-day workshop)

Past Editions

Call for Papers

A growing area of interest in machine intelligence is at the intersection of AI/ML and systems design. At the same time, applications of ML are growing in complexity and so is the volume of data they produce/consume. For computer systems to scale, new learning approaches and advanced optimisation techniques are needed. We also need to understand better the current AI/ML frameworks, in terms of their functionality, limitations, and target applications. This will clarify potential desired functions and future architectures. Novel machine learning methods to optimise and accelerate software and hardware systems must also be developed.

EuroMLSys is an interdisciplinary workshop that brings together researchers in computer architecture, systems and machine learning, along with practitioners who are active in these emerging areas.

Topics of interest include, but are not limited to, the following:

  • Scheduling algorithms for data processing clusters
  • Custom hardware for machine learning
  • Programming languages for machine learning
  • Benchmarking systems (for machine learning algorithms)
  • Synthetic input data generation for training
  • Systems for training and serving machine learning models at scale
  • Graph neural networks
  • Neural network compression and pruning in systems
  • Systems for incremental learning algorithms
  • Large scale distributed learning algorithms in practice
  • Database systems for large scale learning
  • Model understanding tools (debugging, visualisation, etc.)
  • Systems for model-free and model-based Reinforcement Learning
  • Optimisation in end-to-end deep learning
  • System optimisation using Bayesian Optimisation
  • Acceleration of model building (e.g., imitation learning in RL)
  • Use of probabilistic models in ML/AI application
  • Learning models for inferring network attacks, device/service fingerprinting, congestion, etc.
  • Techniques to collect and analyze network data in a privacy-preserving manner
  • Learning models to capture network events and control actions
  • Machine learning in networking (e.g., use of Deep RL in networking)
  • Analysis of distributed ML algorithms
  • Semantics for distributed ML languages
  • Probabilistic modelling for distributed ML algorithms
  • Synchronisation and state control of distributed ML algorithms

Accepted papers will be published in the ACM Digital Library (you can opt out from this).

Program

The full-text PDFs will become available on April, 22, 2024 on ACM Digital Library

Program timezone is EEST (UTC+3.00).

9:00 Opening
09:15 Session 1: GPUs, Training and Optimisation - 15min presentations - Eiko Yoneki (University of Cambridge)
Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs Connor Espenshade, Rachel Peng, Eumin Hong (Columbia University); Max Calman, Yue Zhu, Pritish Parida, Eun Lee (IBM Research); Martha Kim (Columbia University)
An Analysis of Collocation on GPUs for Deep Learning Training Ties Robroek, Ehsan Yousefzadeh-Asl-Miandoab, Pınar Tözün (IT University of Copenhagen)
SIP: Autotuning GPU Native Schedules via Stochastic Instruction Perturbation Guoliang He, E. Yoneki (University of Cambridge)
ML Training with Cloud GPU Shortages: Is Cross-Region the Answer? Foteini Strati, Paul Elvinger, Tolga Kerimoglu, Ana Klimovic (ETH Zurich)
ALTO: An Efficient Network Orchestrator for Compound AI Systems Keshav Santhanam, Deepti Raghavan, Muhammad Shahir Rahman, Thejas Venkatesh, Neha Kunjal (Stanford University); Pratiksha Thaker (CMU); Philip Levis (Stanford University and Google); Matei Zaharia (UC Berkeley)
10:30 Coffee Break / Poster Session (Browsing)
11:00 Session 2: LLM - 15min presentations - Aaron Zhao (Imperial College London)
Deploying Stateful Network Functions Efficiently using Large Language Models Hamid Ghasemirahni (KTH); Alireza Farshin (NVIDIA); Mariano Scazzariello, Marco Chiesa, Dejan Kostic (KTH Royal Institute of Technology)
The Importance of Workload Choice in Evaluating LLM Inference Systems Konstantinos Papaioannou (IMDEA Software Institute, Universidad Politécnica de Madrid); Thaleia Dimitra Doudali (IMDEA Software Institute)
Priority Sampling of Large Language Models for Compilers Dejan Grubisic (Rice University); Volker Seeker, Gabriel Synnaeve, Hugh Leather (Meta AI); John Mellor-Crummey (Rice University); Chris Cummins (Meta AI)
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving Yongjun He (ETH Zurich); Yao Lu (National University of Singapore); Gustavo Alonso (ETH Zurich)
De-DSI: Decentralised Differentiable Search Index Petru Neague, Marcel Gregoriadis, Johan Pouwelse (Delft University of Technology)
Towards Pareto Optimal Throughput in Small Language Model Serving Pol Garcia Recasens (Barcelona Supercomputing Center); Yue Zhu, Chen Wang, Eun Kyung Lee, Olivier Tardieu, Alaa Youssef (IBM Research); Jordi Torres (Barcelona Supercomputing Center); Josep Ll Berral (Universitat Politècnica de Catalunya)
12:30 Lunch Break / Poster Session (Browsing)
13:45 Session 3: FL, Pipeline of Data Processing - 15min presentations - Ahmed Sayed (Queen Mary University of London)
ALS Algorithm for Robust and Communication-Efficient Federated Learning Neil Hurley, Erika Duriakova (Insight Centre for Data Analytics); James Geraci (Samsung Electronics Co., Ltd.); Diarmuid O’Reilly-Morgan, Elias Tragos, Barry Smyth, Aonghus Lawlor (Insight Centre for Data Analytics)
SpeedyLoader: Efficient Pipelining of Data Preprocessing and Machine Learning Training Rahma Nouaji, Stella Bitchebe, Oana Balmau (McGill)
FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission Zeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou (Beijing University of Posts and Telecommunications)
14:30 Keynote: Universally Deploy Large-language Models via ML Compilation Tianqi Chen (CMU, Chief Technologist of OctoAI)
15:30 Coffee Break / Poster Session
16:00 Session 4: Edge AI, GNN, RL - 15min presentations - Hamed Haddadi (Imperial College London)
GuaranTEE: Towards Attestable and Private ML with CCA Sandra Siby, Sina Abdollahi, Mohammad Maheri, Marios Kogias, Hamed Haddadi (Imperial College London)
Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices Yushan Huang, Josh Millar, Yuxuan Long (Imperial College London); Yuchen Zhao (University of York); Hamed Haddadi (Imperial College London)
Temporal Graph Generative Models: An empirical study Houssem Eddine Souid, Lucas Ody, Valentin Lemaire, Youssef Achenchabe, Gianmarco Aversano, Sabri Skhiri (Euranova)
IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning for Diverse Workloads Taiyi Wang, Eiko Yoneki (University of Cambridge)
17:00 Poster Elevator Pitch - 3min each
Evaluating Deep Learning Recommendation Model Training Scalability with the Dynamic Opera Network Connor Imes, Andrew Rittenbach, Peng Xie, Dong In D. Kang, John Paul Walters, Stephen P. Crago (Information Sciences Institute, University of Southern California)
The Environmental Cost of Engineering Machine Learning-Enabled Systems: A Mapping Study Kouider Chadli (University of Galway); Goetz Botterweck (Trinity College Dublin); Takfarinas Saber (University of Galway)
Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling Kamran Razavi (Technical University of Darmstadt)); Saeid Ghafouri (Queen Mary University of London); Max Mühlhäuser (Technische Universität Darmstadt); Pooyan Jamshidi (University of South Carolina); Lin Wang (Paderborn University)
Do Predictors for Resource Overcommitment Even Predict? Georgia Christofidi (IMDEA, Universidad Politécnica de Madrid); Thaleia Dimitra Doudali (IMDEA Software Institute)
Navigating Challenges and Technical Debt in Large Language Models Deployment Ahmed Menshawy, Zeeshan Nawaz, Mahmoud Fahmy (Mastercard)
A Hybrid Decentralised Learning Topology for Recommendations with Improved Privacy Diarmuid O’Reilly-Morgan, Elias Tragos (Insight Centre for Data Analytics); James Geraci (Samsung Electronics Co. Ltd); Qinqin Wang, Neil Hurley, Barry Smyth, Aonghus Lawlor (Insight Centre for Data Analytics)
Enhancing Named Entity Recognition for Agricultural Commodity Monitoring with Large Language Models Abir Chebbi (University of Geneva); Guido Kniesel (Lucerne University of Applied Sciences and Arts); Nabil Abdennadher (University of Applied Sciences and Arts Wwestern Switzerland); Giovanna Dimarzo (University of Geneva)
Comparative Profiling: Insights into Latent Diffusion Model Training Bradley Aldous, Ahmed M. Abdelmoniem (Queen Mary University of London)
17:25 Wrapup and Closing

Keynote

  • Tianqi Chen

    14:30 Tianqi Chen Carnegie Mellon University, Chief Technologist of OctoAI

    Universally Deploy Large-language Models via ML Compilation

    Deploying deep learning models on various devices has become an important topic. Machine learning compilation is an emerging field that leverages compiler and automatic search techniques to accelerate AI models. ML compilation brings a unique set of challenges: emerging machine learning models; increasing hardware specialization brings a diverse set of acceleration primitives; growing tension between flexibility and performance. In this talk. I then discuss our experience in bringing foundational models to a variety of devices and hardware environments through machine learning compilation.

    Bio: Tianqi Chen is currently an Assistant Professor at the Machine Learning Department and Computer Science Department of Carnegie Mellon University. He is also the Chief Technologist of OctoAI. He received his PhD. from the Paul G. Allen School of Computer Science & Engineering at the University of Washington. He has created many major learning systems that are widely adopted: XGBoost, TVM, and MLC-LLM. His personal webpage is https://tqchen.com/.

Sponsors


Committees

Workshop and TPC Chairs

Technical Program Committee

  • Ahmed Sayed, Queen Mary University of London
  • Alex Iacob, University of Cambridge
  • Alexandros Koliousis, Northeastern University London and Institute for Experiential AI
  • Amir Payberah, KTH
  • Amitabha Roy, Google
  • Andrei Paleyes, University of Cambridge
  • Chi Zhang, Brandeis University
  • Christina Giannoula, University of Toronto
  • Christos Bouganis, Imperial College London
  • Daniel Goodman, Oracle
  • Daniel Mendoza, Stanford University
  • Davide Sanvito, NEC Laboratories Europe
  • Dawei Li, Amazon
  • Deepak George Thomas, Iowa State University
  • Dimitris Chatzopoulos, University College Dublin
  • Fiodar Kazhamiaka,Stanford University
  • Guilherme H. Apostolo, Vrije Universiteit Amsterdam
  • Guoliang He, University of Cambridge
  • Guy Leroy, MSR Cambridge
  • Hamed Haddadi, Imperial College London
  • Holger Pirk, Imperial College London
  • Jenny Huang, NVIDIA
  • Jon Crowcroft, University of Cambridge
  • Jose Cano, University of Glasgow
  • Laurent Bindschaedler, MPI-SWS
  • Liang Zhang, Oracle
  • Luigi Nardi, Lund University
  • Luo Mai, University of Edinburgh
  • Mark Zhao, Stanford University
  • Mengying Zhou, Fudan University
  • Nasrullah Sheikh, IBM Research Almaden
  • Νikhil Sarda, Google
  • Nikolas Ioannou, Google
  • Paul Patras, University of Edinburgh
  • Peter Triantafillou, University of Warwick
  • Pouya Hamadanian, MIT
  • Pratik Fegade, Google
  • Sam Ainsworth, University of Edinburgh
  • Sami Alabed, Deepmind
  • Taiyi Wang, University of Cambridge
  • Thaleia Dimitra Doudali, IMDEA
  • Valentin Radu, University of Sheffield
  • Veljko Pejovic, University of Ljubljana
  • Wayne Luke ,Imperial College London
  • Xupeng Miao, Peking University
  • Youhe Jiang, University of Cambridge
  • Zak Singh, University of Cambridge
  • Zheng Wang, University of Leeds
  • Zhihao Jia, CMU

Web Chair

  • Alexis Duque, Net AI

Contact

For any question(s) related to EuroMLSys 2023, please contact the TPC Chairs Eiko Yoneki and Aaron Zhao.

Follow us on Twitter: @euromlsys