I am currently a part of LDOS at UT-Austin. I graduated from UW-Madison, where I was Advised by Dimitris Papailiopoulos and Shivaram Venkataraman . My research interests are primarily in Systems for Machine Learning, especially around distributed training and inference of ML workloads. During my PhD I have been very fortunate to intern with Bilge Acun at FAIR, Amar Phanishayee at Microsoft Research and Yucheng Low at Apple.
During my time in Madison, when I was not being a grad student, I very likely was racing keelboats on Lake Mendota or alpine skiing in the winters. I also doubled up as a sailing instructor at the UW-Madison’s Hoofers Sailing club. Since moving to Austin, I have been racing keelboats on Lake Travis and teaching sailing with Austin Yacht club, while my ski’s languish, covered in storage wax.
Teaching
CS 395T, Principles of Learned Systems
Service
Reviewer: ICML ‘23, ICLR ‘23, Neurips ‘22, Neurips ‘21
ERC: MLSys ‘22, Usenix ATC ‘23
Publications
-
CHAI: Clustered Head Attention for Efficient LLM Inference.
S Agarwal, B Acun, B Hosmer, M Elhoushi, Y Lee, S Venkataraman, D Papailiopoulos, C Wu. ICML’ 24
[paper] -
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
M Elhoushi, A Shrivastava, D Liskovich, B Hosmer, B Wasti, L Lai, A Mahmoud, B Acun, S Agarwal, A Roman, A Aly, B Chen, C Wu. ACL’ 24
[paper] -
Decoding Speculative Decoding.
M Yan, S Agarwal, S Venkataraman.
[paper] -
Blox: A Modular Toolkit for Deep Learning Schedulers.
S Agarwal, A Phanishayee, S Venkataraman. Eurosys’24.
[paper] [source] -
Bagpipe: Accelerating deep recommendation model training.
S Agarwal, C Yan, Z Zhang, S Venkataraman. SOSP’23.
[paper][source] -
Cuttlefish: Low-rank Model Training without All The Tuning.
H Wang, S Agarwal, Y Tanaka, E Xing, D Papailiopoulos. MLSys’23.
[paper] -
Pufferfish: Communication-efficient models at no extra cost.
H Wang, S Agarwal, D Papailiopoulos. MLSys’22.
[paper] -
On the utility of Gradient compression
S Agarwal, H Wang, S Venkataraman, D Papailiopoulos. MLSys’22.
[paper][source] -
Adaptive Gradient Communication via Critical Learning Regime Identification.
S Agarwal, H Wang, K Lee, S Venkataraman, D Papailiopoulos. MLSys’21.
[paper][source] -
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning.
Y Liu, S Agarwal, S Venkataraman.
[paper] -
Attack of the tails: Yes, you really can backdoor federated learning.
H Wang, K Sreenivasan, S Rajput, H Vishwakarma, S Agarwal, J Sohn, K Lee, D Papailiopoulos. Neurips’21
[paper]