Publications
* indicates equal contribution
2026
- ICML’26 (In Submission)
HELIOS: Heterogeneous Lightweight VLA Model Serving System2026In submission - MLSys’26
AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents2026 - ICML’26 (In Submission)
QUESO: Storage-Assisted Quantization Error Compensation for On-Device LLM Inference2026In submission