I am a research scientist at Google Research. Previously I was a postdoctoral researcher at Empirical Inference Department, Max Planck Institute for Intelligent Systems working with Bernhard Schölkopf (from 2018 to 2020). I work in the field of machine learning (intersection of computer science and statistics). My current research topics include (but not limited to)

  • Model routing and model cascading for efficient LLM inference.
  • Faster LLM inference with speculative decoding.
  • Distillation of a large LLM to a smaller LLM.
I completed my PhD study in 2017 at Gatsby Unit, UCL where I worked with Arthur Gretton on various topics related to kernel-based statistical tests and approximate Bayesian inference. Please feel free to check out my list of publications, software and contact me for a research discussion.

Contact: Wittawat Jitkrittum (วิทวัส จิตกฤตธรรม) ( )

 ArXiv  CV  DBLP  Github  Google Scholar  LinkedIn  Orcid ID  ResearchGate  Semantic Scholar  Twitter

Last update: 02-Mar-25
Based on al-folio theme.

News

26 Feb 2025

:memo: New preprint “I Know What I Don’t Know: Improving Model Cascades Through Confidence Tuning”.

17 Feb 2025

:writing_hand: I will serve as an area chair for NeurIPS 2025.

2 Feb 2025

:memo: New preprint “Universal Model Routing for Efficient LLM Inference”. We propose a new model routing technique that can route queries to unseen LLMs without having to retrain the router.

11 Feb 2025

Our work “Faster Cascades via Speculative Decoding” has been accepted to ICLR 2025 to be presented as an Oral.

21 Jan 2025

:writing_hand: I will serve as an area chair for ICML 2025.

24 Oct 2024

:memo: New preprint “A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs”.

3 Oct 2024

:writing_hand: I will serve as an area chair for AISTATS 2025.

1 May 2024

:memo:USTAD: Unified Single-model Training Achieving Diverse Scores for Information Retrieval” has been accepted to ICML 2024.

16 Jan 2024

:memo::memo: :memo: :memo: Four papers accepted to ICLR 2024.

11 Dec 2023

:bullettrain_front: At NeurIPS 2023 presenting our paper “When Does Confidence-Based Cascade Deferral Suffice?”