I lead the team and architect the infrastructure for developing and deploying AI/ML across Apple's devices — with frameworks like Core AI and Core ML. This powers AI running on Apple's devices, from Apple Intelligence and Siri AI and many other Apple products (e.g. camera, health, ..) as well the ecosystem of app developers.
Over the past 15+ years, I've built the frameworks that run and deploy on-device AI — at Google, Meta, and Apple.
At Apple, I founded and architected Core AI, and currently lead the project and the on-device infrastructure team. At Meta I was Tech Lead for PyTorch, where I founded and architected ExecuTorch — the framework used to deploy Meta's family of apps across Android and iOS, and to power AI on Meta's wearables. At Google, I served as Tech Lead in TensorFlow, co-founded TensorFlow Lite (now LiteRT), and built the internal frameworks running on-device AI for Google's early AI products, including Google Assistant (now Gemini).
I got into AI frameworks by trying to deploy my own team's speech recognition research at Google. What followed was 15 years of building infrastructure and keeping pace with research — particularly around deep learning optimization.
Along the way, I've played every role in building AI products: data gathering, applied research, framework development, hardware optimization, and influencing hardware roadmaps.
An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of various PyTorch models (vision, speech, Generative AI, and more) to edge devices.
Selected work I was involved in:
A suite of tools for optimizing machine learning models for deployment and execution, via easy to use and consistent APIs implementing powerful machine learning optimization techniques.
Selected work I was involved in:
Google's open source deep learning framework for on-device machine learning. It has billions of installs, from mobile phones, smart displays and speakers, to cars and wearables, powering Google's and other companies products.
Selected work I was involved in:
I worked in the Google speech team to bring speech and related technologies to work entirely on-device. I was part of the team that developed the very-low-power "hey Google" capabilities across devices, where I developed the first end-to-end system (and latest iteration of the ML model). I also built the pre-TensorFlow ML inference engine that brought a new generation of speech recognizers, text-to-speech generators, and keyboard technology on-device.
This infrastructure, along with many techniques I implemented, promoted and refined --such as quantization, sparsity, and the replacement of vanilla LSTMs for Coupled Input-Forget Gate (CIFG) variant (introduced here)-- became the foudation of newer infrastructure and production models.
US-9767410B1 ⬀, US-9372675B1 ⬀, US-9542948B2 ⬀, US-9842608B2 ⬀, US-10460735B2⬀, EP-3121809B1 ⬀, US-20200126537A1 ⬀, WO-2020092532A1 ⬀, US-9953216B2 ⬀
A more complete (and likely up to date) list at Google Scholar.
[2003 – 2005] Instituto Tecnológico y de Estudios Superiores de Monterrey
Master in Computer Science, Artificial Intelligence, Image Processing, Robotics
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup ⚽, Research assistant
[1998 – 2003] Instituto Tecnológico y de Estudios Superiores de Monterrey
Bachelor Computer Science, Artificial Intelligence
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup ⚽