Raziel Alvarez

Software Engineer

Lead AI/ML Infrastructure for Apple Platforms

I lead the team and architect the infrastructure for developing and deploying AI/ML across Apple's devices — with frameworks like Core AI and Core ML. This powers AI running on Apple's devices, from Apple Intelligence and Siri AI and many other Apple products (e.g. camera, health, ..) as well the ecosystem of app developers.

tl;dr;

Over the past 15+ years, I've built the frameworks that run and deploy on-device AI — at Google, Meta, and Apple.

At Apple, I founded and architected Core AI, and currently lead the project and the on-device infrastructure team. At Meta I was Tech Lead for PyTorch, where I founded and architected ExecuTorch — the framework used to deploy Meta's family of apps across Android and iOS, and to power AI on Meta's wearables. At Google, I served as Tech Lead in TensorFlow, co-founded TensorFlow Lite (now LiteRT), and built the internal frameworks running on-device AI for Google's early AI products, including Google Assistant (now Gemini).

I got into AI frameworks by trying to deploy my own team's speech recognition research at Google. What followed was 15 years of building infrastructure and keeping pace with research — particularly around deep learning optimization.

Along the way, I've played every role in building AI products: data gathering, applied research, framework development, hardware optimization, and influencing hardware roadmaps.

Founded PyTorch's Executorch.

An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of various PyTorch models (vision, speech, Generative AI, and more) to edge devices.

Selected work I was involved in:

  • Redefined the strategy for PyTorch's own on-device ML suite: "PyTorch Edge"
  • Created the vision and strategy for ExecuTorch
  • Defined the ExecuTorch technical architecture
  • Defined the relationship and integration of PyTorch Core, Intermediate Representations, the new 2.0 APIs and ExecuTorch's
  • Key contributor to define PyTorch's torch.export IR
  • Key contributor to define PyTorch's EXIR dialects
  • Redefined the strategy and architecture ot PyTorch's Architecture Optimization and ML optimization around torch.export
  • Worked on the first embedded use of ExecuTorch within Meta's smart classes
Founded the TensorFlow Model optimization toolkit.

A suite of tools for optimizing machine learning models for deployment and execution, via easy to use and consistent APIs implementing powerful machine learning optimization techniques.

Selected work I was involved in:

  • Introducing the Model Optimization Toolkit for Tensorflow
  • TensorFlow Model Optimization Toolkit—Pruning API
  • TensorFlow Model Optimization Toolkit—Post-Training Integer Quantization - blog
  • TensorFlow Model Optimization Toolkit—Post-training reduced-precision fp16 quantization
  • TensorFlow Model Optimization Toolkit—Quantization Aware Training API
  • EfficientNet-EdgeTPU: Accelerator-aware neural network design with AutoML
  • Coral summer updates: Post-training quant support, TF Lite delegate, and new models!
  • TensorFlow Model Optimization Toolkit — Weight Clustering API
Co-Founded TensorFlow Lite.

Google's open source deep learning framework for on-device machine learning. It has billions of installs, from mobile phones, smart displays and speakers, to cars and wearables, powering Google's and other companies products.

Selected work I was involved in:

  • Announcing TensorFlow Lite
  • TensorFlow operation fusion in the TensorFlow Lite converter
  • ML Kit expands into NLP with Language Identification and Smart Reply
  • Higher accuracy on vision models with EfficientNet-Lite
  • Accelerating TensorFlow Lite with XNNPACK Integration
Google On-device speech recognition

I worked in the Google speech team to bring speech and related technologies to work entirely on-device. I was part of the team that developed the very-low-power "hey Google" capabilities across devices, where I developed the first end-to-end system (and latest iteration of the ML model). I also built the pre-TensorFlow ML inference engine that brought a new generation of speech recognizers, text-to-speech generators, and keyboard technology on-device.

This infrastructure, along with many techniques I implemented, promoted and refined --such as quantization, sparsity, and the replacement of vanilla LSTMs for Coupled Input-Forget Gate (CIFG) variant (introduced here)-- became the foudation of newer infrastructure and production models.

Publications
  • Lico-net: Linearized convolution network for hardware-efficient keyword spotting
  • On the quantization of recurrent neural networks
  • A streaming on-device end-to-end model surpassing server-side conventional model quality and latency
  • End-to-end streaming keyword spotting
  • Optimizing speech recognition for the edge
  • Streaming end-to-end speech recognition for mobile devices
  • Lingvo: a modular and scalable framework for sequence-to-sequence modeling
  • A cascade architecture for keyword spotting on mobile devices
  • On the efficient representation and execution of deep acoustic models
  • Personalized speech recognition on mobile devices
  • Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks
  • Locally-connected and convolutional neural networks for small footprint speaker recognition
  • Compressing deep neural networks using a rank-constrained topology
Patents

US-9767410B1 , US-9372675B1 , US-9542948B2 , US-9842608B2 , US-10460735B2, EP-3121809B1 , US-20200126537A1 , WO-2020092532A1 , US-9953216B2

A more complete (and likely up to date) list at Google Scholar.

[2003 – 2005] Instituto Tecnológico y de Estudios Superiores de Monterrey
Master in Computer Science, Artificial Intelligence, Image Processing, Robotics
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup , Research assistant

[1998 – 2003] Instituto Tecnológico y de Estudios Superiores de Monterrey
Bachelor Computer Science, Artificial Intelligence
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup