Raziel Alvarez

Software Engineer

Engineering Lead and Architect, Core ML at Apple.

Providing the foundational infrastructure for Apple's Machine Learning research to be deployed across products and devices, including Apple Intelligence.

tl;dr;

Over the past 10+ years I've focused on applied research and machine learning infrastructure deployed in products used by billions of users. My contributions have been particularly focused on framework portability across devices, as well as Machine Learning optimization techniques. Previous to my current role, I co-authored and architected the "on-device" ML frameworks of TensorFlow and PyTorch, and built the production infrastructure at Google and Meta.

Today I do this as part of the Core ML team at Apple, where I lead the team to develop state-of-the-art infrastructure to deploy machine learning across Apple's products and devices --playing a key role in the role-out of Apple Intelligence--, as well as third-party applications.

Perviously I spent 3 years at Facebook as Tech Lead within the PyTorch team, where I championed PyTorch 2.0 technology and founded and lead the architecture of PyTorch Executorch, PyTorch's end-to-end solution for enabling on-device inference capabilities across mobile and edge devices.

I also spent 8 years at Google also developing machine learning frameworks, like TensorFlow, that powered billions of users in Google and other companies' applications. I also participated in research applied to Google’s products, such as the Assistant, where I worked in Speech Reconition and developed technologies that power "Hey Google". Oh, and my first few months I worked on the Fonts project that I use on this site.

Prior to that I spent 7 years at Appian, where I lead a number of projects, most significantly co-authoring the SAIL (Self-Assembling Interface Layer) technology which underpins the company’s low-code platform.

Founded PyTorch's Executorch.

An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of various PyTorch models (vision, speech, Generative AI, and more) to edge devices.

Selected work I was involved in:

  • Redefined the strategy for PyTorch's own on-device ML suite: "PyTorch Edge"
  • Created the vision and strategy for ExecuTorch
  • Defined the ExecuTorch technical architecture
  • Defined the relationship and integration of PyTorch Core, Intermediate Representations, the new 2.0 APIs and ExecuTorch's
  • Key contributor to define PyTorch's torch.export IR
  • Key contributor to define PyTorch's EXIR dialects
  • Redefined the strategy and architecture ot PyTorch's Architecture Optimization and ML optimization around torch.export
  • Worked on the first embedded use of ExecuTorch within Meta's smart classes
Founded the TensorFlow Model optimization toolkit.

A suite of tools for optimizing machine learning models for deployment and execution, via easy to use and consistent APIs implementing powerful machine learning optimization techniques.

Selected work I was involved in:

  • Introducing the Model Optimization Toolkit for Tensorflow
  • TensorFlow Model Optimization Toolkit—Pruning API
  • TensorFlow Model Optimization Toolkit—Post-Training Integer Quantization - blog
  • TensorFlow Model Optimization Toolkit—Post-training reduced-precision fp16 quantization
  • TensorFlow Model Optimization Toolkit—Quantization Aware Training API
  • EfficientNet-EdgeTPU: Accelerator-aware neural network design with AutoML
  • Coral summer updates: Post-training quant support, TF Lite delegate, and new models!
  • TensorFlow Model Optimization Toolkit — Weight Clustering API
Co-Founded TensorFlow Lite.

Google's open source deep learning framework for on-device machine learning. It has billions of installs, from mobile phones, smart displays and speakers, to cars and wearables, powering Google's and other companies products.

Selected work I was involved in:

  • Announcing TensorFlow Lite
  • TensorFlow operation fusion in the TensorFlow Lite converter
  • ML Kit expands into NLP with Language Identification and Smart Reply
  • Higher accuracy on vision models with EfficientNet-Lite
  • Accelerating TensorFlow Lite with XNNPACK Integration
Google On-device speech recognition

I worked in the Google speech team to bring speech and related technologies to work entirely on-device. I was part of the team that developed the very-low-power "hey Google" capabilities across devices, where I developed the first end-to-end system (and latest iteration of the ML model). I also built the pre-TensorFlow ML inference engine that brought a new generation of speech recognizers, text-to-speech generators, and keyboard technology on-device.

This infrastructure, along with many techniques I implemented, promoted and refined --such as quantization, sparsity, and the replacement of vanilla LSTMs for Coupled Input-Forget Gate (CIFG) variant (introduced here)-- became the foudation of newer infrastructure and production models.

Publications
  • Lico-net: Linearized convolution network for hardware-efficient keyword spotting
  • On the quantization of recurrent neural networks
  • A streaming on-device end-to-end model surpassing server-side conventional model quality and latency
  • End-to-end streaming keyword spotting
  • Optimizing speech recognition for the edge
  • Streaming end-to-end speech recognition for mobile devices
  • Lingvo: a modular and scalable framework for sequence-to-sequence modeling
  • A cascade architecture for keyword spotting on mobile devices
  • On the efficient representation and execution of deep acoustic models
  • Personalized speech recognition on mobile devices
  • Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks
  • Locally-connected and convolutional neural networks for small footprint speaker recognition
  • Compressing deep neural networks using a rank-constrained topology
Patents

US-9767410B1 , US-9372675B1 , US-9542948B2 , US-9842608B2 , US-10460735B2, EP-3121809B1 , US-20200126537A1 , WO-2020092532A1 , US-9953216B2

A more complete (and likely up to date) list at Google Scholar.

[2003 – 2005] Instituto Tecnológico y de Estudios Superiores de Monterrey
Master in Computer Science, Artificial Intelligence, Image Processing, Robotics
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup , Research assistant

[1998 – 2003] Instituto Tecnológico y de Estudios Superiores de Monterrey
Bachelor Computer Science, Artificial Intelligence
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup