Raziel Alvarez

Software Engineer

Technical Lead, PyTorch at Facebook.

Particularly focused on framework portability across devices, as well as Machine Learning optimization techniques.


I’m a pragmatic engineer in constant search for the balance between beautiful, universal solutions and impact in the real world.

Today I do this as part of the PyTorch team at Facebook, developing a machine learning framework that powers state of the art research, and helps bring that research to products across an ever increasing number of fields.

Previously, I spent 8 years at Google also developing machine learning frameworks, like TensorFlow, that powered billions of users in Google and other companies' applications. I also participated in research applied to Google’s products, such as the Assistant, where I worked in Speech Reconition and developed technologies that power "Hey Google". Oh, and my first few months I worked on the Fonts project that I use on this site.

Prior to that I spent 7 years at Appian, where I lead a number of projects, most significantly co-authoring the SAIL (Self-Assembling Interface Layer) technology which underpins the company’s low-code platform.

Founded the TensorFlow Model optimization toolkit.

A suite of tools for optimizing machine learning models for deployment and execution, via easy to use and consistent APIs implementing powerful machine learning optimization techniques.

Selected work I was involved in:

  • Introducing the Model Optimization Toolkit for Tensorflow
  • TensorFlow Model Optimization Toolkit—Pruning API
  • TensorFlow Model Optimization Toolkit—Post-Training Integer Quantization - blog
  • TensorFlow Model Optimization Toolkit—Post-training reduced-precision fp16 quantization
  • TensorFlow Model Optimization Toolkit—Quantization Aware Training API
  • EfficientNet-EdgeTPU: Accelerator-aware neural network design with AutoML
  • Coral summer updates: Post-training quant support, TF Lite delegate, and new models!
  • TensorFlow Model Optimization Toolkit — Weight Clustering API
Co-Founded TensorFlow Lite.

Google's open source deep learning framework for on-device machine learning. It has billions of installs, from mobile phones, smart displays and speakers, to cars and wearables, powering Google's and other companies products.

Selected work I was involved in:

  • Announcing TensorFlow Lite
  • TensorFlow operation fusion in the TensorFlow Lite converter
  • ML Kit expands into NLP with Language Identification and Smart Reply
  • Higher accuracy on vision models with EfficientNet-Lite
  • Accelerating TensorFlow Lite with XNNPACK Integration
Google On-device speech recognition

I worked in the Google speech team to bring speech and related technologies to work entirely on-device. I was part of the team that developed the very-low-power "hey Google" capabilities across devices, where I developed the first end-to-end system (and latest iteration of the ML model). I also built the pre-TensorFlow ML inference engine that brought a new generation of speech recognizers, text-to-speech generators, and keyboard technology on-device.

This infrastructure, along with many techniques I implemented, promoted and refined --such as quantization, sparsity, and the replacement of vanilla LSTMs for Coupled Input-Forget Gate (CIFG) variant (introduced here)-- became the foudation of newer infrastructure and production models.

  • On the quantization of recurrent neural networks
  • A streaming on-device end-to-end model surpassing server-side conventional model quality and latency
  • End-to-end streaming keyword spotting
  • Optimizing speech recognition for the edge
  • Streaming end-to-end speech recognition for mobile devices
  • Lingvo: a modular and scalable framework for sequence-to-sequence modeling
  • A cascade architecture for keyword spotting on mobile devices
  • On the efficient representation and execution of deep acoustic models
  • Personalized speech recognition on mobile devices
  • Automatic gain control and multi-style training for robust small-footprint keyword spotting with deep neural networks
  • Locally-connected and convolutional neural networks for small footprint speaker recognition
  • Compressing deep neural networks using a rank-constrained topology

US-9767410B1 , US-9372675B1 , US-9542948B2 , US-9842608B2 , US-10460735B2, EP-3121809B1 , US-20200126537A1 , WO-2020092532A1 , US-9953216B2

A more complete (and likely up to date) list at Google Scholar.

[2003 – 2005] Instituto Tecnológico y de Estudios Superiores de Monterrey
Master in Computer Science, Artificial Intelligence, Image Processing, Robotics
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup , Research assistant

[1998 – 2003] Instituto Tecnológico y de Estudios Superiores de Monterrey
Bachelor Computer Science, Artificial Intelligence
Summa Cum Laude
Activities and Societies: ACM Collegiate Programming Contest 🎈, RoboCup World Cup