SWE/MLE · ML Systems

Felipe Felix Arias

SWE/MLE on Uber’s ML Training team. I build the systems that run large GPU training workloads and keep them reliable, efficient, and cost-aware.

Founder of Marovi AI, a wiki-style translation platform that makes research and learning materials accessible across languages. Former NSF Graduate Research Fellow at UIUC.

MS CS, UIUC (2023) BS CS Honors (2019) NSF GRF (2021)
ML Infrastructure GPU Training Kubernetes Ray Agentic Solutions Python Go

Bay Area based · Fluent in English and Spanish

Felipe Felix Arias

Impact at a Glance

Signals from production systems, research, and Marovi AI.

0

Downtime migrating the core ML scheduler to regional federated compute across thousands of pipelines.

3.7%

Reduction in unsafe behavior from a multi-signal distracted driving system.

Top 10% x2

Y Combinator selections for Marovi AI.

5x

Faster multi-agent motion planning via spatial regression models.

Technical Focus

Systems areas I work in.

Scalable Training Systems

Infrastructure for large GPU fleets: reliability, throughput, and orchestration.

GenAI Systems

Production GenAI: assistants, safety tooling, and evaluation loops.

Machine & Agentic Translation

Translation infrastructure for research and learning materials across languages.

Fine-Tuning

Fine-tuning strategy, partnerships, and production rollout.

Experience Highlights

Selected work across ML infrastructure, GenAI, and translation.

Uber — ML Training Team

Software Engineer · San Francisco · Mar 2024 — Present

  • Build job controllers and GPU training infrastructure; led the move from zonal to regional federated compute with zero downtime.
  • Co-designed Ray job submission across federated Kubernetes clusters and shipped AI safety systems plus internal assistants for incident response.
  • Leading mTLS rollout for batch workloads and serving as primary point of contact for LLM fine-tuning partnerships.

Marovi AI

Founder · San Francisco · 2024 — Present

  • Built Marovi AI Wiki, a translation platform that makes research and learning materials accessible across languages.
  • Designed correction loops that blend user edits, reviewer votes, and canonical translations; selected top 10% at Y Combinator twice.
  • Implemented cost controls and enterprise-ready deployment options.

Research — UIUC Parasol Lab

NSF GRF · 2019 — 2023

  • Built spatial regression models that improved multi-agent motion planning speed and collision avoidance.
  • Released tools for indoor multi-agent navigation and dataset analysis.
  • Master’s thesis on motion pattern prediction; earlier work at Uber Search (2023) and Stanford Hazy Research (2018–2019).

Selected Projects

A few representative projects.

Video sensor visual

Video as a Sensor for Urban Risk

With Daniel Carmody, Richard Sowers, Jayati Singh, Kevin So

Weak supervision and behavioral heuristics surface risky roadway events when labeled data is scarce.

Publications

Selected academic work.

Now

Focused on ML training infrastructure at Uber, plus safety tooling and fine-tuning partnerships.

Outside Work

I make music and shoot photos/video. Highlights on Instagram.