technical program manager · ai infrastructure · london

AI systems
teams can trust
in production.

Public engineering record: 28 merged upstream PRs across the AI infrastructure stack — HuggingFace Transformers, Ray Serve, LlamaIndex, LiteLLM, axolotl, OpenAI Agents Python, Continue — plus first-party in-flight contributions to Anthropic and Microsoft codebases. Day job: PM II at Amazon, leading production ML and Gen AI programs across European operations (internal hackathon Most Innovative 2026, 9-figure annualized impact). The AI-infrastructure work — inference, agents, post-training & fine-tuning, evaluation, alignment — happens in the OSS contributions and side products; the operations work is where I learned how to run programs at network scale. HKU law + business, St. Gallen Master's magna cum laude, three continents, six natural languages.

amazon · accenture · pwc · 3 continents · english · 中文 · 粵語 · español · français · português
01 — selected work

Production AI work, cross-system program delivery, and a published eval benchmark.

amazon · public
▲ MOST INNOVATIVE · 2026 Agentic AI for operations planning >1,400 hrs / year of manual work eliminated LLM intake + deterministic prep + async AWS orchestration internal AI hackathon · most innovative · 2026

Agentic AI for operations planning

Won Most Innovative Solution, 2026 at an internal Amazon AI hackathon. Built an agentic system that automates experiment setup and analysis — LLM-assisted intake over schema-disciplined deterministic preprocessing of 1–5 GB experiment bundles, plus async AWS-native orchestration. Eliminates >1,400 hours/year of recurring manual work; live in production.

gen ai agentic pydantic aws-native in production
PRODUCTION ML · CROSS-SYSTEM DELIVERY 9-figure annualized · modeled, multi-year ▸ continent-scale network · 50+ inputs · 30+ metrics ▸ BRDs · launches · eval loops · live-metric validation ▸ multi-system integration · rollout · audit · instr. ▸ millions of automated decisions / week ▲ amazon · pm II · london · since 2022

Production ML programs · cross-system delivery

Owner of BRDs, launches, and the simulation + eval loops behind production ML decision systems running on a continent-scale network · hundreds of nodes. Modeled across 50+ dynamic inputs · 30+ operational metrics; millions of automated decisions per week; multi-system integration across planning, audit, and instrumentation pipelines. Two production launches in 2025 produced record reliability gains, validated on live metrics, not backtests. 9-figure annualized impact (modeled, multi-year).

ml policy simulation multi-system live metrics
AI EVAL · PUBLISHED BENCHMARK LLM Judge Calibrator ▸ 6-model benchmark · cohen's kappa ▸ position-bias + verbosity analysis ▸ inter-rater consistency metrics ▲ public results · github

LLM Judge Calibrator

Published 6-model judge-reliability benchmark with Cohen's kappa, position-bias detection, and verbosity analysis. Turns evaluator reliability into a quantity teams can measure before trusting it in model-selection or regression decisions. Public results in the repo.

eval judge calibration benchmark
02 — open source

Open-source contributions across the AI infrastructure stack.

28 merged upstream PRs across the AI infrastructure stack — LLM serving (HuggingFace Transformers, LiteLLM), compute & orchestration (Ray Serve), post-training & fine-tuning (axolotl), RAG & document AI (LlamaIndex, Docling), agents & eval (OpenAI Agents Python, Continue, Mastra). 40 in flight including first-party contributions to Anthropic (Cookbooks, Go SDK), Microsoft (Semantic Kernel), and Vercel (AI SDK security disclosure), plus active work in CrewAI, Chroma, and the agent-tooling stack.

28merged · 24mo
40in flight
1.11Mstars across repos
03 — side projects

Other things I've built on the side.

public urls · in development
04 — background

Three continents, six languages, one operating thesis.

education · work · footprint

I started in Hong Kong with dual LLB + BBA degrees at HKU, did a magna cum laude master's in Strategy & International Management at St. Gallen with a GenAI thesis on LLMs, agents, alignment, and governance, exchanged in Bogotá and Santiago, led an NGO in Guatemala, consulted out of Beijing with Accenture (market-entry strategy, Oracle ERP migration, ML revenue modeling) and Shanghai with PwC (SAP system integration), and joined Amazon as a Business Analyst in Luxembourg before rotating to the L5 PM II role in London in 2024.

The legal training shapes how I build AI: not as unconstrained generation engines, but as systems operating inside explicit rules, authorities, and validation layers. The international footprint shapes the operator habit: most of a program is translation — of constraints between teams, of intent between locales, of model behavior between markets.

AWS Certified AI Practitioner · 2025 AWS Certified Cloud Practitioner · 2022 3× AWS-certified stack

leadership · cross-functional programs · DEI

  • Cross-functional program ownership — coordinated production-ML rollouts across PM, SDE, ML research, and operations teams spanning EU-wide footprint. Owned the BRDs, the launch artifacts, and the eval loops. Managed dependencies across 5+ stakeholder teams on each launch.
  • Mentorship track record — every intern and direct mentee mentored at Amazon to date has received a return offer. Coached through both technical execution and PM-craft skill development.
  • Cross-cultural delivery — operated and shipped across 3 continents and 6 natural languages: consulting tenure in Beijing (Accenture: market-entry strategy, Oracle ERP migration, ML revenue modeling) and Shanghai (PwC: SAP integration), master's-era programs in Latin America, current EU operations leadership.
  • ERG involvement — active in Glamazon and the Asian & Latino affinity groups at Amazon. Diversity is a leadership lens, not a checkbox.
  • NGO leadership — led Niños de Xela for one year. Sustainable agricultural-product diversification and long-term business-model design with underprivileged indigenous women in Quetzaltenango, Guatemala. Main speaker and point of contact during in-country fieldwork; organized fundraising events.

master's thesis · st. gallen · 2024

A systems view of generative AI in no-code / low-code development — built around LLMs, agents, alignment, and governance rather than prompt theater.
Supervisor: Dr. Edona Elshan · graded magna cum laude
50K+reddit entries · chained pipeline
3pillars · LLMs · agents · governance
Pydanticagent coordination layer
working thesis The next decade of AI is won by the teams that make capability claims honest. Capabilities are the input. The eval loop is the product.
work
education
footprint
past current
europe
  • 🇬🇧LondonUK
    2024 — present
  • 🇱🇺LuxembourgLU
    amazon · business analyst
  • 🇨🇭St. GallenCH
    master's · magna cum laude
asia
  • 🇭🇰Hong KongHK
    hku · llb + bba
  • 🇨🇳BeijingCN
    accenture · consultant
  • 🇨🇳ShanghaiCN
    pwc · consultant · sap
latin america
  • 🇨🇴BogotáCO
    u. de los andes · master exchange
  • 🇬🇹QuetzaltenangoGT
    niños de xela · 1-yr ngo lead
  • 🇨🇱SantiagoCL
    u. católica · undergrad exchange
hover a city · london current
🇬🇧LondonUK
2024 — present · pm II

Amazon PM II. Production ML & Gen AI programs across European operations. Active OSS contributor across the AI infrastructure stack — direct in-flight contributions to Anthropic Cookbooks and the Anthropic Go SDK.

🇱🇺LuxembourgLU
amazon · business analyst

Amazon Business Analyst. Analytics, automation, and cloud-planning work across EU operations finance. Foundation for the later return to Amazon at PM II.

🇨🇭St. GallenCH
master's · magna cum laude

University of St. Gallen — MA Strategy & International Management. GenAI thesis on LLMs, agents, alignment, and governance: chained pipeline over 50K+ Reddit entries with Pydantic-coordinated agents.

🇭🇰Hong KongHK
hku · llb + bba

The University of Hong Kong — dual LLB + BBA. Scholarships and debating-society leadership. Legal training that shapes how I build AI systems: explicit rules, authorities, validation layers.

🇨🇳BeijingCN
accenture · consultant

Accenture Consultant. Three engagement areas: China market-entry strategy for global liquor clients; Oracle ERP integration and migration for a major electronics manufacturer; revenue-strategy modeling using ML. Cross-cultural delivery at large-org scale.

🇨🇳ShanghaiCN
pwc · consultant · sap

PwC Consultant. SAP system integration and migration across Greater China clients. Trilingual delivery in English / 中文 / 粵語.

🇨🇴BogotáCO
u. de los andes · master exchange

Universidad de los Andes — master's exchange. Latin American economy, accounting rules, and business context. Spanish + Portuguese operating range deepened.

🇬🇹QuetzaltenangoGT
niños de xela · 1-yr ngo lead

Year-long lead of Niños de Xela — sustainable agricultural-product diversification and long-term business-model design with underprivileged indigenous women in Quetzaltenango. Main speaker and point of contact during in-country fieldwork; organized fundraising events.

🇨🇱SantiagoCL
u. católica · undergrad exchange

Pontificia Universidad Católica de Chile — undergrad exchange. Sharpened cross-cultural operating range and Latin American context for early business-strategy work.

05 — contact