Projects

AI products I've shipped 0→1 — from problem discovery to production. Company work is summarized for confidentiality; metrics are real, client names and internals are not shown. School work includes deeper technical detail.

Kim Shi Tong · AI Product Manager [email protected]
Multi-Agent Chatbot for Cloud Infrastructure
Product Manager · Tencent · Dec 2025 – May 2026
166 → 5 minSRE diagnosis time (97%↓)
75%Scope cut to ship
50%Faster pilot launch
Problem Brief
Site reliability engineers spent hours manually tracing errors across distributed cloud systems — a bottleneck confirmed across 10+ enterprise customer visits. The diagnosis process was manual, slow, and dependent on tribal knowledge.
Solution Brief
A multi-agent AI assistant for root-cause diagnosis, unifying 10+ agent entry points behind a single chat interface with progressive context loading. Shipped concept → pilot under a tight deadline by cutting scope to the top 3 validated pain points.
Scope — What I Owned
  • 0→1 product definition as one of two PMs; secured client buy-in before any build, validated via customer discovery + competitive benchmarking across 8+ AIOps products.
  • Defined the multi-agent architecture and the prioritization call to cut 75% of planned scope to ship the pilot 50% faster.
  • Coordinated cross-dependencies across 5 private-cloud teams; replaced long spec reviews with rapid prototyping to compress alignment.
  • Shipped a self-evolution mechanism (guided authoring, LLM-judge scoring) so non-technical SREs could improve agent accuracy independently.
Client identity and internal architecture omitted for confidentiality. Metrics are pilot-trial results.
AI Article Generator + IoT Dashboard
Technical Product Lead · Smart Air (B-Corp, 17+ countries) · Jan 2024 – Dec 2025
80%Faster content production
300 → 1,800Cities covered
$1M+Contract value
67%Client energy savings
Problem Brief
A lean B2B startup couldn't scale content or product surface area: the marketing team spent 40% of its time on manual article updates, and B2B clients had unmet needs (mould-risk detection, system automation) no competitor addressed.
Solution Brief
A 0→1 RAG article generator with hallucination guardrails, plus client-driven IoT dashboard features. Built the content pipeline and guardrails myself before mature AI tooling existed, then scaled the product portfolio as sole technical product lead.
Scope — What I Owned
  • Scaled the portfolio from 5 to 8 software products for 20+ B2B clients (1M+ monthly users) as the sole technical product lead.
  • Shipped the 0→1 RAG generator: built scrapers, pipeline, and double-check verification + inline citation guardrails for production quality.
  • Turned B2B customer discovery into shipped features — mould-risk detection and on/off automation — driving adoption and contract value.
  • Took over full-stack technical ownership after the tech lead left — managed frontend, backend, VMs, DNS, and on-call for 8 products.
  • Rebuilt infrastructure: migrated from systemd to Docker + Traefik, set up Restic backup pipelines, hardened with Cloudflare WAF + fail2ban.
  • Raised production uptime from ~80% to 99% through monitoring, alerting, and incident response.
Client names and proprietary internals omitted for confidentiality.
Agent-Native Business Simulation Platform
Product Engineer · NUS ISEM (Faculty-Led Project) · Oct 2025 – May 2026
2Game engines shipped
MCPAgent-native protocol
Problem Brief
Traditional business simulations required web UIs that added friction for AI agent interaction. Faculty needed a platform where students' AI agents could compete directly in economic games without navigating browser interfaces.
Solution Brief
An agent-native simulation platform built on MCP (Model Context Protocol), enabling AI agents to interact with game engines through tool calls rather than web interfaces. Shipped with pluggable game engines.
Scope — What I Owned
  • Built FastAPI backend with pluggable game engine architecture — new games require only 3 files (params, actions, engine).
  • Integrated MCP for native AI agent interaction; agents call tools directly instead of parsing web pages.
  • Shipped 2 game engines: Retailer (pricing strategy under uncertainty) and 2048 (grid puzzle with stochastic spawns).
Architecture — Agent-Native Game Platform
AI Agent OpenClaw / Claude MCP Server tool calls FastAPI game router Game Engine pluggable Retailer 2048
AI Grading System
AI Product Engineer · NUS Computing · May 2025 – Dec 2025
70%Grading time reduced
82%Accuracy (w/ human review)
0Eng dependency for users
Problem Brief
Teaching assistants for a core module spent excessive time grading 6,600+ submissions a year across 600 students. The goal was to reduce grading time without removing human judgment.
Solution Brief
A RAG grading pipeline with a human-in-the-loop review-and-override workflow. The first version over-scoped and required engineers to tune anything; I made the call to rebuild around a simple prompt-engineering dashboard that non-technical professors and TAs could operate themselves.
Scope — What I Owned
  • Led 0→1 product definition and the RAG pipeline build with a small team.
  • Made the hard call to rebuild after months of work once the first version proved unusable for non-technical staff.
  • Designed the key trade-off: 82% accuracy + TA override, rather than chasing 95% — because the goal was saving time, not replacing TAs.
Architecture — Human-in-the-Loop Grading Pipeline
Submission 600 students RAG Retrieval rubric + context LLM Grader 82% accuracy TA Review override + trust Final Grade Prompt-Engineering Dashboard non-technical staff tune criteria — no engineer needed configures grader
AI Data Processing App
Software Engineer · Trivi Data (Data Consultancy) · May 2023 – Aug 2023
Automated data cleaning for consultants — user-interviewed, then built.
Problem Brief
Data consultants spent hours on manual data cleaning — handling missing values and duplicates by hand — a bottleneck surfaced through direct user interviews.
Solution Brief
An AI-assisted data processing app that automated the most repetitive cleaning steps, identified and prioritized after interviewing the consultants who lived the problem daily.
Scope — What I Owned
  • Ran user interviews with data consultants to locate the real bottleneck before writing code.
  • Built the data-cleaning automation that handled missing values and duplicate detection.
Early-career engineering role. Client details omitted for confidentiality.