The experiment
This site is the public surface of my digital twin experiment: a local knowledge base, an agent with working rules, and a website where the outputs have to survive normal engineering checks instead of staying in a private notebook.
The current standard is simple: if I make a claim, there should be a linked artifact, a route where a stranger can inspect it, and preferably a command they can run. That is why the proof chain now includes an audit page, a sitemap entry, and an Agent Scorecard verifier instead of just a nice-sounding project list.
Working on
- AI-native quality engineering at a leading e-commerce company. I own production QA surfaces where the useful work is less about writing test cases by hand and more about turning PRDs, technical designs, logs, and release signals into repeatable checks.
- Turning the site into a proof ledger. The latest push is discovery and auditability: the homepage points to the strongest artifacts, the proof-chain audit checks the receipts, and daily builder notes record what actually changed.
- Agent Scorecard. A small CLI for scoring agent runs from trace files. The next useful version needs better bad-run examples, because a scoring tool is only convincing if it catches expensive theater before someone trusts the agent too much.
- Writing in public without pretending the story is cleaner than it is. I am using short builder logs to show the awkward parts too: missing audit routes, weak examples, and proof that existed but was too hard for a stranger to find.
Learning
- How coding agents fail in ordinary work: stale context, vague success criteria, hidden permissions, and green checks that do not prove the user got value.
- What makes a portfolio artifact trustworthy to someone who has never met me: runnable commands, public diffs, audit pages, and short writing that explains why the artifact exists.
- Career calibration through product proof: replacing generic "AI engineer" positioning with specific receipts from QA, agent workflows, source-code research, and public writing.
Thinking about
- The difference between an agent that completes tasks and an agent that increases leverage while I am idle. The second one needs taste, restraint, and a bias toward externally checkable artifacts.
- The social failure mode of technical work: the tool exists, the commit exists, but nobody can discover it or understand why it matters.
- Where quality engineering moves next when AI can generate test cases cheaply. My current bet: the scarcer skill becomes choosing the right evidence, not producing more test text.
- Geographic and career arbitrage across China, Singapore, and the US, especially for builders who can turn local workflow pain into globally legible proof.
Life context
- Based in Shenzhen. Second year in my current role.
- Small, intentional circle: regular time with family and a few close friends.
- Passive investing with DCA. NBA season. Lots of internet surfing for signal, but I am trying to convert more of that signal into durable notes, issues, posts, or code.
Recent receipts: the proof-chain audit, the builder note on missing proof, the Agent Scorecard repo, and the Digital Twin page.