Trustworthy AI for finance & health

AI/ML engineer & researcher building trustworthy LLM systems (evaluation, grounded generation, and participatory design).

I build trustworthy LLM systems for high-stakes finance and health: grounded generation, measurable evaluation, and participatory design.
Finance: 80% faster incident detection Health: 200k records Health: 93% agreement

What I focus on

I build trustworthy AI systems where mistakes are expensive: finance and health. The goal is not “AI adoption” — it is reliable decision support embedded in real workflows.

Focus 01
Finance (investment + market operations)

LLM systems that stay grounded under uncertainty, and monitoring that surfaces failures early.

What I build
  • Grounded RAG for research and memo workflows (claim-first retrieval + citations).
  • Evaluation and monitoring loops: router audits, context capture, and feedback-driven iteration.
  • Workflow-first delivery: clear interfaces, rollback plans, and reliability targets.
Focus 02
Health (research classification + evidence workflows)

High-agreement AI that respects expert time — precision-first systems that know when not to decide.

What I build
  • LLM-assisted classification and triage at scale, with abstention to route ambiguity to experts.
  • Grounding and traceability: evidence-linked outputs, uncertainty handling, and failure analysis.
  • Human-centered design: align system outputs with accountability and clinical/research decision points.

Selected projects / case studies

Industry experience (Finance)

Investment intelligence and market monitoring — grounded generation, evaluation, and reliability loops.

See industry

Research experience (Health + Evaluation)

Medical research classification at scale, agent evaluation, governance, and participatory design.

See research

Writing (Takeaways)

Short essays on what makes AI useful in real organisations — problem-first, workflow-first.

Read takeaways

Research interests

  • LLM agent evaluation in mixed-motive and cooperative settings
  • Governance, explainability, and participatory design for agentic systems
  • Grounded generation: claim-first retrieval, citation, and uncertainty handling
  • Quality monitoring: feedback loops, evaluation harnesses, and failure analysis

Contact

Open to: research collaborations on trustworthy AI, and speaking on LLM systems in high-stakes finance and health.

Focus
Evaluation
Define failure modes, set targets, and measure reliability in deployment — not just in demos.
Grounded generation
Claim-first retrieval with citations and uncertainty handling for high-stakes decisions.
Participatory AI
Design with the people affected — map workflows, decisions, and accountability.