Andrew Ji New Era Scientist
An Archive 2024 — 2026

Selected Work

选 录 · 在 路 上
Six Entries Research · Applied · Data
01 Research · 1st Author

Physics-aware Causal Deep Learning Attribution

A block-based framework for attributing extreme precipitation under counterfactual warming.

Designed and trained CNN-based deep learning models to perform causal inference and attribution in complex multivariate climate systems, achieving R² > 0.96 on counterfactual precipitation regression. The work asks not just what the model predicts, but which physical structures drive extreme precipitation under warming.

Diagnosed structural limitations in an existing state-of-the-art Stanford framework, redesigned the architecture around a block-based attribution scheme, and improved both robustness and interpretability. The result is a framework where attribution is a property of the design, not a post-hoc explanation.

What I built

  • Block-based CNN architecture
  • Counterfactual training pipeline
  • Climate ensemble preprocessing
  • Attribution diagnostics
  • PyTorch

Outcome

  • R² > 0.96 on held-out evaluation
  • 1st-author manuscript in preparation
  • Framework adopted within lab
Attribution should be a property of the model, not an apology added afterward.
Entry 01 of 06
02 Research · Applied

ML-based ESG Investment Return Attribution

Decomposing what ESG signals actually do to long-term, risk-adjusted performance.

A new framework for decomposing and causally attributing ESG-related signals in asset returns — separating the structural drivers of performance from market noise and from confounded factor exposures. The motivation is simple: ESG investing is over-claimed and under-explained, and most published return decompositions confuse correlation with cause.

Built a custom causal attribution system that evaluates how non-financial signals affect long-term, risk-adjusted performance under controlled conditions. Designed to be auditable rather than impressive — the goal is a number you can defend, not one you can't reproduce.

What I built

  • Causal attribution layer
  • Factor decomposition
  • Time-series construction
  • Feature attribution diagnostics

Why it matters

  • Distinguishes structural drivers from noise
  • Auditable across time horizons
  • Transferable across portfolios
In ESG, the number is easy. The defensible number is the work.
Entry 02 of 06
03 Applied Analytics · McIntire

Solidcore Pricing & Customer Value System

Explainable ML for pricing decisions, built so non-technical users can actually use it.

Developed an AI-powered decision-support platform using explainable ML (XAI) to model and interpret the impact of pricing on customer lifetime value. The core idea is that a pricing model is only useful if the people setting prices can understand it — so I designed for legibility first and accuracy second.

Layered on top is an interactive training system that lets non-technical users run pricing simulations, compare scenarios side-by-side, and see why the model recommends what it recommends. The simulator was designed to feel like a workshop rather than a black box.

What I built

  • Explainable ML stack (XAI)
  • CLV regression model
  • Price elasticity decomposition
  • Interactive scenario simulator

Outcome

  • Decision tool usable by non-technical users
  • Side-by-side scenario comparisons
  • Reasoned recommendations, not opaque ones
A model that no one trusts can't be deployed. Legibility is part of accuracy.
Entry 03 of 06
04 Data Infrastructure · Deloitte

Global Geospatial Site Selection & ROI

A 90+ variable location intelligence system for EV battery factory siting.

Built a global, multi-region location intelligence system integrating 90+ economic, regulatory, labor, and logistics variables to rank EV battery factory sites and quantify ROI under multiple policy and cost scenarios. Site selection is rarely a single clean decision — it's a sensitivity analysis across many futures, and the framework was designed to make that sensitivity visible.

Standardized and curated a cross-country, decision-grade dataset with automated normalization and sensitivity routines. The team adopted the framework for continued use in future location-strategy work, which I'm prouder of than any single ranking it produced.

What I built

  • 90+ variable integration
  • Cross-country normalization
  • Multi-scenario ROI modeling
  • Sensitivity analysis routines

Outcome

  • Adopted as reusable team framework
  • Decision-grade structured dataset
  • Scenario rankings under policy + cost variation
A site ranking is just an opinion. The sensitivity analysis is the actual answer.
Entry 04 of 06
05 Venture · Data Pipeline

Investor × Startup Matching Engine

A scoring system that helps founders find the right capital, not just any capital.

Building a global investor–company matching dataset that aggregates fund theses, past deals, sector focus, check sizes, and geographic preferences — the structured raw material needed to ask: which capital partners actually fit this startup? Most founder–investor matching today is either nepotistic or coarse. Neither scales.

On top of the dataset I'm developing a scoring engine that helps startups prioritize outreach, improve fundraising efficiency, and avoid wasted cycles with mismatched funds. The goal is not to replace judgment, but to give founders a defensible shortlist they can argue with.

What I'm building

  • Investor & deal data pipeline
  • Thesis & sector tagging (NLP)
  • Matching score model
  • Outreach prioritization layer

Why it matters

  • Reduces wasted founder fundraising time
  • Surfaces non-obvious investor fit
  • Defensible shortlist, not a black box
Founders deserve a shortlist they can argue with — not one they can only accept.
Entry 05 of 06
06 Internship · Brief

China Securities · Institutional Research

Visualizations and market briefs across seven product lines, plus AI & drone-data exploration.

Built data visualizations and market intelligence materials for institutional clients across seven-plus product lines, translating dense market activity into briefs that traders and PMs would actually read. Most of the value was in editing — choosing which signals to surface and which to drop entirely.

Alongside the client work I ran research into AI-powered automation (Mobile-ViT for image-based industry signals) and drone-based data systems for industry tracking. The exploration was speculative, but it forced a useful distinction: which workflows can AI legitimately accelerate, and which it just makes more confidently wrong?

What I built

  • Multi-product institutional briefs
  • Market intelligence visualizations
  • Mobile-ViT exploration
  • Drone-data feasibility study

What I learned

  • Briefs live or die on editing, not analysis
  • AI accelerates judgment; it doesn't replace it
  • Cross-product fluency compounds quickly
The job wasn't producing data — it was deciding what was worth a busy person's attention.
Entry 06 of 06 · End of archive