Methodology

Technical documentation of hazard scoring, financial analysis, and data sources.

Hazard Scoring Algorithm v4

Piecewise Thresholds, Tier-Weighted Aggregation, Delta Blending, Dynamic EP Convexity

Overview

Each location is scored across 8 hazard categories: Heat Stress, Cold Stress, Drought, Flood, Wildfire, Cyclone, Sea Level Rise, and Precipitation/Water Stress. For each hazard, every available indicator is scored using piecewise threshold normalization (science-based breakpoints), then aggregated via tier-weighted composition. A 20% delta-from-historical blend captures the rate of change as an additional risk factor. Scores are extracted at the user’s selected target year for horizon-specific assessment.

Algorithm Steps

Score all real indicators — Each indicator is normalized using piecewise breakpoint tables that reflect physiological and structural danger thresholds (e.g., WBT 35°C survivability limit, FWI 50+ extreme fire danger).
Delta blending — Final indicator score = 80% absolute + 20% delta-from-historical, capturing the rate of change as an additional risk factor.
Tier-weighted aggregation — Headline = max(tier 1 best, weighted composite). Each indicator’s effective weight = tier_weight × source_quality. Tier 1 (direct measurements, weight 1.0), Tier 2 (derived counts, 0.7), and Tier 3 (statistical proxies, 0.4) contribute proportionally, scaled by the data source’s quality multiplier (0.50–1.0).
Store all sub-scores — Every individual indicator score is preserved for transparency and drill-down.
ThinkHazard fallback — Categorical hazard levels (HIG/MED/LOW/VLO) are used when zero real indicators exist, tagged as Categorical quality (0.5 weight in overall score).
Overall score — Quality-weighted arithmetic mean: real data × 1.0, categorical × 0.5.

Piecewise Threshold Normalization

Climate risks are nonlinear. v4 replaces linear min-max normalization with piecewise breakpoint tables that reflect real physiological and structural danger thresholds. Each indicator has a table of (input, score) breakpoints in ascending order; values are linearly interpolated between adjacent points.

Example: Wet Bulb Temperature

20°C→0 | 26→20 | 28→40 | 30→60 | 32→80 | 35→100

Source: Sherwood & Huber 2010 (35°C survivability limit)

All 52 indicators have peer-reviewed breakpoint tables. Sources include IPCC AR6, NOAA Heat Index scale, ISO 7243 (occupational WBGT), Canadian FWI scale, and the Saffir-Simpson hurricane scale. Indicators without custom breakpoints fall back to linear normalization.

Damage-Function-Derived Scoring (Primary Indicators)

For 9 primary indicators with matching engineering damage functions, scores are derived directly from physics-based damage rather than arbitrary breakpoints. The formula score = 100 × (damage / max_damage)^1/α ensures coherence between hazard scores, EP curves, and engineering damage. The inverse exponent 1/α compensates for the EP curve’s convexity s^α, so the round-trip recovers the original damage fraction.

Indicator	Damage Function	Max	α
wbgt	Foster & Smallcombe 2021 (heat productivity)	0.80	1.0
jrc_flood_depth	FEMA HAZUS (depth-damage)	0.95	1.3
tc_windspeed_max	Emanuel 2011 (cyclone cubic sigmoid)	1.00	1.5
fwixx	De Groot 2013 (FWI danger classes)	0.55	1.5
slr_total	Hinkel 2014 (SLR amplification)	0.50	1.3
hdd65	Sailor & Muñoz 1997 (heating energy cost)	0.45	1.0
hi	Foster & Smallcombe 2021 (HI variant, R²=0.96)	0.80	1.0
cdd65	Sailor & Muñoz 1997 (cooling energy cost)	0.50	1.0
aqueduct_water_stress	WRI Aqueduct (water cost multiplier)	0.40	1.0

The remaining 43 indicators (day counts, percentiles, statistical proxies) continue using piecewise threshold tables only. Precipitation is the only hazard without an engineering function—its primary damage mechanism is flooding, already covered by the HAZUS depth-damage curve.

Two-Tier Scoring Model

The system uses an industry-standard two-tier model (S&P, Munich Re, MSCI):

Exposure Score (0–100) — always shown, based on piecewise threshold normalization for all indicators. Feeds EP curves and financial metrics.
Engineering Damage (% AAL) — shown only for the 9 indicators with physics-based damage functions. Provides a complementary bottom-up loss estimate.

Hazards with engineering damage functions display the damage % below the exposure score.

Indicator Tiers

Each indicator is assigned a tier reflecting its data quality and directness of measurement:

Tier	Weight	Definition	Examples
1	1.0	Primary direct measurement of the hazard	WBT, WBGT, FWI max, JRC flood depth, TC wind
2	0.7	Secondary/derived count or threshold-based	HD35, Tropical Nights, FWI season length, RX1day
3	0.4	Statistical proxy, percentile-based, or indirect	CDD65, TX84rr, TX90p, Aqueduct stress indices

The headline score for each hazard = max(best tier 1 score, weighted composite of all tiers). This prevents a single noisy proxy from driving the headline while ensuring tier 1 measurements anchor the score.

Dynamic Source-Toggle Tiers

Tiers are dynamic — they change based on the selected data source (NASA vs CCKP). When CCKP is selected, its indicators are promoted to their native tiers while NASA indicators are excluded (and vice versa). Independent sources (ETCCDI, SEDAC, Aqueduct, JRC, ISIMIP, ETH FWI, ThinkHazard) are always included regardless of the source toggle.

Source Toggle	Included Pipelines	Excluded
NASA	NASA NEX-GDDP + all independent sources	CCKP climatology, CCKP indicators
CCKP	CCKP climatology/indicators + all independent sources	NASA NEX-GDDP

Source Quality Multipliers

Each data source carries a quality multiplier (0.50–1.0) that reflects its resolution, methodology, and peer-review status. The effective weight of each indicator is tier_weight × source_quality, so a Tier 2 indicator from a 0.95 quality source has effective weight 0.7 × 0.95 = 0.665.

Source	Quality	Notes
ETCCDI, SEDAC, ETH FWI, JRC, ISIMIP, AR6 SLR	1.00	Independent gold-standard datasets
CCKP Median Climatology, CCKP Sea Level, CCKP Cyclone	0.95	Pre-computed 20-year medians (matches IPCC AR6)
CCKP Mean Climatology	0.93	Ensemble mean from per-model downloads
CCKP Indicators (timeseries)	0.90	Annual timeseries, less stable than climatology
NASA NEX-GDDP (Optimized)	0.85	Ensemble mean, biased high by hot models
Aqueduct Water Risk	0.80	Composite index, coarser resolution
ThinkHazard	0.50	Categorical only (HIG/MED/LOW/VLO), 0.5 weight in overall
No data	0.00	No source available for hazard

Change-from-Baseline Delta Blending

IPCC AR6 emphasizes the rate of change as a risk factor. A region warming rapidly is riskier than one stable at high temperatures due to adaptation lag and tipping point proximity. v4 blends:

final = round(0.8 × absolute_score + 0.2 × delta_score)

delta_score = normalize(|future_value - historical_value| / indicator_range). When no historical data is available, 100% absolute score is used (no penalty). Static datasets (ThinkHazard, JRC) produce delta = 0, so they remain pure absolute.

Special Cases

Inverted indicators — Growing Season Length (shorter = more risk) and Annual Min Temperature (colder = more risk) have inversion embedded directly in their breakpoint tables (scores decrease as input value increases).

SPEI-12 (Drought) — The Standardized Precipitation-Evapotranspiration Index is negative for drought conditions. The value is negated before piecewise normalization so that drier conditions produce higher scores.

JRC Flood + Precipitation Trend — JRC flood depth receives a +10 bonus when precipitation trend is increasing, reflecting compounding flood risk from rising rainfall.

ThinkHazard Categorical Quality

When no real quantitative indicators are available for a hazard, the system falls back to World Bank ThinkHazard categorical levels (HIG→75, MED→50, LOW→25, VLO→10). These are now tagged as Categorical quality (not “Scored”), and receive 0.5 weight in the overall score. This ensures transparency—users know which hazards are backed by quantitative data vs. rough categorical estimates.

Indicator Metric Types & Dynamic EP Convexity

Each indicator is tagged as :intensity, :frequency, or :both. After scoring, the per-hazard intensity ratio is computed as the score-weighted share of intensity-type indicators. This ratio adjusts the EP curve’s convexity exponent (α) via:

α = base_alpha + 0.3 × (intensity_ratio − 0.5)

When intensity_ratio = 0.5 (balanced), α equals the base. When > 0.5 (intensity-dominated), α increases, making the EP curve more convex (acute peril behaviour). When < 0.5 (frequency-dominated), α decreases toward linearity (chronic peril behaviour). Range: ±0.15 from base.

Metric Type	Definition	Examples
intensity	Peak magnitude of the hazard	WBT, HI, WBGT, FWI max, TC wind, SLR
frequency	How often the hazard occurs	HD35, Frost Days, CSDI, R50mm, FWI season
both	Composite of frequency and intensity	CDD65, HDD65, Aqueduct variability

Quality-Weighted Overall Score

The overall location score is a quality-weighted arithmetic mean:

overall = round((Σ real_scores × 1.0 + Σ categorical_scores × 0.5) / (count_real × 1.0 + count_categorical × 0.5))

Data coverage is shown separately (e.g., “6/8 hazards with real data”) rather than being conflated with the risk score.

Horizon-Aware Value Extraction

Scores are computed at the user’s selected projection horizon (2030, 2040, or 2050) rather than averaging all time points. Historical data (1950–2014) serves as the baseline reference shown in charts and used for delta blending.

Data Type	Sources	Extraction Method
Static (1 value)	ThinkHazard, Aqueduct, JRC Flood	Use value as-is (time-invariant)
Decadal (2–10 values)	NASA Optimized	Pick nearest decade (values at 2030, 2040, …, 2100)
Annual (11+ values)	CCKP Indicators, ETCCDI, SEDAC, ETH FWI	Average ±5 year window centered on target year

Ensemble Statistics: Mean vs Median

CMIP6 ensembles contain 30–40+ climate models, each projecting a different future. The choice of summary statistic — mean vs median — materially affects risk scores because the ensemble distribution is skewed: a handful of “hot models” (CanESM5, UKESM1, IPSL-CM6A) project 2–3× more warming than the coolest models.

Worked Example: Mumbai Annual Max Temperature (TXx) at 2050 SSP2-4.5

Suppose 5 CMIP6 models produce these TXx projections (illustrative):

Model	TXx (°C)	Category
MRI-ESM2-0	42.1	Cool model
GFDL-ESM4	43.5	Cool model
ACCESS-CM2	44.8	← Median (middle value)
UKESM1-0-LL	47.2	Hot model
CanESM5	49.6	Hot model (ECS ≈ 5.6°C)

Mean

45.4°C

(42.1 + 43.5 + 44.8 + 47.2 + 49.6) / 5

Median

44.8°C

Middle value (3rd of 5)

The mean is 0.6°C higher because CanESM5 (49.6°C) pulls the average upward. At a WBT breakpoint like 30°C, a 0.6°C shift can move a location from one risk band to another. With 35+ real CMIP6 models, the skew can be even more pronounced.

How Our Sources Use Each Statistic

Source	Statistic	Reason
NASA NEX-GDDP	Ensemble mean	Averages all 35 CMIP6 models equally; preserves total variance but is pulled toward hot-model outliers
CCKP (World Bank)	Ensemble median	Follows IPCC AR6 practice; reports p10/p50/p90 spread for uncertainty quantification

Pros and Cons

Ensemble Mean

Pros

Uses information from every model — no data discarded
Preserves total ensemble variance (important for uncertainty-aware applications)
Mathematically convenient: additive, works with variance decomposition
Conservative for risk assessment — tends to overestimate rather than underestimate

Cons

Sensitive to outliers — 2–3 hot models (ECS > 5°C) can pull the average 0.5–1.5°C above the median
CMIP6 “hot model” problem is well-documented (Hausfather et al. 2022) — models like CanESM5 and UKESM1 overshoot observed warming trends
Can give a “best estimate” that no individual model actually projects
May overstate risk, leading to inflated financial loss estimates

Ensemble Median

Pros

Robust to outliers — a single extreme model cannot shift the result
Matches IPCC AR6 “best estimate” practice (Chapter 4, Cross-Chapter Box 7.1)
Naturally pairs with p10/p90 spread for uncertainty communication
Better calibrated against observed warming (Tokarska et al. 2020)

Cons

Discards tail information — does not reflect the possibility of very high warming scenarios
May understate risk for hazards where tail outcomes dominate losses (e.g., extreme heat, cyclone)
Not additive: median of sums ≠ sum of medians
Fewer models have been evaluated for median-based bias correction

What this means in practice: When you toggle between NASA and CCKP sources, you may see scores differ by 5–15 points for the same location. NASA (mean) tends to produce higher scores for heat stress and temperature-driven hazards. CCKP (median) is generally more conservative but better calibrated. Neither is “wrong” — they reflect different choices about how to summarize multi-model uncertainty. For regulatory disclosure (TCFD, ISSB), CCKP median is typically preferred; for conservative internal risk budgeting, NASA mean provides a wider safety margin.

Score Interpretation

Score Range	Rating	Interpretation
0 – 20	LOW	Minimal exposure; standard building codes sufficient
20 – 40	MODERATE	Some exposure; monitor trends and adaptation needs
40 – 60	HIGH	Material risk (threshold: 40); adaptation measures recommended
60 – 80	VERY HIGH	Significant exposure; active risk management required
80 – 100	EXTREME	Severe exposure; major adaptation or relocation needed