Methodology

How Find My Data Center evaluates and ranks data center development sites

Overview

Find My Data Center uses a weighted multi-dimensional scoring model to evaluate US land parcels for AI data center and neocloud development. Each site is scored across 8 dimensions with ~35 sub-dimensions total. Every sub-dimension is evaluated by a priority-based rule engine: rules are checked in order from best to worst, and the first matching ruledetermines the score. This "first-match-wins" approach produces deterministic, fully explainable scores. Each score includes the specific threshold that triggered it and a human-readable reasoning string.

The total site score is a weighted average of dimension scores. Dimension weights reflect importance to DC site viability and are configurable in Settings. Adjusting weights recomputes all scores in real-time.

Data Sources

Site data is aggregated from public and proprietary sources across federal, state, and local levels:

Power & Grid
EIA Forms 860/861, FERC 714, OASIS, utility tariff filings, PUC interconnection queues, ISO/RTO capacity reports, direct utility engagement
Environmental & Climate
FEMA NFHL flood maps, NOAA NCDC weather data (TMY3), US Drought Monitor (USDM), USGS seismic hazard maps, EPA databases, Phase I/II ESAs
Labor & Demographics
BLS Occupational Employment Statistics, Census ACS 5-Year, LEHD Origin-Destination, C2ER Cost of Living Index, National Right to Work Foundation
Regulatory & Incentives
State economic development agencies, county GIS/zoning, NEPA/CEQA registries, IDB/EDA filings, Board of Supervisors minutes, state PUC rulings
Network & Fiber
FCC Broadband Map, PeeringDB, RIPE Atlas latency measurements, carrier route GIS databases, direct fiber surveys
Real Estate & Deal
County assessor records, CoStar/LoopNet, MLS listing history, broker engagement data, SEC filings (public REIT acquisitions)

Scoring Model

The engine reads each field from the database via its field_path (e.g. site_power.capacity_available_24mo_mw), evaluates it against threshold rules using operators like gte, lte, eq, and assigns the score from the first matching rule. Special handling extracts estimated_savings_per_mw from the JSON state_incentives field, and counts the existing_dc_operators_60mi JSON array length.

Dimension Weights (sum to 100)
35%
15%
15%
10%
8%
7%
Power Infrastructure (35%)
Behind-the-Meter (15%)
Regulatory & Permitting (15%)
Labor Supply (10%)
Water (8%)
Network & Fiber (7%)
Climate & Risk (5%)
Deal & Acquisition (5%)

Confidence Scoring

Each score carries a confidence rating (0-100%). Confidence is computed per-field: null/missing values = 0%, string values of "unknown" or empty = 25%, all other values = 85% (baseline from primary-source data). The dimension confidence is the weighted average of its sub-dimension confidences. Fields with no matched rule are excluded from the dimension score and reduce confidence proportionally.

Dimension Breakdown — Click any sub-dimension to see scoring thresholds

Power Infrastructure35%

Evaluates grid capacity, utility pricing, transmission access, and long-term power availability. This is the single most important factor for data center site selection.

🔋
Behind-the-Meter15%

Assesses opportunities for on-site or adjacent power generation including stranded gas, co-located renewables, and nuclear adjacency. BTM power can dramatically reduce operating costs.

📋
Regulatory & Permitting15%

Measures zoning readiness, state/county incentive packages, permitting timelines, and community opposition risk. Regulatory friction can add 12-24 months to a project.

👷
Labor Supply10%

Evaluates the local skilled labor pool including electricians, HVAC technicians, and IT professionals needed for construction and ongoing operations.

💧
Water8%

Assesses water availability, cost, drought risk, and permitting for cooling systems. Evaporative cooling consumes 300,000-500,000+ gallons per day for a large campus.

🌐
Network & Fiber7%

Evaluates dark fiber availability, provider diversity, peering proximity, and latency to major metros. Network connectivity determines which workloads a site can serve.

🌡️
Climate & Risk5%

Evaluates natural disaster risk profile, free cooling potential, and flood zone classification. Climate factors affect both CapEx (hardening) and OpEx (cooling costs).

🤝
Deal & Acquisition5%

Evaluates deal economics including price/acre relative to comps, seller motivation, broker relationships, and negotiation leverage.

Score Interpretation

80–100
Excellent
Top-tier site, minimal red flags
60–79
Good
Viable with manageable trade-offs
40–59
Fair
Significant issues to address
0–39
Poor
Major blockers, high risk

Notes & Limitations

  • Dimension weights are configurable in Settings. Sub-dimension weights within each dimension always sum to 1.0 (100%). Dimension weights sum to 100 across all 8 dimensions.
  • Scores are fully deterministic. The engine uses no ML or probabilistic models — every score traces back to a specific threshold rule in the configuration above.
  • JSON fields get special extraction: state_incentives.estimated_savings_per_mw is parsed from the regulatory JSON, and existing_dc_operators_60mi is counted as array length (not raw value).
  • Data freshness varies by source. Utility capacity is typically refreshed quarterly, FEMA maps update irregularly, BLS labor stats are annual. Scores should be validated with primary research before acquisition decisions.