AI Strategy · Harla Data Room

Strategic framing

Harla's go-to-market strategy enters through the transaction — one deal, one landlord, and then every party involved in that deal becomes a potential subscriber. The AI strategy is what makes owning that entry point defensible over time.

Every transaction processed on Harla produces structured execution data: how deals move, where they stall, which counterparties execute reliably, where compliance gaps recur. That data does not exist anywhere else. It cannot be purchased from a model provider or replicated by a new entrant without years of operational engagement. The goal is to compound it — and to build from this foundation an AI layer that owns the execution. Starting with UK commercial real estate.

The sections below describe the current architecture. Each capability is designed to serve that longer objective: extraction feeds intelligence, intelligence feeds automation, and automation drives the transaction volume that accumulates the data that makes every subsequent transaction faster and more accurate.

What AI does in the product

Harla uses AI across three layers of the property operations workflow.

At the extraction layer, AI interprets incoming documents — lease agreements, heads of terms, compliance records, KYC packs — and converts unstructured, inconsistently formatted content into structured, queryable data. This is the foundation of everything else: without reliable extraction, neither portfolio intelligence nor transaction automation is possible. AI handles the variance in document format, layout, and language that makes manual data entry slow, expensive, and error-prone at scale.

At the intelligence layer, the system surfaces what the data means for portfolio performance. Void forecasting, re-letting strategy, counterparty benchmarking, and obligation tracking are all derived from the structured data accumulated across every transaction. AI identifies patterns and anomalies — a lease approaching a break clause, a counterparty with a history of slow execution, a compliance gap that has appeared across multiple properties — before they become problems requiring reactive management.

At the automation layer, AI drives the workflows that move transactions from first contact to cash. It orchestrates documentation, coordinates actions across multiple parties, drafts outbound communications with portfolio-specific context, and populates compliance records. This removes the administrative coordination load that currently makes commercial property operations expensive to scale and slow to execute.

Together, these three layers mean a lease or asset manager using Harla handles a larger portfolio with less manual effort, a more complete compliance record, and faster execution from heads of terms to completion.

Models and tools

Function	Approach	Provider / Model	Build or Buy
Lease and document parsing	OCR + element extraction	Unstructured.io	Buy
Obligation and clause extraction	LLM structured output	Anthropic Claude API	Buy
KYC / AML agent workflow	Agentic LLM + third-party data	Anthropic Claude API + Creditsafe / Comply Advantage	Buy
Semantic search and retrieval	Dense vector embeddings	Voyage RE (real-estate fine-tuned)	Buy
Portfolio knowledge graph	Graph construction on transaction data	Internal (Neo4j / pgvector)	Build
Void forecasting and portfolio analytics	ML models on proprietary execution data	Internal	Build
Communication drafting	LLM with portfolio and counterparty context	Anthropic Claude API	Buy
Multi-party transaction orchestration	Stateful agentic workflow	Internal (LangGraph)	Build

Build vs. buy rationale

Where a capability is commodity — language understanding, document OCR, general text generation — Harla buys via API. Foundation model providers have invested at a scale no early-stage company can replicate, and the marginal cost of API access is falling consistently. Competing on general language capability is not the strategy.

Where defensibility comes from proprietary data, Harla builds. The knowledge of how a specific portfolio's transactions execute over time — which counterparties move quickly, where compliance gaps recur, how void duration correlates with lease type and market conditions in the UK commercial sector — is not available in any foundation model and cannot be purchased. It is accumulated through operational engagement with real portfolios.

The goal is a thin, fast API layer on top of foundation models, with a proprietary data and retrieval layer underneath that compounds in value with every transaction processed. The model is the commodity; the operational graph built on top of it is the product.

Data flywheel and defensibility

Each portfolio onboarded adds structured execution data: lease timelines, void duration records, counterparty performance histories, compliance gap patterns, and HoT-to-completion benchmarks. This data trains and continuously improves Harla's forecasting, routing, and anomaly detection capabilities in ways that become increasingly specific to UK commercial and residential portfolios over time.

The flywheel is self-reinforcing. More transactions improve void forecasting accuracy. Better void forecasting improves re-letting strategy recommendations. Improved recommendations deliver measurable NAV outcomes for operators, which drives further adoption and further data accumulation.

Over time this creates a proprietary dataset that a new entrant cannot replicate without acquiring a comparable base of operational customers — and by that point, Harla's models will already be substantially more accurate for UK portfolios than any general-purpose tool. The moat is not the model; it is the operational graph built on top of it, and the time required to build that graph is the barrier to replication.

Risks and mitigations

Risk	Mitigation
Model hallucination in legal or financial context	Human-in-the-loop for all actions with legal or financial consequence. AI surfaces recommendations and drafts; a person approves before execution. Confidence thresholds flag low-certainty outputs for review.
Multi-party agentic workflow failure	Stateful workflow design with explicit checkpoints, rollback capability, and audit trail at every step. No irreversible action is taken without human confirmation. Failure modes are bounded by design.
Cost at scale	Prompt optimisation, response caching, and model tier routing — simpler extraction tasks routed to lighter models; complex reasoning and orchestration use full capability. Inference cost as a share of gross margin improves as transaction volume grows.
Vendor dependency on foundation model providers	Abstraction layer across providers. Provider-specific logic is isolated; switching cost is bounded. Anthropic is the preferred provider today but the architecture does not require it.
Data privacy — tenant and counterparty data	UK GDPR-compliant data architecture with UK/EU data residency enforced. No personally identifiable information enters model training pipelines without explicit consent. Data processing agreements in place with all sub-processors. Row-level security enforced at the database layer.
Extraction accuracy on complex lease documents	Unstructured.io pre-processing normalises document layout before LLM extraction. Output validation rules flag implausible or missing fields for human review. Accuracy improves as the extraction pipeline is fine-tuned on a growing corpus of UK commercial lease formats.