Case Studies/Financial Services
Financial Services·Global Investment Firm, $40B+ AUM

Automating Compliance Review for a Global Investment Firm

78%
Review Time Reduction
$1.8M
Annual Cost Savings
10,000+
Documents / Month
91%
Extraction Accuracy

A global investment firm processing 10,000+ compliance documents monthly was drowning in a 3-week review backlog that cost $2M annually in analyst hours. We deployed a Document AI pipeline that automated extraction, classification, and risk-flagging — cutting review time by 78% and freeing the compliance team to focus on judgment, not data entry.

Client Background

The client manages $40B+ in assets across equities, fixed income, and alternative investments. Their compliance function processes regulatory filings, trade confirmations, counterparty agreements, and internal audit documents — all subject to FCA and MiFID II requirements. Prior to this engagement, all document review was manual: analysts reading, extracting, and logging data into their case management system by hand.

The Challenge

Three converging pressures brought the compliance function to a breaking point. Regulatory requirements were expanding — the volume of required documentation had grown 40% over three years. Headcount had not kept pace. And a 3-week review backlog meant that time-sensitive regulatory responses were routinely delayed, creating material compliance risk.

10,000+ documents per month processed entirely by hand

Average review time of 18 days per document batch — regulatory deadlines frequently missed

23% of analyst time spent on data entry with no value-add judgment

$2M annually in analyst hours on tasks that were rule-based and automatable

No audit trail for extraction decisions — could not demonstrate data lineage to regulators

Our Approach

We built a multi-stage Document AI pipeline: ingestion and OCR using AWS Textract, custom NER models for financial entity extraction trained on the client's document corpus, a classification layer to route documents by type and risk level, and an extraction layer that pulled structured data into the existing case management system via API. Human reviewers were retained for edge cases and final sign-off.

01

Document corpus audit: 6 weeks of document analysis to classify types, identify extraction fields, and baseline accuracy requirements

02

Custom OCR + NER pipeline built with LangChain, GPT-4, and AWS Textract — fine-tuned on 2,000 client documents

03

Risk scoring model that flags documents requiring human review based on content and regulatory exposure

04

API integration with the client's existing case management system — zero workflow disruption

05

Audit trail layer: every extraction logged with source text, confidence score, and model version

06

Human-in-the-loop workflow for flagged documents — retained analyst judgment where it adds value

Implementation Timeline
3 weeks
Discovery & Audit
Document corpus analysisRegulatory requirement mappingExtraction field specificationSuccess metric baseline
6 weeks
Model Development
Custom NER model trainingOCR pipeline configurationRisk classification modelAccuracy benchmarking
4 weeks
Integration & Testing
Case management API integrationUAT with compliance teamAudit trail implementationRegulatory review
3 weeks
Deployment & Hypercare
Phased production rolloutAnalyst trainingPerformance monitoring30-day hypercare
Results & Impact

Within 90 days of go-live, average document review time dropped from 18 days to 4 days. Analyst time spent on manual data entry fell from 23% to less than 5%. The system now processes 10,000+ documents per month with 91% extraction accuracy — above the client's 88% threshold for straight-through processing.

Review time: 18 days → 4 days (78% reduction)

$1.8M in annual analyst cost savings in Year 1

91% extraction accuracy — exceeding the 88% straight-through processing threshold

Zero regulatory deadline misses in the 6 months post-deployment

Full audit trail now available for regulator inspection

Norvik didn't just build us a tool — they transformed how our compliance team operates. We went from drowning in paper to having real-time visibility into our review pipeline.

SM

Sarah Mitchell

VP of Compliance, Global Investment Firm

Key Learnings

Fine-tuning on a client-specific corpus of just 2,000 documents produced substantially better results than a general model — domain-specific training data matters more than volume

Retaining human review for flagged documents accelerated internal adoption — the compliance team trusted a system that respected their judgment

Audit trail requirements should be designed first, not retrofitted — the data lineage architecture shaped every other architectural decision

Key Results

Review Time Reduction78%
Annual Cost Savings$1.8M
Documents / Month10,000+
Extraction Accuracy91%

Services Engaged

Document AI
AI Automation
Data Infrastructure
AI Strategy

Technology Stack

LangChainGPT-4AWS TextractPythonFastAPIPostgreSQL

Facing a Similar Challenge?

Let's discuss your specific context and what results are realistic for your organisation.

Get in touch