We architected a "Human-in-the-Loop" Document Intelligence Pipeline using Azure AI Document Intelligence (formerly Form Recognizer) and Large Language Models (LLMs). This expanded beyond simple OCR to true semantic understanding.
Step 1: Intelligent Ingestion & OCR
We built a secure upload portal where officers dropped PDFs. The system automatically pre-processed them (cleaning scans, rotating pages) and used high-fidelity OCR to convert them into machine-readable text, preserving table structures and headers.
Step 2: Semantic Analysis & Risk Scoring
We trained a custom NLP model on Al Baraka's historical data and specific regulatory rulebooks. The AI scanned the document for 40+ specific risk indicators:
* "Does this contract reference an interest rate (Riba)?" (Strictly prohibited).
* "Is the governing law clause valid?"
* "Are the counterparty details present on the sanctions list?"
The AI assigned a "Risk Score" (0-100) and generated a "Compliance Summary" highlighting the exact page and paragraph of any flagged issues.
Step 3: The Officer's Workbench
Instead of reading the whole document, the officer opened a custom web dashboard.
* Green Documents (Low Risk): Auto-approved for processing (80% of standard vendor contracts).
* Red Documents (High Risk): The dashboard showed the PDF on the left and the AI's "Red Flags" on the right. Clicking a flag jumped to the exact clause. The officer could "Accept" or "Reject" the AI's finding.
Step 4: Continuous Learning
Every time an officer corrected the AI (e.g., "This clause is actually compliant because of exception X"), that feedback loops back to retrain the model, making it smarter every week.