The problem
A regional financial firm received thousands of free-text tickets per month. Back-office staff classified them manually for routing and priority. Misclassification caused missed SLAs and regulatory costs.
The hard constraint
Nothing could leave the client’s datacenter. Regulated end-customer data. External APIs (OpenAI, Anthropic) were off the table from the first meeting.
The solution
A classifier with a local LLM + automatic PII masking:
- Model: Llama 3.1 8B fine-tuned on 18 months of categorized tickets.
- PII masking: Microsoft Presidio anonymizes names, national IDs, emails and account numbers before any inference.
- Routing: classification → matching queue → notification to the assigned human agent.
- Audit log: every decision is recorded for regulatory audit.
Why fine-tuning and not just prompts
The client’s internal vocabulary (local regulatory terminology, in-house product names) wasn’t well covered by a base model. A short fine-tune with the client’s own data pushed precision above the threshold we needed to trust automatic routing.
Outcomes
Illustrative figures based on comparable projects: 82% of volume routed automatically, with -65% time-to-answer and -40% re-escalated tickets versus the previous manual workflow. The pipeline runs 100% on-prem.