Finance Process Flows
Reimagined with AI
AP & AR automation — invoice receipt to ERP posting in 3.2 seconds, not 25 minutes. Zero manual data entry on 95% of invoices.
From Invoice Chaos to Full Automation
A six-step transformation from an 8-person AP team to a serverless AI pipeline processing 15,000 invoices per month.
Invoices/Month, 8 People, 25 Minutes Each
The AP team received invoices by email, PDF, portal, and fax. Every invoice required manual keying into Acumatica — vendor lookup, GL code from memory, entity routing by gut instinct. At $25/invoice fully loaded, that is $375,000 per year in AP labor alone.
200+ Vendors, Zero Consistent Format
Ashford's 9 hospitality entities (Stirling, Remington, Premier, OpenKey…) each had different GL structures, approval thresholds, and vendor relationships. The same vendor invoiced different entities differently. No OCR tool alone could handle the variance without a judgment layer.
Azure Document Intelligence Extracts in One Call
Azure Form Recognizer's pre-built invoice model extracts vendor name, invoice number, amounts, and line items — all in one REST call at $0.01/page. Per-field confidence scores flag low-confidence extractions. Output: structured JSON ready for the judgment layer.
Claude AI: The GL Code Judgment Layer
Rule-based GL coding works for 78% of invoices. It fails on new vendors, multi-line descriptions, and cross-entity allocations. A TF-IDF + logistic regression classifier trained on 2 years of GL history achieves 94% accuracy — 16 percentage points better than hand-coded rules.
3-Way Match + Hierarchical Approval Routing
PO vs. Invoice vs. Goods Receipt with 2%/5% tolerance bands. Invoices under $5K that pass 3-way match auto-approve — covering 73% of all invoice volume. Only exceptions or high-value invoices require a human decision. Approval queue cut by 73%.
Results: 95% Touchless, Full Audit Trail
OCR ($0.01) + Claude AI ($0.003) = $0.013/invoice vs. $15–$40 manual. 95% of invoices touch zero human hands. Full immutable audit trail in Azure SQL: extraction confidence, GL reasoning, match result, approval chain, ERP response code. DSO dropped 30% on AR.
See It From Every Angle
ELI5 mode explains the value to each stakeholder. Engineer mode runs a live 6-stage AP pipeline simulation with real synthetic invoices.
The Math That Matters
AP cost $25/invoice × 15,000/month = $375K/year. Now it is $0.013/invoice in compute cost. That is a $370K annual saving with tighter controls, faster month-end close, and a 30% DSO reduction on AR — freeing up working capital that was sitting in float.
Your Job Got Better
You no longer key 15,000 invoices a month. The system handles the 95% that are straightforward. You handle 750 complex, high-value, relationship-sensitive cases per month — the ones that actually require human expertise, negotiation, and judgment.
The Architecture is Serverless
Email / blob → Logic Apps trigger → Form Recognizer REST → Azure Function validates → GL classifier → Acumatica REST POST → SQL audit log. Zero VMs. The entire pipeline at 15K invoices/month costs roughly $195 in Azure compute.
Every Decision Is Traceable
Every invoice carries an immutable record: OCR field confidence, GL assignment plus model reasoning, 3-way match result with variance details, approval chain with timestamps, and ERP HTTP response code. SOC 2-compatible. No more "who approved this?" investigations.
How It Works — Lesson by Lesson
Six slides building from the business problem to fraud prevention. Each slide adds one layer of the system.
Four Engineering Decisions That Made It Work
The automation works because of specific, deliberate engineering choices — each with a measurable impact.
AI vs. Rules for GL Coding
Rule-based GL classification plateaus at 78% accuracy — it cannot generalize to new vendors or ambiguous multi-line invoices. A TF-IDF + logistic regression classifier on 2 years of history achieves 94%. The 16-point gap widens further on hospitality-specific vendor categories where rules require constant manual maintenance.
Why it matters: 94% accuracy is the threshold where AP team trust in the system exceeds resistance to adoptionTolerance Bands Cut False Exceptions
Exact-match validation flags 40% of invoices as exceptions — creating a workload that overwhelms reviewers and destroys the ROI of automation. A 2% per-line and 5% total tolerance reduces false exceptions to 8%. Engineering the tolerance thresholds is the critical calibration: they must be loose enough to avoid noise but tight enough to catch the 2.3% genuine overpayment rate.
Why it matters: false exception rate drives reviewer workload and determines whether the system is trusted or bypassedServerless Economics at Scale
The entire pipeline at 15,000 invoices/month: Form Recognizer OCR at $0.01/page, Azure Functions at near-zero, Claude API at $0.003/invoice, SQL writes at $0.001, Logic Apps orchestration at $0.001. Total: $0.013/invoice or $195/month — down from $375,000/year in AP labor. The cost structure is perfectly variable: it scales to zero on slow months and to any volume without infrastructure investment.
Why it matters: variable cost structure eliminates the risk of over-provisioning for peak capacityAuto-Approval Covers the Majority
73% of invoices are under $5,000 and pass 3-way match validation. These auto-approve without any human touch. The approval queue drops from 15,000 to 4,050 invoices/month. Controllers now review only the invoices that genuinely benefit from human judgment — high-value, new-vendor, or exception-flagged. The spend velocity trigger adds automatic escalation when cost centers overspend, creating budget guardrails without policy changes.
Why it matters: routing 73% automatically lets approvers give full attention to the 27% that actually need themProduction Implementation
The three core algorithms — extraction, matching, and classification — that power the pipeline.
Invoice Data Extraction — Azure Form Recognizer (Python)
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.identity import DefaultAzureCredential
import os
CONFIDENCE_THRESHOLD = 0.85
def extract_invoice(blob_url: str) -> dict:
"""Extract structured fields from a PDF invoice
using Azure AI Document Intelligence."""
client = DocumentAnalysisClient(
endpoint=os.environ["FORM_RECOGNIZER_ENDPOINT"],
credential=DefaultAzureCredential()
)
poller = client.begin_analyze_document_from_url(
"prebuilt-invoice", blob_url
)
result = poller.result()
for invoice in result.documents:
fields = invoice.fields
extracted = {
"vendor_name": _safe(fields, "VendorName"),
"invoice_num": _safe(fields, "InvoiceId"),
"invoice_date": _safe(fields, "InvoiceDate"),
"due_date": _safe(fields, "DueDate"),
"total": _safe(fields, "InvoiceTotal"),
"line_items": _extract_lines(fields),
"confidence": invoice.confidence,
}
if invoice.confidence < CONFIDENCE_THRESHOLD:
extracted["needs_review"] = True
extracted["review_reason"] = (
f"Low confidence ({invoice.confidence:.2f})"
)
return extracted
return {"error": "No invoice detected", "needs_review": True}
def _safe(fields, key):
f = fields.get(key)
return f.value if f else None
def _extract_lines(fields):
items = fields.get("Items")
if not items:
return []
lines = []
for item in items.value:
sf = item.value
lines.append({
"description": _safe(sf, "Description"),
"quantity": _safe(sf, "Quantity"),
"unit_price": _safe(sf, "UnitPrice"),
"amount": _safe(sf, "Amount"),
"confidence": item.confidence,
})
return lines
3-Way Match Engine — PO / Invoice / GRN (Python)
from dataclasses import dataclass
from Levenshtein import distance as lev_distance
TOLERANCE_LINE = 0.02 # 2% per line item
TOLERANCE_TOTAL = 0.05 # 5% on invoice total
@dataclass
class MatchResult:
status: str # "matched" | "exception"
score: float # 0.0 - 1.0
exceptions: list
explanation: str
def three_way_match(po, invoice, grn) -> MatchResult:
"""Compare PO, Invoice, and Goods Receipt line-by-line."""
exceptions = []
# 1. GRN must exist before matching
if grn is None:
return MatchResult("exception", 0.0,
[{"type": "MISSING_GRN",
"detail": f"No GRN for PO {po['po_number']}"}],
"Goods receipt not yet recorded.")
# 2. Match line items (fuzzy on description)
matched, total_po, total_inv = 0, 0.0, 0.0
for po_line in po["lines"]:
best = _fuzzy_find(po_line, invoice["lines"])
if best is None:
exceptions.append({"type": "MISSING_LINE",
"detail": f"PO line '{po_line['desc']}' not on invoice"})
continue
# Price variance check
diff = abs(best["amount"] - po_line["amount"])
if diff / max(po_line["amount"], 0.01) > TOLERANCE_LINE:
exceptions.append({"type": "PRICE_VARIANCE",
"detail": f"{po_line['desc']}: PO ${po_line['amount']:.2f}"
f" vs INV ${best['amount']:.2f}"})
# Quantity check against GRN
grn_line = _fuzzy_find(po_line, grn["lines"])
if grn_line and grn_line["qty"] != best.get("qty", 0):
exceptions.append({"type": "QTY_VARIANCE",
"detail": f"{po_line['desc']}: GRN {grn_line['qty']}"
f" vs INV {best.get('qty', '?')}"})
matched += 1
total_po += po_line["amount"]
total_inv += best["amount"]
# 3. Total tolerance check
if total_po > 0:
var = abs(total_inv - total_po) / total_po
if var > TOLERANCE_TOTAL:
exceptions.append({"type": "TOTAL_VARIANCE",
"detail": f"PO ${total_po:.2f} vs INV ${total_inv:.2f}"
f" ({var:.1%} variance)"})
score = matched / max(len(po["lines"]), 1)
status = "matched" if not exceptions else "exception"
return MatchResult(status, score, exceptions,
f"{matched}/{len(po['lines'])} lines, {len(exceptions)} exceptions")
def _fuzzy_find(target, candidates):
best, best_d = None, 999
for c in candidates:
d = lev_distance(target["desc"].lower(), c["desc"].lower())
if d < best_d:
best, best_d = c, d
return best if best_d < 3 else None
GL Code Classifier — TF-IDF + Logistic Regression (Python)
import joblib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
CONFIDENCE_THRESHOLD = 0.70
# ── Training ──────────────────────────────────────────────────────
def train_gl_classifier(history_df):
"""Train on historical descriptions -> GL codes.
Expects columns: 'description', 'gl_code'."""
pipe = Pipeline([
("tfidf", TfidfVectorizer(
max_features=5000, ngram_range=(1, 2),
sublinear_tf=True, stop_words="english")),
("clf", LogisticRegression(
max_iter=1000, C=5.0,
class_weight="balanced", solver="lbfgs")),
])
pipe.fit(history_df["description"], history_df["gl_code"])
joblib.dump(pipe, "models/gl_classifier.pkl")
return pipe
# ── Prediction ────────────────────────────────────────────────────
def predict_gl(description: str, pipe=None):
"""Return GL code with confidence; top-3 fallback
when confidence < threshold."""
if pipe is None:
pipe = joblib.load("models/gl_classifier.pkl")
probas = pipe.predict_proba([description])[0]
classes = pipe.classes_
top_idx = probas.argsort()[::-1]
best_code = classes[top_idx[0]]
best_conf = probas[top_idx[0]]
if best_conf >= CONFIDENCE_THRESHOLD:
return {
"gl_code": best_code,
"confidence": round(float(best_conf), 3),
"method": "auto",
}
# Low confidence: surface top 3 for human review
return {
"gl_code": best_code,
"confidence": round(float(best_conf), 3),
"method": "review",
"suggestions": [
{"gl_code": classes[i],
"confidence": round(float(probas[i]), 3)}
for i in top_idx[:3]
],
}
Technology Stack
Every component in the AP/AR automation pipeline — from email ingestion to ERP posting.