CTO · fintech scale-up · Mumbai
IN·CTO7s → 0.4s
transaction scoring latency
The Emergency: The fraud-scoring model added 7s to every transaction and approvals timed out.
What happened: Booked QuickHire at 9pm; a PM and AI infrastructure engineer joined within 10 minutes.
Result: Speculative decoding and INT4 quantisation cut scoring latency under 400ms by midnight.











