Research Agenda

What does it take for financial institutions to confidently deploy AI within their high-stakes decisions?

Every day, financial institutions run on a current of critical decisions, from determining whether a transaction is fraudulent to calculating the amount of risk to accept when underwriting a loan.

We research the conditions required to deploy AI in this high-stakes decision-making where errors can jeopardize an institution’s customer safety, regulatory standing, and bottom line.

Why we’re launching now

AI models are now capable of autonomously handling complex reasoning and multi-step tasks. Last year, DeepMind and OpenAI achieved gold-level performance in the International Mathematics Olympiad; however, many financial institutions wouldn’t trust AI to automate financial analysis when underwriting a business loan.

There’s a pressing need to break through this stasis. As model performance improves, we’re seeing new momentum for institutions to stay ahead of customer expectations, and even regulators have begun actively encouraging AI adoption.

Critically, the leaders tasked with creating value through AI are the same people whose jobs are at risk if something goes wrong. Supporting them with practical research has never been more critical.

Why Taktile Labs can bridge the AI research gap in financial services

Most AI benchmarks are based on publicly available datasets, which means that realistic benchmarks for banking use cases are extremely rare. Taktile is uniquely positioned to give financial institutions the insights they are often lacking:

Focus on financial services

In contrast to large AI labs, we focus our full energy and resources on researching AI applications within banking, fintech, and insurance.

Powered by high-quality, realistic data

We work closely with development partners and industry experts to create realistic, high-quality evaluation data sets, including annotations of human performance thresholds. We then assess how models perform in common use cases like underwriting, fraud detection, and AML.

Informed by our environment

Leading financial institutions run millions of decisions on Taktile, giving us a clear view into the context agents must operate within, and the real concerns of leaders driving AI adoption.

Our research pillars

Evaluations & Benchmarking

Academic research often relies on synthetic, manicured datasets that don’t account for the messiness and inconsistencies of real customer data. We use data supplied by our development partners to build trusted benchmarks for model performance in core financial services use cases, designed around the KPIs business teams care about most: accuracy, cost per decision, and latency.

Goal: Establish a trusted benchmark for state-of-the-art AI performance in financial services, so teams can confidently choose which models to deploy for which use cases.

View FinSpread-Bench →

Human-Agent Design Patterns

Not every decision should be fully automated, and not every decision requires a human to intervene. We study confidence-based routing that dynamically allocates tasks between agents and analysts based on case complexity, model uncertainty, and regulatory requirements. We research how to make agents that fail gracefully: systems that know when they don’t know, escalate appropriately, and learn from analyst feedback to continuously improve behavior.

Goal: Pursue balanced research that helps teams unlock the benefits of AI in complex decision-making while preserving the value of human judgment and engagement.

Governance, Risk & Compliance

Deploying AI in financial services means operating under some of the most demanding regulatory requirements and internal risk policies. Agentic systems based on LLMs are stochastic and hard to inspect, which challenges assumptions behind existing frameworks such as SR 11-7 and traditional model risk management practices. We work with practitioners and regulators to clarify what responsible adoption looks like in practice.

Goal: Bridge the gap between innovation and compliance by partnering with regulators and financial institutions to co-create best practices for responsible AI deployment.

Foundation Models for Financial Data

Foundation models are trained primarily on public text, images, audio, and code. Financial decisioning, however, runs on data with a rich sequential and relational structure that is largely ignored by off-the-shelf LLMs. We explore transformer-based architectures that can better understand common data structures in financial services, and therefore drive more reliable decisions.

Goal: Design and benchmark foundation models purpose-built for financial data, clarifying where specialized models outperform general-purpose LLMs on real decisioning tasks.

Hybrid Decision Architectures

Just because LLMs have reached a new level of sophistication, that doesn’t mean they’re the right solution for every problem. The most effective AI systems in financial services will be hybrid architectures: rules and heuristics to enforce strict business policies, machine learning models to capture well-understood numerical patterns, language models for unstructured data interpretation and reasoning, and agentic orchestration for multi-step workflows that span all three.

Goal: Help financial institutions navigate AI hype with clarity and choose the right tool for each task while balancing cost, risk, and performance.

Our research model

Driven by an in-house research team

We ensure consistent, productive research activity by making the hub of Taktile Labs a dedicated in-house team staffed by senior research engineers.

Co-developed with financial institutions

We partner with leading banks, insurers, and fintechs on applied research projects, using real-world data to validate model performance in high-stakes decision use cases.

Guided by external expertise

We collaborate with a Research Council and Advisory Board of academics, regulators, industry leaders, and partners to help steer priorities and ensure relevance.