My Data Science Internship at ANNA by Jonah Hughes

 · 5 min read

We spoke to Data Science Intern Jonah Hughes about his time at ANNA.

Jonah Hughes, Data Science Internship
Phone ANNA app with a credit card
Open a business account with ANNA and get your taxes sorted
With ANNA you get a debit card, automated bookkeeping, a personal payment link, up to 40% cashback and 24/7 customer support

Structures, not models

I spent three months as a Data Science Intern in ANNA's Business Admin team, working with Nick Turusin and Liev Garcia. Across three projects, one principle became clear: the effectiveness of AI systems depends less on the model than on the structure surrounding it. Thoughtful design, domain expertise, rigorous evaluation, and iterative refinement form the foundation for deploying AI at scale.

Improving the AI Accountant

Customers rely on ANNA's AI Accountant to process financial documents and extract accurate tax data. Documents contain different dates – invoice dates, payment dates, and the service period, and the system has to predict the correct tax date through contextual reasoning. When I started, tax dates were predicted correctly 77% of the time, with the remaining cases needing a manual review. I ran comparative analysis across 537 documents, benchmarking a range of models and processing strategies.

Each improvement was guided by iterative evaluation, with every architectural decision linked to a specific pattern in the data. Early analysis revealed the model misidentified relevant invoices when predicting tax dates, a pattern visible only through systematic error review.

This led to restructuring the prompt to classify invoice relevance before predicting tax dates, a technique known as grounding. Splitting results by document file type showed distinct failure patterns, prompting a hybrid routing approach that applies the best processing method for each file format.

As well as that, multi-page documents often contained relevant context for tax dates on later pages, motivating expanding from single-page to 10-page processing. Finally, having the model output its reasoning alongside predictions made remaining errors traceable and informed further tuning.

After making these changes, tax dates are now correctly predicted over 90% of the time. The individual changes delivered the gains, but the evaluation infrastructure made each change identifiable and justifiable.

Building Sole Trader Onboarding

To help sole traders get started confidently, I built a personalised AI chatbot pathway for the introduction of these customers. I built a pathway that detects user type and routes sole traders through a multi-step conversational flow. Each step provides guidance on relevant features such as Self Assessment, marketplace integrations, and smart invoicing, while allowing the AI to determine which features to surface based on the user's profile.

This required building a framework where each conversational step can trigger relevant UI elements, along with careful crafting of the AI's instructions, tools and new interactive components to create a smooth, hands-on introduction.

This project reinforced that AI systems don’t need open-ended access to be effective. They can work within tightly scoped stages, while still allowing flexible user interaction.

Early results are positive: cancellation rate decreased from 23.8% to 19.0%, engagement increased from 76.2% to 81.0%, and users completing the full flow rose from near zero to 8.8%.

Invoice Reconciliation

ANNA's expense management system matches bank transactions to uploaded invoices. The existing approach used fuzzy matching, a technique that measures the similarity between two pieces of text on a character by character basis. While effective for clear cases, it produced false positives: an invoice from "SMITH J" might incorrectly match a transaction for "JANE SMITH" because the text is similar despite referring to different people.

I constructed a dataset of hard examples to quantify these failures, then experimented with different ways to incorporate AI reasoning into the system. The naive solution would send every potential match to an AI model, but evaluation showed this wasn’t necessary as most cases didn’t need it. The optimal strategy uses AI only as a validation layer after fuzzy matching finds a candidate, catching false positives before reconciliation completes.

Through experimenting with different configurations, accuracy improved from 60% to 86% while maintaining fast processing times.

Domain knowledge shaped the solution. Understanding the reconciliation problem led to a cleaner architecture requiring minimal AI intervention. Without rigorous evaluation, the temptation would be to add AI where something more targeted was needed.

What I learned

Each project reinforced the same principle: the model is one component in a larger system. The internship reminded me that effective AI systems depend on systematic evaluation, thoughtful architecture, and iterative refinement more than model intelligence alone.

My time at ANNA taught me how to build AI that’s technically robust, aligned with user needs, and delivers tangible value at scale.

Open a business account in minutes

Take the load off with ANNA, the business current account that sorts your invoices and expenses.
Get a business account and a debit card that miaows
We create, send and chase up your invoices
We snap and sort your business expenses
Never miss a deadline, with handy tax reminders