Updated Sep 19, 2023. Tax Terrapin, ANNA’s free-to-use Chat GPT based tax bot, could potentially be certified as an official tax advisor since it is already able to pass some of ATT, ACCA and CIMA certification exams.
- In this article
- What is ANNA’s Tax Terrapin?
- What is ATT, ACCA, CIMA, and how to get certified?
- How ANNA’s Tax Terrapin “took” the accounting tests (methodology)
- How well did ANNA’s Tax Terrapin do?
- How is ANNA’s Tax Terrapin different from ChatGPT?
- What this all means – for ANNA customers and for accountants
Since the release of GPT3.5 last year, quite a few chatbots have scored well on legal exams, so we thought it would be interesting to see if Terrapin could pass industry-recognised accounting exams and get itself qualified. In this blogpost we’ll show how we’re training Terrapin – and how close it is to passing the official certification tests (it’s already passed some of them and is close to passing more of them…)
What is ANNA’s Tax Terrapin?
Tax Terrapin is ANNA’s tax chatbot powered by GPT4 large language model, trained on over 100,000 of HMRC’s tax articles. It provides comprehensive, up-to-date answers to your business tax questions. Not only does it answer your questions, it also provides you with links to the specific HMRC’s articles where it’s found the information.
What is ATT, ACCA, CIMA, and how to get certified?
There are several professional associations of accountants in the UK, and to get certified as an accountant or accounting consultant, you need to pass one of their tests. Some of the most well-known associations with available certification tests are Association of Taxation Technicians (ATT), Association of Chartered Certified Accountants (ACCA) and The Chartered Institute of Management Accountants (CIMA) – and each of them offers their own exams, with a number of questions and test methodologies. Some of them are relatively straightforward, with close-ended questions and no free-form calculations involved. Others are more complex and are aimed at evaluating a general understanding of the abstract concepts behind practical accounting – these include custom problems to be solved. As you might guess, Terrapin produced different results depending on the type of examination it took.
How ANNA’s Tax Terrapin “took” the accounting tests (methodology)
In practical terms, how does an AI tax bot prepare for an exam and, most importantly, actually take it? Here’s what we did.
- We found the publicly available test samples (both questions and answers) and loaded them into a spreadsheet, then manually labelled the questions and the corresponding correct answers.
- We “made” Terrapin study that spreadsheet, thus training it on the labelled data.
- To take the exams, we (the humans of ANNA!) manually started each test on its website, then copied and pasted each question into Terrapin as it came up in the test, took Terrapin’s answers one by one and copied them back into the exam. In this way Terrapin “took the test” through a human proxy (yes, we see the irony in a human helping the taxbot)
- We submitted the answers and recorded the results.
Some important details:
- We only used publicly available test samples for training. Some certification exams don’t have public samples; sometimes because part of the test may involve a human examiner presenting a problem to be solved live.
- The approximate sample size was about 1000 pairs of questions & answers.
- Terrapin was only trained on those questions that had predetermined correct answers – we didn’t train it on open-ended questions and the exam sections with a problem to solve.
- Most exams have different types of questions. We trained Terrapin on: assigning labels to text samples; filling in the gaps in text; typing in numbers; choosing one correct answer; and choosing multiple correct answers.
How well did ANNA’s Tax Terrapin do?
The results varied from test to test, but on average it did well. Here’s a breakdown of the results for each exam.
ATT – passed CBEs part of the exam
With ATT, we were only able to take the so-called Compulsory Computer-Based Examinations, or CBEs, which is the largest part of the test. We weren’t able to do the human-interaction part of the test, for obvious reasons. (We’d like to do this part of the test eventually, so we’ve reached out to ATT to see if it’ll be possible.)
Since ATT’s CBE test consists only of close-ended questions, Terrapin got good results and has passed all 3 parts of the exam (Accounting, Law, and Professional Responsibilities and Ethics).
ACCA – passed both parts
ACCA’s examination consists of two parts: management accounting and financial accounting. Previously, Terrapin passed the management accounting part with a score of 0.6 (the threshold to pass being 0.5) but failed the financial accounting one, with a score of 0.4. Now, the most recent version of Terrapin passes the financial accounting part with a score of 0.5. This became possible due to the fact that now the LLM uses a calculator to find the correct answer to questions involving numbers and maths. This does not mean that all the answers are correct, but we definitely see a gap (very significant in some tests) from the standard GPT4 model in terms of quality.
CIMA – 75% of Certificate level – very good but not yet a pass
CIMA offers multiple levels of certification, the first one being the Certificate Level. The test consists of 4 parts (BA1, BA2, BA3, BA4), and Terrapin passed 3 of them (it got a 50% score on BA3, with 70% required to pass each of the parts). While we didn’t pass the first time, there’s a good chance Terrapin will pass soon, thanks to the vast amount of learning material and the fact that the whole test is computer-based.
The next CIMA certification levels would be Operational, Management and Strategic, but these are currently out of Terrapin’s scope.
What’s notable about the CIMA exam is that it has the largest number of questions, which makes for the biggest training base for the model. In general, of all the exams we subjected it to, Terrapin showed the best result on this one, and it significantly outperformed the standard GPT4 model.
How is ANNA’s Tax Terrapin different from ChatGPT?
ChatGPT is a chatbot based on GPT4. GPT4 forms part of Terrapin as well, but Terrapin also has a few additional components.
GPT4 is OpenAI’s large language model (LLM) and is at the core of Terrapin. It is capable of very advanced conversations, and it can base what it says on data sources that it’s been instructed to study. While it works impressively on its own, we had to improve on it to make sure it provided more accurate and useful information.
On top of GPT4, Terrapin has 2 extra ingredients:
- A custom prompt (a rigid set of instructions) that we engineered to help GPT4 focus on a specific tax situation presented to it. We won’t get into technical details right now, but this is the important step in making Terrapin better at taxes than the core GPT4 setup.
- HMRC’s tax knowledge database, consisting of over 100,000 articles, which Terrapin sourced its answers from.
We saw that Terrapin was much better (exactly 54% better) at calculations than GPT4 on its own. At the moment large language models are notorious for mixing up factual information like names, years and numbers. While this might not be critical in some situations (writing emails, marketing copy… etc.), you really wouldn’t want your accountant to mix up the numbers on your tax return. So naturally this was an area we focused on a lot, and we’re pleased with current results.
When presented with a specific situation like – “I run a small business in the UK. My annual profit is £120,000, but I’ve made an equipment purchase of £25,000 that qualifies for Annual Investment Allowance (AIA). How much corporation tax would I owe, assuming a tax rate of 25% and full AIA relief?” – Terrapin replies with very precise calculations and deadlines. While we can’t yet classify its answers as legal accounting advice, that might well happen soon.
What this all means – for ANNA customers and for accountants
So, what happens when tax AIs start being as good at accounting as humans are – or even exceed us? Will it be worth aspiring accountants taking their exams if AIs can do it better?
We can’t really predict how this kind of tax automation will affect accountants, but we hope it’ll mean there’s no need for accountants to memorise and store vast amounts of very specific information about tax rules. We hope it’ll make their lives easier, giving them time to work with more businesses and to find new and better solutions, spending less time on the numbers and more time on their clients.
As for ANNA customers, they’ll soon be able to quickly access very precise and accurate answers to all their tax questions, tailored to their specific business. In fact, it won’t just be ANNA customers – Tax Terrapin is free for everyone to use.