The challenges of developing a language app for healthcare

Our co-founder and CTO, Alexander Gyllensvärd, shares insights on the application of language models in healthcare, and how it could, but maybe shouldn’t be used.

You’ve probably read plenty of posts lately about how ChatGPT is changing the world as we know it, how the development of AI is reinventing specifically your line of work and about the mind-blowing potential of it all. This is one of those posts, but a bit more dull, with a focus on language models and healthcare, and how it could, but maybe shouldn’t affect their line of work.

We recently hosted a webinar on the progress of language technology and how it can be used to communicate in healthcare or when developing translation tools. There is already quite impressive technology in place when it comes to real-time translation, and the possibility to make conversation on the fly without speaking the same language is a reality in a few major languages. In healthcare, when there are no human interpreters available, this technology can and should be used to communicate with the patient, strengthening the alliance and building a relationship. However there are still many situations where language models simply should not be relied on.

Language models are based on a guessing game

Language models are, and will continue to be, predictions based on a guessing game. The more data the machines can practise on, and the better the algorithms get, the more accurate the guesses become, but nonetheless it’s still guesses.

A thought experiment. Imagine you suffer from muscle pain and the instructions on the medicine read that in 95% of the cases it leads to reduced muscle pain, but in 5% of the cases it doesn’t have any effect. You would not think twice before taking it. Now if we change the instructions to 95% muscle pain relief, and 5% of the time you get a headache, but no relief. You would probably still take it, but it depends on how severe the muscle pain is and what history you have with headaches. As a final example, there is still a 95% chance of muscle pain relief, but in 5% of the cases it causes a heart attack. Is it worth it?

You could think of using machine translation in healthcare in a similar fashion. Depending on the consequences of translating incorrectly, which could be compared to when the medicine doesn’t work in our thought experiment, the method used should be different. Even if there are a few situations where a guessing game would be good enough, maybe combined with body language, most situations in healthcare require absolute certainty. You simply can’t take the risk of being wrong in 5% of the cases, often not in 1% either.

Also remember that 95% is a made up number, the chances of getting it right are a lot lower in a lot of languages.

Accuracy is an issue in healthcare

Even if the guessing game experiment gives you an idea of machine translation in general, it still doesn’t really explain why it’s so hard to guess accurately in healthcare, which leads me to the challenges of using language models in a translation app for healthcare.

First of all, healthcare in itself is a complicated language. Even if you as a patient speak the same language as the caregiver you don’t necessarily understand the terminology. However if you speak the same language it could be solved by just asking what the phrase means, a luxury you won’t get when speaking a different language. This is something caregivers are aware of, but when being lost in translation, it is harder to take into account.

Also, cultural differences easily lead to misunderstandings. For example, depending on what culture you are from, answering “no” to the question “You don’t have any allergies?” could mean either “no, you’re wrong, I do have allergies” or “no, I don’t have any allergies”. There are also cultures where an equivalent word for what you use in your language simply doesn’t exist, or at least not that’s being used among the general population. So you can’t assume that a translation will be understood just because it’s correct.

These challenges mean two things. Using machine translation puts a lot of pressure on the user. You need to be aware of exactly what you put in, because else you’re not even giving the machine a chance to guess right. Also, even if you do put in perfect sentences, the current implementation of language models simply can’t always get all details correct, especially not in languages where there isn’t a lot of standardised grammar or data.

How we work with AI and language models

The basics of how language models work together with the challenges presented above are two things that create a clear problem definition for the app we develop at Care to Translate. We want to bridge the gap between human interpreters and machine translation, offering an alternative when there are no human interpreters present, but you still need to trust the translations. We use fixed phrases, and have a library of about 3000 phrases in 40+ languages that you can use in different situations in healthcare. By having fixed phrases we can make sure the phrases are understood from a patient perspective, we can take cultural and contextual differences into consideration, and, of course, we can actually double check our translations, fixing potential translation errors. All in all, we eliminate as much of the error margin as you can, and help caregivers feel confident that what they want to say is what the patient hears.

This does not mean that we can’t use any modern language technology at all in our product. For example we are taking advantage of the incredible progress that is happening in text to speech technology. Using a neural voice (a machine generated voice) to add audio to our fixed phrases helps us maintain a consistent quality, empathy level and pronunciation throughout the library, making the content more accessible to the patient.

Also, we use other language technology, such as grammatical frameworks to make it easier for the user to navigate our content. Our search engine is built to help the user find content related to what they are looking for, for example by giving results based on synonyms to what the user searched for, or taking into account variations in conjugation or tense of the search terms.

I am very excited about how language models can keep improving, and have already implemented some of the latest AI as part of my day to day work. However I am certain that we will need more than just one tool to overcome language barriers in healthcare, and we can’t fall into the trap of believing AI will solve all our problems, instead we need to keep adjusting and build tools that can help us in the reality we have, not the one we dream of.