Given that LLMs are a recent innovation, there are not enough easy ways to understand its pros and cons.
First of all, LLMs are not going to replace humans or do the work of humans. LLMs accuracy depends a lot on the training dataset used. The smaller the dataset, the less accurate the LLM is going to be. LLMs can confidently give a response to any prompt regardless of whether it is accurate or not. So the responsibility of the verification should be considered to prevent any issues arising in the usage of your product by your customers.
The accuracy validation can be done using a framework such as evals, to increase the confidence that LLM will give correct answers for a given scenario. This framework will evaluate the LLM against common patterns and known set of prompts. The difficulty here is that even with evaluation you can never be sure that the responses are accurate. As mentioned before you can try to increase the accuracy by using larger models, but this has impact on performance.
Integrating LLMs can introduce lag and latency into your user journey. Even if you use the fastest LLM, the larger the model and the conversation context the slower it gets, this can sometimes even increase to tens of seconds. So you have to be very careful in adding LLMs into your existing product flows.
To successfully integrate LLMs into your product you need to consider the limitations mentioned above.
When the LLMs are added to the product flow they will certainly add an extra latency component. So think carefully about in which flows your users are tolerant and patient. This should be done by not just looking into the problem of integrating LLMs as an engineering task but also as a User Experience enhancement. For example, If LLM or GenAI is introduced as part of a critical path then users might get even more frustrated with the slow response times. Even if they tolerate the performance hit, they might end up getting bitten by the hallucination problems.
To mitigate the errors caused by hallucinations introduce explicit validation by a human in critical workflows. This helps in increasing the user’s confidence in your product. If your product cannot scale with introducing verification by humans, give an option for the user to request a validation by a human. Depending on the domain you are working in, the human variation might be even required by law. Be aware of the legal implications of generated responses and make clear indication where the responsibility lies.
Finally, try to use the LLMs via cloud services, building and training LLMs is a very costly endeavour. As a product, you want to introduce new features as quickly as possible and validate with your users. So, it is better to use APIs such as Claude, OpenAI provided via Azure or Amazon will be quicker than building training your own model and hosting them. This only adds a lot of complexity to your engineering with no clear benefit to your users.
LLMs and GenAI are rapidly becoming useful technologies to augment and enhance product journeys, if you integrate them with your user experience at the front and centre you can build delightful and successful products.
Do you have more questions on how to implement LLMs into your product or learn more about their advantages and disadvantages? Drop us a note at contact@datapebbles.com