LLM Engineering: Building Production-Ready LLM-enabled Systems
Petra Heck
Senior Researcher AI Engineering (Fontys Kenniscentrum Applied AI for Society)
With the advent of foundation models and generative AI, especially the recent explosion in Large Language Models (LLMs), we see our students and companies around us build a whole new type of AI-enabled systems: LLM-based systems or LLM systems in short. The most well-known example of an LLM system is the chatbot ChatGPT. Inspired by ChatGPT and its possibilities, many developers want to build their own chatbots, trained on their own set of documents, e.g. as an intelligent search engine. For the specific text generation task they have to:
1) select the most appropriate LLM, sometimes fine-tune it,
2) engineer the document retrieval step (Retrieval Augmented Generation, RAG),
3) engineer the prompt,
4) engineer a user interface that hides the complexity of prompts and answers to end users. Especially prompt engineering is a new activity introduced in LLM systems.
It is intrinsically hard as the possibilities are endless, prompts are hard to test or compare, the result might vary with different LLM models or model versions, prompts are difficult to debug, you need domain expertise (and language skills!) to engineer fitting prompts for the task at hand, and so on. For LLMs however, prompt engineering is the main way the models can be adapted to support specific tasks. So, for LLM systems we must conclude that they are data + model + prompt + code. Where it must also be noted that with LLM systems the model is usually provided by an external party and thus hard or inpossible for the developer to control, other than by engineering prompts. The external party might however frequently update its LLM and this might necessitate a system update for the LLM system as well. In this talk, we analyze the quality characteristics of LLM systems and discuss the challenges for engineering LLM systems, illustrated by real examples. We also present the solutions we have found untill now to address the quality characteristics and the challenges.
Leon Schrijvers
Docent-Onderzoeker (Fontys University of Applied Sciences)
With the advent of foundation models and generative AI, especially the recent explosion in Large Language Models (LLMs), we see our students and companies around us build a whole new type of AI-enabled systems: LLM-based systems or LLM systems in short. The most well-known example of an LLM system is the chatbot ChatGPT. Inspired by ChatGPT and its possibilities, many developers want to build their own chatbots, trained on their own set of documents, e.g. as an intelligent search engine. For the specific text generation task they have to:
1) select the most appropriate LLM, sometimes fine-tune it,
2) engineer the document retrieval step (Retrieval Augmented Generation, RAG),
3) engineer the prompt,
4) engineer a user interface that hides the complexity of prompts and answers to end users. Especially prompt engineering is a new activity introduced in LLM systems.
It is intrinsically hard as the possibilities are endless, prompts are hard to test or compare, the result might vary with different LLM models or model versions, prompts are difficult to debug, you need domain expertise (and language skills!) to engineer fitting prompts for the task at hand, and so on. For LLMs however, prompt engineering is the main way the models can be adapted to support specific tasks. So, for LLM systems we must conclude that they are data + model + prompt + code. Where it must also be noted that with LLM systems the model is usually provided by an external party and thus hard or inpossible for the developer to control, other than by engineering prompts. The external party might however frequently update its LLM and this might necessitate a system update for the LLM system as well. In this talk, we analyze the quality characteristics of LLM systems and discuss the challenges for engineering LLM systems, illustrated by real examples. We also present the solutions we have found untill now to address the quality characteristics and the challenges.
With the advent of foundation models and generative AI, especially the recent explosion in Large Language Models (LLMs), we see our students and companies around us build a whole new type of AI-enabled systems: LLM-based systems or LLM systems in short. The most well-known example of an LLM system is the chatbot ChatGPT. Inspired by ChatGPT and its possibilities, many developers want to build their own chatbots, trained on their own set of documents, e.g. as an intelligent search engine. For the specific text generation task they have to:
1) select the most appropriate LLM, sometimes fine-tune it,
2) engineer the document retrieval step (Retrieval Augmented Generation, RAG),
3) engineer the prompt,
4) engineer a user interface that hides the complexity of prompts and answers to end users. Especially prompt engineering is a new activity introduced in LLM systems.
It is intrinsically hard as the possibilities are endless, prompts are hard to test or compare, the result might vary with different LLM models or model versions, prompts are difficult to debug, you need domain expertise (and language skills!) to engineer fitting prompts for the task at hand, and so on. For LLMs however, prompt engineering is the main way the models can be adapted to support specific tasks. So, for LLM systems we must conclude that they are data + model + prompt + code. Where it must also be noted that with LLM systems the model is usually provided by an external party and thus hard or inpossible for the developer to control, other than by engineering prompts. The external party might however frequently update its LLM and this might necessitate a system update for the LLM system as well. In this talk, we analyze the quality characteristics of LLM systems and discuss the challenges for engineering LLM systems, illustrated by real examples. We also present the solutions we have found untill now to address the quality characteristics and the challenges.
Terug naar het overzicht
Geïnteresseerd in deze lezing?
We believe data drives digital transformation
Blog
De kracht van Retrieval-Augmented Generation (RAG) ontsluiten
Digitale Transformatie voor MKB: 8x Voordelen en Uitdagingen
Meld je aan voor de nieuwsbrief
naar boven