Published: January 13, 2024
This is the first in a three-part series on LLMs and chatbots. Part 2 on building a chatbot with WebLLM and part 3 about using the Prompt API are already live.
Large language models (LLMs) are becoming an important building block in software development: LLMs are well suited for generating and processing natural language text, unlocking use cases such as data extraction, summarization, or facilitating dialogues with user data.
In this series, I discuss the benefits and drawbacks of on-device LLMs and guide you through adding chatbot capabilities to an existing application using two local and offline-capable approaches, the web-based LLM runtime WebLLM and Chrome's experimental Prompt API.
Potential use cases
 
We'll build a chatbot on top of a classic to-do list application. The source code for each step is available on GitHub. Users can add new to-dos, mark them as done, and delete them.
You may want to add a feature for users to learn more about the to-do list data or perform additional functions. A chatbot feature can allow users to:
- Inquire about the number of open tasks.
- Identify duplicates or very similar to-dos.
- Categorize the to-dos into groups.
- Receive recommendations for new tasks based on completed ones.
- Translate tasks into different languages.
- Export the to-do list in XML format.
These are just a few examples of tasks that LLMs can handle.
What are large language models?
LLMs are artificial neural networks that process and generate natural language text. Most current LLMs are based on the Transformer architecture, developed at Google. Examples include Google's Gemini and Gemma models, OpenAI's GPT model series, and the open-source models, such as LLaMa by Meta AI and Mistral by Mistral AI.
Thanks to their training on vast amounts of data, LLMs possess an impressive range of capabilities. They understand numerous languages, have trivia knowledge, can translate between languages, or generate programming code. The extent of these capabilities can vary significantly based on the size of the model, as discussed in Understand LLM Sizes.
LLMs lead to a paradigm shift in software architecture, as natural language now becomes a core feature in software engineering. Instead of calling APIs using well-defined interfaces, expressing the intent in natural language in a so-called prompt is sufficient.
Limitations of LLMs
LLMs also come with certain limitations:
- Non-deterministic behavior: LLMs can produce varying and occasionally even contradictory responses to the same prompt, as their outputs depend on probabilistic models rather than fixed rules.
- Hallucinations: These models may sometimes generate incorrect or nonsensical information, relying on learned patterns rather than factual accuracy.
- Prompt injections: LLMs can be susceptible to prompt injection attacks, where users craft input prompts that manipulate the model into deviating from its intended function or producing undesired outcomes.
Therefore, users must verify the results generated by LLMs before taking any consequential actions.
When dealing with on-device LLMs, their size must be considered. They attain file sizes of several Gigabytes and must be downloaded to the user's device before first use. Smaller models tend to achieve lower-quality responses, especially when compared to cloud-backed models.
Choose local solutions
Your first instinct to integrate an LLM into your web application may be to use a cloud provider. Numerous providers offer high-quality LLMs, some of which are exclusive to specific providers. Cloud-based LLMs provide rapid inference speed at a reasonable cost, which is usually calculated per processed token.
In contrast, local solutions present compelling advantages. By operating directly on the user's device, locally-hosted LLMs provide more reliable response times, remain accessible even when the user is offline, and don't require developers to pay subscription fees or other ongoing costs. Notably, they can substantially enhance user safety. By keeping all activities on-device, you can avoid transmitting personally identifiable information (PII) to external providers or regions.
Demos
You can take a look at the finished demos with chatbot capabilities, before you learn how to built it yourself.
- Original to-do application
- To-do application with WebLLM
- To-do application with Prompt API
- Source on GitHub
Next, you'll use WebLLM to add a chatbot to the to-do list application.
