Build a Custom LLM-Powered Chat App Using Chainlit
Building conversational AI applications has become increasingly accessible with the rise of large language models (LLMs). However, creating a user-friendly chat interface that can effectively interact with these models often requires significant frontend development skills. Chainlit changes this by providing a Python-first framework specifically designed for building LLM chat apps without the complexity of traditional web development.
Whether you’re a data scientist looking to prototype AI applications or a developer wanting to create production-ready chat interfaces, Chainlit offers an intuitive solution that lets you focus on the AI logic rather than the user interface complexities.
Let’s explore what makes Chainlit unique and why it’s the ideal framework for creating LLM-powered chat applications.
What is Chainlit?
Chainlit is an open-source Python framework designed specifically for building conversational AI applications and LLM chat apps. Unlike general-purpose web frameworks, Chainlit is purpose-built for creating chat interfaces that work seamlessly with large language models, making it easier to develop, test, and deploy AI-powered conversational applications.
Chainlit bridges the gap between AI model development and user-facing applications. While frameworks like LangChain help you build the AI logic, Chainlit focuses on providing the interface layer that users interact with. It automatically handles common chat application requirements like message history, real-time streaming, file uploads, and user session management.
Now that we understand its strengths, let’s get hands-on and build an LLM-powered chat app from scratch.
Learn Intermediate Go: File Handling
Learn about handling files and directories in this Intermediate Go course. Try it for freeBuilding LLM app using Chainlit
In this step-by-step tutorial, you’ll build a fully functional LLM-powered chat application using Chainlit and Ollama, all running locally on your machine.
Prerequisites
Before we start building, ensure you have the following installed:
Python 3.8 or higher
Basic familiarity with Python programming
Sufficient disk space for AI model downloads (several GB)
Step 1: Install Ollama
First, we need to install Ollama to run LLMs locally. Ollama provides an easy way to run large language models on your own machine.
On Windows:
Download the installer from ollama.ai and follow the installation instructions.
On macOS:
# Using Homebrewbrew install ollama# Or download from https://ollama.ai
On Linux:
curl -fsSL https://ollama.ai/install.sh | sh
Step 2: Download a language model
Once Ollama is installed, download a language model. We’ll use Llama 2, which provides good performance for chat applications:
# Pull the Llama 2 model (this may take several minutes)ollama pull llama2# Verify the model is availableollama list
You should see llama2:latest
in the list of available models.
Step 3: Set up your development environment
Create a new project directory and set up a virtual environment:
# Create project directorymkdir chainlit-chat-appcd chainlit-chat-app
Now create a virtual environment. The command varies depending on your system:
On Windows:
# Try python firstpython -m venv venv
Note: If
python
doesn’t work, try usingpython3
orpy
in this command.
On macOS/Linux:
# Try python3 first (most common on macOS)python3 -m venv venv# If python3 doesn't work, try pythonpython -m venv venv
Activate your virtual environment:
# Activate virtual environment# On macOS/Linux:source venv/bin/activate# On Windows:venv\Scripts\activate
You should see (venv)
appear at the beginning of your terminal prompt, indicating the virtual environment is active.
Step 4: Install required dependencies
Install Chainlit and the necessary packages for Ollama integration:
# Install the latest version of Chainlitpip install chainlit>=1.0.0# Install requests for API communicationpip install requests# Create requirements filepip freeze > requirements.txt
With our environment set up, we’ll now create the core functionality of our chat app, starting with basic message handling.
Step 5: Create the basic chat module
Let’s start by creating the foundational chat functionality. Create a file called basic_chat.py
in your chainlit-chat-app
directory:
import chainlit as climport requestsimport json# Ollama API configurationOLLAMA_URL = "http://localhost:11434/api/generate"MODEL_NAME = "llama2"def call_ollama(message: str) -> str:"""Send a message to Ollama and return the response."""payload = {"model": MODEL_NAME,"prompt": message,"stream": False}try:response = requests.post(OLLAMA_URL, json=payload)response.raise_for_status()result = response.json()return result.get("response", "Sorry, I couldn't generate a response.")except requests.exceptions.RequestException as e:return f"Error connecting to Ollama: {str(e)}"except json.JSONDecodeError:return "Error: Invalid response from Ollama"@cl.on_chat_startasync def start():"""Initialize the chat session."""await cl.Message(content="Hello! I'm your basic AI assistant powered by Llama 2. How can I help you today?").send()@cl.on_messageasync def main(message: cl.Message):"""Handle incoming user messages."""# Show that we're processing the messagemsg = cl.Message(content="")await msg.send()# Get response from Ollamaresponse = call_ollama(message.content)# Update the message with the responsemsg.content = responseawait msg.update()
Code Explanation:
API Configuration: Sets up constants for connecting to Ollama running locally on port 11434
call_ollama()
function: Handles the HTTP request to Ollama with proper error handling for network issues and JSON parsing@cl.on_chat_start
decorator: Chainlit decorator that runs when a new chat session begins, sending a welcome message@cl.on_message
decorator: Chainlit decorator that handles every user message, processes it through Ollama, and displays the responseMessage handling: Creates an empty message first, then updates it with the AI response for a smooth user experience
Step 6: Create the streaming module
Now let’s create a separate module for streaming functionality. Create another file named streaming.py
:
import requestsimport json# Ollama API configurationOLLAMA_URL = "http://localhost:11434/api/generate"MODEL_NAME = "llama2"def call_ollama_stream(prompt: str):"""Send a message to Ollama and yield streaming responses."""payload = {"model": MODEL_NAME,"prompt": prompt,"stream": True,"options": {"temperature": 0.7,"top_p": 0.9}}try:response = requests.post(OLLAMA_URL, json=payload, stream=True)response.raise_for_status()for line in response.iter_lines():if line:try:data = json.loads(line.decode('utf-8'))if 'response' in data:yield data['response']except json.JSONDecodeError:continueexcept requests.exceptions.RequestException as e:yield f"Error connecting to Ollama: {str(e)}"async def stream_response(message_content: str, chainlit_message):"""Stream a response to a Chainlit message."""full_response = ""for chunk in call_ollama_stream(message_content):full_response += chunkchainlit_message.content = full_responseawait chainlit_message.update()return full_response
Code Explanation:
Streaming API call: Uses
stream=True
parameter to enable real-time response streaming from OllamaGenerator function:
call_ollama_stream()
usesyield
to return response chunks as they arrive, creating a more responsive experienceResponse options: Includes
temperature
(creativity) andtop_p
(response diversity) parameters for better conversation qualityLine-by-line processing: Processes each line of the streaming response, handling JSON parsing errors gracefully
Helper function:
stream_response()
provides a convenient way to stream responses to Chainlit messages
Step 7: Create the history module
Create a module to handle conversation history. Create a file as history.py
:
import chainlit as cldef initialize_history():"""Initialize conversation history in user session."""cl.user_session.set("message_history", [])def add_message_to_history(message_type: str, content: str):"""Add a message to the conversation history."""message_history = cl.user_session.get("message_history", [])message_history.append({"type": message_type,"content": content})cl.user_session.set("message_history", message_history)def build_conversation_context():"""Build conversation context from message history."""message_history = cl.user_session.get("message_history", [])if not message_history:return "You are a helpful AI assistant."context = "You are a helpful AI assistant. Here's our conversation so far:\n\n"# Keep last 6 messages for contextfor msg in message_history[-6:]:role = "Human" if msg['type'] == 'user_message' else "Assistant"context += f"{role}: {msg['content']}\n"return contextdef get_contextual_prompt(user_message: str):"""Build a prompt with conversation context."""context = build_conversation_context()return f"{context}\nHuman: {user_message}\nAssistant:"
Code Explanation:
Session storage: Uses Chainlit’s
user_session
to store conversation history per user, maintaining privacy and contextMessage tracking: Stores both user messages and AI responses with type labels for proper context reconstruction
Context building: Creates a formatted conversation history that helps the AI understand previous interactions
Memory management: Limits context to the last 6 messages to prevent overwhelming the AI model and stay within token limits
Prompt formatting: Combines conversation history with the current message in a format that AI models understand well
Our chat app now has memory, but we can enhance the user experience further by adding file upload capabilities.
Step 8: Create the file handler module
Create a module for file upload functionality. Create file_handler.py
:
Note: File upload functionality requires Chainlit 1.0.0+. If you have an older version, this module will provide fallback functionality.
import chainlit as clasync def handle_file_upload(files):"""Handle uploaded files and store them in the session.Compatible with Chainlit 1.0.0+"""for file in files:if file.type == "text/plain":# Read the file contenttry:content = file.content.decode('utf-8')except AttributeError:# Fallback for older Chainlit versionswith open(file.path, 'r', encoding='utf-8') as f:content = f.read()# Store file content in user sessioncl.user_session.set("uploaded_file", {"name": file.name,"content": content})await cl.Message(content=f"📄 **File uploaded successfully!**\n\nFile: `{file.name}`\nSize: {len(content)} characters\n\nYou can now ask me questions about this file's content!").send()else:await cl.Message(content=f"❌ Sorry, I can only process text files (.txt). The file `{file.name}` is not supported.").send()def get_file_context():"""Get uploaded file context for the prompt."""uploaded_file = cl.user_session.get("uploaded_file")if uploaded_file:return f"\n\nAdditionally, the user has uploaded a file named '{uploaded_file['name']}' with the following content:\n\n{uploaded_file['content'][:3000]}..."return ""def has_uploaded_file():"""Check if user has uploaded a file."""return cl.user_session.get("uploaded_file") is not None# Version compatibility checkdef supports_file_upload():"""Check if the current Chainlit version supports file uploads."""try:import chainlitversion = chainlit.__version__major, minor = map(int, version.split('.')[:2])return major >= 1 and minor >= 0except:return False
Code Explanation:
File validation: Checks file type to ensure only text files are processed, preventing errors with binary files
Version compatibility: Handles different Chainlit versions gracefully with fallback methods for file reading
Content storage: Stores file content in the user session, making it available throughout the conversation
Context integration: Provides functions to check for uploaded files and include their content in AI prompts
Content truncation: Limits file content to 3000 characters to prevent exceeding AI model context limits
User feedback: Provides clear success and error messages to guide users on supported file types
We’ve built all the individual components. Now it’s time to bring everything together into a cohesive application.
Step 9: Create the main application
Now let’s combine all modules into our main application. Create a new file called app.py
in your chainlit-chat-app
directory:
Make sure you have created all the previous module files (streaming.py
, history.py
, file_handler.py
) before creating this main application file.
import chainlit as clfrom streaming import call_ollama_streamfrom history import initialize_history, add_message_to_history, get_contextual_promptfrom file_handler import handle_file_upload, get_file_context, has_uploaded_file, supports_file_upload@cl.on_chat_startasync def start():"""Initialize the chat session."""# Initialize conversation historyinitialize_history()# Check if file upload is supportedfile_upload_info = ""if supports_file_upload():file_upload_info = "• **File upload support** for text files\n"else:file_upload_info = "• File upload not available (requires Chainlit 1.0.0+)\n"welcome_msg = f"""🤖 **Welcome to your Complete AI Chat Assistant!**I'm powered by Llama 2 running locally through Ollama. Here's what I can do:**Core Features:**• **Real-time streaming** responses• **Conversation memory** and context{file_upload_info}• **Complete privacy** - everything runs locally**Capabilities:**• Answer questions on any topic• Help with creative writing and brainstorming• Analyze and explain complex concepts• Process uploaded text files (if supported)• Provide coding help and technical guidanceFeel free to ask me anything!"""await cl.Message(content=welcome_msg).send()# Only add file upload handler if supportedif supports_file_upload():@cl.on_file_uploadasync def on_file_upload(files):"""Handle file uploads using the file_handler module.Only available in Chainlit 1.0.0+"""await handle_file_upload(files)@cl.on_messageasync def main(message: cl.Message):"""Handle incoming user messages with all features combined."""# Add user message to historyadd_message_to_history("user_message", message.content)# Build prompt with conversation contextprompt = get_contextual_prompt(message.content)# Add file context if availableif has_uploaded_file():file_context = get_file_context()uploaded_file = cl.user_session.get("uploaded_file")prompt = prompt.replace("Human:", f"Human (with uploaded file '{uploaded_file['name']}'):")prompt = prompt.replace("\nAssistant:", f"{file_context}\n\nAssistant:")# Create message for streaming responsemsg = cl.Message(content="")await msg.send()# Stream the responsefull_response = ""for chunk in call_ollama_stream(prompt):full_response += chunkmsg.content = full_responseawait msg.update()# Add assistant response to historyadd_message_to_history("assistant_message", full_response)
Code Explanation:
Module imports: Imports functionality from our custom modules, demonstrating modular architecture
Dynamic feature detection: Checks Chainlit version compatibility and adjusts features accordingly
Conditional decorators: Uses
if supports_file_upload()
to conditionally add file upload handlingOrchestration logic: Combines conversation history, file context, and streaming responses seamlessly
Context management: Intelligently merges file content with conversation history for comprehensive AI context
Error resilience: Gracefully handles missing features and version incompatibilities
Step 10: Run your application
Now let’s test our chat application.
- Start Ollama (if not already running):
ollama serve
- Run the application:
chainlit run app.py -w
Congratulations! You’ve built a fully functional chat app. But how does Chainlit compare to other options you might consider?
Chainlit vs other frameworks
Here’s a quick comparison to help you understand when to choose Chainlit for your projects.
Framework | Best for | Chat features | Learning curve |
---|---|---|---|
Chainlit | Conversational AI apps | Native streaming, history, file uploads | Low (for chat apps) |
Streamlit | Data dashboards | Basic (custom required) | Low (for data apps) |
Gradio | ML model demos | Simple input/output | Very Low |
Flask/FastAPI | Custom web apps | Manual implementation | High |
When to choose Chainlit
Choose Chainlit when you need:
Real-time streaming chat responses
Built-in conversation history
File uploads in chat context
Quick development of conversational AI apps
Choose alternatives when:
Streamlit: Building data dashboards or analytics tools
Gradio: Creating quick ML model demonstrations
Flask/FastAPI: Need complete control over application architecture
Now let’s dive deeper into Chainlit’s key components to help you build more sophisticated applications.
Key components of Chainlit
Each component serves a specific purpose in creating conversational AI experiences.
Messages and responses
The foundation of any chat application is message handling. Chainlit provides several message types to create rich conversational experiences:
Text messages: The most basic form of communication, supporting markdown formatting for rich text display.
Streaming messages: Allow real-time display of AI responses as they’re generated, creating a more natural conversational flow.
System messages: Special messages that provide context or instructions without appearing as regular conversation.
User input handling
Chainlit offers various ways to capture user input beyond simple text messages:
Text input: Standard chat input with support for multiline messages and markdown.
File uploads: Built-in support for handling document uploads, images, and other file types that can be processed by your AI model.
Actions and buttons: Interactive elements that allow users to trigger specific functions or workflows.
Session management
Effective session management is crucial for maintaining conversation context:
User sessions: Automatic handling of individual user sessions with persistent storage options.
Conversation history: Built-in message history that maintains context across the conversation.
State management: Tools for maintaining application state between messages and user interactions.
Decorators and hooks
Chainlit uses Python decorators to define application behavior:
@cl.on_message: Handles incoming user messages and defines response logic.
@cl.on_chat_start: Executes when a new chat session begins, useful for initialization.
@cl.on_file_upload: Processes uploaded files and integrates them into the conversation.
Integration components
Chainlit provides seamless integration with popular AI frameworks:
LangChain Integration: Native support for LangChain chains, agents, and tools.
Custom LLM Integration: Flexible architecture for integrating any LLM or AI service.
Tool integration: Support for connecting external tools and APIs to your chat application.
Conclusion
Chainlit provides an excellent foundation for building LLM-powered chat applications with minimal complexity. By combining Chainlit’s intuitive framework with Ollama’s local model execution, you can create powerful conversational AI applications that prioritize privacy and control.
Ready to build more advanced applications? Explore our Build Chatbots with Python course to deepen your programming skills and create even more powerful AI solutions.
Frequently asked questions
1. What is Chainlit used for?
Chainlit is used for building conversational AI applications and LLM chat interfaces. Common use cases include creating AI-powered chatbots, building document analysis tools, prototyping conversational AI features, and developing customer support applications.
2. Is Chainlit better than Streamlit?
Chainlit and Streamlit serve different purposes. Chainlit is ideal for building chat applications, real-time message streaming, and conversational AI interfaces. Streamlit excels at creating data dashboards, building ML model demos, and developing analytical applications. Choose Chainlit for chat interfaces and Streamlit for data visualization and general web apps.
3. What is function calling in Chainlit?
Function calling allows your chat app to integrate external tools and APIs. You can define custom functions for tasks like web searches, database queries, or API calls. Using Chainlit’s tool integration with LangChain agents, your AI can perform actions like checking the weather, sending emails, or querying databases based on user requests.
4. What are the limitations of Chainlit?
Chainlit has limited UI customization compared to building custom React applications. It’s better suited for chat-based interactions than complex multi-step workflows, and designed more for prototyping and medium-scale applications than large enterprise deployments. There’s also a learning curve that requires understanding both Chainlit and LLM integration concepts. Many of these limitations can be addressed through custom components and integrations.
5. How do I turn off telemetry in Chainlit?
To disable telemetry in Chainlit, set CHAINLIT_TELEMETRY=false
in your environment or .env
file, add telemetry = false
to your Chainlit configuration file, or use the --no-telemetry
flag when running your application. You can verify telemetry is disabled by checking your application logs.
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
How to Run Deepseek R1 Locally
Learn how to set up and use Deepsake R1 locally using Ollama. - Article
Handling Text Files in Python: How to Read from a File
Learn how to read from text files in Python using built-in functions like `read()` and `readline()`. Explore file handling, file modes, and best practices for efficient file handling. - Article
Building Multi-Agent Application with CrewAI
Learn how to create powerful multi-agent applications using CrewAI, a framework that enables AI agents to collaborate and solve complex problems together.
Learn more on Codecademy
- Course
Learn Intermediate Go: File Handling
Learn about handling files and directories in this Intermediate Go course.With CertificateIntermediate1 hour - Free course
Learn How to Use AI for Coding
Ready to learn how to use AI for coding? Learn how to use generative AI tools like ChatGPT to generate code and expedite your development.Beginner Friendly1 hour - Skill path
Generative AI for Everyone
Learn the basics of generative AI and best prompt engineering practices when using AI chatbots like ChatGPT to create new content.Includes 6 CoursesWith CertificateBeginner Friendly3 hours