Articles

Jan-v1 Tutorial: Build a Smart Q&A Assistant

Jan-v1 is a breakthrough 4B parameter AI model that achieves 91.1% accuracy in question-answering tasks while running 100% offline on your device. This remarkable performance surpasses even Perplexity Pro (90.6%), making Jan-v1 one of the most powerful local AI models available. Built on Alibaba’s Qwen3-4B-Thinking foundation, it offers complete privacy protection since all processing happens on your computer without internet connectivity.

In this comprehensive tutorial, we’ll show you how to set up Jan-v1 locally and build your own intelligent Q&A assistant.

Let’s start by understanding what makes Jan-v1 unique in the AI landscape.

  • Learn to build AI chatbots and agents with Flowise's no-code platform—no programming required. Perfect for business professionals.
    • Beginner Friendly.
      1 hour
  • Learn Streamlit to build and deploy interactive AI applications with Python in this hands-on course.
    • With Certificate
    • Intermediate.
      1 hour

What is Jan-v1?

Jan-v1 is an advanced conversational AI framework designed to simplify the process of building smart assistants, chatbots, and Q&A systems. It leverages deep learning models, natural language processing (NLP), and contextual reasoning to understand user inputs and provide accurate, context-aware responses.

At its core, Jan-v1 is lightweight yet powerful, offering developers the flexibility to integrate it with existing systems or use it as a standalone assistant. With customizable settings, it’s ideal for both small projects and enterprise-level solutions.

With this foundation, let’s move on to how we can set up Jan-v1 locally to start building our own assistant.

How to set up Jan-v1: Two installation methods

Jan-v1 offers two installation approaches depending on your technical preferences and use case needs.

To install Jan-v1 on your machine, follow these steps:

Step 1: Download and Install

Visit jan.ai and download the application for your operating system.

Step 2: Enable web search capabilities

To unlock Jan-v1’s full potential for Q&A tasks:

  1. Open the Jan application
  2. Go to Settings → Experimental Features and turn “On”
  3. Navigate to Settings → MCP Servers
  4. Enable search-related MCP (such as Serper)

Method 2: Manual Python setup (For advanced users)

For developers who prefer manual control or want to integrate Jan-v1 into custom applications, follow this approach:

Step 1: Create a virtual environment

Open the command line / terminal on your machine and execute this command to create a virtual environment named myenv:

python -m venv myenv

Then, activate the newly created virtual environment:

source myenv/bin/activate # For Linux/macOS
myenv\Scripts\activate # For Windows

Step 2: Install the required dependencies

Run these commands to install the necessary dependencies on the system:

pip install llama-cpp-python
pip install streamlit
pip install requests

Step 3: Download the Jan-v1 model

Firstly, create and navigate to a folder named models:

mkdir models
cd models

Then, download the Jan-v1 model named Jan-v1-4B-Q4_K_M.gguf:

curl -L https://huggingface.co/janhq/Jan-v1-4B-GGUF/resolve/main/Jan-v1-4B-Q4_K_M.gguf -o Jan-v1-4B-Q4_K_M.gguf

Once we complete these steps, we’re done setting up Jan-v1 locally.

In the next section, we’ll explore how to build a smart Q&A assistant using Jan-v1 locally.

Build a smart Q&A assistant using Jan-v1

Creating a smart Q&A assistant locally with Jan-v1 is about combining data, context, and the right configurations. Here’s how we can build one.

Step 1: Import required libraries

Firstly, let’s create a Python file named app.py, where the main logic of the application will reside:

touch app.py

Then, open the file and import the necessary libraries:

import requests
import streamlit as st
from llama_cpp import Llama

Step 2: Load the Jan-v1 model

Next, we define a function for loading the Jan-v1 model using the llama_cpp library. In this function, the model_path parameter specifies the location of the model to load, which is models/Jan-v1-4B-Q4_K_M.gguf in this case.

Additionally, the model is cached using @st.cache_resource to avoid reloading on every interaction, improving performance:

@st.cache_resource
def load_model():
return Llama(model_path="models/Jan-v1-4B-Q4_K_M.gguf")
model = load_model()

Step 3: Define the Serper search function

In this step, we will implement a function to perform a Google search using the Serper API. The search results will supplement the assistant’s knowledge by providing real-time information from the web:

def serper_search(query):
url = "https://google.serper.dev/search"
headers = {
"X-API-KEY": "<Serper_API_Key>",
"Content-Type": "application/json"
}
data = {
"q": query,
"num": 3,
"gl": "us",
"h1": "en",
}
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
results = response.json().get("organic", [])
return "\n".join([f"- {item['title']}: {item['link']}" for item in results])
else:
return "No results found."

Let’s have a detailed, step-by-step breakdown of what this function does.

1. API endpoint

The API endpoint https://google.serper.dev/search allows us to query Google’s search engine programmatically via the Serper service.

2. Authentication

The X-API-KEY header authenticates the request using a Serper API key. To get one, login to the official Serper website and copy the already generated Serper API key. Then, replace the <Serper_API_Key> placeholder with the copied API key.

3. Search configuration

Several parameters are defined to specify what kind of search we want:

  • q: The search query entered by the user.
  • num: The number of results to fetch (set to 3 for brevity).
  • gl: The geographical location (set to "us" to prioritize results from the United States).
  • hl: The language for results (set to "en" for English).

4. Parsing the response

The function extracts the organic search results and formats them into a simple bulleted list that will be included in the prompt for the language model.

5. Error handling

If the request fails or returns a status code other than 200 (successful request), the function provides a fallback message to handle the error smoothly.

Step 4: Build the Streamlit UI

This step sets up the Streamlit user interface for our smart Q&A assistant app. It takes the user’s query, calls the search function, generates a prompt for the language model, and displays the answer. The assistant responds using search results without showing reasoning or internal processing:

st.title("Smart Q&A Assistant")
user_query = st.text_input("Enter your question:")
if st.button("Get Answer"):
if user_query.strip() == "":
st.warning("Please enter a question.")
else:
with st.spinner("Searching and generating response..."):
search_results = serper_search(user_query)
prompt = f"""
You are a knowledgeable assistant. Using the following search results, answer the user's question in 3-5 complete sentences. Do NOT show reasoning, internal thoughts, or step-by-step explanations. Start your answer immediately after 'Answer:'.
Search results:
{search_results}
Question:
{user_query}
Answer:
"""
response = model(
prompt=prompt,
max_tokens=1024,
temperature=0.5
)
answer = response['choices'][0]['text'].strip()
if not answer.endswith(('.', '!', '?')):
answer += "."
st.write(answer)

Step 5: Run the application

Finally, it’s time to run the application:

streamlit run app.py

Here is the output:

A GIF demonstrating our smart Jan-v1 Q&A assistant in action

As we can see, we have successfully created a functional and scalable smart Q&A assistant using Jan-v1.

Let’s now look at some of the key features that make Jan-v1 stand out.

Key features of Jan-v1

Jan-v1 offers a suite of features designed to enhance the development and usability of smart assistants:

  • Contextual awareness: The assistant remembers previous interactions, allowing for natural follow-up questions.
  • Customizable knowledge sources: Import documents, APIs, or structured data to tailor responses.
  • Lightweight & fast: Optimized for low-latency responses without heavy server requirements.
  • Extensible architecture: Supports plugins and integrations for analytics, language models, and external services.

These features make Jan-v1 highly suitable for developers aiming to create personalized, scalable, and efficient assistants.

Now that we’ve covered its strengths, it’s essential to understand the limitations of Jan-v1 before deploying it widely.

Limitations of Jan-v1

While Jan-v1 offers powerful tools, it is important to be aware of its limitations:

  • Dependency on data quality: Poorly structured or insufficient datasets can reduce accuracy.
  • Contextual drift: Extended conversations may confuse the assistant if not properly managed.
  • Limited offline capabilities: Some features require online APIs or external services.
  • Training overhead: Customizing the assistant for specific industries may require additional time and expertise.

Recognizing these limitations helps us plan better implementations and set realistic expectations while using Jan-v1.

Conclusion

In this tutorial, we had a detailed discussion on Jan-v1, covering what it is, what it offers, and how to set it up. We learned how to build a smart Q&A assistant using it. Finally, we explored its key features and limitations to get a fair understanding of how to use Jan-v1 effectively.

Jan-v1 is an excellent tool for developers and organizations looking to create responsive and intelligent assistants without the need for complex AI expertise. Its modular design, ease of customization, and scalability make it a top choice for projects requiring conversational AI. By combining its strengths with careful planning, developers can build efficient and context-aware assistants that deliver real value.

If you want to learn more about generative AI, check out the Intro to Generative AI course on Codecademy.

Frequently asked questions

1. Is Jan-v1 free to use?

Yes, Jan-v1 offers a free version with core features suitable for development and small projects. Paid plans are available for enterprise-grade features.

2. Is Jan-v1 open source?

Yes, Jan-v1 is open source, allowing developers to inspect, modify, and extend the codebase according to their needs.

3. Does Jan-v1 work offline?

Jan-v1 can operate in limited offline mode for basic interactions. However, advanced features like external API integration require internet connectivity.

4. What makes Jan AI different from ChatGPT or other AI models?

Unlike cloud-based AI services, Jan-v1 operates entirely offline with 91.1% accuracy while maintaining complete privacy. You don’t need internet connectivity or worry about data sharing.

5. How does Jan-v1 work?

Jan-v1 processes user inputs using natural language processing models and context-aware algorithms. It retrieves information from configured data sources and generates intelligent, real-time responses.

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team

Learn more on Codecademy