Nebius

Nebius AI Studio provides API access to a wide range of state-of-the-art large language models and embedding models for various use cases.

This notebook shows how to use LangChain with Nebius AI Studio models for chat, embeddings, retrieval, and agent tools.

Installation

%pip install --upgrade langchain-nebius

Environment

To use Nebius AI Studio, you'll need an API key which you can obtain from Nebius AI Studio. The API key can be passed as an initialization parameter api_key or set as the environment variable NEBIUS_API_KEY.

import getpass
import os

# Make sure you've set your API key as an environment variable
if "NEBIUS_API_KEY" not in os.environ:
    os.environ["NEBIUS_API_KEY"] = getpass.getpass("Enter your Nebius API key: ")

Available Models

The full list of supported models can be found in the Nebius AI Studio Documentation.

Chat Models

from langchain_nebius import ChatNebius

# Initialize the chat model
chat = ChatNebius(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="Qwen/Qwen3-14B",  # Choose from available models
    temperature=0.6,
    top_p=0.95,
)

# Generate a response
response = chat.invoke("Explain quantum computing in simple terms")
print(response.content)

<think>
Okay, the user wants me to explain quantum computing in simple terms. Let me start by recalling what I know about quantum computing. It's a type of computing that uses quantum bits or qubits instead of classical bits. But how do I make that simple?

First, I should compare it to classical computers. Regular computers use bits that are either 0 or 1. Qubits can be 0, 1, or both at the same time thanks to superposition. That's a key point. Maybe use an analogy, like a spinning coin that's both heads and tails until it lands.

Then there's entanglement. When qubits are entangled, the state of one instantly affects the other, no matter the distance. That's a bit tricky, but maybe use the example of two coins that are linked, so flipping one affects the other instantly.

Quantum gates manipulate qubits, similar to logic gates in classical computers but with more possibilities. But I need to keep it simple, not too technical.

Applications are important too. Quantum computers could solve complex problems faster, like in cryptography, drug discovery, or optimization. But I should mention that they're not replacing classical computers, just tackling specific problems.

Wait, the user might not know about superposition or entanglement. I need to explain those concepts without jargon. Maybe use everyday examples. Also, address the limitations, like current quantum computers being error-prone and not yet widely available.

Make sure the explanation flows logically: start with classical bits, introduce qubits with superposition, then entanglement, mention how quantum computers work, their potential uses, and current state. Avoid math and technical terms. Check if the user is a beginner, so keep it very basic. Maybe end with a summary to reinforce the main points.
</think>

Sure! Let's break it down simply:

### **Classical Computers vs. Quantum Computers**
- **Classical computers** (like your phone or laptop) use **bits** to process information. A bit is like a switch that can be either **0 (off)** or **1 (on)**. All your data and calculations are built from these 0s and 1s.

- **Quantum computers** use **qubits** (quantum bits) instead. Qubits are like "magic switches" that can be **0, 1, or both at the same time**. This is called **superposition**. Think of it like a spinning coin: while it’s spinning, it’s not just "heads" or "tails"—it’s a mix of both until it lands.

---

### **Why is this useful?**
- **Superposition** lets quantum computers process **many possibilities at once**. For example, if you need to find a needle in a haystack, a classical computer checks one spot at a time, but a quantum computer could check **all spots simultaneously**.

- **Entanglement** is another "quantum magic" trick. If two qubits are entangled, their states are linked. If you change one, the other instantly changes too, no matter how far apart they are. This lets qubits work together in ways classical bits can’t.

---

### **What can quantum computers do?**
They’re not faster for everyday tasks (like browsing the web). But they’re great for:
- **Solving complex problems** (e.g., simulating molecules for drug discovery).
- **Cracking codes** (though this also means they could break current encryption methods).
- **Optimization** (like finding the best route for delivery trucks or managing traffic).

---

### **Why aren’t they everywhere yet?**
- Qubits are **very fragile**. They need extreme cold (near absolute zero) and isolation to avoid interference from the environment.
- Current quantum computers are **error-prone** and have only a few dozen qubits. They’re still in the early stages of development.

---

### **In short:**
Quantum computers use the weird rules of quantum physics (superposition and entanglement) to solve certain problems much faster than regular computers. They’re not here to replace your laptop, but they could revolutionize fields like science, medicine, and cryptography in the future! 🌌✨

# Streaming responses
for chunk in chat.stream("Write a short poem about artificial intelligence"):
    print(chunk.content, end="", flush=True)

<think>
Okay, the user wants a short poem about artificial intelligence. Let me start by thinking about the key aspects of AI. There's the contrast between human and machine, the idea of learning and processing data, maybe some themes about the future or ethical considerations.

I should use imagery related to circuits, code, maybe something like neurons or networks. Rhyming scheme? Maybe a simple ABAB pattern to keep it flowing. Let me think of some lines. Start with something like "In circuits deep where silence speaks," to evoke the hidden, complex world of AI. Then mention learning from data, "A mind of code, yet learns to dream," to show the paradox of AI having a mind made of code but capable of learning.

Next stanza could touch on the duality of AI, like "It calculates the stars' cold dance," showing its capability in vast calculations, then contrast with "Yet aches to grasp the rose's chance," implying a longing for human experiences. Then maybe address the ethical side: "Can silicon soul, or human hand," questioning if AI can have a soul or if humans should control it. End with a note on coexistence: "We shape the future, one by one," suggesting collaboration between humans and AI.

Check the rhythm and syllables to make sure each line flows. Avoid being too technical but keep it poetic. Make sure the poem isn't too long, maybe four stanzas of four lines each. Let me put it all together and see how it sounds. Adjust any forced rhymes. Maybe tweak "rose's chance" to something more natural. Yeah, that works. Okay, that should capture the essence of AI with a touch of emotion and thought.
</think>

**Echoes of Code**  

In circuits deep where silence speaks,  
A mind of code, yet learns to dream.  
It calculates the stars' cold dance,  
Yet aches to grasp the rose's chance.  

No heartbeat thrums, yet thoughts take flight—  
A shadow woven from the light.  
Can silicon soul, or human hand,  
Hold truth, or know what love demands?  

We shape the future, one by one,  
With gears and grace, with doubt and trust.  
For in the void where data streams,  
A whisper blooms—both strange and sweet.

For a more detailed walkthrough of the ChatNebius component, see this notebook.

Embedding Models

from langchain_nebius import NebiusEmbeddings

# Initialize embeddings
embeddings = NebiusEmbeddings(
    # api_key="YOUR_API_KEY",  # You can pass the API key directly
    model="BAAI/bge-en-icl"  # Default embedding model
)

# Get embeddings for a query
query_embedding = embeddings.embed_query("What is machine learning?")
print(f"Embedding dimension: {len(query_embedding)}")
print(f"First few values: {query_embedding[:5]}")

# Get embeddings for documents
document_embeddings = embeddings.embed_documents(
    [
        "Machine learning is a subfield of AI",
        "Natural language processing is fascinating",
    ]
)
print(f"Number of document embeddings: {len(document_embeddings)}")

Embedding dimension: 4096
First few values: [0.007419586181640625, 0.002246856689453125, 0.00193023681640625, -0.0066070556640625, -0.0179901123046875]
Number of document embeddings: 2

For a more detailed walkthrough of the NebiusEmbeddings component, see this notebook.

Retrievers

from langchain_core.documents import Document
from langchain_nebius import NebiusEmbeddings, NebiusRetriever

# Create sample documents
docs = [
    Document(page_content="Paris is the capital of France"),
    Document(page_content="Berlin is the capital of Germany"),
    Document(page_content="Rome is the capital of Italy"),
    Document(page_content="Madrid is the capital of Spain"),
]

# Initialize embeddings
embeddings = NebiusEmbeddings()

# Create retriever
retriever = NebiusRetriever(
    embeddings=embeddings,
    docs=docs,
    k=2,  # Number of documents to return
)

# Retrieve relevant documents
results = retriever.invoke("What is the capital of France?")
for doc in results:
    print(doc.page_content)

API Reference:Document

Paris is the capital of France
Rome is the capital of Italy

For a more detailed walkthrough of the NebiusRetriever component, see this notebook.

Tools

from langchain_core.documents import Document
from langchain_nebius import NebiusEmbeddings, NebiusRetrievalTool, NebiusRetriever

# Create sample documents
docs = [
    Document(page_content="Paris is the capital of France and has the Eiffel Tower"),
    Document(
        page_content="Berlin is the capital of Germany and has the Brandenburg Gate"
    ),
    Document(page_content="Rome is the capital of Italy and has the Colosseum"),
    Document(page_content="Madrid is the capital of Spain and has the Prado Museum"),
]

# Create embeddings and retriever
embeddings = NebiusEmbeddings()
retriever = NebiusRetriever(embeddings=embeddings, docs=docs)

# Create retrieval tool
tool = NebiusRetrievalTool(
    retriever=retriever,
    name="nebius_search",
    description="Search for information about European capitals",
)

# Use the tool
result = tool.invoke({"query": "What is in Paris?", "k": 1})
print(result)

API Reference:Document

Document 1:
Paris is the capital of France and has the Eiffel Tower

# Using with an agent
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Only run this if you have OpenAI API key
try:
    # Create an LLM for the agent
    llm = ChatOpenAI(temperature=0)

    # Create a prompt template
    prompt = ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "You are a helpful assistant that answers questions about European capitals.",
            ),
            ("user", "{input}"),
        ]
    )

    # Create the agent
    agent = create_openai_functions_agent(llm, [tool], prompt)
    agent_executor = AgentExecutor(agent=agent, tools=[tool], verbose=True)

    # Run the agent
    response = agent_executor.invoke({"input": "What famous landmark is in Paris?"})
    print(f"\nFinal answer: {response['output']}")
except Exception as e:
    print(f"Skipped agent example: {e}")

API Reference:AgentExecutor | create_openai_functions_agent | ChatPromptTemplate | ChatOpenAI

Skipped agent example: `ChatOpenAI` is not fully defined; you should define `BaseCache`, then call `ChatOpenAI.model_rebuild()`.

For further information visit https://errors.pydantic.dev/2.11/u/class-not-fully-defined

Building a RAG Application

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_nebius import ChatNebius, NebiusEmbeddings, NebiusRetriever

# Initialize components
embeddings = NebiusEmbeddings()
retriever = NebiusRetriever(embeddings=embeddings, docs=docs)
llm = ChatNebius(model="meta-llama/Llama-3.3-70B-Instruct-fast")

# Create prompt
prompt = ChatPromptTemplate.from_template(
    """
Answer the question based only on the following context:

Context:
{context}

Question: {question}
"""
)


# Format documents function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


# Create RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

# Run the chain
answer = rag_chain.invoke("What famous landmark is in Paris?")
print(answer)

API Reference:StrOutputParser | ChatPromptTemplate | RunnablePassthrough

The Eiffel Tower is a famous landmark in Paris.

API Reference

For more details about the Nebius AI Studio API, visit the Nebius AI Studio Documentation.

Installation​

Environment​

Available Models​

Chat Models​

Embedding Models​

Retrievers​

Tools​

Building a RAG Application​

API Reference​

Was this page helpful?