AttributeError: 'SentenceTransformer' Object Has No Attribute 'Embed_Documents'

This error happens because sentence-transformers exposes encode, while LangChain’s embedding wrappers expose embed_documents.

The stack trace points to a mismatch between two similar but separate APIs. The SentenceTransformer class from the sentence-transformers library gives you a simple model.encode(texts) method to make embeddings. LangChain, by design, standardizes many providers behind an Embeddings interface that expects embed_documents() and embed_query(). If you pass a plain SentenceTransformer object where a LangChain Embeddings implementation is required, you’ll get this AttributeError. The fix is to either wrap the model with LangChain’s HuggingFaceEmbeddings or skip LangChain and call encode directly.

Root Causes And The One-Line Fix

Quick check: confirm which object you are handing to your vector store or chain. If it’s a raw SentenceTransformer, switch to a LangChain wrapper that adds the missing methods.

Using raw sentence-transformers: Replace the plain model with a wrapper class that implements embed_documents/embed_query.
Using LangChain ≥ 0.2: Import HuggingFaceEmbeddings from langchain-huggingface (the new split package) instead of the deprecated community path.
Working outside LangChain: Call the native model.encode(texts) directly and skip embed_documents entirely.

# ✅ Correct: use LangChain's Embeddings wrapper
from langchain_huggingface import HuggingFaceEmbeddings
emb = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectors = emb.embed_documents(["Doc one", "Doc two"])

# ✅ Also valid: old path (now deprecated, still common in code)
# from langchain_community.embeddings import HuggingFaceEmbeddings

# ✅ Native sentence-transformers (no LangChain)
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
vectors_native = model.encode(["Doc one", "Doc two"], batch_size=32, convert_to_numpy=True)

The wrapper approach keeps your code consistent across providers and lets LangChain vector stores call a uniform API. The native approach keeps dependencies light when you do not need chains.

Sentencetransformer Embed_Documents Error In Langchain — Causes And Fixes

Many tutorials mix libraries, which yields a method mismatch. Align your imports and pass an Embeddings implementation, not a bare encoder. The same issue appears when a vector store factory expects a wrapper while your code hands over a raw model.

AttributeError: ‘SentenceTransformer’ Object Has No Attribute ‘Embed_Documents’

Deeper fix: upgrade your imports to the current locations and pin versions so your wrappers and vector store agree on types.

# Keep these in sync for a smooth experience
pip install -U "langchain>=0.2" langchain-huggingface "sentence-transformers>=3.0" chromadb

# After installing, confirm the right class shows the expected methods
from langchain_huggingface import HuggingFaceEmbeddings
print(hasattr(HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2"), "embed_documents"))  # True

Symptoms, Causes, And Proven Fixes

Symptom	Likely Cause	Fix
AttributeError on `embed_documents`	Passing raw `SentenceTransformer` where LangChain expects an Embeddings interface	Wrap with `HuggingFaceEmbeddings` or use `model.encode`
Import path works on one machine, fails on another	Old `langchain_community` path vs new `langchain_huggingface` package	Switch imports to `from langchain_huggingface import HuggingFaceEmbeddings`
Vector store errors about dimensions	Indexes built with a different model size	Re-embed with one model and rebuild the index

Correct Patterns With Chroma And FAISS

Good pattern: let the vector store call the wrapper’s methods. This keeps batching and device handling neat.

# Chroma example with LangChain
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma

texts = ["Alpha", "Beta", "Gamma"]
emb = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vs = Chroma.from_texts(texts, embedding=emb, persist_directory="./chroma_db")
# Later: query
q_vec = emb.embed_query("search term")

Lean pattern: skip LangChain if you just need fast vectors and manual search logic.

# Pure sentence-transformers with FAISS
from sentence_transformers import SentenceTransformer
import faiss, numpy as np

texts = ["Alpha", "Beta", "Gamma"]
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
X = model.encode(texts, convert_to_numpy=True, batch_size=32)

index = faiss.IndexFlatIP(X.shape[1])
faiss.normalize_L2(X)
index.add(X)

query = "search term"
q = model.encode([query], convert_to_numpy=True)
faiss.normalize_L2(q)
D, I = index.search(q, k=3)

Version Pitfalls And How To Avoid Them

Split packages in LangChain: Many integrations moved into dedicated packages. Use langchain-huggingface for the wrapper import going forward.
Deprecated paths: Code that imports from langchain_community.embeddings still runs in some setups, but docs point to the new package. Migrate when you touch this area.
Model families differ: A switch from all-MiniLM-L6-v2 to a larger model changes vector size. Keep one model per index to avoid shape conflicts.

End-To-End Reference Snippets

Use LangChain’s Embeddings Interface

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

docs = ["Banana bread recipe", "How to brew oolong", "Caching with Redis"]
emb = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
store = FAISS.from_texts(docs, emb)

Stick To Native sentence-transformers

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
doc_vecs = model.encode(["text A", "text B"], normalize_embeddings=True)

Checklist To Keep This Error Away

Match the API: Use wrappers when your code path expects embed_documents/embed_query.
Pin versions: Lock langchain, langchain-huggingface, and sentence-transformers in your project file.
Test imports: Run a quick hasattr probe in CI to catch wrong classes early.
Audit vector size: Keep vector dimension consistent across embedding and index builds.
Write small probes: Add a unit test that calls both embed_documents and encode on a tiny sample.

Why This Happens And When To Use Each Path

LangChain wraps providers so your chains can swap models without refactors. That agreement lives in two methods: embed_documents for a list of texts and embed_query for a single string. The interface is simple and predictable, which is why vector store helpers take an Embeddings object instead of a bare model. sentence-transformers, on the other hand, keeps its surface area clean by exposing one primary entry point: encode(). Neither is wrong; they solve different needs. Pick one path in each code path to avoid surprises.

Working Examples You Can Copy

Fix In A LangChain App That Used A Raw Model

# ❌ Before
from sentence_transformers import SentenceTransformer
from langchain_community.vectorstores import Chroma
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
db = Chroma.from_documents(texts, embedding=model)  # raises AttributeError

# ✅ After
from langchain_huggingface import HuggingFaceEmbeddings
emb = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
db = Chroma.from_documents(texts, embedding=emb)

Fix When You Do Not Want LangChain

# Just call encode; no embed_documents here
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
vecs = model.encode(texts, convert_to_numpy=True)

Practical Migration Steps

Identify the call site — Search your code for places where a vector store or chain receives an embedding=... argument.
Swap in the wrapper — Replace the plain model with HuggingFaceEmbeddings from langchain_huggingface.
Recreate the index — Drop any old vector index built with a different model and re-embed your corpus.
Probe dimensions — Print one vector’s length to confirm the index and the embedder agree.
Pin versions — Record exact versions in pyproject.toml or requirements.txt.

Device, Speed, And Memory Tips

Batch smart: pick a batch size that fills your GPU without spilling. Start with 32 and adjust.

model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2", device="cuda")  # or "cpu"
embs = model.encode(texts, batch_size=32, convert_to_numpy=True, normalize_embeddings=True)

Normalize vectors: cosine search works best if you L2-normalize both the documents and the query.
Avoid double encoding: cache document vectors and only encode new data.
Keep token budget in mind: long chunks waste compute; 256–512 tokens per chunk is a good starting point.

Quality Guardrails For RAG Workflows

Keep context focused: chunk by sentence boundaries where possible to avoid partial thoughts. Add a small overlap so entities stay intact.

Choose a consistent model: stick with one embedding model per project to maintain stable scoring.
Measure retrieval: track recall@k on a labelled set before and after a model switch.
Use reranking only when needed: a cross-encoder adds latency; add it after you hit accuracy walls.

Common Mis-Wires That Cause The Error

Mismatched object type: passing a plain SentenceTransformer where a wrapper is expected triggers attributeerror: ‘sentencetransformer’ object has no attribute ’embed_documents’.
Old tutorial code: examples written before the package split import from langchain_community, which confuses new setups.
Index built with another model: mixing models yields shape errors during add or search calls.
Shadowed variable names: naming a list embeddings can hide your class and create odd errors.

Make It Bulletproof In CI

# tests/test_embeddings_contract.py
from langchain_huggingface import HuggingFaceEmbeddings

def test_contract():
    emb = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
    docs = ["a", "b", "c"]
    vecs = emb.embed_documents(docs)
    assert isinstance(vecs, list) and len(vecs) == 3
    q = emb.embed_query("a")
    assert isinstance(q, list) and len(q) == len(vecs[0])

Choosing A Model Without Regret

Pick for footprint: small models such as all-MiniLM-L6-v2 run fast and serve well for many search tasks. Larger encoders add cost and memory; move up only when your metrics demand it.

English-only vs multilingual: match the model to your corpus language.
Domain match: try a domain-tuned model when you see lots of near-ties in the top-k results.
Cold-start path: begin with a compact model, then A/B test upgrades against a held-out query set.

From Zero To A Clean Pipeline

Load documents — parse files and split cleanly.
Embed with one class — use either the wrapper or the native API, not both in the same code path.
Index once — persist vectors with metadata you can filter on.
Search and score — log hits, misses, and add a tiny suite of golden queries.
Refresh safely — when you change the model, re-embed and bump the index version number.

When You See It In Notebooks

Quick fix: restart the kernel after changing imports. A live session can hold the old class in memory, which makes the same cell work once and fail later.

Reinstall cleanly: in a virtual env, reinstall the packages and verify import paths.
Print the class: print(type(emb)) should show a LangChain wrapper when you plan to call embed_documents.
Confirm device: if you toggle GPU/CPU, double-check the device arg on construction.

Real-World Layouts That Stay Stable

Keep a single source: centralize embedder creation in one factory function so every module uses the same object type and settings.

# my_project/embeddings.py
from langchain_huggingface import HuggingFaceEmbeddings

def make_embedder(model_name="sentence-transformers/all-MiniLM-L6-v2"):
    return HuggingFaceEmbeddings(model_name=model_name)

# elsewhere
emb = make_embedder()

FAQ-Free Clarifications

Short notes: you can keep your code lean by calling encode directly when you do not need chains, retrievers, or memory helpers. In projects that rely on LangChain tools, the wrapper keeps interfaces aligned and avoids the attributeerror: ‘sentencetransformer’ object has no attribute ’embed_documents’ surprise.

Troubleshooting Checklist

Confirm the class — print(type(embedding)) before passing it to a vector store.
Inspect methods — run dir(embedding) and look for embed_documents and embed_query.
Validate shapes — check that one vector length matches your index dimension.
Flush old caches — clear any persisted DB built with a different model.
Re-run tests — keep a tiny retrieval suite so regressions show up right away.

Notes On Licensing And Deployment

Check licenses: sentence-transformers models and data carry their own terms. Review the model card on Hugging Face and respect usage limits in hosted settings. When you ship a service, pin model names and package versions so rollouts stay predictable across environments safely.

References

• LangChain Embeddings interface with embed_documents and embed_query — docs.langchain.com

• HuggingFaceEmbeddings wrapper (new package) — reference.langchain.com

• sentence-transformers library and encode() usage — sbert.net

• Community answer showing the exact fix (use wrapper instead of raw class) — Stack Overflow