무제

[Langchain] Retriever 본문

Project/LLM

[Langchain] Retriever

mugan1 2024. 11. 17. 16:23
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import UnstructuredFileLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings, CacheBackedEmbeddings
from langchain.vectorstores import Chroma
from langchain.storage import LocalFileStore

from qdrant_client.http.models import Distance, VectorParams, PointStruct, PointInsertOperations
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams

from uuid import uuid4
from langchain_core.documents import Document

cache_dir = LocalFileStore("./.cache/")

loader = UnstructuredFileLoader(r"C:\Users\user\Desktop\LHS\Project\document\모욕.txt")

# splitter = RecursiveCharacterTextSplitter()
splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=500,
    chunk_overlap=100,
)

docs = loader.load_and_split(text_splitter=splitter)
embeddings = OpenAIEmbeddings()
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(embeddings, cache_dir)

client = QdrantClient("http://localhost:6333")

client.recreate_collection(
    collection_name="test1",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

vectorstore = QdrantVectorStore(
    client=client,
    collection_name="test1",
    embedding=cached_embeddings,
)

# uuids = [str(uuid4()) for i in range(len(docs))]
ids = [i+1 for i in range(len(docs))]
vectorstore.add_documents(documents=docs, ids=ids)

retriver = vectorstore.as_retriever()

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a competent lawyer. Answer questions in Korean using only the following context. If you don't know the answer just say you don't know, don't make it up:\n\n{context}",
        ),
        ("human", "{question}"),
    ]
)

chain = (
    {
        "context": retriver,
        "question": RunnablePassthrough(),
    }
    | prompt
    | llm
)

chain.invoke("무례한 표현만으로 모욕죄가 성립할 수 있어?")

 

지난번 vector DB를 구축하였고, 이를 retriever로 활용한 langchain 코드

에러가 지속적으로 발생해서 langchain 관련 라이브러리를 모두 지우고 새로 깔아줬더니 작동한다..

 

'Project > LLM' 카테고리의 다른 글

[Langhchain] Refine  (0) 2024.11.21
[Langchain] Map Reduce  (0) 2024.11.17
[Langchain] Qdrant Vector DB  (2) 2024.11.12
[Langchain] Splitter / Vector DB  (0) 2024.11.10
[Langchain] Memory  (0) 2024.10.25
Comments