Spaces:

GIZ
/

ChaBo_README

Sleeping

App Files Files Community

ChaBo_README / retriever.py

ppsingh

adding other components

944aab6 3 months ago

raw

history blame contribute delete

3.82 kB

	retriverText = """ This microservice integrates with the vector database to retrieve semantically relevant documents,\
	with optional reranking for precision, ready for seamless use in ChaBo RAG workflows.

	# Retriever and Reranker Microservice on Hugging Face Spaces

	[ChaBo_Retrieval](https://huggingface.co/spaces/GIZ/chatfed_retriever0.3) hosts a Retrieval and Reranker mciroservice.\
	Some of key feature of Retrieval service are:
	- The embedding of the user query is done by retriever itself using Sentence-Transformer.
	- ReRanker is available as optional component.
	- This is rate determining step as the emedding of user query can be compute intensive if using dedicated model.
	- Model config, Qdrant server url and other params can be set through \
	[params.cfg](https://huggingface.co/spaces/GIZ/chatfed_retriever0.3/blob/main/params.cfg)

	```
	[vectorstore]
	# Qdrant-Server usage:
	PROVIDER = qdrant
	URL = giz-chatfed-qdrantserver.hf.space
	COLLECTION_NAME = EUDR

	[embeddings]
	MODEL_NAME = BAAI/bge-m3

	[retriever]
	TOP_K = 10
	SCORE_THRESHOLD = 0.6

	[reranker]
	MODEL_NAME = BAAI/bge-reranker-v2-m3
	TOP_K = 10
	ENABLED = true
	# use this to scale out the total docs retrieved prior to reranking (i.e. retriever top_k * TOP_K_SCALE_FACTOR)
	TOP_K_SCALE_FACTOR = 2
	```

	API documentation: 1 API Endpoint

	### api_name: /retrieve

	Params:
	- query(str): Required
	- collection_name(str): collection_name in the Qdrant server which need to be queried. Defualts to None.
	- filter_metadata(dict): metadata filtering for Qdrant vector store which will be
	applied to the collection mentioned above. Defuals to None

	Returns: List of retrieved context along with metadata as string,
	where each context is dict with two key 'answer' and 'answer_metadata'

	How to Connect

	```python
	from gradio_client import Client
	# Replace with your actual Space URL (e.g., https://your-username-retriever_space.hf.space)
	retriever_url = "https://giz-chatfed-retriever0-3.hf.space/"
	client = Client(retriever_url)
	result = client.predict(
	query="What is Circular Economy",
	collection_name="Humboldt",
	filter_metadata=None,
	api_name="/retrieve"
	)
	```
	For more info on Retriever and code base visit the following links:
	- ChaBo_Retriever : [ReadMe](https://huggingface.co/spaces/GIZ/chatfed_retriever0.3/blob/main/README.md)
	- ChaBo_Retriever: [Codebase](https://huggingface.co/spaces/GIZ/chatfed_retriever0.3/tree/main)"""