How good is your RAG pipeline? take the test below to find out.
Caution: This is not a checklist. Completion of all items on this test will likely result in customer churn.
I blindly chunk by character limit, instead of looking for stop tokens.
I use more than 80% of the context window for each generation.
I vectorize full documents, without creating chunks.
I only use vector search for retrieving documents.
I pack the context window with more than 10 results for each generation.
I think I'm too cool to use plaintext search in my RAG pipeline
I still use text-ada-002 embeddings model from OpenAI.
I do not use a semantic filter or post-processing step.
I blindly process user query without protecting for prompt injection.
I blindly dump user documents into my vector index without any pre-processing or reformatting
I don't inject context to my document chunks before vectorizing them
I still use the standard RAG prompt from the Langchain tutorial.
I evaulate my RAG pipeline performance using vibes.
I don't use LlamaParse for processing PDFs.
I don't use overlapping chunks to prevent information loss.
I rely on cosine similarity alone to rank results.
I utilize a minimum cosine similarity threshold to filter out irrelevant results.
I do not use query expansion.
I have not experimented with chunk sizes.
I have not updated my pipeline in the last 6 months.
Calculate My Score!