Using Clojure for RAG with Ollama and Pyjama

Retrieval-Augmented Generation (RAG) is a powerful technique that combines retrieval-based search with generative AI models. This approach ensures that responses are based on relevant context rather than solely on the model’s pretrained knowledge. In this blog post, we’ll explore a simple RAG setup in Clojure using Ollama and Pyjama.

Setting Up the Environment

The script starts by defining the URL of the Ollama server, which serves the embedding and generative models:

Pyjama Embeddings: Asking the Big Questions

Fun with Clojure: Asking the Big Questions

Clojure might be known for its Lisp-y elegance and its ability to wrangle data like a pro, but today, we’re using it for something far more profound: uncovering the deepest truths of the universe. Why does the sky turn red? Why do fireworks explode? Why did the sun even bother to rise? Thanks to some Clojure magic, we’ve got the answers—straight from a highly sophisticated knowledge base (i.e., a delightfully whimsical source_of_truth.txt).

Transparently use multiple Ollama servers

Just a code summary of a crazy week.

At this point Pyjama can load balance requests to different ollama server transparently via pyjama.parallel/generate.

Basically, that gives:

(->>
    {:url "http://localhost:11432,http://localhost:11434"
     :models  ["llama3.1"]
     :format  {:type "integer"}
     :pre     "This is a potential answer %s02 for this question: %s01. Give a score to the answer on a scale 1 to 100: based on how accurate it.
       - Do not give an answer yourself.
       - No comment.
       - No explanation.
       - No extra text. "
     :prompts [["Why is the sky blue" "The sky appears blue because of a process called Rayleigh scattering."]
               ["Why is the sky blue" "Blue is scattered more than other colors because it travels as shorter, smaller waves."]
               ["Why is the sky blue" "During the day the sky looks blue because it's the blue light that gets scattered the most. "]
               ["Why is the sky blue" "Because it is Christmas. "]
               ]}
    (pyjama.parallel/generate)
    (map #(-> [(:prompt %) (:result %) (:url %)]))
    (clojure.pprint/pprint))

And yes, the prompts are dispatched to the different ollama servers as can he seen in the urls in the answer, showing where the request was effectively executed. The set of URLs can also be set via the OLLAMA_URL system env at runtime.

Using Schemas with Ollama

Ollama added support for json schema-based output a while ago. So you can tell what output is coming from back from the model, and with a nice structure.

I found the quality of the answers to be slightly lower, but with the added advantage of a well-formatted response.

Pyjama has full support for creating ollama functions based on a schema. Which means function are called as Clojure function, and transparently making request calls to your ollama running model.

Llama parse from Clojure

Who is Llama Parse ?

Sure! Here’s a fun intro to LlamaParse:

Imagine you have a mountain of documents—PDFs, messy scans, or text-heavy reports—and you need to extract useful information fast. Instead of manually copying and pasting like a tired intern, meet LlamaParse, your AI-powered document whisperer!

LlamaParse takes complex, unstructured documents and turns them into clean, structured data, ready for analysis or automation. Whether you’re dealing with legal contracts, research papers, or financial reports, this Llama doesn’t spit—it delivers.