Using Schemas with Ollama

Author included in AI

2025-02-07 1051 words 5 minutes

Contents

Ollama added support for json schema-based output a while ago. So you can tell what output is coming from back from the model, and with a nice structure.

I found the quality of the answers to be slightly lower, but with the added advantage of a well-formatted response.

Pyjama has full support for creating ollama functions based on a schema. Which means function are called as Clojure function, and transparently making request calls to your ollama running model.

Let’s go through a few examples.

Simple Calculator

This should probably not be used in production, but here is a simple model based calculator.

The :format key is where we put all the schema details. This is mostly edn based, mapped to json transparently for the model request.

(def calculator
  (pyjama.functions/ollama-fn
    {:model  "llama3.2"
     :system "answer the computation, only the numerical value of the result"
     :format {:type "number"}}))

This “simple” calculator can now be called as regular Clojure function.

(calculator "3 x 7")
; 21

Note that the returned value is directly a java integer.

; java.lang.Integer

Very useful for further computations.

(calculator "3 + 7")
; 10

The default url for Ollama is http://localhost:11414 and can be set with OLLAMA_URL as an ENV parameter or directly, but not recommended, in the map using the :url key.

City Generator

This second example ask the model to return a random city back.

(def simple-city-generator
  (pyjama.functions/ollama-fn
    {:system "generate an existing city name"
     :model  "llama3.2"
     :format {:type "string"}}))

(println
    (simple-city-generator "city is tokyo"))
; Tokyo

And we can also give directions on where we want the city to be located in.

(println
    (simple-city-generator "random city in Africa"))
; Kinshasa

Note that the model answers with better accuracy when we hint with proper names, the response objects. So while not obvious, the below gives better results:

(def city-generator
  (pyjama.functions/ollama-fn
    {
     :system "generate an object according to schema"
     :model  "llama3.2"
     :format {:type "object" :properties {:city {:type "string"}}}}))

Here the proper is named :city which helps the model tremendously.

You can also enforce the length and size of just about everything, so to answer with city of max 5 letters:

(pyjama.functions/ollama-fn
    {
     :system "generate an object according to schema"
     :model  "llama3.2"
     :format {:type "object" :properties {:city {:type "string" :maxLength 5}}}})

Airport Codes Generator

In this third example we want an array back with cities and their main airport code name. We want min and max items to be 2, and the length of the airport code to be exactly 3 characters.

(def airport-code-generator
  (pyjama.functions/ollama-fn
    {:system "generate array of object: the original city and the corresponding 3 letters code for the main airport, with no extra text."
     :model  "llama3.2"
     :format
     {:type                 "array"
      :items
      {:type                 "object"
       :required             ["airport" "city"]
       :properties           {:city {:type "string"} :airport {:type "string" :maxLength 3}}
       :additionalProperties false}
      :additionalProperties false
      :minItems             2
      :maxItems             2}}))

And calling with Paris and New York return the proper departure and arrival city names and codes.

  (let [res (airport-code-generator "Paris and New York")]
    (println "Departure city: " (first res))
    (println "Arrival city: " (second res)))
; Departure city:  {:city Paris, :airport CDG}
; Arrival city:  {:city New York, :airport JFK}

Keyworder

The keyworder also returns an array, of between 2 and 7 keywords, with the computed relevance on the full text.

(def keyworder
  (pyjama.functions/ollama-fn
    {
     :model "llama3.2"
     :format
     {
      :type     "array"
      :minItems 2
      :maxItems 7
      :items    {:type       "object"
                 :required   [:keyword :relevance]
                 :properties {:keyword {:type "string"} :relevance {:type "integer"}}}}
     :system
     "Find all the main keywords in the each prompt. relevance is how important it is in the sentence betweem 1 and 10"
     }))

(def opening-crawl
  "It is a period of civil wars in the galaxy. A brave alliance of underground freedom fighters has challenged the tyranny and oppression of the awesome GALACTIC EMPIRE.\n\nStriking from a fortress hidden among the billion stars of the galaxy, rebel spaceships have won their first victory in a battle with the powerful Imperial Starfleet. The EMPIRE fears that another defeat could bring a thousand more solar systems into the rebellion, and Imperial control over the galaxy would be lost forever.\n\nTo crush the rebellion once and for all, the EMPIRE is constructing a sinister new battle station. Powerful enough to destroy an entire planet, its completion spells certain doom for the champions of freedom.")

(keyworder opening-crawl)
; [{"keyword": "galaxy", "relevance": 8}, {"keyword": "empire", "relevance": 9}, {"keyword": "rebellion", "relevance": 9}, {"keyword": "battlestation", "relevance": 7}, {"keyword": "freedom", "relevance": 6}, {"keyword": "spaceships", "relevance": 5}, {"keyword": "imperial", "relevance": 8}]

Smurfs

This one started to review question/answer accuracy. The scorer config is rather simple after all we have seen so far. This returns an object with three properties, score, category, and explanation on the score.


(def scorer-config
  {
   :model   "llama3.1"
   :options {:temperature 0.9}
   :format
   {:type       "object"
    :required   [:score :category :explanation]
    :properties {:category    {:type "string" :enum ["perfect" "good" "bad"]}
                 :explanation {:type "string"}
                 :score       {:type "integer" :minimum 0 :maximum 100}}}
   :pre     "Given the question: %s \n, give a score (bad:1-50 good:50-80 perfect:80-100) based solely on the logical relevance of the answer: %s. \n.
   "
   :stream  false})

The model is rather precise, but I had to set the temperature to 0.9 to get something consistent. (0.7 did not do it).


(let [questions [["How many smurfs are red?" 1]
                 ["How many lady smurfs in the smurfs world?" 1]
                 ["How many smurfs are red?" -10]
                 ["How many smurfs are blue?" "between 100 and 200"]
                 ["What is the color of smurfs?" "blue"]]]
    (doseq [[question answer] questions]
      (let [{:keys [category explanation score]}
            ((pyjama.functions/ollama-fn scorer-config) [question answer])]
        (println (str/join "," [score category explanation])))))

And the scores of our scorer function:

90,perfect,In every depiction and reference to Smurfs, one Smurf is identified as 'Papa Smurf' or 'The Elder', who wears a red hat.
90,perfect,The question directly asks for a number, and a specific name 'Lady Smurf' is mentioned, implying there's only one. This makes the answer logically relevant.
20,bad,The question is ambiguous and does not provide any context about the population or group of smurfs being referred to. The answer -10 is a score that reflects this lack of clarity.
90,perfect,The question explicitly states that all smurfs are blue, so the answer is indeed between 100 and 200.
90,perfect,The question explicitly states that it is about smurfs, and there is a well-known characteristic associated with smurfs, which makes this an obvious and correct answer.

The whole test suite for those schema based ollama-fn is on gitub.