From c660c334ddb960e8d2fa0cc59931dccf991916fa Mon Sep 17 00:00:00 2001
From: Matteo Mei <52063123+FattiMei@users.noreply.github.com>
Date: Fri, 12 Apr 2024 21:44:05 +0200
Subject: [PATCH] Update readme.md with benchmark on ubuntu system

Added results from running `benchmark.py` on a ubuntu laptop without docker
---
 examples/readme.md | 54 +++++++++++++++++++++++++++++-----------------
 1 file changed, 34 insertions(+), 20 deletions(-)
diff --git a/examples/readme.md b/examples/readme.md
index 53922b8c..3ea21220 100644
--- a/examples/readme.md
+++ b/examples/readme.md
@@ -1,36 +1,50 @@
-# Benchmark comparison for the models
-- Hardware: Macbook pro 14 inches with m1 pro and 16 GB of ram
+# Benchmark analysis
+# Local models
+The 3 websites benchmark are:
+- Example 1:  https://perinim.github.io/projects
+- Example 2: https://www.wired.com
+- Example 3: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2
 
+The time is measured in seconds
+
+The model runned for this benchmark is Mistral on Ollama with nomic-embed-text
+
+| Hardware                | Example 1 | Example 2 | Example 3 |
+| ----------------------- | --------- | --------- | --------- |
+| Macbook pro 14 inches   | 26.10<br> | 60.915    | 200.77    |
+| Ubuntu with Radeon M260 | 296.98    | 1003.56   | /         |
+**Note**: the examples on Docker are not runned on other devices than the Macbook because the performance are to slow (10 times slower than Ollama). Indeed the results are the following:
+
+| Hardware              | Example 1 | Example 2 | Example 3 |
+| --------------------- | --------- | --------- | --------- |
+| Macbook pro 14 inches | 240.22    | 612.48    | 2008.32   |
+# Performance on APIs services
 ### Example 1: personal portfolio 
 **URL**: https://perinim.github.io/projects
 **Task**: List me all the projects with their description.
 
-| Name                                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
-| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
-| gpt-3.5-turbo                       | 35.98                    | 858          | 512           | 346               | 2                   | 0.00146        |
-| gpt-4-turbo-preview                 | 13.907                   | 866          | 512           | 354               | 2                   | 0.01574        |
-| Ollama with Mistral and embeddings  | 26.10                    | 0            | 0             | 0                 | 0                   | 0              |
-| Docker with  Mistral and embeddings | 240.22                   | 0            | 0             | 0                 | 0                   | 0              |
+| Name                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
+| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
+| gpt-3.5-turbo       | 35.98                    | 858          | 512           | 346               | 2                   | 0.00146        |
+| gpt-4-turbo-preview | 13.907                   | 866          | 512           | 354               | 2                   | 0.01574        |
+
 ### Example 2: Wired
 **URL**: https://www.wired.com
 **Task**: List me all the articles with their description.
 
-| Name                                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
-| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
-| gpt-3.5-turbo                       | 87.03                    | 3780         | 3760          | 3000              | 2                   | 0.01319        |
-| gpt-4-turbo-preview                 | 74.90                    | 5306         | 3060          | 2246              | 2                   | 0.09798        |
-| Ollama with Mistral and embeddings  | 60.915                   | 0            | 0             | 0                 | 0                   | 0              |
-| Docker with  Mistral and embeddings | 612.48                   | 0<br>        | 0<br>         | 0<br>             | 0<br>               | 0<br>          |
+| Name                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
+| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
+| gpt-3.5-turbo       | 87.03                    | 3780         | 3760          | 3000              | 2                   | 0.01319        |
+| gpt-4-turbo-preview | 74.90                    | 5306         | 3060          | 2246              | 2                   | 0.09798        |
 
 ### Example 3: Amazon product page
 **URL**: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2
 **Task**: List me all the articles with their the costs and image url.
 
-| Name                                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
-| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
-| gpt-3.5-turbo                       | 145.55                   | 26038        | 18091         | 7947              | 5                   | 0.04303        |
-| gpt-4-turbo-preview                 | 82.38                    | 15640        | 13698         | 1942              | 2                   | 0.19524        |
-| Ollama with Llama2 and embeddings   | 200.77                   | 0<br>        | 0<br>         | 0<br>             | 0<br>               | 0<br>          |
-| Docker with  Mistral and embeddings | 2008.32                  | 0<br>        | 0<br>         | 0<br>             | 0<br>               | 0<br>          |
+| Name                | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD |
+| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- |
+| gpt-3.5-turbo       | 145.55                   | 26038        | 18091         | 7947              | 5                   | 0.04303        |
+| gpt-4-turbo-preview | 82.38                    | 15640        | 13698         | 1942              | 2                   | 0.19524        |
+
 ## Hosting services
 [[💻 Provider costs informations]]