From c660c334ddb960e8d2fa0cc59931dccf991916fa Mon Sep 17 00:00:00 2001 From: Matteo Mei <52063123+FattiMei@users.noreply.github.com> Date: Fri, 12 Apr 2024 21:44:05 +0200 Subject: [PATCH] Update readme.md with benchmark on ubuntu system Added results from running `benchmark.py` on a ubuntu laptop without docker --- examples/readme.md | 54 +++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 20 deletions(-) diff --git a/examples/readme.md b/examples/readme.md index 53922b8c..3ea21220 100644 --- a/examples/readme.md +++ b/examples/readme.md @@ -1,36 +1,50 @@ -# Benchmark comparison for the models -- Hardware: Macbook pro 14 inches with m1 pro and 16 GB of ram +# Benchmark analysis +# Local models +The 3 websites benchmark are: +- Example 1: https://perinim.github.io/projects +- Example 2: https://www.wired.com +- Example 3: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2 +The time is measured in seconds + +The model runned for this benchmark is Mistral on Ollama with nomic-embed-text + +| Hardware | Example 1 | Example 2 | Example 3 | +| ----------------------- | --------- | --------- | --------- | +| Macbook pro 14 inches | 26.10
| 60.915 | 200.77 | +| Ubuntu with Radeon M260 | 296.98 | 1003.56 | / | +**Note**: the examples on Docker are not runned on other devices than the Macbook because the performance are to slow (10 times slower than Ollama). Indeed the results are the following: + +| Hardware | Example 1 | Example 2 | Example 3 | +| --------------------- | --------- | --------- | --------- | +| Macbook pro 14 inches | 240.22 | 612.48 | 2008.32 | +# Performance on APIs services ### Example 1: personal portfolio **URL**: https://perinim.github.io/projects **Task**: List me all the projects with their description. -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | -| gpt-3.5-turbo | 35.98 | 858 | 512 | 346 | 2 | 0.00146 | -| gpt-4-turbo-preview | 13.907 | 866 | 512 | 354 | 2 | 0.01574 | -| Ollama with Mistral and embeddings | 26.10 | 0 | 0 | 0 | 0 | 0 | -| Docker with Mistral and embeddings | 240.22 | 0 | 0 | 0 | 0 | 0 | +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | +| gpt-3.5-turbo | 35.98 | 858 | 512 | 346 | 2 | 0.00146 | +| gpt-4-turbo-preview | 13.907 | 866 | 512 | 354 | 2 | 0.01574 | + ### Example 2: Wired **URL**: https://www.wired.com **Task**: List me all the articles with their description. -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | -| gpt-3.5-turbo | 87.03 | 3780 | 3760 | 3000 | 2 | 0.01319 | -| gpt-4-turbo-preview | 74.90 | 5306 | 3060 | 2246 | 2 | 0.09798 | -| Ollama with Mistral and embeddings | 60.915 | 0 | 0 | 0 | 0 | 0 | -| Docker with Mistral and embeddings | 612.48 | 0
| 0
| 0
| 0
| 0
| +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | +| gpt-3.5-turbo | 87.03 | 3780 | 3760 | 3000 | 2 | 0.01319 | +| gpt-4-turbo-preview | 74.90 | 5306 | 3060 | 2246 | 2 | 0.09798 | ### Example 3: Amazon product page **URL**: https://www.amazon.it/s?k=alexa&__mk_it_IT=ÅMÅŽÕÑ&crid=1WWVF1RGDBBSB&sprefix=alex%2Caps%2C114&ref=nb_sb_noss_2 **Task**: List me all the articles with their the costs and image url. -| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | -| ----------------------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | -| gpt-3.5-turbo | 145.55 | 26038 | 18091 | 7947 | 5 | 0.04303 | -| gpt-4-turbo-preview | 82.38 | 15640 | 13698 | 1942 | 2 | 0.19524 | -| Ollama with Llama2 and embeddings | 200.77 | 0
| 0
| 0
| 0
| 0
| -| Docker with Mistral and embeddings | 2008.32 | 0
| 0
| 0
| 0
| 0
| +| Name | Execution time (seconds) | total_tokens | prompt_tokens | completion_tokens | successful_requests | total_cost_USD | +| ------------------- | ------------------------ | ------------ | ------------- | ----------------- | ------------------- | -------------- | +| gpt-3.5-turbo | 145.55 | 26038 | 18091 | 7947 | 5 | 0.04303 | +| gpt-4-turbo-preview | 82.38 | 15640 | 13698 | 1942 | 2 | 0.19524 | + ## Hosting services [[💻 Provider costs informations]]