Scrapegraph-ai/docs/source/scrapers/llm.rst

194 lines
5.4 KiB
ReStructuredText

.. _llm:
LLM
===
We support many known LLM models and providers used to analyze the web pages and extract the information requested by the user. Models can be split in **Chat Models** and **Embedding Models** (the latter are mainly used for Retrieval Augmented Generation RAG).
These models are specified inside the graph configuration dictionary and can be used interchangeably, for example by defining a different model for llm and embeddings.
- **Local Models**: These models are hosted on the local machine and can be used without any API key.
- **API-based Models**: These models are hosted on the cloud and require an API key to access them (eg. OpenAI, Groq, etc).
.. note::
If the emebedding model is not specified, the library will use the default one for that LLM, if available.
Local Models
------------
Currently, local models are supported through Ollama integration. Ollama is a provider of LLM models which can be downloaded from here `Ollama <https://ollama.com/>`_.
Let's say we want to use **llama3** as chat model and **nomic-embed-text** as embedding model. We first need to pull them from ollama using:
.. code-block:: bash
ollama pull llama3
ollama pull nomic-embed-text
Then we can use them in the graph configuration as follows:
.. code-block:: python
graph_config = {
"llm": {
"model": "llama3",
"temperature": 0.0,
"format": "json",
},
"embeddings": {
"model": "nomic-embed-text",
},
}
You can also specify the **base_url** parameter to specify the models endpoint. By default, it is set to http://localhost:11434. This is useful if you are running Ollama on a Docker container or on a different machine.
If you want to host Ollama in a Docker container, you can use the following command:
.. code-block:: bash
docker-compose up -d
docker exec -it ollama ollama pull llama3
API-based Models
----------------
OpenAI
^^^^^^
You can get the API key from `here <https://platform.openai.com/api-keys>`_.
.. code-block:: python
graph_config = {
"llm": {
"api_key": "OPENAI_API_KEY",
"model": "gpt-3.5-turbo",
},
}
If you want to use text to speech models, you can specify the `tts_model` parameter:
.. code-block:: python
graph_config = {
"llm": {
"api_key": "OPENAI_API_KEY",
"model": "gpt-3.5-turbo",
"temperature": 0.7,
},
"tts_model": {
"api_key": "OPENAI_API_KEY",
"model": "tts-1",
"voice": "alloy"
},
}
Gemini
^^^^^^
You can get the API key from `here <https://ai.google.dev/gemini-api/docs/api-key>`_.
**Note**: some countries are not supported and therefore it won't be possible to request an API key. A possible workaround is to use a VPN or run the library on Colab.
.. code-block:: python
graph_config = {
"llm": {
"api_key": "GEMINI_API_KEY",
"model": "gemini-pro"
},
}
Groq
^^^^
You can get the API key from `here <https://console.groq.com/keys>`_. Groq doesn't support embedding models, so in the following example we are using Ollama one.
.. code-block:: python
graph_config = {
"llm": {
"model": "groq/gemma-7b-it",
"api_key": "GROQ_API_KEY",
"temperature": 0
},
"embeddings": {
"model": "ollama/nomic-embed-text",
},
}
Azure
^^^^^
We can also pass a model instance for the chat model and the embedding model. For Azure, a possible configuration would be:
.. code-block:: python
llm_model_instance = AzureChatOpenAI(
openai_api_version="AZURE_OPENAI_API_VERSION",
azure_deployment="AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"
)
embedder_model_instance = AzureOpenAIEmbeddings(
azure_deployment="AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME",
openai_api_version="AZURE_OPENAI_API_VERSION",
)
graph_config = {
"llm": {
"model_instance": llm_model_instance
},
"embeddings": {
"model_instance": embedder_model_instance
}
}
Hugging Face Hub
^^^^^^^^^^^^^^^^
We can also pass a model instance for the chat model and the embedding model. For Hugging Face, a possible configuration would be:
.. code-block:: python
llm_model_instance = HuggingFaceEndpoint(
repo_id="mistralai/Mistral-7B-Instruct-v0.2",
max_length=128,
temperature=0.5,
token="HUGGINGFACEHUB_API_TOKEN"
)
embedder_model_instance = HuggingFaceInferenceAPIEmbeddings(
api_key="HUGGINGFACEHUB_API_TOKEN",
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
graph_config = {
"llm": {
"model_instance": llm_model_instance
},
"embeddings": {
"model_instance": embedder_model_instance
}
}
Anthropic
^^^^^^^^^
We can also pass a model instance for the chat model and the embedding model. For Anthropic, a possible configuration would be:
.. code-block:: python
embedder_model_instance = HuggingFaceInferenceAPIEmbeddings(
api_key="HUGGINGFACEHUB_API_TOKEN",
model_name="sentence-transformers/all-MiniLM-l6-v2"
)
graph_config = {
"llm": {
"api_key": "ANTHROPIC_API_KEY",
"model": "claude-3-haiku-20240307",
"max_tokens": 4000
},
"embeddings": {
"model_instance": embedder_model_instance
}
}