Skip to main content

Knowledge Settings

GPT Models and Generative AI Models​

Currently you can choose one of multiple generative AI models. In general we highly recommend Azure GPT-4o Mini for most use cases. If you need a higher quality response, you can choose Azure GPT-4o.

ModelHostingCostsQuality of Response
Azure GPT-4o MiniEU$High
Azure GPT-4oEU$$High
OpenAI GPT-4o MiniUS$High
OpenAI GPT-4oUS$$High
Anthropic Claude 3.5 HaikuUS$Medium
Anthropic Claude 3.5 SonnetUS$$High
Anthropic Claude 3 OpusUS$$$High
FireworksAI Llama3.3 70bUS$$Medium
TogetherAI Llama3.3 70bUS$$Medium
OllamaYour HostingYour CostsYour Model
vLLMYour HostingYour CostsYour Model

Custom API Keys​

When using a model, you can enter your own API key to route the API calls to your account.

When using an Azure model, you can also add custom endpoint information. In this case you not only need to enter an API key, but also the base URL and deployments. See the article Azure GPT for further details, how to set up your Azure GPT model endpoints.

Rate limit​

You can set the rate limit by IP address per hour or a rate limit by tenant ID per minute. The Rate Limit setting for GPT in LoyJoy controls how often users can send GPT-requests within a given time frame. If the GPT prompt module is used or intensive tests are carried out, an increase to 360 requests per hour is recommended. The number of tokens and costs of the request can be viewed in the β€œMessage” tab in the knowledge area.

Self-Hosted Models​

If you want to host your own model, you can select Ollama or vLLM to connect to your model hosting.

Ollama​

Ollama is an open source project that serves as a platform for running LLMs on your own infrastructure. You can use the Ollama to run and connect to your own model. You can find more information about Ollama on the Ollama website.

vLLM​

vLLM is an open source library for LLM inference and serving. vLLM includes an OpenAI-compatible API that can be integrated with LoyJoy. You can find more information about vLLM in the vLLM repository.

Llama-3 70b: Generative AI 100% Made in Germany πŸ‡©πŸ‡ͺ​

The integration of Meta's open-weight model Llama-3 70b into the LoyJoy Platform offers a new LLM which is hosted by our German partner primeLine AI (part of the primeLine Group) in Limburg an der Lahn, Germany and ensures that everything is 100% made in Germany.

Key Features

  • Hosted in an ISO 27001 certified data center of partimus GmbH in Limburg an der Lahn (part of the primeLine Group)
  • Privacy focused: Data stays in Germany
  • No Azure OpenAI hosting required
  • 100% Made in Germany
  • As always, full GDPR compliance

Interested? Contact us for activation.