Knowledge Settings

GPT Models and Generative AI Models

Currently you can choose one of multiple generative AI models. In general we highly recommend Azure GPT-4o Mini for most use cases. If you need a higher quality response, you can choose Azure GPT-4o.

Model	Hosting	Costs	Quality of Response
Azure GPT-4o Mini	EU	$	High
Azure GPT-4o	EU	$$	High
OpenAI GPT-4o Mini	US	$	High
OpenAI GPT-4o	US	$$	High
Anthropic Claude 3.5 Haiku	US	$	Medium
Anthropic Claude 3.5 Sonnet	US	$$	High
Anthropic Claude 3 Opus	US	$$$	High
FireworksAI Llama3.3 70b	US	$$	Medium
TogetherAI Llama3.3 70b	US	$$	Medium
Ollama	Your Hosting	Your Costs	Your Model
vLLM	Your Hosting	Your Costs	Your Model

Custom API Keys

When using a model, you can enter your own API key to route the API calls to your account.

When using an Azure model, you can also add custom endpoint information. In this case you not only need to enter an API key, but also the base URL and deployments. See the article Azure GPT for further details, how to set up your Azure GPT model endpoints.

Rate limit

You can set the rate limit by IP address per hour or a rate limit by tenant ID per minute. The Rate Limit setting for GPT in LoyJoy controls how often users can send GPT-requests within a given time frame. If the GPT prompt module is used or intensive tests are carried out, an increase to 360 requests per hour is recommended. The number of tokens and costs of the request can be viewed in the “Message” tab in the knowledge area.

Self-Hosted Models

If you want to host your own model, you can select Ollama or vLLM to connect to your model hosting.

Ollama

Ollama is an open source project that serves as a platform for running LLMs on your own infrastructure. You can use the Ollama to run and connect to your own model. You can find more information about Ollama on the Ollama website.

vLLM

vLLM is an open source library for LLM inference and serving. vLLM includes an OpenAI-compatible API that can be integrated with LoyJoy. You can find more information about vLLM in the vLLM repository.

Llama-3 70b: Generative AI 100% Made in Germany 🇩🇪

The integration of Meta's open-weight model Llama-3 70b into the LoyJoy Platform offers a new LLM which is hosted by our German partner primeLine AI (part of the primeLine Group) in Limburg an der Lahn, Germany and ensures that everything is 100% made in Germany.

Key Features

Hosted in an ISO 27001 certified data center of partimus GmbH in Limburg an der Lahn (part of the primeLine Group)
Privacy focused: Data stays in Germany
No Azure OpenAI hosting required
100% Made in Germany
As always, full GDPR compliance

Interested? Contact us for activation.

Knowledge Settings

GPT Models and Generative AI Models​

Custom API Keys​

Rate limit​

Self-Hosted Models​

Ollama​

vLLM​

Llama-3 70b: Generative AI 100% Made in Germany 🇩🇪​