Privacy Architecture

GPT Lab has a robust privacy architecture to ensure sensitive and personal information is not sent to third-party LLM AI providers. Here are the key points:

🛡️ Data Isolation

GPT Lab is designed to be self-hosted, meaning all data processing and storage happens within our own infrastructure. This allows us to keep our and your data completely isolated and prevent any leakage to external parties.

🛠️ Local LLM Support

GPT Lab supports running open-source LLMs like Llama, Mixtral and Falcon locally on our own hardware (GPU or CPU). This enables a truly air-gapped system where no data ever leaves our premises. This is possible via support of Ollama, GPT4All and FastChat servers.

☁️ Cloud LLM Integration

When using a cloud LLM provider like Replicate, TogetherAI or HuggingFace, GPT Lab can integrate with them. These providers do not retain any data, ensuring your information remains private.

🕵️‍♂️ Anonymization

In our self-hosted community version, personally identifiable information (PII) can be detected and anonymized before being ingested into our vector database on-premise. We also incorporate an "Anonymize Scanner" (such as LLM Guard and/or Microsoft Presidio) to act as a digital guardian. This scanner ensures that user prompts remain confidential and free from sensitive data exposure before sending the prompt data to a third-party cloud LLM provider. This step is not necessary when self-hosting the LLM.

📝 Summary

In summary, GPT Lab's self-hosted nature, local LLM support, cloud integration options without data retention, and anonymization of personal information provide a robust privacy architecture to prevent sensitive data exposure. This aligns with the "Privacy by Design" principle of EUDPR/GDPR and IT security best practices.

📧 Further details

If you want to learn more about the privacy architecture, please reach out to Antonio De Marinis (DIS1 - GPT-Lab Solution Architect).