Service Online

Paragon DevOps AI Assistant

Internal AI coding assistant - data stays within Altera Health Azure tenant

Model

Qwen 2.5 Coder 14B
AWQ Quantized

Context Window

131,072 tokens
~100,000 words

Infrastructure

Azure A100 GPU
Auto-scales 1-3 nodes

Concurrent Users

~50 users
Tested and verified

Connect via OpenCode

1

Open OpenCode and press M to switch models

2

Select vllm / Qwen 2.5 Coder 14B from the model list

3

Start coding - your requests stay within Altera Health Azure tenant

API Endpoint

1

Base URL: https://ai-vllm.nicemoss-b01fbd49.eastus.azurecontainerapps.io/v1

2

Compatible with OpenAI API - use any OpenAI SDK with the base URL above

Schedule: Always on during IST (8am-6pm) and ET (8am-6pm) weekdays. Outside hours scales to zero - first request triggers ~5 min warm-up.