Duke AI Suite Models Guide
Not sure which model to use? Instead of memorizing model names, think in terms of clusters based on capability, access, and use case. Models can appear in more than one category, so pick the cluster that matches your needs. Additionally, we are always evaluating new models and sunsetting old or deprecated models, so this list is subject to change.
On-Prem Models (Private & Controlled)
What This Means: These models are hosted on Duke-managed GPUs within Duke’s campus infrastructure, so your data never leaves the Duke network or Duke servers.
Use When: You prioritize privacy and data control above all else.
Models: Mistral On-Prem
Best For:
- Research use cases (e.g., pre-publication data)
- Internal-only tools like department-specific chatbots
- Student-related data and academic projects where keeping data on campus is ideal
Note: These models are approved for use with sensitive data under Duke policy, except for PHI or health-related data.
Cloud-Hosted Models (Azure-Backed, Secure)
What This Means: Our cloud models are all hosted via Azure under Duke’s data security agreement with Microsoft.
Use When: You need advanced model capabilities or a variety of models.
Available Models:
- GPT-5-chat, GPT-5, GPT-5-mini, GPT-5-nano, GPT-oss 120b
- LLaMA 4 Scout, LLaMA 4 Maverick, LLaMA 3.3 70B
- Removed in DukeGPT and deprecated in myGPT Builder and AI Gateway: GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, o3, o4-mini
Best For:
- Sophisticated reasoning, long context, high performance tasks
- AI-enhanced learning and faculty tools
- Supports experimentation by offering a variety of models—allows users to run simultaneous requests, compare outputs, and find the best fit for their needs.
Note: These models are approved for use with sensitive data under Duke policy, except for PHI or health-related data.
Reasoning Models (Deep Logic & Long-Form Thinking)
What This Means: Typical AI models just produce an answer one word at a time. Reasoning models go a step further by constructing logical chains of thought to solve complex, multi-step problems.
Use When: You’re tackling a complex problem that requires extended logic, memory, or planning.
Available Models: GPT-5
Best For:
- Long-form writing or research synthesis
- Planning, simulations, step-by-step breakdowns
Not Great For: Rapid chat or casual Q&A, due to the longer “reasoning” time and additional resources it takes to produce a response.
Rate-Limited Models (High Power, Use Wisely)
What This Means: Some models in DukeGPT have usage limits to prevent runaway costs and ensure fair access. Users have a defined limit on these models, which resets daily.
Use When: You need the best model output, but in limited quantities.
Affected Models: GPT-5-chat, GPT-5
Recommended Backups: GPT-5-mini, GPT-5-nano, or any on-prem model
Best For:
- Research synthesis, paper writing, coding deep dives
- Anything where quality outweighs quantity
STEM-Focused Models (Engineering, Math, Code)
What This Means: Some models are better than others when it comes to coding, logic, and math-base queries. Try different models to see what works best, but you can start with these.
Use When: You need help with logic-heavy, numerical, or technical problems.
Available Models: Mistral 24B, GPT-5
Best For:
- Coding assignments or Python notebooks
- Solving math problems, formula generation
- Logic-heavy reasoning tasks and structured responses
Multilingual Models (Language Flexibility)
Use When: You need responses in non-English languages or support across multilingual materials.
Available Models: All models
Best For:
- Translating course content or creating global-facing tools
- Supporting international students in their native language
Model Costs
|
Model |
Company |
Cloud vs. On-Prem |
Input Cost (per 1M tokens) |
Output Cost (per 1M tokens) |
|
Llama 3.3 |
Meta |
Cloud |
$0.71 |
$0.71 |
|
Llama 4 Maverick |
Meta |
Cloud |
$0.35 |
$1.41 |
|
Llama 4 Scout |
Meta |
Cloud |
$0.20 |
$0.78 |
|
GPT-5, GPT-5-chat |
OpenAI |
Cloud |
$1.25 |
$10.00 |
|
GPT-5-mini |
OpenAI |
Cloud |
$0.25 |
$2.00 |
|
GPT-5-nano |
OpenAI |
Cloud |
$0.05 |
$0.40 |
|
GPT-OSS 120B |
OpenAI |
Cloud |
$0.15 | $0.06 |
|
text-embedding-3-small |
OpenAI |
Cloud |
$0.02 |
- |
|
Mistral |
Mistral |
On-premise |
no cost |
no cost |
Article number: KB0038832
Valid to: October 28, 2026