Duke AI Suite Models Guide

Not sure which model to use? Instead of memorizing model names, think in terms of clusters based on capability, access, and use case. Models can appear in more than one category, so pick the cluster that matches your needs. Additionally, we are always evaluating new models and sunsetting old or deprecated models, so this list is subject to change.

On-Prem Models (Private & Controlled)

What This Means: These models are hosted on Duke-managed GPUs within Duke’s campus infrastructure, so your data never leaves the Duke network or Duke servers.

Use When: You prioritize privacy and data control above all else.

Available Models: Mistral On-Prem

Best For:

Research use cases (e.g., pre-publication data)
Internal-only tools like department-specific chatbots
Student-related data and academic projects where keeping data on campus is ideal

Note: These models are approved for use with sensitive data under Duke policy, except for PHI or health-related data.

Cloud-Hosted Models (Azure and OpenAI Backed, Secure)

What This Means: Our cloud models are all hosted via Azure under Duke’s data security agreement with Microsoft.

Use When: You need advanced model capabilities or a variety of models.

Available Models: See Model tables below

Best For:

Sophisticated reasoning, long context, high performance tasks
AI-enhanced learning and faculty tools
Supports experimentation by offering a variety of models—allows users to run simultaneous requests, compare outputs, and find the best fit for their needs.

Note: These models are approved for use with sensitive data under Duke policy, except for PHI or health-related data.

Reasoning Models (Deep Logic & Long-Form Thinking)

What This Means: Typical AI models just produce an answer one word at a time. Reasoning models go a step further by constructing logical chains of thought to solve complex, multi-step problems.

Use When: You’re tackling a complex problem that requires extended logic, memory, or planning.

Available Models: GPT-5.2, o3-deep-research, o4-mini-deep-research

Best For:

Long-form writing or research synthesis
Planning, simulations, step-by-step breakdowns

Not Great For: Rapid chat or casual Q&A, due to the longer “reasoning” time and additional resources it takes to produce a response.

Rate-Limited Models (High Power, Use Wisely)

What This Means: Some models in DukeGPT have usage limits to prevent runaway costs and ensure fair access. Users have a defined limit on these models, which resets daily.

Use When: You need the best model output, but in limited quantities.

Affected Models: See Model Costs tables below

Recommended Backups: Any on-prem model

Best For:

Research synthesis, paper writing, coding deep dives
Anything where quality outweighs quantity

Specialty Models (Code, Engineering, Math)

What This Means: Some models are better than others when it comes to specific purposes other than creating chatbots such as coding, logic, math-based queries, and transcribing speech to text.

Use When: You need help with logic-heavy, numerical, or technical problems.

Available Models:

Coding Models: GPT-5.2-codex, GPT-5.1-codex, GPT-5.1-codex-mini, GPT-5.1-codex-max
Speech-to-Text Models: GPT-4o-transcribe, GPT-4o-transcribe-diarize, whisper-1
Text-Embedding Models: text-embedding-3-small, text-embedding-3-large

Best For:

Coding assignments or Python notebooks
Solving math problems, formula generation
Logic-heavy reasoning tasks and structured responses

General Models & Costs

Model	Company	Cloud vs. On-Prem	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
GPT-5.2, GPT-5.2-chat	OpenAI	Cloud	$1.75	$14.00
GPT-5.1, GPT-5.1-chat	OpenAI	Cloud	$1.25	$10.00
GPT-5, GPT-5-chat	OpenAI	Cloud	$1.25	$10.00
GPT-5-mini	OpenAI	Cloud	$0.25	$2.00
GPT-5-nano	OpenAI	Cloud	$0.05	$0.40
GPT-4.1	OpenAI	Cloud	$2.00	$8.00
GPT-4.1-mini	OpenAI	Cloud	$0.40	$1.60
GPT-4.1-nano	OpenAI	Cloud	$0.10	$0.40
GPT-OSS 120B	OpenAI	Cloud	$0.15	$0.60
Llama 3.3	Meta	Cloud	$0.71	$0.71
Llama 4 Maverick	Meta	Cloud	$0.35	$1.41
Llama 4 Scout	Meta	Cloud	$0.20	$0.78
Mistral	Mistral	On-premise	no cost	no cost

Specialty Models & Costs

Model	Company	Cloud vs. On-Prem	Input Cost (per 1M tokens)	Output Cost (per 1M tokens)
GPT-5.2-codex	OpenAI	Cloud	$1.75	$14.00
GPT-5.1-codex	OpenAI	Cloud	$1.25	$10.00
GPT-5.1-codex-mini	OpenAI	Cloud	$0.25	$2.00
GPT-5.1-codex-max	OpenAI	Cloud	$1.25	$10.00
GPT-4o-transcribe	OpenAI	Cloud	$2.50	$10.00
GPT-4o-transcribe-diarize	OpenAI	Cloud	$2.50	$10.00
o4-mini	OpenAI	Cloud	$1.10	$4.40
o4-mini-deep-research	OpenAI	Cloud	$2.00	$8.00
o3	OpenAI	Cloud	$10.00	$40.00
o3-deep-research	OpenAI	Cloud	$10.00	$40.00
text-embedding-3-small	OpenAI	Cloud	$0.02	-
text-embedding-3-large	OpenAI	Cloud	$0.13	-
whisper-1	OpenAI	Cloud	$0.006 per minute	$0.006 per minute

Article number: KB0038832

Valid to: February 2, 2027