GPT 4.1 Mini

LLM

OpenAI

GPT-4.1 mini matches or exceeds GPT-4o performance while reducing latency by nearly half and cost by 83%, with 1M token context and strong general-purpose reasoning capabilities.

Context tokens

1,047,576

Output tokens

8,192

Docs

Model documentation

Training cutoff

Jun 1, 2024

Released

Apr 14, 2025

Schema

We handle the model, stream, service_tier, safety_identifier, and user parameters.

Schema documentation

Capabilities

Tool use

Structured output

OpenAI API

Multilingual

Multimodal

Image gen (tool)

Supported languages

Supported input media

image

pdf

text

Supported tools

Image Generation
The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It leverages the GPT Image model, and automatically optimizes text inputs for improved performance.
Costs ~559 credits per image.
MCP
You can give models new capabilities using connectors and remote MCP servers. These tools give the model the ability to connect to and control external services when needed to respond to a user's prompt. These tool calls can either be allowed automatically, or restricted with explicit approval required by you as the developer.
Seclai Content Tools
Inspect source documents connected to your account. Includes tools for loading full content, reading character ranges, searching within documents, viewing stats, and listing available content sources. When a source_connection_content_version_id is provided in agent run metadata it is used as the default. Otherwise the model can discover content via list_content_sources.
Seclai Knowledge Base
Search your knowledge bases using semantic similarity. Includes search_knowledge_base and list_knowledge_bases. When a knowledge_base_id is provided in the prompt or agent run metadata it is used as the default. Otherwise the model can discover available knowledge bases at runtime.
Seclai Memory Banks
Manage persistent memory across agent runs. Includes tools for listing memory banks, writing entries, searching memory via semantic similarity, and loading entries in chronological order. Supports two memory types: 'conversation' (speaker-attributed turns) and 'general' (freeform text). Use key to organize entries by topic, session, or user.
Seclai Web Tools
Fetch web pages and search the web from within agent prompt calls. Includes seclai_web_fetch for retrieving page content in markdown, HTML, or plain text, and seclai_web_search for finding relevant pages with content snippets.
Web Search
Web search allows models to access up-to-date information from the internet and provide answers with sourced citations. To enable this, use the web search tool in the Responses API or, in some cases, Chat Completions.

Pricing

Type	Credits	Units
Cache hit	1.33	Credits per 1k tokens

Variants

Tier

Priority and flex tiers trade speed for cost.

Option	Description	Input credits (per 1k tokens)	Output credits (per 1k tokens)
Priority	Priority processing delivers significantly lower and more consistent latency compared to Standard processing while keeping pay-as-you-go flexibility. Priority processing is ideal for high-value, user-facing applications with regular traffic where latency is paramount. Priority processing should not be used for data processing, evaluations, or other highly erratic traffic.	9.31	37.24
Standard	—	5.32	21.28