Seclai

Seclai

Pricing

GPT-5 nano

LLM
OpenAI

GPT-5 nano is the most lightweight and fastest variant, built for high-volume, cost-sensitive, and latency-constrained deployments while retaining reasoning capability at smaller scale.

Back to models

Context tokens

400,000

Output tokens

8,192

Training cutoff

May 1, 2024

Released

Aug 7, 2025

Schema

We handle the model, stream, service_tier, safety_identifier, and user parameters.

Schema documentation

Capabilities

Tool use
Structured output
Thinking
OpenAI API
Multilingual
Multimodal

Supported languages

ar
cs
da
de
el
en
es
fi
fr
hi
hu
id
it
ja
ko
nl
no
pl
pt
ro
ru
sv
th
tr
vi
zh

Supported input media

image
text

Supported tools

  • MCP

    You can give models new capabilities using connectors and remote MCP servers. These tools give the model the ability to connect to and control external services when needed to respond to a user's prompt. These tool calls can either be allowed automatically, or restricted with explicit approval required by you as the developer.

  • Seclai Content Tools

    Inspect source documents connected to your account. Includes tools for loading full content, reading character ranges, searching within documents, viewing stats, and listing available content sources. When a source_connection_content_version_id is provided in agent run metadata it is used as the default. Otherwise the model can discover content via list_content_sources.

  • Seclai Knowledge Base

    Search your knowledge bases using semantic similarity. Includes search_knowledge_base and list_knowledge_bases. When a knowledge_base_id is provided in the prompt or agent run metadata it is used as the default. Otherwise the model can discover available knowledge bases at runtime.

  • Web Search

    Web search allows models to access up-to-date information from the internet and provide answers with sourced citations. To enable this, use the web search tool in the Responses API or, in some cases, Chat Completions.

Pricing

TypeCreditsUnits
Cache hit0.07Credits per 1k tokens

Variants

Tier

Priority and flex tiers trade speed for cost.

OptionDescriptionInput credits (per 1k tokens)Output credits (per 1k tokens)
FlexFlex processing provides lower costs for Responses or Chat Completions requests in exchange for slower response times and occasional resource unavailability. It's ideal for non-production or lower priority tasks, such as model evaluations, data enrichment, and asynchronous workloads.0.332.66
Standard0.675.32