Qwen3.6 35B A3B

LLM
Qwen

Qwen3.6 35B A3B is designed for agentic coding and developer workflows. This MoE model activates 3B of 35B parameters, supports a 262K token context, and features thinking mode with reasoning preservation across sessions.

Context tokens

262,144

Output tokens

8,192

Released

Apr 21, 2026

Schema

- Thinking mode is ON by default; responses include `<think>\n...\n</think>` blocks before the final answer. To disable, pass `chat_template_kwargs: {enable_thinking: false}` (self-hosted vLLM/SGLang) or `enable_thinking: false` in `extra_body` (Alibaba Cloud DashScope API). - The `/think` and `/nothink` soft-switch tokens from Qwen3 are NOT supported. - `top_k` and `min_p` are non-standard OpenAI parameters; must be passed via `extra_body` in the OpenAI SDK. - Multi-turn conversations: thinking content (`<think>…</think>`) must be stripped from assistant history turns — only the final response should be included. - Inference servers must be launched with `--reasoning-parser qwen3`; tool-call parsing requires `--tool-call-parser qwen3_coder` separately.

Schema documentation

Capabilities

Tool use
Structured output
Thinking
Multimodal

Supported input media

image
text

Supported tools

  • Seclai Content Tools

    Inspect source documents connected to your account. Includes tools for loading full content, reading character ranges, searching within documents, viewing stats, and listing available content sources. When a source_connection_content_version_id is provided in agent run metadata it is used as the default. Otherwise the model can discover content via list_content_sources.

  • Seclai Knowledge Base

    Search your knowledge bases using semantic similarity. Includes search_knowledge_base and list_knowledge_bases. When a knowledge_base_id is provided in the prompt or agent run metadata it is used as the default. Otherwise the model can discover available knowledge bases at runtime.

  • Seclai Memory Banks

    Manage persistent memory across agent runs. Includes tools for listing memory banks, writing entries, searching memory via semantic similarity, and loading entries in chronological order. Supports two memory types: 'conversation' (speaker-attributed turns) and 'general' (freeform text). Use key to organize entries by topic, session, or user.

  • Seclai Web Tools

    Fetch web pages and search the web from within agent prompt calls. Includes seclai_web_fetch for retrieving page content in markdown, HTML, or plain text, and seclai_web_search for finding relevant pages with content snippets.

Pricing

TypeCreditsUnits
Input3.33Credits per 1k tokens
Output26.60Credits per 1k tokens

Variants

No variants available for this model.

Try This Model

Write a prompt and experiment with Qwen3.6 35B A3B in the model experiments page. You can compare it with other models side by side.