Qwen3.6-27B

Overview

Qwen3.6-27B FP8 is a dense 27-billion-parameter multimodal model developed by Alibaba's Qwen Team, released on April 22, 2026, under the Apache 2.0 license. It is optimized for agentic coding and reasoning, outperforming the previous-generation 397B MoE flagship on all major coding benchmarks while being significantly more practical to deploy.

Key Features

  • Hybrid Gated DeltaNet Architecture: Uses a novel structure of 64 layers organized as 16 repeating blocks — each with 3 Gated DeltaNet (linear attention) sublayers followed by 1 Gated Attention (full self-attention) sublayer — balancing efficiency with strong long-context performance.
  • Natively Multimodal: Supports text, image, and video inputs in a single unified checkpoint, combining a causal language model with a vision encoder.
  • Hybrid Thinking Mode: Offers both a "thinking" mode (step-by-step reasoning with configurable budget of 512–8,192 tokens) and a "non-thinking" mode (instant responses) selectable at inference time.
  • Thinking Preservation: Retains chain-of-thought reasoning traces across multi-turn conversations, improving coherence in agentic coding workflows.
  • Multi-Token Prediction (MTP): Trained with multi-token prediction for improved throughput during inference.

Best Use Cases

  • Agentic Coding: Scores 77.2% on SWE-bench Verified and 59.3% on Terminal-Bench 2.0, making it well-suited for autonomous code generation, debugging, and multi-step software engineering workflows.
  • Complex Reasoning Tasks: Achieves 87.8% on GPQA Diamond and 94.1% on AIME 2026, strong for scientific and mathematical problem-solving.

Capabilities and Limitations

CapabilityDescription
Reasoning87.8% GPQA Diamond, 94.1% AIME 2026, 86.2% MMLU-Pro
Coding77.2% SWE-bench Verified, 53.5% SWE-bench Pro, 59.3% Terminal-Bench 2.0, 48.2% SkillsBench
Creative WritingGeneral-purpose text generation; primarily optimized for code and reasoning rather than creative output
MultimodalText, image, and video input; text output only
Context Window128k tokens
Max Output32,768 tokens
Tool UseNative function calling and tool use support

Known Limitations

  • Improvements on knowledge-recall benchmarks (MMLU-Pro, C-Eval) are modest compared to coding gains — better characterized as an agent/reasoning model than a knowledge-recall upgrade.
  • Independent third-party benchmark verification remains limited as of the initial release; published coding scores use Qwen's internal agent scaffold.

Pricing

ModelInput (Credits/Token)Cache Write (Credits/Token)Cache Read (Credits/Token)Output (Credits/Token)
Qwen3.6-27B0.190.190.0192.99
  • Cache reads and writes are billed at the same rate.