ModelHarbor -Model Information

💚 Qwen/Qwen3.6-35B-A3B

Budget GOAT 👑

Looking for max value? This is the ULTIMATE budget king! 💸 Qwen3.6-35B-A3B is a Mixture-of-Experts model with only 3B active parameters out of 35B total — meaning it's crazy efficient while still delivering solid performance. It supports vision, function calling, AND computer use at the lowest price point on the platform. No cap, this is the move for high-volume tasks that need to stay wallet-friendly!

Insanely Affordable: Starting from just $0.16/M input tokens — the cheapest model on ModelHarbor!
MoE Efficiency: 35B total params with only ~3B active per token = fast + cheap
Vision + Function Calling: Understands images and supports tool use despite the low price
262K Context: Massive context window for long documents and codebases

🎨 google/gemini-2.5-flash-image

Image Gen Flash ⚡

Need AI-generated images? Gemini 2.5 Flash Image is your go-to! 🎨 This model specializes in creating and understanding images with Google's latest Gemini technology. It supports vision, computer use, and web search — all at an incredibly affordable price point. Perfect for creative projects, visual analysis, and multimodal applications!

Image Generation: Create images from text descriptions
Vision + Computer Use: Understand images and interact with screens
Web Search: Access real-time information from the internet
Super Affordable: Just $0.30/M input tokens for image-capable AI

⚡ gemini-3.1-flash-lite-preview

Speed Demon 🏎️

Need blazing-fast responses without breaking the bank? Gemini 3.1 Flash Lite is the speed champion! 🏎️ This lightweight model delivers Gemini-quality responses at lightning speed with a massive 1M token context window. It supports vision, computer use, and web search — making it incredibly versatile for its price. The ultimate choice for high-throughput applications!

1M Token Context: One of the largest context windows available anywhere
Ultra Fast: Lite architecture means rapid responses
Full Multimodal: Vision, Computer Use, and Web Search all included
Great Value: Only $0.40/M input tokens with 1M context

🧠 glm (GLM-5)

Reasoning Value 🧩

GLM-5 is the value champion for reasoning tasks! 🧩 This model from Zhipu AI delivers impressive thinking capabilities with built-in reasoning support and prompt caching for even better efficiency. With a massive 202K context window and function calling support, it's perfect for complex analytical tasks that don't require vision. The prompt caching feature means repeated queries cost even less!

Built-in Reasoning: Supports extended thinking for complex problems
Prompt Caching: Cache reads cost near-zero — great for repeated queries
202K Context: Large context window for detailed analysis
Function Calling: Supports parallel function calling for agentic workflows

🖌️ google/gemini-3.1-flash-image-preview

Next-Gen Image ✨

The next generation of image generation is here! ✨ Gemini 3.1 Flash Image Preview brings Google's latest multimodal capabilities with a 65K token context and 65K max output. It supports vision and web search, making it perfect for creative workflows that need both image understanding and generation. This preview model gives you early access to cutting-edge image AI technology!

Next-Gen Image AI: Latest Gemini 3.1 technology for image generation
65K Input + 65K Output: Generous context for complex image tasks
Vision + Web Search: Understand images and search the web
Preview Access: Be the first to try Google's newest image model

🔥 glm-max (GLM-5.1)

Max Performance 💎

When you need maximum performance from the GLM family, GLM-5.1 (glm-max) delivers! 💎 This is Zhipu AI's most capable model with 202K context, prompt caching, and parallel function calling. It's designed for demanding text-based tasks that require deep understanding and precise outputs. The perfect choice when you need top-tier reasoning without multimodal overhead.

Maximum GLM Performance: The most powerful model in the GLM family
202K Context: Massive context for complex document analysis
Prompt Caching: Efficient repeated queries with caching support
Parallel Function Calling: Execute multiple tools simultaneously

🚀 glm-turbo (GLM-5-Turbo)

Turbo Mode ⚡

Need GLM power at turbo speed? GLM-5-Turbo is your answer! ⚡ Same 202K context and function calling capabilities as glm-max, but optimized for faster responses. Perfect for production workloads that need both quality and speed. When you want GLM-level intelligence with minimal latency, this is the one!

Turbo Speed: Optimized for fast response times
202K Context: Same massive context window as glm-max
Prompt Caching: Efficient for repeated queries
Parallel Function Calling: Multi-tool execution support

⚡ deepseek-v4-flash

Flash Value King 👑

DeepSeek V4 Flash is the NEW budget champion! ⚡ Same price as Qwen3.6 but with a jaw-dropping 3.84M token max output — that's not a typo! This model can generate massively long responses, making it perfect for code generation, long-form content, and complex multi-step tasks. Plus it supports vision, computer use, and function calling at the cheapest price on the platform!

3.84M Max Output: Generate up to 3.84 million tokens in a single response — unprecedented!
Rock Bottom Price: Just $0.16/M input, $1.00/M output — tied for cheapest
Vision + Computer Use: Full multimodal support at budget pricing
Function Calling: Tool use for building AI agents
100K Context: Solid context window for most use cases

🌟 gemini-3-flash-preview

Balanced Pro ⚖️

Gemini 3 Flash Preview is the balanced powerhouse in Google's lineup! ⚖️ With a massive 1M token context, vision support, PDF processing, web search, and computer use — it's got everything you need for serious multimodal work. The premium pricing reflects its top-tier capabilities across all modalities. Perfect when you need the full Gemini experience with deep document understanding!

1M Token Context: Massive context for processing huge documents
PDF Support: Native PDF input processing built-in
Full Multimodal: Vision, Computer Use, and Web Search
Tiered Pricing: Different rates above 200K tokens for flexibility

🔥 deepseek-v4-pro

Pro Powerhouse 🔋

DeepSeek V4 Pro is the premium DeepSeek model with the same jaw-dropping 3.84M token max output! 🔋 This is DeepSeek's most capable model — combining advanced reasoning with vision, computer use, and function calling. When you need DeepSeek's best quality output with massive generation length, this is the one. A serious contender in the pro-tier model space!

3.84M Max Output: Generate up to 3.84 million tokens — unprecedented for a pro model
Pro Quality: DeepSeek's most capable model for demanding tasks
Vision + Computer Use: Full multimodal understanding and screen interaction
Function Calling: Tool use for sophisticated AI agents
100K Context: Solid context window with massive output capacity

👑 gemini-3.1-pro-preview

Ultimate Pro 👑

The king of the hill has arrived! 👑 Gemini 3.1 Pro Preview is Google's most powerful AI model, delivering state-of-the-art performance across all benchmarks. With a 1M token context window, vision, computer use, and web search — this is the ultimate model for the most demanding tasks. When you need the absolute best, no compromises, Gemini 3.1 Pro is the answer!

Maximum Performance: Google's most capable Gemini model
1M Token Context: Process entire codebases and massive documents
Full Multimodal: Vision, Computer Use, and Web Search all at pro level
State-of-the-Art: Top-tier results across all AI benchmarks

🚀 Available Models

🚀 โมเดลที่มีให้บริการ