OpenRouter, Anthropic, Volcano Engine, Siliconflow Usage Guide
In AI application development, choosing the right LLM API service is crucial. Whether you are building an intelligent dialogue system, developing an AI Agent, or participating in an AI Hackathon, this article will provide you with a comprehensive API usage guide, covering mainstream services such as OpenRouter, Anthropic API, Volcano Engine, and Siliconflow.
Why Do You Need Multiple API Services?
Different LLM models have their own advantages, especially when developing AI Agents, where you need to choose the right model based on specific scenarios:
- Claude (Anthropic): Excels in complex reasoning, programming, and Agent tasks, particularly suitable for scenarios requiring deep thinking
- Gemini (Google): Performs well in long text processing and multimodal understanding, suitable for handling multimedia content such as images and videos
- GPT (OpenAI): Strong in image understanding and mathematical reasoning, excellent for everyday conversation experiences
- Doubao (ByteDance): Fast access speed in China, good voice dialogue experience, especially suitable for real-time interaction scenarios
- Open Source Models: Low cost, highly customizable, suitable for large-scale deployment
OpenRouter: One-Stop Access to All Models
OpenRouter is the service I most recommend. It provides a unified API interface to access various models without worrying about regional restrictions. For AI Agent developers, this means you can easily switch and combine different models.
Advantages
- No Regional Restrictions: Access various models directly from within China
- Unified Interface: Uses OpenAI format API, simplifying programming
- Rich Models: Supports mainstream models like Claude, Gemini, GPT, Grok, etc.
- Convenient for Agent Development: Flexibly call models with different capabilities within one system
Usage Example
1 | from openai import OpenAI |
Model Selection Strategy for AI Agent Development
When developing AI Agents, different tasks require different models:
- Programming and Core Logic of Agent (Tool Calling):
anthropic/claude-sonnet-4
- Long Text Analysis, Report Generation, Multimodal:
google/gemini-2.5-pro
- Real-Time Response (Fast Thinking):
google/gemini-2.5-flash
- Everyday Conversation:
openai/gpt-4o
Anthropic Official API
Although OpenRouter is convenient, there are scenarios where you still need to use the official API, such as using Claude Code or Computer Use features.
Notes
⚠️ Important: Do not use the Anthropic API key with domestic and Hong Kong IPs; you need to access it using overseas IPs.
Usage Scenarios
- Claude Code
- Computer Use (Allowing Agents to Operate Computers)
⚠️ Special Feature Tip: Tool Use and Thinking Mode in Claude for Agent development require special syntax. Please refer to the Anthropic Official Documentation for the latest usage methods.
Volcano Engine: Doubao Model
The Doubao model provided by Volcano Engine has low latency for access within China, making it particularly suitable for real-time Agent applications that require fast response times.
Model Selection for AI Agent Development
Volcano Engine provides three Doubao models with different capabilities, suitable for different AI Agent scenarios:
1. Fast and Slow Thinking Architecture
When developing an Agent with a “Fast and Slow Thinking” architecture:
Fast Thinking Model (Low Latency) - doubao-seed-1-6-flash-250615
:
1 | response = client.chat.completions.create( |
Slow Thinking Model (Deep Reasoning) - doubao-seed-1-6-thinking-250615
:
1 | response = client.chat.completions.create( |
2. Multimodal Agent
Suitable for Agents that need to process images - doubao-seed-1-6-250615
:
1 | import os |
Siliconflow: The Best Choice for Open Source Models
Siliconflow provides a complete model ecosystem for AI Agent development, including LLM, TTS (Text-to-Speech), and ASR (Automatic Speech Recognition).
LLM Models
Recommended models for Agent development:
- Kimi K2 Instruct:
moonshotai/Kimi-K2-Instruct
- DeepSeek R1 0528:
deepseek-ai/DeepSeek-R1
- Qwen3 235B:
Qwen/Qwen3-235B-A22B-Instruct-2507
⚠️ Important Tip: It is not recommended to use DeepSeek for Tool Calling, as DeepSeek’s capabilities in this area are relatively weak. If you need Tool Calling functionality, it is recommended to choose Claude, Gemini series models, or OpenAI o3/GPT-4.1, Grok-4. If you can only use domestic models, it is recommended to use Kimi K2.
1 | import requests |
Voice Agent Development Kit
TTS Text-to-Speech
Use Fish Audio or CosyVoice for speech synthesis:
1 | import requests |
ASR Speech Recognition
Use SenseVoice for speech recognition:
1 | from openai import OpenAI |
Using AI to Assist in Developing Agents
When developing AI Agents, it is recommended to use AI-assisted programming tools like Cursor to practice “using Agents to develop Agents”:
- Documentation First: Let AI generate design documents first, iterate and optimize before coding
- Choose the Right Model: Use models with thinking capabilities (e.g., Claude 4 Sonnet)
- Test-Driven: Let AI write test cases for the code
⚠️ Important Reminder: LLM-generated model names are often outdated, and usage is frequently incorrect because LLM’s training data is relatively old, while model progress is very fast. Therefore, when writing LLM-related code, be sure to:
- Have Cursor write code according to the latest official documentation
- Explicitly specify the model type and version in the prompt
- Do not attempt to let Cursor randomly generate model calling code
- Actively provide the latest API documentation links for the AI assistant to reference
AI Code Editor Selection
In addition to AI code editors like Cursor, Windsurf, and Trae that require payment, you can also use the OpenRouter API key to get a similar experience in other editors:
- Void AI Editor: Open source, supports direct use of OpenRouter API key, similar functionality to Cursor
- VSCode Cline Plugin: Can configure OpenRouter API key to achieve AI-assisted programming in VSCode
The advantage of using OpenRouter is that you can switch between different models under the same API key to find the AI assistant that best suits your programming style.
AI Agent Debugging Tips
Debugging is one of the most important stages in AI Agent development. Here are some practical debugging suggestions:
Understanding How LLM Works
Think of LLM as a person, one without any background tasks but smart and with a lot of general knowledge (similar to a freshly graduated student from Tsinghua’s Yao Class). At this point, you must provide clear instructions and complete context.
Thoroughly Check Input Content
During the debugging phase, it is best to thoroughly review the context sent to LLM to ensure:
- Structured Content Format is Correct: JSON, XML, and other structured data are correctly formatted
- System Prompt is Correct: Instructions are clear and unambiguous
- Complete History: The execution history of the Agent includes:
- What the user said
- LLM’s internal thought process
- Replies to the user
- Tool Call and Tool Call Result
- Correct Order: All interaction records are arranged in chronological order, with no omissions or disorder
Debugging Best Practices
- Step-by-Step Verification: Start with simple scenarios and gradually increase complexity
- Detailed Logs: Record complete input and output for problem localization
- Model Comparison: Test on different models to find the most suitable model combination
Summary
Choosing the right LLM API service is key to building an excellent AI Agent:
- Regional Restrictions: Prioritize using OpenRouter to avoid access issues
- Task Requirements: Choose the model that excels based on the specific functions of the Agent
- Latency Requirements: Choose low-latency models for real-time interaction and powerful models for deep thinking
- Cost Considerations: Balance performance and cost, and use open-source models wisely
- Architecture Design: Adopt a hybrid model architecture to leverage the advantages of different models
Quick Reference: Common Model List
Purpose | Recommended Model | Features |
---|---|---|
Programming/Agent Development | anthropic/claude-sonnet-4 |
Strong coding ability, stable tool calling |
Low Latency Response | google/gemini-2.5-flash |
Extremely low latency, suitable for real-time interaction |
Long Text Processing | google/gemini-2.5-pro |
Large context window, strong understanding ability |
Document Writing | google/gemini-2.5-pro |
Deep thinking, concise and fluent language |
Balanced Performance | openai/gpt-4o |
Balanced capabilities in all aspects |
Cost Optimization | google/gemini-2.5-flash |
High cost-performance ratio |
Low-Cost Agent Development | moonshotai/Kimi-K2-Instruct |
Open-source model, low cost, good effect |
Low Latency Response in China | doubao-seed-1-6-flash-250615 |
Extremely low latency in China, suitable for real-time interaction |
Chinese Creative Writing | deepseek-ai/DeepSeek-R1 |
Strong Chinese expression ability |
Speech Recognition | FunAudioLLM/SenseVoiceSmall |
Low latency, low cost |
Speech Synthesis | fishaudio/fish-speech-1.5 |
Low latency, low cost |
By using these API services in a reasonable combination, you can build a powerful and responsive AI Agent. Remember, an excellent Agent does not rely on a single model but knows how to use the right tool in the right scenario.
Wishing you success in your AI Agent development journey!