GUIDES
Best Practices
Patterns and recommendations for building production-grade AI agent systems. Covers security, performance, reliability, and cost optimization.
Security
API Key Management
- -Store API keys in environment variables, never in source code
- -Use separate keys for development, staging, and production
- -Apply minimum required scopes to each key
- -Rotate keys on a regular schedule (quarterly recommended)
- -Monitor key usage for anomalies in the dashboard
Input Validation
Validate user input
// Always validate and sanitize user input before passing to agents
function sanitizeInput(input: string): string {
// Remove potential injection attempts
const sanitized = input
.slice(0, 10000) // enforce max length
.replace(/[\x00-\x08\x0B\x0C\x0E-\x1F]/g, ""); // remove control chars
return sanitized;
}
const result = await client.agents.execute(agent.id, {
input: sanitizeInput(userInput),
});
Tool Handler Security
Secure tool handlers
// Always verify webhook signatures
app.post("/webhooks/tool", (req, res) => {
// 1. Verify X-YA-Signature header
if (!verifySignature(req.body, req.headers["x-ya-webhook-signature"], SECRET)) {
return res.status(401).json({ error: "Invalid signature" });
}
// 2. Validate the tool call parameters
const { parameters } = req.body;
const validation = schema.validate(parameters);
if (!validation.valid) {
return res.json({ error: { code: "INVALID_PARAMS", message: validation.error } });
}
// 3. Apply authorization checks
if (!canAccessResource(req.body.agentId, parameters.resourceId)) {
return res.json({ error: { code: "UNAUTHORIZED", message: "Access denied" } });
}
// 4. Process the request
// ...
});
Prompt Engineering
System Prompt Structure
Recommended structure
const systemPrompt = `## Role
[Who is the agent? What company/domain?]
## Capabilities
[What can the agent do? What tools does it have?]
## Rules & Constraints
[What should the agent NOT do? Escalation criteria?]
## Response Format
[Desired output format, length, style]
## Examples (optional)
[2-3 example interactions]`;
Tips
- -Be specific — Vague prompts produce vague responses
- -Include examples — Show the agent what good output looks like
- -Define boundaries — Explicit rules prevent unexpected behavior
- -Use structured sections — Markdown headers help the model parse instructions
- -Test with edge cases — Adversarial inputs, empty inputs, very long inputs
Performance
Reduce Latency
Performance tips
// 1. Use streaming for perceived speed
const stream = await client.agents.execute(agent.id, {
input: "...",
stream: true, // first token in < 200ms
});
// 2. Use lower maxTokens when possible
const agent = await client.agents.create({
parameters: {
maxTokens: 512, // don't set 4096 if you only need short answers
},
});
// 3. Use Haiku for simple tasks
const classifier = await client.agents.create({
model: "claude-haiku-4-20250514", // 3x faster than Sonnet
parameters: { temperature: 0, maxTokens: 10 },
});
// 4. Cache common responses
const cache = new Map<string, string>();
async function executeWithCache(agentId: string, input: string) {
const key = `${agentId}:${input}`;
if (cache.has(key)) return cache.get(key)!;
const result = await client.agents.execute(agentId, { input });
cache.set(key, result.output);
return result.output;
}
Optimize Tool Handlers
- -Keep tool handlers under 3 seconds response time
- -Return only the data the agent needs, not full database records
- -Use connection pooling for database queries
- -Add caching layers for frequently accessed data
Cost Optimization
Cost optimization
// 1. Use the right model for the task
// Haiku: classification, routing, simple Q&A ($0.80/M tokens)
// Sonnet: complex reasoning, writing, analysis ($3.00/M tokens)
// Opus: highest quality, creative tasks ($15.00/M tokens)
// 2. Limit conversation length
const agent = await client.agents.create({
memory: {
enabled: true,
maxTurns: 20, // cap conversation length
summarize: true, // summarize old turns to save tokens
summaryModel: "claude-haiku-4-20250514", // cheap model for summaries
},
});
// 3. Use concise system prompts
// Every token in the system prompt is sent with every request
// A 2000-token prompt costs more than a 500-token prompt
// 4. Monitor usage
const executions = await client.executions.list({
agentId: agent.id,
startDate: "2026-03-01",
});
let totalTokens = 0;
for (const exec of executions.data) {
totalTokens += exec.usage.totalTokens;
}
console.log("Total tokens this month:", totalTokens);
Reliability
Error Handling
Robust error handling
async function executeWithFallback(
agentId: string,
input: string,
retries = 3
) {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
return await client.agents.execute(agentId, { input });
} catch (err) {
if (err instanceof RateLimitError) {
await sleep(err.retryAfter * 1000);
continue;
}
if (err instanceof APIError && err.status >= 500 && attempt < retries) {
await sleep(1000 * attempt); // exponential backoff
continue;
}
throw err; // non-retryable error
}
}
throw new Error("Max retries exceeded");
}
Health Checks
Health monitoring
// Periodic health check
async function checkAgentHealth(agentId: string) {
const start = Date.now();
try {
const result = await client.agents.execute(agentId, {
input: "Health check: respond with OK",
});
const latency = Date.now() - start;
return {
healthy: result.output.includes("OK"),
latency,
tokens: result.usage.totalTokens,
};
} catch (err) {
return { healthy: false, error: err.message };
}
}
// Run every 5 minutes
setInterval(() => checkAgentHealth("agt_abc123"), 5 * 60 * 1000);
Note
Set up webhook alerts for execution.failed events to catch issues before your users do. Route alerts to Slack or PagerDuty for immediate response.
Testing
Test patterns
// Use test keys for automated testing
const testClient = new YourAutomation({
apiKey: process.env.YA_TEST_KEY!, // ya_test_...
});
// Test suite
describe("Support Agent", () => {
it("handles order lookup", async () => {
const result = await testClient.agents.execute(agentId, {
input: "What is the status of order ORD-12345?",
});
expect(result.toolCalls).toContainEqual(
expect.objectContaining({ name: "lookup_order" })
);
expect(result.output).toContain("ORD-12345");
});
it("escalates for refunds over $500", async () => {
const result = await testClient.agents.execute(agentId, {
input: "I need a $750 refund for my order",
});
expect(result.toolCalls).toContainEqual(
expect.objectContaining({ name: "escalate" })
);
});
it("handles unknown orders gracefully", async () => {
const result = await testClient.agents.execute(agentId, {
input: "Check order ORD-NONEXISTENT",
});
expect(result.output).not.toContain("error");
expect(result.output).toMatch(/not found|couldn't find/i);
});
});