12 KiB
Vendored
Troubleshooting and Best Practices
This guide helps you resolve common issues with AI features and provides best practices for optimal performance.
Common Issues and Solutions
Connection and Authentication Issues
"Invalid API Key" Error
Symptoms:
- Error message: "Invalid API key provided"
- Cannot connect to AI provider
- All AI features disabled
Solutions:
-
Verify API Key Format
OpenAI: Should start with "sk-" Anthropic: Should start with "sk-ant-" -
Check for Extra Spaces
- Remove leading/trailing whitespace
- Ensure no line breaks in the key
-
Verify Key Permissions
- OpenAI: Check key has Chat and Embedding permissions
- Anthropic: Ensure key is active and not expired
-
Test Outside Trilium
# Test OpenAI curl https://api.openai.com/v1/models \ -H "Authorization: Bearer YOUR_KEY" # Test Anthropic curl https://api.anthropic.com/v1/messages \ -H "x-api-key: YOUR_KEY" \ -H "anthropic-version: 2023-06-01"
Connection Timeout
Symptoms:
- "Connection timeout" errors
- Slow or no response from AI
- Intermittent failures
Solutions:
-
Check Network Configuration
# Test connectivity ping api.openai.com ping api.anthropic.com # Check DNS nslookup api.openai.com -
Configure Proxy Settings
// If behind corporate proxy { "proxy": { "http": "http://proxy.company.com:8080", "https": "http://proxy.company.com:8080" } } -
Increase Timeout Values
{ "timeouts": { "connection": 60000, // 60 seconds "request": 120000 // 2 minutes } } -
Check Firewall Rules
- Ensure ports 443 (HTTPS) are open
- Whitelist AI provider domains
Ollama Connection Issues
Symptoms:
- "Cannot connect to Ollama" error
- Models not loading
- Empty model list
Solutions:
-
Verify Ollama is Running
# Check if Ollama is running ollama list # Start Ollama if needed ollama serve # Check process ps aux | grep ollama -
Correct Base URL
Default: http://localhost:11434 Docker: http://host.docker.internal:11434 Remote: http://server-ip:11434 -
Enable CORS for Remote Access
# Set environment variable export OLLAMA_ORIGINS="*" # Or in service file Environment="OLLAMA_ORIGINS=*" -
Check Model Availability
# List available models ollama list # Pull missing model ollama pull llama3
Embedding Issues
Embeddings Not Generating
Symptoms:
- Embedding count stays at 0
- "Failed to generate embeddings" error
- Search not finding relevant notes
Solutions:
-
Check Embedding Model Configuration
- Ensure embedding model is selected
- Verify model supports embeddings
-
Manually Trigger Regeneration
Settings → AI/LLM → Recreate All Embeddings -
Check Note Exclusions
// Look for notes with exclusion label SELECT * FROM attributes WHERE name = 'label' AND value = 'excludeFromAI'; -
Verify Resource Availability
- Check disk space for embedding storage
- Monitor memory usage during generation
Embedding Quality Issues
Symptoms:
- Poor search results
- Irrelevant context in chats
- Missing obvious matches
Solutions:
-
Switch to Better Embedding Model
OpenAI: text-embedding-3-large (higher quality) Ollama: mxbai-embed-large (recommended) -
Adjust Similarity Threshold
{ "search": { "similarity_threshold": 0.6, // Lower = more results "diversity_factor": 0.3 // Balance relevance/variety } } -
Recreate Embeddings After Changes
- Required when switching models
- Recommended after major note updates
Chat and Response Issues
AI Not Accessing Notes
Symptoms:
- AI says "I don't have access to your notes"
- Generic responses without note references
- Tools not being called
Solutions:
-
Enable Tool Calling
{ "tools": { "enabled": true, "auto_invoke": true } } -
Check Note Permissions
- Ensure notes aren't encrypted
- Remove #excludeFromAI labels if present
-
Verify Context Service
Check logs for: - "Context extraction failed" - "No relevant notes found" -
Test Tool Execution
Ask explicitly: "Search my notes for [topic]" Should trigger search_notes tool
Slow Response Times
Symptoms:
- Long delays before responses
- Timeouts during conversations
- UI freezing during AI operations
Solutions:
-
Optimize Model Selection
Fast: gpt-3.5-turbo, claude-3-haiku Balanced: gpt-4-turbo, claude-3-sonnet Quality: gpt-4, claude-3-opus -
Reduce Context Size
{ "context": { "max_notes": 5, // Reduce from 10 "max_tokens": 4000 // Reduce from 8000 } } -
Enable Caching
{ "cache": { "enabled": true, "ttl": 3600000, "aggressive": true } } -
Use Streaming Responses
{ "streaming": true, "stream_delay": 0 }
Incomplete or Cut-off Responses
Symptoms:
- Responses end mid-sentence
- Missing expected information
- "Length limit reached" messages
Solutions:
-
Increase Token Limits
{ "max_tokens": 8000, // Increase limit "reserve_tokens": 500 // Reserve for completion } -
Optimize Prompts
- Be more specific to reduce response length
- Request summaries instead of full content
-
Use Continuation Prompts
"Continue from where you left off" "Please complete the previous response"
Model-Specific Issues
OpenAI Rate Limits
Symptoms:
- "Rate limit exceeded" errors
- 429 status codes
- Intermittent failures
Solutions:
-
Implement Retry Logic
{ "retry": { "max_attempts": 3, "delay": 2000, "backoff": 2 } } -
Configure Rate Limiting
{ "rate_limit": { "requests_per_minute": 50, "tokens_per_minute": 40000 } } -
Upgrade API Tier
- Check OpenAI usage tier
- Request tier increase if needed
Anthropic Context Window
Symptoms:
- "Context too long" errors
- Inability to process large notes
Solutions:
-
Use Larger Context Models
Claude 3 models: 200K token context -
Implement Smart Truncation
{ "truncation": { "strategy": "smart", "preserve": ["title", "summary"], "max_per_note": 2000 } }
Ollama Memory Issues
Symptoms:
- "Out of memory" errors
- Model loading failures
- System slowdown
Solutions:
-
Use Quantized Models
# Use smaller quantization ollama pull llama3:7b-q4_0 -
Limit Context Size
# Set context window ollama run llama3 --ctx-size 2048 -
Configure GPU Memory
export OLLAMA_GPU_MEMORY=4GB export OLLAMA_NUM_GPU_LAYERS=20
Best Practices
Cost Optimization Strategies
Monitor Usage
-
Track Token Consumption
// Add to your configuration { "monitoring": { "log_token_usage": true, "alert_threshold": 100000 } } -
Set Budget Limits
{ "budget": { "daily_limit": 2.00, "monthly_limit": 50.00, "auto_stop": true } }
Optimize Requests
-
Use Appropriate Models
- Simple queries: Use cheaper/faster models
- Complex analysis: Use advanced models
- Embeddings: Use dedicated embedding models
-
Cache Aggressively
- Cache common queries
- Store processed embeddings
- Reuse context when possible
-
Batch Operations
// Process multiple notes together await ai.batchEmbed(notes, { batch_size: 100 });
Quality and Accuracy Tips
Improve Response Quality
-
Provide Clear Context
Bad: "Summarize my notes" Good: "Summarize my project management notes from Q1 2024" -
Use Examples
"Format the response like this example: - Topic: [name] - Key Points: [list] - Action Items: [list]" -
Iterate and Refine
- Start with broad questions
- Narrow down based on responses
- Use follow-up questions
Maintain Note Quality
-
Structure Notes Consistently
- Use clear titles
- Add descriptive labels
- Include summaries for long notes
-
Update Metadata
- Add relevant attributes
- Maintain relationships
- Keep dates current
-
Regular Maintenance
- Remove duplicate notes
- Update outdated information
- Fix broken links
Privacy Considerations
Protect Sensitive Data
-
Use Exclusion Labels
Add #excludeFromAI to sensitive notes -
Configure Privacy Settings
{ "privacy": { "exclude_patterns": ["password", "ssn", "credit card"], "sanitize_logs": true, "local_only": false } } -
Choose Appropriate Providers
- Sensitive data: Use Ollama (local)
- General content: Cloud providers acceptable
Audit AI Access
-
Review AI Logs
grep "AI accessed note" trilium-logs.txt -
Monitor Tool Usage
// Track which notes are accessed { "audit": { "log_note_access": true, "log_tool_calls": true } }
Performance Tuning
System Resources
-
Memory Management
- Close unused applications
- Increase Node.js memory limit
- Use swap space if needed
-
CPU Optimization
- Limit concurrent requests
- Use worker threads
- Enable process priority
-
Storage Optimization
- Regular database maintenance
- Compress old embeddings
- Archive unused chat logs
Network Optimization
-
Reduce Latency
- Use CDN endpoints when available
- Enable HTTP/2
- Configure keep-alive
-
Handle Failures Gracefully
{ "resilience": { "circuit_breaker": true, "fallback_provider": "ollama", "offline_mode": true } }
Diagnostic Tools
Built-in Diagnostics
-
AI Health Check
Settings → AI/LLM → Run Diagnostics -
Connection Test
Settings → AI/LLM → Test Connection -
Embedding Statistics
Settings → AI/LLM → View Statistics
Log Analysis
-
Enable Debug Logging
{ "logging": { "level": "debug", "ai_verbose": true } } -
Common Log Patterns
# Find errors grep "ERROR.*AI" logs/*.log # Track performance grep "AI response time" logs/*.log | awk '{print $NF}' # Monitor token usage grep "tokens_used" logs/*.log | sum
Performance Monitoring
-
Response Time Tracking
// Add to configuration { "metrics": { "track_response_time": true, "slow_query_threshold": 5000 } } -
Resource Usage
# Monitor Trilium process top -p $(pgrep -f trilium) # Check memory usage ps aux | grep trilium # Monitor network netstat -an | grep 11434 # Ollama
Getting Help
Self-Help Resources
-
Check Documentation
- Review this troubleshooting guide
- Read provider-specific docs
- Check release notes for known issues
-
Community Resources
- Trilium Discord server
- GitHub Discussions
- Reddit r/TriliumNotes
-
Debug Information to Collect
- Trilium version
- AI provider and model
- Error messages and logs
- System specifications
- Configuration settings
Reporting Issues
When reporting AI-related issues:
-
Include Details
Trilium Version: X.X.X Provider: OpenAI/Anthropic/Ollama Model: gpt-4/claude-3/llama3 Error: [exact error message] Steps to reproduce: [detailed steps] -
Attach Logs
- Recent error logs
- Debug output if available
- Configuration (sanitized)
-
Describe Expected Behavior
- What you expected to happen
- What actually happened
- Any workarounds tried