The relevance threshold is a filter that determines how pertinent content must be to the user's question to be considered useful. It is expressed as a decimal value from 0.00 to 1.00, where higher values indicate greater selectivity.
How the relevance score works
The system assigns a score to each content fragment based on:
- Semantic similarity: How conceptually similar the content is to the question
- Lexical correspondence: Presence of specific keywords
- Thematic context: Belonging to the same knowledge domain
- Content structure: Relevance of titles, subtitles and structure
Interpretation of threshold values
Low threshold (0.30-0.60)
Characteristics:
- Includes even vaguely related content
- Greater information coverage
- Risk of non-pertinent information
When to use:
- Limited knowledge base
- Highly specialized content
- Exploratory and research questions
- Initial configuration for testing
Practical example:
Question: "How to configure WiFi?"
With threshold 0.40: Includes WiFi guides, network configurations, connection troubleshooting, router settings
Medium threshold (0.60-0.80)
Characteristics:
- Balance between coverage and precision
- Filters marginally relevant content
- Good general response quality
When to use:
- Standard configuration for most cases
- Well-organized knowledge base
- General assistant use
Practical example:
Question: "How to configure WiFi?"
With threshold 0.70: Includes specific WiFi guides, configuration procedures, but excludes generic network content
High threshold (0.80-1.00)
Characteristics:
- Only highly pertinent content
- Very focused responses
- Risk of missing useful information
When to use:
- FAQ with precise answers
- Very large knowledge base
- Content with possible overlaps
- When precision is critical
Practical example:
Question: "How to configure WiFi?"
With threshold 0.85: Includes only step-by-step guides specific to WiFi configuration
Optimization strategies
Gradual approach
- Start with medium threshold (0.70): Safe value for most cases
- Monitor responses: Check if they are too generic or too limited
- Adjust incrementally: Change by 0.05-0.10 at a time
- Test with real questions: Use actual user questions
A/B testing for optimization
- Prepare test question set: 20-30 representative questions
- Compare different thresholds: 0.60, 0.70, 0.80
- Evaluate each response: Quality, completeness, relevance
- Choose winning threshold: The one with best balance
Common problems and solutions
Problem: "Assistant never finds information"
Possible cause: Threshold too high
Solution:
- Reduce threshold by 0.10-0.15
- Verify content quality in knowledge base
- Check that content uses consistent terminology
Problem: "Responses include too much non-pertinent information"
Possible cause: Threshold too low
Solution:
- Increase threshold by 0.10-0.15
- Remove duplicate or very similar content
- Improve content organization
Problem: "Inconsistent responses"
Possible cause: Contradictory content in relevance range
Solution:
- Increase threshold to exclude marginal content
- Identify and resolve content contradictions
- Better organize knowledge base
Optimization for specific sectors
Technical support
- Recommended threshold: 0.75-0.85
- Objective: Precise and unambiguous procedures
- Considerations: Avoid contradictory information that confuses users
E-commerce
- Recommended threshold: 0.65-0.75
- Objective: Balance product information and related suggestions
- Considerations: Include similar products and accessories
Professional services
- Recommended threshold: 0.60-0.70
- Objective: Complete coverage of complex topics
- Considerations: Allow connections between related concepts
Continuous monitoring
- Quality metrics: Collect feedback on response relevance
- Conversation analysis: Identify patterns of unanswered questions
- Periodic testing: Repeat optimization every 2-3 months
- Content adaptations: Recalibrate when adding new materials