ZeptixZeptix
DemoDEVAnmelden
Intermediate6 minUpdated: 2026-05-16

Precision or Recall? Setting Top-K and Threshold Correctly in the Visualizer

When your bot needs more hits and when it needs fewer. Concrete values for Top-K, minimum similarity and source count in the Zeptix Visualizer.

Precision or Recall? Setting Top-K and Threshold Correctly

In the Visualizer you control how your Zeptix bot searches for knowledge. Two switches in the "Search knowledge" node determine whether your bot collects many hits (recall) or only admits the best ones (precision). Anyone who wants to maximize both ends up with neither. This guide shows how to find the right mix.

TL;DR

  • Top-K = how many knowledge snippets flow into the answer.
  • Minimum similarity = the score from which a hit counts at all.
  • More Top-K -> more recall, but blurrier answers.
  • Higher threshold -> more precision, but gaps on rare questions.
  • The default 8 / 0.5 is a good starting point for most bots.

What Top-K and threshold mean

When an end user asks "How do I cancel my subscription?", Zeptix searches your knowledge base and returns hits with a similarity score. Top-K defines how many of these hits are ultimately built into the answer prompt. Minimum similarity is the lower bound: weaker hits are not passed on at all.

KnobSmall valueLarge value
Top-KNarrow context, shorter answersBroad context, longer answers
Minimum similarityEven weak hits allowedOnly very similar hits count

When more recall helps

More recall (higher Top-K, lower threshold) pays off when your bot often replies "I don't know" even though the information is in the knowledge base. Typical symptoms:

  • End users phrase things unusually, and the bot finds nothing.
  • Your knowledge base is large and diverse (technical specifications, FAQ, tutorials mixed together).
  • You have long documents with statements spread throughout.

In these cases: increase Top-K from 8 to 12 or 16, lower the threshold from 0.5 to 0.40. Observe the behavior - if answers suddenly become inaccurate, pull back again.

When more precision helps

More precision (smaller Top-K, higher threshold) makes sense when your bot fabricates too often or mixes sources that do not belong together:

  • The bot answers questions with content from the wrong product.
  • Answers sound coherent but are factually wrong.
  • You have only a single, clear, well-structured body of knowledge.

Here: set Top-K to 4 or 6, raise the threshold to 0.60 or 0.65, and additionally set minimum sources to 2 in the "Anchor sources" node. The bot should rather say "I don't know that" than invent something false.

Concrete recommendations by bot type

Bot typeTop-KThresholdMinimum sources
FAQ bot with clear answers60.551
Knowledge bot over large docs120.451
Coaching bot with soft topics80.501
Compliance bot, legally sensitive40.652
Marketing bot, creative80.400

How to set this in the Visualizer

  1. Open https://zeptix.dev/visualizer and select your bot in the navigation bar.
  2. In the canvas, click the "Search knowledge" node. The inspector opens on the right.
  3. Slide Top-K to the desired value (4, 6, 8, 12, 16, 20).
  4. Set minimum similarity between 0.30 and 0.70.
  5. Click Save. The new configuration is live immediately.
  6. Use the live preview to test whether answers change as expected.

For advanced users

If you have a very wide range of topics, combine higher Top-K (e.g. 16) with the "Sharpen selection" node (reranker). The reranker re-sorts the 16 hits and ultimately passes on only the best 8. That gets you recall and precision - but it costs 150 to 300 ms of latency per question.

More on this in Reranker explained - when the second sorter pays off.

Common mistakes

  • Top-K = 20 + threshold = 0.30: Maximum recall without a filter. Answers become long, vague, and mix topics.
  • Top-K = 4 + threshold = 0.70: Maximum precision. On 70 percent of all questions the bot says "I don't know".
  • Forgetting to test: After saving, always run through 5 real questions. Pipeline tuning without tests almost always leads to worse results.

Next steps

Next article →Reranker explained - when the second sorter in your bot pays off
Precision or Recall? Setting Top-K and Threshold Correctly in the Visualizer | Zeptix