Querying Apache Solr based on score values

https://stackoverflow.com/questions/22780756

25-06-2023
|

题

I am working on an image retrieval task. I have a dataset of wikipedia images with their textual description in xml files (1 xml file per image). I have indexed those xmls in Solr. Now while retrieving those, I want to maintain some threshold for Score values, so that docs with less score will not come in the result (because they are not of much importance). For example I want to retrieve all documents having similarity score greater than or equal to 2.0. I have already tried range queries like score:[2.0 TO *] but can't get it working. Does anyone have any idea how can I do that?

解决方案

What's the motivation for wanting to do this? The reason I ask, is
score is a relative thing determined by Lucene based on your index
statistics. It is only meaningful for comparing the results of a
specific query with a specific instance of the index. In other words,
it isn't useful to filter on b/c there is no way of knowing what a
good cutoff value would be.

http://lucene.472066.n3.nabble.com/score-filter-td493438.html

Also, take a look here - http://wiki.apache.org/lucene-java/ScoresAsPercentages

So, in general it's bad to cut off by some value, because you'll never know which threshold value is best. In good query it could be score=2, in bad query score=0.5, etc. These two links should explain you why you DONT want to do it.

P.S. If you still want to do it take a look here - https://stackoverflow.com/a/15765203/2663985

P.P.S. I recommend you to fix your search queries, so they will search better with high precision (http://en.wikipedia.org/wiki/Precision_and_recall)

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow