What would you like to be added:
Currently the approximate prefix cache scorer calculates the score based on the match percentage of the prompt itself. The absolute prompt length should be a factor in the score.
Why is this needed:
The current approach may not be ideal for short sequences, which will typically get a high match percentage (since the denominator is small) and so results in high prefix match score and could be more susceptible to hot spots.