Similarity measures

up:: Machine Learning related: loss function (Machine Learning)

Jaccard similarity

The ratio of Intersection over Union.

The literal overlap. Like how many words in one sentece overlap in the other sentence.

This has some problems: Take the options “Go for a leisurely stroll”, “Hike in the mountains” and “Paddle in a kayak”. For the input “I don’t like to hike.”, the option “Hike in the mountains” would be presented.

This way of using similarity ignores semantics.

Cosine similarity

This takes into account the relationship of vectors.