Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
davanstrien 
posted an update Aug 19, 2024

Very useful extension! Thanks

image.png

96% to wikipedia - i love the idea - but the similarity estimation is far off

original dataset
https://huggingface.co/datasets/kalomaze/Opus_Instruct_25k

·

At the moment, this is relying on the dataset cards, so the similarity does indeed work better for longer dataset cards. I plan for a version that will directly use the dataset to create the similarity scores, which should hopefully work better!

thanks!