nomic-ai
/

CodeRankEmbed

sentence-transformers

Model card Files Files and versions

tarsur909 commited on Nov 9, 2024

Commit

396a762

·

verified ·

1 Parent(s): 998ce3a

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -1,6 +1,7 @@
 ---
 base_model:
 - Snowflake/snowflake-arctic-embed-m-long
 ---
@@ -60,5 +61,4 @@ print(code_embeddings)
 ## Training
-We use a bi-encoder architecture for `CodeRankEmbed`, with weights shared between the text and code encoder. The retriever is contrastively fine-tuned with InfoNCE loss on a 21 million example high-quality dataset we curated called [CoRNStack](https://gangiswag.github.io/cornstack/). Our encoder is initialized with [Arctic-Embed-M-Long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long), a 137M parameter text encoder supporting an extended context length of 8,192 tokens.

 ---
 base_model:
 - Snowflake/snowflake-arctic-embed-m-long
+library_name: sentence-transformers
 ---
 ## Training
+We use a bi-encoder architecture for `CodeRankEmbed`, with weights shared between the text and code encoder. The retriever is contrastively fine-tuned with InfoNCE loss on a 21 million example high-quality dataset we curated called [CoRNStack](https://gangiswag.github.io/cornstack/). Our encoder is initialized with [Arctic-Embed-M-Long](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-long), a 137M parameter text encoder supporting an extended context length of 8,192 tokens.