Anthonyg5005
/

hf-scripts

English

Model card Files Files and versions

xet

Community

Anthonyg5005 commited on Jul 28, 2024

Commit

2a943d9

1 Parent(s): b240958

outdated

Browse files

Files changed (1) hide show

ipynb/EXL2_Private_Quant_V3.ipynb +53 -51

ipynb/EXL2_Private_Quant_V3.ipynb CHANGED Viewed

@@ -1,18 +1,36 @@
 {
   "cells": [
     {
       "cell_type": "markdown",
-      "metadata": {
-        "id": "Ku0ezvyD42ng"
-      },
       "source": [
         "#Quantizing huggingface models to exl2\n",
         "This version of my exl2 quantize colab creates a single quantizaion to upload privatly.\\\n",
         "To calculate an estimate for VRAM size use: [NyxKrage/LLM-Model-VRAM-Calculator](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)\\\n",
         "Not all models and architectures are compatible with exl2.\\\n",
         "I've only tested with llama-7b and mistral-7b, not sure if higher size models work with free colab.\\\n",
-        "More stuff in [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts)"
-      ]
     },
     {
       "cell_type": "code",
@@ -39,12 +57,6 @@
     },
     {
       "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "cellView": "form",
-        "id": "8Hl3fQmRLybp"
-      },
-      "outputs": [],
       "source": [
         "#@title Login to HF (Required to upload files)\n",
         "#@markdown From my Colab/Kaggle login script on [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts/blob/main/HF%20Login%20Snippet%20Kaggle.py)\n",
@@ -98,16 +110,16 @@
         "        login(input(\"Enter your HuggingFace (WRITE) token: \"))\n",
         "        continue\n",
         "    break"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
       "metadata": {
         "cellView": "form",
-        "id": "NI1LUMD7H-Zx"
       },
-      "outputs": [],
       "source": [
         "#@title ##Choose HF model to download\n",
         "#@markdown ###Repo should be formatted as user/repo\n",
@@ -127,16 +139,16 @@
         "    print(\"Finished converting\")\n",
         "#@markdown If model files are stored in a pytorch .bin extention then enable convert_safetensors above.\\\n",
         "#@markdown ![Example Image](https://huggingface.co/Anthonyg5005/hf-scripts/resolve/main/ipynb/pytorch-example.jpg \"File extension is .bin\")"
-      ]
     },
     {
       "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "cellView": "form",
-        "id": "8anbEbGyNmBI"
-      },
-      "outputs": [],
       "source": [
         "#@title Quantize the model\n",
         "#@markdown ###Quantization time will last based on model size\n",
@@ -193,16 +205,16 @@
         "else:\n",
         "    quant = f\"convert.py -i models/{model} -o {model}-exl2-{BPW}bpw-WD -cf {model}-exl2-{BPW}bpw -b {BPW}\"\n",
         "!python {quant}"
-      ]
     },
     {
       "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "cellView": "form",
-        "id": "XORLS2uPrbma"
-      },
-      "outputs": [],
       "source": [
         "#@title Upload to huggingface privately\n",
         "#@markdown You may also set it to public but I'd recommend waiting for my next ipynb that will create mutliple quants and place them all into individual branches.\n",
@@ -213,23 +225,13 @@
         "create_repo(f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", private=True)\n",
         "HfApi().upload_folder(folder_path=f\"{model}-exl2-{BPW}bpw\", repo_id=f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", repo_type=\"model\", commit_message=\"Upload from Colab automation\")\n",
         "print(f\"uploaded to https://huggingface.co/{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\")"
-      ]
-    }
-  ],
-  "metadata": {
-    "accelerator": "GPU",
-    "colab": {
-      "gpuType": "T4",
-      "provenance": []
-    },
-    "kernelspec": {
-      "display_name": "Python 3",
-      "name": "python3"
-    },
-    "language_info": {
-      "name": "python"
     }
-  },
-  "nbformat": 4,
-  "nbformat_minor": 0
-}

 {
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "gpuType": "T4"
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    },
+    "accelerator": "GPU"
+  },
   "cells": [
     {
       "cell_type": "markdown",
       "source": [
         "#Quantizing huggingface models to exl2\n",
         "This version of my exl2 quantize colab creates a single quantizaion to upload privatly.\\\n",
         "To calculate an estimate for VRAM size use: [NyxKrage/LLM-Model-VRAM-Calculator](https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator)\\\n",
         "Not all models and architectures are compatible with exl2.\\\n",
         "I've only tested with llama-7b and mistral-7b, not sure if higher size models work with free colab.\\\n",
+        "#Outdated\n",
+        "More recent stuff in [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts)\\\n",
+        "If you need to quant a model to exl2 for free, check out the bot from the [Exllama Discord server](https://discord.gg/NSFwVuCjRq)"
+      ],
+      "metadata": {
+        "id": "Ku0ezvyD42ng"
+      }
     },
     {
       "cell_type": "code",
     },
     {
       "cell_type": "code",
       "source": [
         "#@title Login to HF (Required to upload files)\n",
         "#@markdown From my Colab/Kaggle login script on [Anthonyg5005/hf-scripts](https://huggingface.co/Anthonyg5005/hf-scripts/blob/main/HF%20Login%20Snippet%20Kaggle.py)\n",
         "        login(input(\"Enter your HuggingFace (WRITE) token: \"))\n",
         "        continue\n",
         "    break"
+      ],
       "metadata": {
         "cellView": "form",
+        "id": "8Hl3fQmRLybp"
       },
+      "execution_count": null,
+      "outputs": []
+    },
+    {
+      "cell_type": "code",
       "source": [
         "#@title ##Choose HF model to download\n",
         "#@markdown ###Repo should be formatted as user/repo\n",
         "    print(\"Finished converting\")\n",
         "#@markdown If model files are stored in a pytorch .bin extention then enable convert_safetensors above.\\\n",
         "#@markdown ![Example Image](https://huggingface.co/Anthonyg5005/hf-scripts/resolve/main/ipynb/pytorch-example.jpg \"File extension is .bin\")"
+      ],
+      "metadata": {
+        "id": "NI1LUMD7H-Zx",
+        "cellView": "form"
+      },
+      "execution_count": null,
+      "outputs": []
     },
     {
       "cell_type": "code",
       "source": [
         "#@title Quantize the model\n",
         "#@markdown ###Quantization time will last based on model size\n",
         "else:\n",
         "    quant = f\"convert.py -i models/{model} -o {model}-exl2-{BPW}bpw-WD -cf {model}-exl2-{BPW}bpw -b {BPW}\"\n",
         "!python {quant}"
+      ],
+      "metadata": {
+        "id": "8anbEbGyNmBI",
+        "cellView": "form"
+      },
+      "execution_count": null,
+      "outputs": []
     },
     {
       "cell_type": "code",
       "source": [
         "#@title Upload to huggingface privately\n",
         "#@markdown You may also set it to public but I'd recommend waiting for my next ipynb that will create mutliple quants and place them all into individual branches.\n",
         "create_repo(f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", private=True)\n",
         "HfApi().upload_folder(folder_path=f\"{model}-exl2-{BPW}bpw\", repo_id=f\"{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\", repo_type=\"model\", commit_message=\"Upload from Colab automation\")\n",
         "print(f\"uploaded to https://huggingface.co/{whoami().get('name', None)}/{model}-exl2-{BPW}bpw\")"
+      ],
+      "metadata": {
+        "cellView": "form",
+        "id": "XORLS2uPrbma"
+      },
+      "execution_count": null,
+      "outputs": []
     }
+  ]
+}