LLM360
/

CrystalChat-7B-Web2Code

Text Generation

Model card Files Files and versions

victormiller commited on Oct 22, 2024

Commit

a1dc112

·

verified ·

1 Parent(s): fbe1b69

Update README.md

Files changed (1) hide show

README.md +18 -23

README.md CHANGED Viewed

@@ -19,6 +19,24 @@ datasets:
 CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [Web2Code: A Large-scale Webpage-to-Code Dataset
 and Evaluation Framework for Multimodal LLMs](https://arxiv.org/pdf/2406.20098). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
 ## Web2Code Dataset
 Our Web2Code instruction tuning dataset construction and instruction generation process
 involves four key components:
@@ -142,29 +160,6 @@ The dataset chosen was created by LLaVA with academic-task-oriented VQA data mix
 **Table 6:** Distribution of DWU and DWU<sub>R</sub> datasets. Both datasets include high-quality question-answer pairs for webpage understanding.*
-## Examples
-**Example 1: Hand drawn images**
-| ![Image 1](images2/handdrawn.png) | ![Image 2](images2/crystal.png) |
-|:----------------------:|:----------------------:|
-| Hand Drawn Webpage  | CrystalChat-Web2Code Rendering |
-**Example 2: Recreate a webpage from an image**
-Image 1: Original Webpage
-<center><img src="images2/ori.png" alt="k2 eval table" /></center>
-Image 2: CrystalChat-Web2Code Rendering
-<center><img src="images2/crystalchat.png" alt="k2 eval table" /></center>
-**Image 3:** Hand-drawn webpage input to CrystalChat-7B-Web2Code generated output.
 ## Loading Crystal
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer

 CrystalChat-7B based multi-modal large language model (MLLM) mimics the training recipe used for Vicuna-7B based [LLaVa-v1.5](https://huggingface.co/docs/transformers/main/model_doc/llava). CrystalChat-7B based MLLMs models are entirely transparent, having open-sourced all materials, including code, data, model checkpoint, intermediate results, and more at [Web2Code: A Large-scale Webpage-to-Code Dataset
 and Evaluation Framework for Multimodal LLMs](https://arxiv.org/pdf/2406.20098). CrystalChat-7B-Web2Code MLLM is specialized in webpage images-to-html code generation.
+## CrystalChat-Web2Code Features
+**Covert hand-drawn images to a website**
+| ![Image 1](images2/handdrawn.png) | ![Image 2](images2/crystal.png) |
+|:----------------------:|:----------------------:|
+| Hand Drawn Webpage  | CrystalChat-Web2Code Rendering |
+**Recreate a new webpage from an existing webpage**
+Image 1: Original Webpage
+<center><img src="images2/ori.png" alt="k2 eval table" /></center>
+Image 2: CrystalChat-Web2Code Rendering
+<center><img src="images2/crystalchat.png" alt="k2 eval table" /></center>
 ## Web2Code Dataset
 Our Web2Code instruction tuning dataset construction and instruction generation process
 involves four key components:
 **Table 6:** Distribution of DWU and DWU<sub>R</sub> datasets. Both datasets include high-quality question-answer pairs for webpage understanding.*
 ## Loading Crystal
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer