Update README.md
Browse files
README.md
CHANGED
|
@@ -1,144 +1,45 @@
|
|
| 1 |
---
|
| 2 |
base_model: meta-llama/Llama-3.2-1B-Instruct
|
| 3 |
datasets:
|
| 4 |
-
- fineinstructions/
|
| 5 |
tags:
|
| 6 |
- datadreamer
|
| 7 |
- datadreamer-0.46.0
|
| 8 |
- synthetic
|
| 9 |
- text-generation
|
| 10 |
pipeline_tag: text-generation
|
| 11 |
-
widget:
|
| 12 |
-
- text: "<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December\
|
| 13 |
-
\ 2023\nToday Date: 21 Apr 2025\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\
|
| 14 |
-
\n{\n \"instruction_template\": \"How should we go about <fi>a few word description\
|
| 15 |
-
\ of the desirable outcome</fi> the <fi>a few word description of the undesirable\
|
| 16 |
-
\ situation</fi>? While I think it is important we research ways we can <fi>protect\
|
| 17 |
-
\ ourselves from the undesirable situation</fi>, I think it is equally important\
|
| 18 |
-
\ that we look at some ideas on how we can actually <fi>address the undesirable\
|
| 19 |
-
\ situation</fi> <fi>entities or organizations</fi> like <fi>them</fi> from <fi>their\
|
| 20 |
-
\ actions</fi> on <fi>people or groups</fi>. I have a few ideas of my own, but\
|
| 21 |
-
\ I want to see what other people think is the easiest, most reasonable way to\
|
| 22 |
-
\ <fi>achieve the desirable outcome</fi> or at the very least <fi>minimize the\
|
| 23 |
-
\ undesirable situation</fi>.\",\n \"document\": \"South Asia Pure Water Initiative,\
|
| 24 |
-
\ Inc. (SAPWII) supports two small factories in Kolar and Mysore,Karnataka South\
|
| 25 |
-
\ India to manufacture BioSand Water Filters. For the past 10 years, we have developed\
|
| 26 |
-
\ programs such as our \\u201cAdopt-A-Village Partnership\\u201d and \\u201cErnie\\\
|
| 27 |
-
u2019s Filters for Schools\\u201d that have placed more than 12,000 filters in\
|
| 28 |
-
\ villages and schools in South India. We have brought clean water to more than\
|
| 29 |
-
\ 200,000 people suffering from diseases caused by contaminated water!\\nWith\
|
| 30 |
-
\ the help and support from the Centre for Affordable Water and Sanitation Technologies\
|
| 31 |
-
\ (CAWST), the premier BioSand filter experts worldwide, we have conducted training\
|
| 32 |
-
\ camps in various locations in India to spread the word of the BioSand Water\
|
| 33 |
-
\ Filter technology to all of India. We are training other organizations to manufacture\
|
| 34 |
-
\ and distribute BioSand Water Filters and provide clean water to all locations\
|
| 35 |
-
\ in India where there is a need.\\nOver 500,000 children die every year from\
|
| 36 |
-
\ diarrhea caused by unsafe water and poor sanitation \\u2013 that\\u2019s more\
|
| 37 |
-
\ than 1,400 a day. Achieving universal access to safe water would save 2.5 million\
|
| 38 |
-
\ lives every year. For every $1 invested in water and sanitation, an average\
|
| 39 |
-
\ of $4 is returned in increased productivity and reduced medical costs. Access\
|
| 40 |
-
\ to safe water breaks the cycle of poverty, creates markets where they never\
|
| 41 |
-
\ existed before and uplifts the global community as well as the local community.\\\
|
| 42 |
-
nA BioSand water filter is an adaptation of the traditional slow sand filter which\
|
| 43 |
-
\ has been used for community drinking water treatment for 200 years. The technology\
|
| 44 |
-
\ has been adapted to create a household water treatment filter that can be built\
|
| 45 |
-
\ on a small scale at low cost with materials available locally. The BioSand water\
|
| 46 |
-
\ filter has no replacement parts, requires no electricity, lasts for 30 years\
|
| 47 |
-
\ without ongoing costs and is virtually maintenance free. Found to be very effective\
|
| 48 |
-
\ for reducing water-borne disease and manufactured and used in more than 60 countries\
|
| 49 |
-
\ worldwide.\"\n}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
| 50 |
-
example_title: Example 1
|
| 51 |
-
- text: "<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December\
|
| 52 |
-
\ 2023\nToday Date: 21 Apr 2025\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\
|
| 53 |
-
\n{\n \"instruction_template\": \"Can we please use this opportunity to <fi>a\
|
| 54 |
-
\ few word description of a desirable change</fi> and focus more on <fi>a few\
|
| 55 |
-
\ word description of a desirable state</fi>? <fi>Examples of current situations\
|
| 56 |
-
\ or locations where the desirable change is happening</fi> are <fi>a few word\
|
| 57 |
-
\ description of a desirable state</fi> right now. <fi>Examples of locations or\
|
| 58 |
-
\ situations where the desirable change is happening</fi> have <fi>notable examples\
|
| 59 |
-
\ of the desirable change</fi>. The <fi>a few word description of a system or\
|
| 60 |
-
\ environment</fi> is <fi>a few word description of a desirable state</fi>, and\
|
| 61 |
-
\ this all happened in <fi>a short amount of time</fi>. Imagine all the <fi>positive\
|
| 62 |
-
\ outcomes</fi> that could happen if we learned to <fi>coexist with nature</fi>\
|
| 63 |
-
\ and <fi>made improvements</fi>. This is a real opportunity for us all to make\
|
| 64 |
-
\ a <fi>positive change</fi>.\",\n \"document\": \"South Asia Pure Water Initiative,\
|
| 65 |
-
\ Inc. (SAPWII) supports two small factories in Kolar and Mysore,Karnataka South\
|
| 66 |
-
\ India to manufacture BioSand Water Filters. For the past 10 years, we have developed\
|
| 67 |
-
\ programs such as our \\u201cAdopt-A-Village Partnership\\u201d and \\u201cErnie\\\
|
| 68 |
-
u2019s Filters for Schools\\u201d that have placed more than 12,000 filters in\
|
| 69 |
-
\ villages and schools in South India. We have brought clean water to more than\
|
| 70 |
-
\ 200,000 people suffering from diseases caused by contaminated water!\\nWith\
|
| 71 |
-
\ the help and support from the Centre for Affordable Water and Sanitation Technologies\
|
| 72 |
-
\ (CAWST), the premier BioSand filter experts worldwide, we have conducted training\
|
| 73 |
-
\ camps in various locations in India to spread the word of the BioSand Water\
|
| 74 |
-
\ Filter technology to all of India. We are training other organizations to manufacture\
|
| 75 |
-
\ and distribute BioSand Water Filters and provide clean water to all locations\
|
| 76 |
-
\ in India where there is a need.\\nOver 500,000 children die every year from\
|
| 77 |
-
\ diarrhea caused by unsafe water and poor sanitation \\u2013 that\\u2019s more\
|
| 78 |
-
\ than 1,400 a day. Achieving universal access to safe water would save 2.5 million\
|
| 79 |
-
\ lives every year. For every $1 invested in water and sanitation, an average\
|
| 80 |
-
\ of $4 is returned in increased productivity and reduced medical costs. Access\
|
| 81 |
-
\ to safe water breaks the cycle of poverty, creates markets where they never\
|
| 82 |
-
\ existed before and uplifts the global community as well as the local community.\\\
|
| 83 |
-
nA BioSand water filter is an adaptation of the traditional slow sand filter which\
|
| 84 |
-
\ has been used for community drinking water treatment for 200 years. The technology\
|
| 85 |
-
\ has been adapted to create a household water treatment filter that can be built\
|
| 86 |
-
\ on a small scale at low cost with materials available locally. The BioSand water\
|
| 87 |
-
\ filter has no replacement parts, requires no electricity, lasts for 30 years\
|
| 88 |
-
\ without ongoing costs and is virtually maintenance free. Found to be very effective\
|
| 89 |
-
\ for reducing water-borne disease and manufactured and used in more than 60 countries\
|
| 90 |
-
\ worldwide.\"\n}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
| 91 |
-
example_title: Example 2
|
| 92 |
-
- text: "<|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December\
|
| 93 |
-
\ 2023\nToday Date: 21 Apr 2025\n\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\
|
| 94 |
-
\n{\n \"instruction_template\": \"what are <fi>a type of item, tool, or technology</fi>\
|
| 95 |
-
\ used for?\",\n \"document\": \"South Asia Pure Water Initiative, Inc. (SAPWII)\
|
| 96 |
-
\ supports two small factories in Kolar and Mysore,Karnataka South India to manufacture\
|
| 97 |
-
\ BioSand Water Filters. For the past 10 years, we have developed programs such\
|
| 98 |
-
\ as our \\u201cAdopt-A-Village Partnership\\u201d and \\u201cErnie\\u2019s Filters\
|
| 99 |
-
\ for Schools\\u201d that have placed more than 12,000 filters in villages and\
|
| 100 |
-
\ schools in South India. We have brought clean water to more than 200,000 people\
|
| 101 |
-
\ suffering from diseases caused by contaminated water!\\nWith the help and support\
|
| 102 |
-
\ from the Centre for Affordable Water and Sanitation Technologies (CAWST), the\
|
| 103 |
-
\ premier BioSand filter experts worldwide, we have conducted training camps in\
|
| 104 |
-
\ various locations in India to spread the word of the BioSand Water Filter technology\
|
| 105 |
-
\ to all of India. We are training other organizations to manufacture and distribute\
|
| 106 |
-
\ BioSand Water Filters and provide clean water to all locations in India where\
|
| 107 |
-
\ there is a need.\\nOver 500,000 children die every year from diarrhea caused\
|
| 108 |
-
\ by unsafe water and poor sanitation \\u2013 that\\u2019s more than 1,400 a day.\
|
| 109 |
-
\ Achieving universal access to safe water would save 2.5 million lives every\
|
| 110 |
-
\ year. For every $1 invested in water and sanitation, an average of $4 is returned\
|
| 111 |
-
\ in increased productivity and reduced medical costs. Access to safe water breaks\
|
| 112 |
-
\ the cycle of poverty, creates markets where they never existed before and uplifts\
|
| 113 |
-
\ the global community as well as the local community.\\nA BioSand water filter\
|
| 114 |
-
\ is an adaptation of the traditional slow sand filter which has been used for\
|
| 115 |
-
\ community drinking water treatment for 200 years. The technology has been adapted\
|
| 116 |
-
\ to create a household water treatment filter that can be built on a small scale\
|
| 117 |
-
\ at low cost with materials available locally. The BioSand water filter has no\
|
| 118 |
-
\ replacement parts, requires no electricity, lasts for 30 years without ongoing\
|
| 119 |
-
\ costs and is virtually maintenance free. Found to be very effective for reducing\
|
| 120 |
-
\ water-borne disease and manufactured and used in more than 60 countries worldwide.\"\
|
| 121 |
-
\n}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"
|
| 122 |
-
example_title: Example 3
|
| 123 |
---
|
| 124 |
-
|
| 125 |
|
| 126 |
-
|
| 127 |
|
| 128 |
-
##
|
| 129 |
|
| 130 |
-
```
|
| 131 |
-
|
|
|
|
| 132 |
|
| 133 |
-
|
|
|
|
| 134 |
tokenizer.padding_side = 'left'
|
| 135 |
-
model = AutoModelForCausalLM.from_pretrained('fineinstructions/template_instantiator', revision=None)
|
| 136 |
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)
|
| 137 |
|
| 138 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 139 |
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
|
| 140 |
-
|
| 141 |
-
|
|
|
|
| 142 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
---
|
| 144 |
This model was trained with a synthetic dataset with [DataDreamer 🤖💤](https://datadreamer.dev). The synthetic dataset card and model card can be found [here](datadreamer.json). The training arguments can be found [here](training_args.json).
|
|
|
|
| 1 |
---
|
| 2 |
base_model: meta-llama/Llama-3.2-1B-Instruct
|
| 3 |
datasets:
|
| 4 |
+
- fineinstructions/template_instantiator_training
|
| 5 |
tags:
|
| 6 |
- datadreamer
|
| 7 |
- datadreamer-0.46.0
|
| 8 |
- synthetic
|
| 9 |
- text-generation
|
| 10 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
+
This model will convert an instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and return a
|
| 13 |
|
| 14 |
+
The output will be a JSON object.
|
| 15 |
|
| 16 |
+
## Simple Usage Example
|
| 17 |
|
| 18 |
+
```python
|
| 19 |
+
import json
|
| 20 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
|
| 21 |
|
| 22 |
+
# Load tokenizer and model
|
| 23 |
+
tokenizer = AutoTokenizer.from_pretrained('fineinstructions/template_instantiator', revision=None)
|
| 24 |
tokenizer.padding_side = 'left'
|
| 25 |
+
model = AutoModelForCausalLM.from_pretrained('fineinstructions/template_instantiator', revision=None)
|
| 26 |
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)
|
| 27 |
|
| 28 |
+
# Run inference to instantiate the instruction template and generate an answer
|
| 29 |
+
inputs = [json.dumps({
|
| 30 |
+
"instruction_template": "...",
|
| 31 |
+
"document": "..."
|
| 32 |
+
}, indent=2)]
|
| 33 |
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
|
| 34 |
+
generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
|
| 35 |
+
output = generations[0][0]['generated_text']
|
| 36 |
+
print(output)
|
| 37 |
|
| 38 |
+
##### Output:
|
| 39 |
+
# {
|
| 40 |
+
# ..
|
| 41 |
+
# }
|
| 42 |
+
#
|
| 43 |
+
```
|
| 44 |
---
|
| 45 |
This model was trained with a synthetic dataset with [DataDreamer 🤖💤](https://datadreamer.dev). The synthetic dataset card and model card can be found [here](datadreamer.json). The training arguments can be found [here](training_args.json).
|