See axolotl config

axolotl version: 0.10.0.dev0


base_model: Qwen/Qwen3-32B
# Automatically upload checkpoint and final model to HF
hub_model_id: ctitools/neurocti-qwen3-32b-orion10k-instruct-fb16-r16-lr0.0002-sl2048-e5-v2

plugins:
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
strict: false

chat_template: qwen3
datasets:
  - path: ctitools/orion_10k
    type: chat_template
    field_messages: messages
    split: train[:1%]
    message_property_mappings:
      role: role
      content: content
    roles:
      user:
        - user
      assistant:
        - assistant

val_set_size: 0.01
output_dir: ./outputs/out
dataset_prepared_path: last_run_prepared

sequence_len: 2048
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

#load_in_4bit: false
#load_in_8bit: true
adapter: lora
lora_r: 16
lora_alpha: 32
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - down_proj
  - up_proj
lora_mlp_kernel: true
lora_qkv_kernel: true
lora_o_kernel: true

bf16: auto
tf32: true

wandb_project: neurocti-hunting
wandb_entity: aaronkaplan
wandb_watch: 
wandb_name: neurocti-hunting_lora_neurocti-qwen3-32b-orion10k-instruct-fb16-r16-lr0.0002-sl2048-e5-v2
wandb_log_model: 

gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 5
#optimizer: adamw_torch_4bit
optimizer: adamw_torch
lr_scheduler: cosine
learning_rate: 0.0002

gradient_checkpointing: offload
gradient_checkpointing_kwargs:
  use_reentrant: false
resume_from_checkpoint:
logging_steps: 1
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
saves_per_epoch: 1
weight_decay: 0.0
special_tokens:

# multi-gpu setups:
deepspeed: deepspeed_configs/zero2.json

neurocti-qwen3-32b-orion10k-instruct-fb16-r16-lr0.0002-sl2048-e5-v2

This model is a fine-tuned version of Qwen/Qwen3-32B on the ctitools/orion_10k dataset. It achieves the following results on the evaluation set:

Loss: 0.2664

Model description

More information needed

Intended uses & limitations

See our github repo There you can also find, how to use the model.

Example output:

<think>
Okay, actionable? Not actionable? Let's see. The report mentions specific file paths and a domain. For actionable, I need indicators like strings, hashes, domains, IPs, or specific file paths.

The file paths `C:\ProgramData\file.txt`, `%LOCALAPPDATA%\Packages\Bay0NsQIzx\p.txt`, and the archive `output.zip` are mentioned. These paths could be used in malware code for reading/writing files. Also, the domain `wetransfers.io` is used for exfiltration. The string `uplo.php` is part of the URL path.

Actionable items:
- File paths (especially the unique ones like Bay0NsQIzx)
- Domain name `wetransfers.io`
- URL path `/uplo.php`
- Archive name `output.zip` (though less unique)

Not actionable items:
- The general concept of stealing browser data (too common)
- Telegram usage (common in malware)
- Compression into ZIP (common technique)

So actionable: Yes. Create rules for the specific file paths, domain, URL path, and archive name.
</think>

rule Bitter_Whale_File_Paths {
  meta:
    author = "cti.tools"
    description = "Detects specific file paths used by Bitter-Whale for storing stolen data and output."
    reference = "AI generated rule from the ORKL10K dataset"
  strings:
    $path1 = "C:\\ProgramData\\file.txt" wide ascii
    $path2 = "%LOCALAPPDATA%\\Packages\\Bay0NsQIzx\\p.txt" wide ascii
    $path3 = "output.zip" wide ascii
    $path4 = "%LOCALAPPDATA%\\Packages\\Bay0NsQIzx" wide ascii
  condition:
    (uint16(0) == 0x5a4d or uint32(0) == 0x464c457f) and (1 of ($path*))
}

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 8
total_eval_batch_size: 4
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 10
training_steps: 115

Training results

Training Loss	Epoch	Step	Validation Loss
1.8526	0.0426	1	6.5215
1.7964	0.2553	6	4.9077
1.1975	0.5106	12	0.6092
0.7366	0.7660	18	0.3202
0.6753	1.0	24	0.3063
0.7945	1.2553	30	0.2814
0.8882	1.5106	36	0.2780
0.5997	1.7660	42	0.3090
0.5288	2.0	48	0.2671
0.6918	2.2553	54	0.2669
0.8042	2.5106	60	0.2635
0.5136	2.7660	66	0.2684
0.3987	3.0	72	0.2623
0.6158	3.2553	78	0.2665
0.7359	3.5106	84	0.2724
0.453	3.7660	90	0.2618
0.3249	4.0	96	0.2621
0.5681	4.2553	102	0.2646
0.7036	4.5106	108	0.2661
0.4334	4.7660	114	0.2664

Framework versions

PEFT 0.15.2
Transformers 4.52.3
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1

ctitools
/

neurocti-qwen3-32b-orion10k-instruct-fp16-v2

You need to agree to share your contact information to access this model