thanks for mlx but having issue!
#1
by
						
matznerd
	
							
						- opened
							
					
Hi, awesome sped on getting this up, going to report this to you in case was something overlooked as I see nothing extra in instructions etc.
this is the full error. thanks!
🥲 Failed to load the model
Failed to load model
Error when loading model: ValueError: Received 264 parameters not in model: 
model.layers.0.mlp.experts.down_proj.biases,
model.layers.0.mlp.experts.down_proj.weight.scales,
model.layers.0.mlp.experts.down_proj.weight.weight,
model.layers.0.mlp.experts.gate_proj.bias,
model.layers.0.mlp.experts.gate_proj.biases,
model.layers.0.mlp.experts.gate_proj.scales,
model.layers.0.mlp.experts.gate_proj.weight,
model.layers.0.mlp.experts.up_proj.bias,
model.layers.0.mlp.experts.up_proj.biases,
model.layers.0.mlp.experts.up_proj.scales,
model.layers.0.mlp.experts.up_proj.weight,
model.layers.1.mlp.experts.down_proj.biases,
model.layers.1.mlp.experts.down_proj.weight.scales,
model.layers.1.mlp.experts.down_proj.weight.weight,
model.layers.1.mlp.experts.gate_proj.bias,
model.layers.1.mlp.experts.gate_proj.biases,
model.layers.1.mlp.experts.gate_proj.scales,
model.layers.1.mlp.experts.gate_proj.weight,
model.layers.1.mlp.experts.up_proj.bias,
model.layers.1.mlp.experts.up_proj.biases,
model.layers.1.mlp.experts.up_proj.scales,
model.layers.1.mlp.experts.up_proj.weight,
model.layers.10.mlp.experts.down_proj.biases,
model.layers.10.mlp.experts.down_proj.weight.scales,
model.layers.10.mlp.experts.down_proj.weight.weight,
model.layers.10.mlp.experts.gate_proj.bias,
model.layers.10.mlp.experts.gate_proj.biases,
model.layers.10.mlp.experts.gate_proj.scales,
model.layers.10.mlp.experts.gate_proj.weight,
model.layers.10.mlp.experts.up_proj.bias,
model.layers.10.mlp.experts.up_proj.biases,
model.layers.10.mlp.experts.up_proj.scales,
model.layers.10.mlp.experts.up_proj.weight,
model.layers.11.mlp.experts.down_proj.biases,
model.layers.11.mlp.experts.down_proj.weight.scales,
model.layers.11.mlp.experts.down_proj.weight.weight,
model.layers.11.mlp.experts.gate_proj.bias,
model.layers.11.mlp.experts.gate_proj.biases,
model.layers.11.mlp.experts.gate_proj.scales,
model.layers.11.mlp.experts.gate_proj.weight,
model.layers.11.mlp.experts.up_proj.bias,
model.layers.11.mlp.experts.up_proj.biases,
model.layers.11.mlp.experts.up_proj.scales,
model.layers.11.mlp.experts.up_proj.weight,
model.layers.12.mlp.experts.down_proj.biases,
model.layers.12.mlp.experts.down_proj.weight.scales,
model.layers.12.mlp.experts.down_proj.weight.weight,
model.layers.12.mlp.experts.gate_proj.bias,
model.layers.12.mlp.experts.gate_proj.biases,
model.layers.12.mlp.experts.gate_proj.scales,
model.layers.12.mlp.experts.gate_proj.weight,
model.layers.12.mlp.experts.up_proj.bias,
model.layers.12.mlp.experts.up_proj.biases,
model.layers.12.mlp.experts.up_proj.scales,
model.layers.12.mlp.experts.up_proj.weight,
model.layers.13.mlp.experts.down_proj.biases,
model.layers.13.mlp.experts.down_proj.weight.scales,
model.layers.13.mlp.experts.down_proj.weight.weight,
model.layers.13.mlp.experts.gate_proj.bias,
model.layers.13.mlp.experts.gate_proj.biases,
model.layers.13.mlp.experts.gate_proj.scales,
model.layers.13.mlp.experts.gate_proj.weight,
model.layers.13.mlp.experts.up_proj.bias,
model.layers.13.mlp.experts.up_proj.biases,
model.layers.13.mlp.experts.up_proj.scales,
model.layers.13.mlp.experts.up_proj.weight,
model.layers.14.mlp.experts.down_proj.biases,
model.layers.14.mlp.experts.down_proj.weight.scales,
model.layers.14.mlp.experts.down_proj.weight.weight,
model.layers.14.mlp.experts.gate_proj.bias,
model.layers.14.mlp.experts.gate_proj.biases,
model.layers.14.mlp.experts.gate_proj.scales,
model.layers.14.mlp.experts.gate_proj.weight,
model.layers.14.mlp.experts.up_proj.bias,
model.layers.14.mlp.experts.up_proj.biases,
model.layers.14.mlp.experts.up_proj.scales,
model.layers.14.mlp.experts.up_proj.weight,
model.layers.15.mlp.experts.down_proj.biases,
model.layers.15.mlp.experts.down_proj.weight.scales,
model.layers.15.mlp.experts.down_proj.weight.weight,
model.layers.15.mlp.experts.gate_proj.bias,
model.layers.15.mlp.experts.gate_proj.biases,
model.layers.15.mlp.experts.gate_proj.scales,
model.layers.15.mlp.experts.gate_proj.weight,
model.layers.15.mlp.experts.up_proj.bias,
model.layers.15.mlp.experts.up_proj.biases,
model.layers.15.mlp.experts.up_proj.scales,
model.layers.15.mlp.experts.up_proj.weight,
model.layers.16.mlp.experts.down_proj.biases,
model.layers.16.mlp.experts.down_proj.weight.scales,
model.layers.16.mlp.experts.down_proj.weight.weight,
model.layers.16.mlp.experts.gate_proj.bias,
model.layers.16.mlp.experts.gate_proj.biases,
model.layers.16.mlp.experts.gate_proj.scales,
model.layers.16.mlp.experts.gate_proj.weight,
model.layers.16.mlp.experts.up_proj.bias,
model.layers.16.mlp.experts.up_proj.biases,
model.layers.16.mlp.experts.up_proj.scales,
model.layers.16.mlp.experts.up_proj.weight,
model.layers.17.mlp.experts.down_proj.biases,
model.layers.17.mlp.experts.down_proj.weight.scales,
model.layers.17.mlp.experts.down_proj.weight.weight,
model.layers.17.mlp.experts.gate_proj.bias,
model.layers.17.mlp.experts.gate_proj.biases,
model.layers.17.mlp.experts.gate_proj.scales,
model.layers.17.mlp.experts.gate_proj.weight,
model.layers.17.mlp.experts.up_proj.bias,
model.layers.17.mlp.experts.up_proj.biases,
model.layers.17.mlp.experts.up_proj.scales,
model.layers.17.mlp.experts.up_proj.weight,
model.layers.18.mlp.experts.down_proj.biases,
model.layers.18.mlp.experts.down_proj.weight.scales,
model.layers.18.mlp.experts.down_proj.weight.weight,
model.layers.18.mlp.experts.gate_proj.bias,
model.layers.18.mlp.experts.gate_proj.biases,
model.layers.18.mlp.experts.gate_proj.scales,
model.layers.18.mlp.experts.gate_proj.weight,
model.layers.18.mlp.experts.up_proj.bias,
model.layers.18.mlp.experts.up_proj.biases,
model.layers.18.mlp.experts.up_proj.scales,
model.layers.18.mlp.experts.up_proj.weight,
model.layers.19.mlp.experts.down_proj.biases,
model.layers.19.mlp.experts.down_proj.weight.scales,
model.layers.19.mlp.experts.down_proj.weight.weight,
model.layers.19.mlp.experts.gate_proj.bias,
model.layers.19.mlp.experts.gate_proj.biases,
model.layers.19.mlp.experts.gate_proj.scales,
model.layers.19.mlp.experts.gate_proj.weight,
model.layers.19.mlp.experts.up_proj.bias,
model.layers.19.mlp.experts.up_proj.biases,
model.layers.19.mlp.experts.up_proj.scales,
model.layers.19.mlp.experts.up_proj.weight,
model.layers.2.mlp.experts.down_proj.biases,
model.layers.2.mlp.experts.down_proj.weight.scales,
model.layers.2.mlp.experts.down_proj.weight.weight,
model.layers.2.mlp.experts.gate_proj.bias,
model.layers.2.mlp.experts.gate_proj.biases,
model.layers.2.mlp.experts.gate_proj.scales,
model.layers.2.mlp.experts.gate_proj.weight,
model.layers.2.mlp.experts.up_proj.bias,
model.layers.2.mlp.experts.up_proj.biases,
model.layers.2.mlp.experts.up_proj.scales,
model.layers.2.mlp.experts.up_proj.weight,
model.layers.20.mlp.experts.down_proj.biases,
model.layers.20.mlp.experts.down_proj.weight.scales,
model.layers.20.mlp.experts.down_proj.weight.weight,
model.layers.20.mlp.experts.gate_proj.bias,
model.layers.20.mlp.experts.gate_proj.biases,
model.layers.20.mlp.experts.gate_proj.scales,
model.layers.20.mlp.experts.gate_proj.weight,
model.layers.20.mlp.experts.up_proj.bias,
model.layers.20.mlp.experts.up_proj.biases,
model.layers.20.mlp.experts.up_proj.scales,
model.layers.20.mlp.experts.up_proj.weight,
model.layers.21.mlp.experts.down_proj.biases,
model.layers.21.mlp.experts.down_proj.weight.scales,
model.layers.21.mlp.experts.down_proj.weight.weight,
model.layers.21.mlp.experts.gate_proj.bias,
model.layers.21.mlp.experts.gate_proj.biases,
model.layers.21.mlp.experts.gate_proj.scales,
model.layers.21.mlp.experts.gate_proj.weight,
model.layers.21.mlp.experts.up_proj.bias,
model.layers.21.mlp.experts.up_proj.biases,
model.layers.21.mlp.experts.up_proj.scales,
model.layers.21.mlp.experts.up_proj.weight,
model.layers.22.mlp.experts.down_proj.biases,
model.layers.22.mlp.experts.down_proj.weight.scales,
model.layers.22.mlp.experts.down_proj.weight.weight,
model.layers.22.mlp.experts.gate_proj.bias,
model.layers.22.mlp.experts.gate_proj.biases,
model.layers.22.mlp.experts.gate_proj.scales,
model.layers.22.mlp.experts.gate_proj.weight,
model.layers.22.mlp.experts.up_proj.bias,
model.layers.22.mlp.experts.up_proj.biases,
model.layers.22.mlp.experts.up_proj.scales,
model.layers.22.mlp.experts.up_proj.weight,
model.layers.23.mlp.experts.down_proj.biases,
model.layers.23.mlp.experts.down_proj.weight.scales,
model.layers.23.mlp.experts.down_proj.weight.weight,
model.layers.23.mlp.experts.gate_proj.bias,
model.layers.23.mlp.experts.gate_proj.biases,
model.layers.23.mlp.experts.gate_proj.scales,
model.layers.23.mlp.experts.gate_proj.weight,
model.layers.23.mlp.experts.up_proj.bias,
model.layers.23.mlp.experts.up_proj.biases,
model.layers.23.mlp.experts.up_proj.scales,
model.layers.23.mlp.experts.up_proj.weight,
model.layers.3.mlp.experts.down_proj.biases,
model.layers.3.mlp.experts.down_proj.weight.scales,
model.layers.3.mlp.experts.down_proj.weight.weight,
model.layers.3.mlp.experts.gate_proj.bias,
model.layers.3.mlp.experts.gate_proj.biases,
model.layers.3.mlp.experts.gate_proj.scales,
model.layers.3.mlp.experts.gate_proj.weight,
model.layers.3.mlp.experts.up_proj.bias,
model.layers.3.mlp.experts.up_proj.biases,
model.layers.3.mlp.experts.up_proj.scales,
model.layers.3.mlp.experts.up_proj.weight,
model.layers.4.mlp.experts.down_proj.biases,
model.layers.4.mlp.experts.down_proj.weight.scales,
model.layers.4.mlp.experts.down_proj.weight.weight,
model.layers.4.mlp.experts.gate_proj.bias,
model.layers.4.mlp.experts.gate_proj.biases,
model.layers.4.mlp.experts.gate_proj.scales,
model.layers.4.mlp.experts.gate_proj.weight,
model.layers.4.mlp.experts.up_proj.bias,
model.layers.4.mlp.experts.up_proj.biases,
model.layers.4.mlp.experts.up_proj.scales,
model.layers.4.mlp.experts.up_proj.weight,
model.layers.5.mlp.experts.down_proj.biases,
model.layers.5.mlp.experts.down_proj.weight.scales,
model.layers.5.mlp.experts.down_proj.weight.weight,
model.layers.5.mlp.experts.gate_proj.bias,
model.layers.5.mlp.experts.gate_proj.biases,
model.layers.5.mlp.experts.gate_proj.scales,
model.layers.5.mlp.experts.gate_proj.weight,
model.layers.5.mlp.experts.up_proj.bias,
model.layers.5.mlp.experts.up_proj.biases,
model.layers.5.mlp.experts.up_proj.scales,
model.layers.5.mlp.experts.up_proj.weight,
model.layers.6.mlp.experts.down_proj.biases,
model.layers.6.mlp.experts.down_proj.weight.scales,
model.layers.6.mlp.experts.down_proj.weight.weight,
model.layers.6.mlp.experts.gate_proj.bias,
model.layers.6.mlp.experts.gate_proj.biases,
model.layers.6.mlp.experts.gate_proj.scales,
model.layers.6.mlp.experts.gate_proj.weight,
model.layers.6.mlp.experts.up_proj.bias,
model.layers.6.mlp.experts.up_proj.biases,
model.layers.6.mlp.experts.up_proj.scales,
model.layers.6.mlp.experts.up_proj.weight,
model.layers.7.mlp.experts.down_proj.biases,
model.layers.7.mlp.experts.down_proj.weight.scales,
model.layers.7.mlp.experts.down_proj.weight.weight,
model.layers.7.mlp.experts.gate_proj.bias,
model.layers.7.mlp.experts.gate_proj.biases,
model.layers.7.mlp.experts.gate_proj.scales,
model.layers.7.mlp.experts.gate_proj.weight,
model.layers.7.mlp.experts.up_proj.bias,
model.layers.7.mlp.experts.up_proj.biases,
model.layers.7.mlp.experts.up_proj.scales,
model.layers.7.mlp.experts.up_proj.weight,
model.layers.8.mlp.experts.down_proj.biases,
model.layers.8.mlp.experts.down_proj.weight.scales,
model.layers.8.mlp.experts.down_proj.weight.weight,
model.layers.8.mlp.experts.gate_proj.bias,
model.layers.8.mlp.experts.gate_proj.biases,
model.layers.8.mlp.experts.gate_proj.scales,
model.layers.8.mlp.experts.gate_proj.weight,
model.layers.8.mlp.experts.up_proj.bias,
model.layers.8.mlp.experts.up_proj.biases,
model.layers.8.mlp.experts.up_proj.scales,
model.layers.8.mlp.experts.up_proj.weight,
model.layers.9.mlp.experts.down_proj.biases,
model.layers.9.mlp.experts.down_proj.weight.scales,
model.layers.9.mlp.experts.down_proj.weight.weight,
model.layers.9.mlp.experts.gate_proj.bias,
model.layers.9.mlp.experts.gate_proj.biases,
model.layers.9.mlp.experts.gate_proj.scales,
model.layers.9.mlp.experts.gate_proj.weight,
model.layers.9.mlp.experts.up_proj.bias,
model.layers.9.mlp.experts.up_proj.biases,
model.layers.9.mlp.experts.up_proj.scales,
model.layers.9.mlp.experts.up_proj.weight.
Thanks for letting me know about this. Could you double check you have the latest files downloaded, as I just finished uploading the model. I'll also download it myself as soon as I've finished uploading the 120b edition to check.
Just tested here and it worked, try double checking the files you downloaded are the same size as the one's in the repo.