thanks for mlx but having issue!

#1
by matznerd - opened

Hi, awesome sped on getting this up, going to report this to you in case was something overlooked as I see nothing extra in instructions etc.

this is the full error. thanks!

🥲 Failed to load the model

Failed to load model

Error when loading model: ValueError: Received 264 parameters not in model: 
model.layers.0.mlp.experts.down_proj.biases,
model.layers.0.mlp.experts.down_proj.weight.scales,
model.layers.0.mlp.experts.down_proj.weight.weight,
model.layers.0.mlp.experts.gate_proj.bias,
model.layers.0.mlp.experts.gate_proj.biases,
model.layers.0.mlp.experts.gate_proj.scales,
model.layers.0.mlp.experts.gate_proj.weight,
model.layers.0.mlp.experts.up_proj.bias,
model.layers.0.mlp.experts.up_proj.biases,
model.layers.0.mlp.experts.up_proj.scales,
model.layers.0.mlp.experts.up_proj.weight,
model.layers.1.mlp.experts.down_proj.biases,
model.layers.1.mlp.experts.down_proj.weight.scales,
model.layers.1.mlp.experts.down_proj.weight.weight,
model.layers.1.mlp.experts.gate_proj.bias,
model.layers.1.mlp.experts.gate_proj.biases,
model.layers.1.mlp.experts.gate_proj.scales,
model.layers.1.mlp.experts.gate_proj.weight,
model.layers.1.mlp.experts.up_proj.bias,
model.layers.1.mlp.experts.up_proj.biases,
model.layers.1.mlp.experts.up_proj.scales,
model.layers.1.mlp.experts.up_proj.weight,
model.layers.10.mlp.experts.down_proj.biases,
model.layers.10.mlp.experts.down_proj.weight.scales,
model.layers.10.mlp.experts.down_proj.weight.weight,
model.layers.10.mlp.experts.gate_proj.bias,
model.layers.10.mlp.experts.gate_proj.biases,
model.layers.10.mlp.experts.gate_proj.scales,
model.layers.10.mlp.experts.gate_proj.weight,
model.layers.10.mlp.experts.up_proj.bias,
model.layers.10.mlp.experts.up_proj.biases,
model.layers.10.mlp.experts.up_proj.scales,
model.layers.10.mlp.experts.up_proj.weight,
model.layers.11.mlp.experts.down_proj.biases,
model.layers.11.mlp.experts.down_proj.weight.scales,
model.layers.11.mlp.experts.down_proj.weight.weight,
model.layers.11.mlp.experts.gate_proj.bias,
model.layers.11.mlp.experts.gate_proj.biases,
model.layers.11.mlp.experts.gate_proj.scales,
model.layers.11.mlp.experts.gate_proj.weight,
model.layers.11.mlp.experts.up_proj.bias,
model.layers.11.mlp.experts.up_proj.biases,
model.layers.11.mlp.experts.up_proj.scales,
model.layers.11.mlp.experts.up_proj.weight,
model.layers.12.mlp.experts.down_proj.biases,
model.layers.12.mlp.experts.down_proj.weight.scales,
model.layers.12.mlp.experts.down_proj.weight.weight,
model.layers.12.mlp.experts.gate_proj.bias,
model.layers.12.mlp.experts.gate_proj.biases,
model.layers.12.mlp.experts.gate_proj.scales,
model.layers.12.mlp.experts.gate_proj.weight,
model.layers.12.mlp.experts.up_proj.bias,
model.layers.12.mlp.experts.up_proj.biases,
model.layers.12.mlp.experts.up_proj.scales,
model.layers.12.mlp.experts.up_proj.weight,
model.layers.13.mlp.experts.down_proj.biases,
model.layers.13.mlp.experts.down_proj.weight.scales,
model.layers.13.mlp.experts.down_proj.weight.weight,
model.layers.13.mlp.experts.gate_proj.bias,
model.layers.13.mlp.experts.gate_proj.biases,
model.layers.13.mlp.experts.gate_proj.scales,
model.layers.13.mlp.experts.gate_proj.weight,
model.layers.13.mlp.experts.up_proj.bias,
model.layers.13.mlp.experts.up_proj.biases,
model.layers.13.mlp.experts.up_proj.scales,
model.layers.13.mlp.experts.up_proj.weight,
model.layers.14.mlp.experts.down_proj.biases,
model.layers.14.mlp.experts.down_proj.weight.scales,
model.layers.14.mlp.experts.down_proj.weight.weight,
model.layers.14.mlp.experts.gate_proj.bias,
model.layers.14.mlp.experts.gate_proj.biases,
model.layers.14.mlp.experts.gate_proj.scales,
model.layers.14.mlp.experts.gate_proj.weight,
model.layers.14.mlp.experts.up_proj.bias,
model.layers.14.mlp.experts.up_proj.biases,
model.layers.14.mlp.experts.up_proj.scales,
model.layers.14.mlp.experts.up_proj.weight,
model.layers.15.mlp.experts.down_proj.biases,
model.layers.15.mlp.experts.down_proj.weight.scales,
model.layers.15.mlp.experts.down_proj.weight.weight,
model.layers.15.mlp.experts.gate_proj.bias,
model.layers.15.mlp.experts.gate_proj.biases,
model.layers.15.mlp.experts.gate_proj.scales,
model.layers.15.mlp.experts.gate_proj.weight,
model.layers.15.mlp.experts.up_proj.bias,
model.layers.15.mlp.experts.up_proj.biases,
model.layers.15.mlp.experts.up_proj.scales,
model.layers.15.mlp.experts.up_proj.weight,
model.layers.16.mlp.experts.down_proj.biases,
model.layers.16.mlp.experts.down_proj.weight.scales,
model.layers.16.mlp.experts.down_proj.weight.weight,
model.layers.16.mlp.experts.gate_proj.bias,
model.layers.16.mlp.experts.gate_proj.biases,
model.layers.16.mlp.experts.gate_proj.scales,
model.layers.16.mlp.experts.gate_proj.weight,
model.layers.16.mlp.experts.up_proj.bias,
model.layers.16.mlp.experts.up_proj.biases,
model.layers.16.mlp.experts.up_proj.scales,
model.layers.16.mlp.experts.up_proj.weight,
model.layers.17.mlp.experts.down_proj.biases,
model.layers.17.mlp.experts.down_proj.weight.scales,
model.layers.17.mlp.experts.down_proj.weight.weight,
model.layers.17.mlp.experts.gate_proj.bias,
model.layers.17.mlp.experts.gate_proj.biases,
model.layers.17.mlp.experts.gate_proj.scales,
model.layers.17.mlp.experts.gate_proj.weight,
model.layers.17.mlp.experts.up_proj.bias,
model.layers.17.mlp.experts.up_proj.biases,
model.layers.17.mlp.experts.up_proj.scales,
model.layers.17.mlp.experts.up_proj.weight,
model.layers.18.mlp.experts.down_proj.biases,
model.layers.18.mlp.experts.down_proj.weight.scales,
model.layers.18.mlp.experts.down_proj.weight.weight,
model.layers.18.mlp.experts.gate_proj.bias,
model.layers.18.mlp.experts.gate_proj.biases,
model.layers.18.mlp.experts.gate_proj.scales,
model.layers.18.mlp.experts.gate_proj.weight,
model.layers.18.mlp.experts.up_proj.bias,
model.layers.18.mlp.experts.up_proj.biases,
model.layers.18.mlp.experts.up_proj.scales,
model.layers.18.mlp.experts.up_proj.weight,
model.layers.19.mlp.experts.down_proj.biases,
model.layers.19.mlp.experts.down_proj.weight.scales,
model.layers.19.mlp.experts.down_proj.weight.weight,
model.layers.19.mlp.experts.gate_proj.bias,
model.layers.19.mlp.experts.gate_proj.biases,
model.layers.19.mlp.experts.gate_proj.scales,
model.layers.19.mlp.experts.gate_proj.weight,
model.layers.19.mlp.experts.up_proj.bias,
model.layers.19.mlp.experts.up_proj.biases,
model.layers.19.mlp.experts.up_proj.scales,
model.layers.19.mlp.experts.up_proj.weight,
model.layers.2.mlp.experts.down_proj.biases,
model.layers.2.mlp.experts.down_proj.weight.scales,
model.layers.2.mlp.experts.down_proj.weight.weight,
model.layers.2.mlp.experts.gate_proj.bias,
model.layers.2.mlp.experts.gate_proj.biases,
model.layers.2.mlp.experts.gate_proj.scales,
model.layers.2.mlp.experts.gate_proj.weight,
model.layers.2.mlp.experts.up_proj.bias,
model.layers.2.mlp.experts.up_proj.biases,
model.layers.2.mlp.experts.up_proj.scales,
model.layers.2.mlp.experts.up_proj.weight,
model.layers.20.mlp.experts.down_proj.biases,
model.layers.20.mlp.experts.down_proj.weight.scales,
model.layers.20.mlp.experts.down_proj.weight.weight,
model.layers.20.mlp.experts.gate_proj.bias,
model.layers.20.mlp.experts.gate_proj.biases,
model.layers.20.mlp.experts.gate_proj.scales,
model.layers.20.mlp.experts.gate_proj.weight,
model.layers.20.mlp.experts.up_proj.bias,
model.layers.20.mlp.experts.up_proj.biases,
model.layers.20.mlp.experts.up_proj.scales,
model.layers.20.mlp.experts.up_proj.weight,
model.layers.21.mlp.experts.down_proj.biases,
model.layers.21.mlp.experts.down_proj.weight.scales,
model.layers.21.mlp.experts.down_proj.weight.weight,
model.layers.21.mlp.experts.gate_proj.bias,
model.layers.21.mlp.experts.gate_proj.biases,
model.layers.21.mlp.experts.gate_proj.scales,
model.layers.21.mlp.experts.gate_proj.weight,
model.layers.21.mlp.experts.up_proj.bias,
model.layers.21.mlp.experts.up_proj.biases,
model.layers.21.mlp.experts.up_proj.scales,
model.layers.21.mlp.experts.up_proj.weight,
model.layers.22.mlp.experts.down_proj.biases,
model.layers.22.mlp.experts.down_proj.weight.scales,
model.layers.22.mlp.experts.down_proj.weight.weight,
model.layers.22.mlp.experts.gate_proj.bias,
model.layers.22.mlp.experts.gate_proj.biases,
model.layers.22.mlp.experts.gate_proj.scales,
model.layers.22.mlp.experts.gate_proj.weight,
model.layers.22.mlp.experts.up_proj.bias,
model.layers.22.mlp.experts.up_proj.biases,
model.layers.22.mlp.experts.up_proj.scales,
model.layers.22.mlp.experts.up_proj.weight,
model.layers.23.mlp.experts.down_proj.biases,
model.layers.23.mlp.experts.down_proj.weight.scales,
model.layers.23.mlp.experts.down_proj.weight.weight,
model.layers.23.mlp.experts.gate_proj.bias,
model.layers.23.mlp.experts.gate_proj.biases,
model.layers.23.mlp.experts.gate_proj.scales,
model.layers.23.mlp.experts.gate_proj.weight,
model.layers.23.mlp.experts.up_proj.bias,
model.layers.23.mlp.experts.up_proj.biases,
model.layers.23.mlp.experts.up_proj.scales,
model.layers.23.mlp.experts.up_proj.weight,
model.layers.3.mlp.experts.down_proj.biases,
model.layers.3.mlp.experts.down_proj.weight.scales,
model.layers.3.mlp.experts.down_proj.weight.weight,
model.layers.3.mlp.experts.gate_proj.bias,
model.layers.3.mlp.experts.gate_proj.biases,
model.layers.3.mlp.experts.gate_proj.scales,
model.layers.3.mlp.experts.gate_proj.weight,
model.layers.3.mlp.experts.up_proj.bias,
model.layers.3.mlp.experts.up_proj.biases,
model.layers.3.mlp.experts.up_proj.scales,
model.layers.3.mlp.experts.up_proj.weight,
model.layers.4.mlp.experts.down_proj.biases,
model.layers.4.mlp.experts.down_proj.weight.scales,
model.layers.4.mlp.experts.down_proj.weight.weight,
model.layers.4.mlp.experts.gate_proj.bias,
model.layers.4.mlp.experts.gate_proj.biases,
model.layers.4.mlp.experts.gate_proj.scales,
model.layers.4.mlp.experts.gate_proj.weight,
model.layers.4.mlp.experts.up_proj.bias,
model.layers.4.mlp.experts.up_proj.biases,
model.layers.4.mlp.experts.up_proj.scales,
model.layers.4.mlp.experts.up_proj.weight,
model.layers.5.mlp.experts.down_proj.biases,
model.layers.5.mlp.experts.down_proj.weight.scales,
model.layers.5.mlp.experts.down_proj.weight.weight,
model.layers.5.mlp.experts.gate_proj.bias,
model.layers.5.mlp.experts.gate_proj.biases,
model.layers.5.mlp.experts.gate_proj.scales,
model.layers.5.mlp.experts.gate_proj.weight,
model.layers.5.mlp.experts.up_proj.bias,
model.layers.5.mlp.experts.up_proj.biases,
model.layers.5.mlp.experts.up_proj.scales,
model.layers.5.mlp.experts.up_proj.weight,
model.layers.6.mlp.experts.down_proj.biases,
model.layers.6.mlp.experts.down_proj.weight.scales,
model.layers.6.mlp.experts.down_proj.weight.weight,
model.layers.6.mlp.experts.gate_proj.bias,
model.layers.6.mlp.experts.gate_proj.biases,
model.layers.6.mlp.experts.gate_proj.scales,
model.layers.6.mlp.experts.gate_proj.weight,
model.layers.6.mlp.experts.up_proj.bias,
model.layers.6.mlp.experts.up_proj.biases,
model.layers.6.mlp.experts.up_proj.scales,
model.layers.6.mlp.experts.up_proj.weight,
model.layers.7.mlp.experts.down_proj.biases,
model.layers.7.mlp.experts.down_proj.weight.scales,
model.layers.7.mlp.experts.down_proj.weight.weight,
model.layers.7.mlp.experts.gate_proj.bias,
model.layers.7.mlp.experts.gate_proj.biases,
model.layers.7.mlp.experts.gate_proj.scales,
model.layers.7.mlp.experts.gate_proj.weight,
model.layers.7.mlp.experts.up_proj.bias,
model.layers.7.mlp.experts.up_proj.biases,
model.layers.7.mlp.experts.up_proj.scales,
model.layers.7.mlp.experts.up_proj.weight,
model.layers.8.mlp.experts.down_proj.biases,
model.layers.8.mlp.experts.down_proj.weight.scales,
model.layers.8.mlp.experts.down_proj.weight.weight,
model.layers.8.mlp.experts.gate_proj.bias,
model.layers.8.mlp.experts.gate_proj.biases,
model.layers.8.mlp.experts.gate_proj.scales,
model.layers.8.mlp.experts.gate_proj.weight,
model.layers.8.mlp.experts.up_proj.bias,
model.layers.8.mlp.experts.up_proj.biases,
model.layers.8.mlp.experts.up_proj.scales,
model.layers.8.mlp.experts.up_proj.weight,
model.layers.9.mlp.experts.down_proj.biases,
model.layers.9.mlp.experts.down_proj.weight.scales,
model.layers.9.mlp.experts.down_proj.weight.weight,
model.layers.9.mlp.experts.gate_proj.bias,
model.layers.9.mlp.experts.gate_proj.biases,
model.layers.9.mlp.experts.gate_proj.scales,
model.layers.9.mlp.experts.gate_proj.weight,
model.layers.9.mlp.experts.up_proj.bias,
model.layers.9.mlp.experts.up_proj.biases,
model.layers.9.mlp.experts.up_proj.scales,
model.layers.9.mlp.experts.up_proj.weight.

Thanks for letting me know about this. Could you double check you have the latest files downloaded, as I just finished uploading the model. I'll also download it myself as soon as I've finished uploading the 120b edition to check.

Just tested here and it worked, try double checking the files you downloaded are the same size as the one's in the repo.

Sign up or log in to comment