Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

This repository contains low-rank adapters (LoRA weights) for Large Language Models (LLMs) featuring Training-Free Bayesianization (TFB). TFB is a simple yet theoretically grounded framework that efficiently transforms trained low-rank adapters into Bayesian ones without requiring additional training. This approach systematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. The theoretical analysis demonstrates that this process is equivalent to KL-regularized variational optimization. Through comprehensive experiments, TFB achieves superior uncertainty estimation and generalization compared to existing methods, while eliminating the need for complex Bayesianization training procedures.

The model is presented in the paper: Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Abstract

Estimating the uncertainty of responses from Large Language Models (LLMs) remains a critical challenge. While recent Bayesian methods have demonstrated effectiveness in quantifying uncertainty through low-rank weight updates, they typically require complex fine-tuning or post-training procedures. In this paper, we propose Training-Free Bayesianization (TFB), a simple yet theoretically grounded framework that efficiently transforms trained low-rank adapters into Bayesian ones without additional training. TFB systematically searches for the maximally acceptable level of variance in the weight posterior, constrained within a family of low-rank isotropic Gaussian distributions. Our theoretical analysis shows that under mild conditions, this search process is equivalent to KL-regularized variational optimization, a generalized form of variational inference. Through comprehensive experiments, we show that TFB achieves superior uncertainty estimation and generalization compared to existing methods while eliminating the need for complex Bayesianization training procedures.

For the associated code, installation instructions, and further details, please refer to the official GitHub repository: https://github.com/Wang-ML-Lab/bayesian-peft

Usage

This model provides Bayesian low-rank adapters for the meta-llama/Llama-3.1-8B base model. For full usage, including running experiments, evaluating the TFB method, and loading LoRA weights, please refer to the detailed instructions in the GitHub repository.

The GitHub repository provides scripts to run TFB on LoRA weights obtained from various training methods (BLoB, MLE, MAP).

To evaluate TFB on LoRA weights obtained from <method_name> (BLoB, MLE, MAP) training, use the following script:

bash scripts/tfblora/<method_name>.sh

Pre-trained BLoB, MLE, and MAP LoRA weights on various datasets are available at FlyLee/bayesian-peft. These are loaded by default in the scripts, but you can load any local LoRA weights by specifying:

--load-lora-huggingface-repo None \
--load-lora-path <path_to_your_lora_weights>
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FlyLee/bayesian-peft

Finetuned
(1588)
this model

Datasets used to train FlyLee/bayesian-peft