48G 显存的显卡能正常运行不

#14
by hiSandog - opened

可以 32g的就可以正常运行diffusers的pipeline

可以 32g的就可以正常运行diffusers的pipeline

求问,为什么我48g运行显示不够:
Traceback (most recent call last):
File "/data/lsc/IClight/qianw/test.py", line 13, in
pipe = pipe.to(device)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 541, in to
module.to(device, dtype)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 1446, in to
return super().to(*args, **kwargs)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 72.00 MiB. GPU 1 has a total capacity of 44.32 GiB of which 63.81 MiB is free. Process 1688744 has 44.24 GiB memory in use. Of the allocated memory 43.63 GiB is allocated by PyTorch, and 196.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
(base_lbm) root@tgys01-desktop:/data/lsc/IClight/qianw#

可以 32g的就可以正常运行diffusers的pipeline

求问,为什么我48g运行显示不够:
Traceback (most recent call last):
File "/data/lsc/IClight/qianw/test.py", line 13, in
pipe = pipe.to(device)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 541, in to
module.to(device, dtype)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 1446, in to
return super().to(*args, **kwargs)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1355, in to
return self._apply(convert)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 915, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 942, in _apply
param_applied = fn(param)
File "/opt/conda/envs/base_lbm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1341, in convert
return t.to(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 72.00 MiB. GPU 1 has a total capacity of 44.32 GiB of which 63.81 MiB is free. Process 1688744 has 44.24 GiB memory in use. Of the allocated memory 43.63 GiB is allocated by PyTorch, and 196.65 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
(base_lbm) root@tgys01-desktop:/data/lsc/IClight/qianw#

尝试下面的代码,可以自动卸载一部分权重到内存中,速度稍慢但可以在32g的显卡上运行,主要是启用了diffusers的自动device map,删除了强制移动到cuda的语句

torch_dtype = torch.bfloat16 if torch.cuda.is_available() else torch.float32

pipe = DiffusionPipeline.from_pretrained(
model_name,
torch_dtype=torch_dtype,
device_map="balanced"
)

positive_magic = {
"en": "Ultra HD, 4K, cinematic composition.", # for English prompt
"zh": "超清,4K,电影级构图" # for Chinese prompt
}

comfyui下24G的4090d可以运行

Sign up or log in to comment