[AI达人特训营第三期] 使用Lora技术用Dreambooth训练国潮风格模型

★★★ 本文源自AlStudio社区精品项目,【点击此处】查看更多精品内容 >>>

DreamBooth 介绍

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation是一种新的文本生成图像(text2image)的“个性化”(可适应用户特定的图像生成需求)扩散模型。虽然 DreamBooth 是在 Imagen 的基础上做的调整,但研究人员在论文中还提到,他们的方法也适用于其他扩散模型。只需几张(通常 3~5 张)指定物体的照片和相应的类名(如“狗”)作为输入,并添加一个唯一标识符植入不同的文字描述中,DreamBooth 就能让被指定物体“完美”出现在用户想要生成的场景中。

LoRA 介绍

LoRA: Low-Rank Adaptation of Large Language Models 是微软研究员引入的一项新技术,主要用于处理大模型微调的问题。目前超过数十亿以上参数的具有强能力的大模型 (例如 GPT-3) 通常在为了适应其下游任务的微调中会呈现出巨大开销。LoRA 建议冻结预训练模型的权重并在每个 Transformer 块中注入可训练层 (秩-分解矩阵)。因为不需要为大多数模型权重计算梯度,所以大大减少了需要训练参数的数量并且降低了 GPU 的内存要求。研究人员发现,通过聚焦大模型的 Transformer 注意力块,使用 LoRA 进行的微调质量与全模型微调相当,同时速度更快且需要更少的计算。

简而言之,LoRA允许通过向现有权重添加一对秩分解矩阵,并只训练这些新添加的权重来适应预训练的模型。这有几个优点:

  • 保持预训练的权重不变,这样模型就不容易出现灾难性遗忘 catastrophic forgetting;
  • 秩分解矩阵的参数比原始模型少得多,这意味着训练的 LoRA 权重很容易移植;
  • LoRA 注意力层允许通过一个 scale 参数来控制模型适应新训练图像的程度。

1. 安装依赖

  • 运行下面的按钮安装依赖,为了确保安装成功,安装完毕请重启内核!(注意:这里只需要运行一次!)
!python -m pip install -U paddlenlp ppdiffusers visualdl --user

2. 准备要训练的图片

  • 运行下面的按钮,解压图片资源(注意:这里只需要运行一次!)
!unzip data/data190562/国潮2.zip -d data/paints
  • 在这里我们已经在 data/paints 文件夹准备好了如下所示的图片资源。

  • 文件夹下还有更多如下

  • 图片处理,把其中质量不好的图片去掉

3. 开始训练

  • 下载训练脚本
!wget https://raw.githubusercontent.com/PaddlePaddle/PaddleNLP/develop/ppdiffusers/examples/dreambooth/train_dreambooth_lora.py

DreamBooth LoRA

参数解释:

主要修改的参数

  • --pretrained_model_name_or_path: 所使用的 Stable Diffusion 模型权重名称或者本地下载的模型路径,目前支持了上表中的8种模型权重,我们可直接替换使用。
  • --instance_data_dir: 实例(物体)图片文件夹地址。
  • --instance_prompt: 带有特定实例(物体)的提示词描述文本,例如a photo of sks dog,其中dog代表实例(物体)。
  • --class_data_dir: 类别(class)图片文件夹地址,主要作为先验知识。
  • --class_prompt: 类别(class)提示词文本,该提示器要与实例(物体)是同一种类别,例如a photo of dog,主要作为先验知识。
  • --num_class_images: 事先需要从class_prompt中生成多少张图片,主要作为先验知识。
  • --prior_loss_weight: 先验loss占比权重。
  • --sample_batch_size: 生成class_prompt文本对应的图片所用的批次(batch size),注意,当GPU显卡显存较小的时候需要将这个默认值改成1。
  • --with_prior_preservation: 是否将生成的同类图片(先验知识)一同加入训练,当为True的时候,class_promptclass_data_dirnum_class_imagessample_batch_sizeprior_loss_weight才生效。
  • --num_train_epochs: 训练的轮数,默认值为1
  • --max_train_steps: 最大的训练步数,当我们设置这个值后,它会重新计算所需的num_train_epochs轮数。
  • --checkpointing_steps: 每间隔多少步(global step步数),保存模型权重。
  • --gradient_accumulation_steps: 梯度累积的步数,用户可以指定梯度累积的步数,在梯度累积的 step 中。减少多卡之间梯度的通信,减少更新的次数,扩大训练的 batch_size 。
  • --train_text_encoder: 是否一同训练文本编码器的部分,默认为False

可以修改的参数

  • --height: 输入给模型的图片高度,由于用户输入的并不是固定大小的图片,因此代码中会将原始大小的图片压缩成指定高度的图片,默认值为None
  • --width: 输入给模型的图片宽度,由于用户输入的并不是固定大小的图片,因此代码中会将原始大小的图片压缩成指定宽度的图片,默认值为None
  • --resolution: 输入给模型图片的分辨率,当高度宽度None时,我们将会使用resolution,默认值为512
  • --learning_rate: 学习率。
  • --scale_lr: 是否根据GPU数量,梯度累积步数,以及批量数对学习率进行缩放。缩放公式:learning_rate * gradient_accumulation_steps * train_batch_size * num_processes
  • --lr_scheduler: 要使用的学习率调度策略。默认为 constant
  • --lr_warmup_steps: 用于从 0 到 learning_rate 的线性 warmup 的步数。
  • --train_batch_size: 训练时每张显卡所使用的batch_size批量,当我们的显存较小的时候,需要将这个值设置的小一点。
  • --center_crop: 在调整图片宽和高之前是否将裁剪图像居中,默认值为False
  • --random_flip: 是否对图片进行随机水平反转,默认值为False
  • --gradient_checkpointing: 是否开启gradient_checkpointing功能,在一定程度上能够更显显存,但是会减慢训练速度。
  • --output_dir: 模型训练完所保存的路径,默认设置为dreambooth-model文件夹,建议用户每训练一个模型可以修改一下输出路径,防止先前已有的模型被覆盖了。

基本无需修改的参数

  • --seed: 随机种子,为了可以复现训练结果,Tips:当前paddle设置该随机种子后仍无法完美复现。
  • --adam_beta1: AdamW 优化器时的 beta1 超参数。默认为 0.9
  • --adam_beta2: AdamW 优化器时的 beta2 超参数。默认为 0.999
  • --adam_weight_decay: AdamW 优化器时的 weight_decay 超参数。 默认为0.02
  • --adam_weight_decay: AdamW 优化器时的 epsilon 超参数。默认为 1e-8
  • --max_grad_norm: 最大梯度范数(用于梯度裁剪)。默认为 -1 表示不使用。
  • --logging_dir: Tensorboard 或 VisualDL 记录日志的地址,注意:该地址会与输出目录进行拼接,即,最终的日志地址为<output_dir>/<logging_dir>
  • --report_to: 用于记录日志的工具,可选["tensorboard", "visualdl"],默认为visualdl,如果选用tensorboard,请使用命令安装pip install tensorboardX
  • --push_to_hub: 是否将模型上传到 huggingface hub,默认值为 False
  • --hub_token: 上传到 huggingface hub 所需要使用的 token,如果我们已经登录了,那么我们就无需填写。
  • --hub_model_id: 上传到 huggingface hub 的模型库名称, 如果为 None 的话表示我们将使用 output_dir 的名称作为模型库名称。
!python train_dreambooth_lora.py \--pretrained_model_name_or_path="Linaqruf/anything-v3.0"  \--instance_data_dir="./data/paints" \--output_dir="./dream_booth_lora_outputs" \--instance_prompt="<Guochao>" \--resolution=512 \--train_batch_size=1 \--gradient_accumulation_steps=1 \--checkpointing_steps=1000 \--learning_rate=1e-4 \--report_to="visualdl" \--lr_scheduler="constant" \--lr_warmup_steps=0 \--max_train_steps=5000 \--lora_rank=256 \--validation_prompt="pretty girl,<Guochao>" \--validation_epochs=500 \--seed=0
[32m[2023-03-21 11:56:42,999] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:56:43,000] [    INFO][0m - We are using <class 'paddlenlp.transformers.clip.tokenizer.CLIPTokenizer'> to load 'Linaqruf/anything-v3.0/tokenizer'.[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/vocab.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/merges.txt[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/added_tokens.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/special_tokens_map.json[0m
[32m[2023-03-21 11:56:43,001] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:56:43,247] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:43,248] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:43,249] [    INFO][0m - Model config PretrainedConfig {"architectures": ["CLIPTextModel"],"initializer_factor": 1.0,"initializer_range": 0.02,"max_text_length": 77,"paddlenlp_version": null,"projection_dim": 768,"text_embed_dim": 768,"text_heads": 12,"text_hidden_act": "quick_gelu","text_layers": 12,"vocab_size": 49408
}
[0m
[32m[2023-03-21 11:56:43,249] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/scheduler/scheduler_config.json[0m
W0321 11:56:43.251735  2942 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0321 11:56:43.255620  2942 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[32m[2023-03-21 11:56:44,836] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:44,836] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:56:44,837] [    INFO][0m - Model config CLIPTextConfig {"architectures": ["CLIPTextModel"],"attention_dropout": 0.0,"bos_token_id": 0,"dropout": 0.0,"eos_token_id": 2,"hidden_act": "quick_gelu","hidden_size": 768,"initializer_factor": 1.0,"initializer_range": 0.02,"intermediate_size": 3072,"layer_norm_eps": 1e-05,"max_position_embeddings": 77,"model_type": "clip_text_model","num_attention_heads": 12,"num_hidden_layers": 12,"pad_token_id": 1,"paddlenlp_version": null,"projection_dim": 768,"return_dict": true,"vocab_size": 49408
}
[0m
[32m[2023-03-21 11:56:44,908] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_state.pdparams[0m
[32m[2023-03-21 11:56:46,375] [    INFO][0m - All model checkpoint weights were used when initializing CLIPTextModel.
[0m
[32m[2023-03-21 11:56:46,375] [    INFO][0m - All the weights of CLIPTextModel were initialized from the model checkpoint at Linaqruf/anything-v3.0/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.[0m
[32m[2023-03-21 11:56:46,376] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/model_state.pdparams[0m
[32m[2023-03-21 11:56:46,376] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/config.json[0m
[32m[2023-03-21 11:56:47,056] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/unet/model_state.pdparams[0m
[32m[2023-03-21 11:56:47,057] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/unet/config.json[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - -----------  Configuration Arguments -----------[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_beta1: 0.9[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_beta2: 0.999[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_epsilon: 1e-08[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - adam_weight_decay: 0.01[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - center_crop: False[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - checkpointing_steps: 1000[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - class_data_dir: None[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - class_prompt: None[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - dataloader_num_workers: 0[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - gradient_accumulation_steps: 1[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - gradient_checkpointing: False[0m
[32m[2023-03-21 11:56:56,555] [    INFO][0m - height: 512[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - hub_model_id: None[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - hub_token: None[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - instance_data_dir: ./paints[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - instance_prompt: <Guochao>[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - learning_rate: 0.0001[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - logging_dir: ./dream_booth_lora_outputs/logs[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lora_rank: 256[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_num_cycles: 1[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_power: 1.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_scheduler: constant[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - lr_warmup_steps: 0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - max_grad_norm: 1.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - max_train_steps: 8000[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_class_images: 100[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_train_epochs: 534[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - num_validation_images: 4[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - output_dir: ./dream_booth_lora_outputs[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - pretrained_model_name_or_path: Linaqruf/anything-v3.0[0m
[32m[2023-03-21 11:56:56,556] [    INFO][0m - prior_loss_weight: 1.0[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - push_to_hub: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - random_flip: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - report_to: visualdl[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - resolution: 512[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - sample_batch_size: 4[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - scale_lr: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - seed: 0[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - tokenizer_name: None[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - train_batch_size: 1[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - validation_epochs: 500[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - validation_prompt: pretty girl,<Guochao>[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - width: 512[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - with_prior_preservation: False[0m
[32m[2023-03-21 11:56:56,557] [    INFO][0m - ------------------------------------------------[0m
[32m[2023-03-21 11:56:56,658] [    INFO][0m - ***** Running training *****[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num examples = 15[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num batches each epoch = 15[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Num Epochs = 534[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Instantaneous batch size per device = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Total train batch size (w. parallel, distributed & accumulation) = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Gradient Accumulation steps = 1[0m
[32m[2023-03-21 11:56:56,659] [    INFO][0m -   Total optimization steps = 8000[0m
Train Steps:   0%| | 15/8000 [00:06<45:48,  2.91it/s, epoch=0000, lr=0.0001, ste[32m[2023-03-21 11:57:03,236] [    INFO][0m - Running validation... Generating 4 images with prompt: pretty girl,<Guochao>.[0m
[32m[2023-03-21 11:57:03,237] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/model_index.json[0m
[32m[2023-03-21 11:57:03,238] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/model_state.pdparams[0m
[32m[2023-03-21 11:57:03,238] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/vae/config.json[0m
[32m[2023-03-21 11:57:04,064] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:57:04,066] [    INFO][0m - loading configuration file /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_config.json[0m
[32m[2023-03-21 11:57:04,067] [    INFO][0m - Model config CLIPTextConfig {"architectures": ["CLIPTextModel"],"attention_dropout": 0.0,"bos_token_id": 0,"dropout": 0.0,"eos_token_id": 2,"hidden_act": "quick_gelu","hidden_size": 768,"initializer_factor": 1.0,"initializer_range": 0.02,"intermediate_size": 3072,"layer_norm_eps": 1e-05,"max_position_embeddings": 77,"model_type": "clip_text_model","num_attention_heads": 12,"num_hidden_layers": 12,"pad_token_id": 1,"paddlenlp_version": null,"projection_dim": 768,"return_dict": true,"vocab_size": 49408
}
[0m
[32m[2023-03-21 11:57:04,130] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/text_encoder/model_state.pdparams[0m
[32m[2023-03-21 11:57:05,460] [    INFO][0m - All model checkpoint weights were used when initializing CLIPTextModel.
[0m
[32m[2023-03-21 11:57:05,460] [    INFO][0m - All the weights of CLIPTextModel were initialized from the model checkpoint at Linaqruf/anything-v3.0/text_encoder.
If your task is similar to the task the model of the checkpoint was trained on, you can already use CLIPTextModel for predictions without further training.[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/vocab.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/merges.txt[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/added_tokens.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/special_tokens_map.json[0m
[32m[2023-03-21 11:57:05,462] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/tokenizer/tokenizer_config.json[0m
[32m[2023-03-21 11:57:05,524] [    INFO][0m - Already cached /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/scheduler/scheduler_config.json[0m
[32m[2023-03-21 11:57:05,527] [    INFO][0m - Found /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json[0m
[32m[2023-03-21 11:57:05,528] [    INFO][0m - loading configuration file https://bj.bcebos.com/paddlenlp/models/community/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json from cache at /home/aistudio/.paddlenlp/models/Linaqruf/anything-v3.0/feature_extractor/preprocessor_config.json[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'shortest_edge': 224}.[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - crop_size should be a dictionary on of the following set of keys: ({'width', 'height'}, {'shortest_edge'}, {'shortest_edge', 'longest_edge'}), got 224. Converted to {'height': 224, 'width': 224}.[0m
[32m[2023-03-21 11:57:05,529] [    INFO][0m - Image processor CLIPFeatureExtractor {"crop_size": {"height": 224,"width": 224},"do_center_crop": true,"do_convert_rgb": true,"do_normalize": true,"do_rescale": true,"do_resize": true,"feature_extractor_type": "CLIPFeatureExtractor","image_mean": [0.48145466,0.4578275,0.40821073],"image_processor_type": "CLIPFeatureExtractor","image_std": [0.26862954,0.26130258,0.27577711],"resample": 3,"rescale_factor": 0.00392156862745098,"size": {"shortest_edge": 224}
}
[0m
You have disabled the safety checker for <class 'ppdiffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. PaddleNLP team, diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
Train Steps:  12%|▏| 1000/8000 [06:16<41:28,  2.81it/s, epoch=0066, lr=0.0001, s[32m[2023-03-21 12:03:13,837] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-1000[0m
Train Steps:  25%|▎| 2000/8000 [12:17<35:40,  2.80it/s, epoch=0133, lr=0.0001, s[32m[2023-03-21 12:09:14,978] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-2000[0m
Train Steps:  38%|▍| 3000/8000 [18:16<29:04,  2.87it/s, epoch=0199, lr=0.0001, s[32m[2023-03-21 12:15:13,741] [    INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-3000[0m

Train Steps: 38%|▍| 3000/8000 [18:16<29:04, 2.87it/s, epoch=0199, lr=0.0001, s[32m[2023-03-21 12:15:13,741] [ INFO][0m - Saved lora weights to ./dream_booth_lora_outputs/checkpoint-3000[0m
Train Steps: 49%|▍| 3954/8000 [24:09<23:38, 2.85it/s, epoch=0263, lr=0.0001, s

4. 启动visualdl程序,查看我们训练过程

5. 加载训练好的文件进行推理

  • 加载模型
from ppdiffusers import StableDiffusionPipeline
from ppdiffusers import DPMSolverMultistepScheduler
import paddle
from IPython.display import clear_output# 模型
pretrained_model_name_or_path = "Linaqruf/anything-v3.0"
unet_model_path = "./dream_booth_lora_outputs"# 加载原始的模型
pipe = StableDiffusionPipeline.from_pretrained(pretrained_model_name_or_path, safety_checker=None)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# 将 adapter layers 添加到 UNet 模型
pipe.unet.load_attn_procs(unet_model_path, from_hf_hub=False)clear_output()
  • 使用模型进行推理预测
from IPython.display import displayprompt               = "pretty girl,full body,pretty face,Perfect face,clear face,fine details,<Guochao>"
negative_prompt      = "lowres, error_face, bad_face, bad_anatomy, error_body, error_hair, error_arm, (error_hands, bad_hands, error_fingers, bad_fingers, missing_fingers) error_legs, bad_legs, multiple_legs, missing_legs, error_lighting, error_shadow, error_reflection, text, error, extra_digit, fewer_digits, cropped, worst_quality, low_quality, normal_quality, jpeg_artifacts, signature, watermark, username, blurry"guidance_scale       = 8
num_inference_steps  = 50
height = 512
width = 512img = pipe(prompt, negative_prompt=negative_prompt, guidance_scale=guidance_scale, height=height, width=width, num_inference_steps=num_inference_steps).images[0]
display(img)
  0%|          | 0/50 [00:00<?, ?it/s]

在这里插入图片描述

参考资料

  • https://github.com/huggingface/diffusers/tree/main/examples/dreambooth
  • https://github.com/CompVis/stable-diffusion
  • https://github.com/PaddlePaddle/PaddleNLP/tree/develop/ppdiffusers/examples/dreambooth
  • https://aistudio.baidu.com/aistudio/projectdetail/5481677?channelType=0&channel=0

此文章为搬运
原项目链接

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.rhkb.cn/news/33406.html

如若内容造成侵权/违法违规/事实不符,请联系长河编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【社群运营】AI智能对话,打造自动化社群

人工智能大背景下&#xff0c;各行各业都在往智能化发展&#xff0c;无论是办公产品&#xff0c;还是生产器械都选择接入了更加智能的AI来提高生产效率。那么&#xff0c;在日常的社群管理工作中&#xff0c;我们又能否跟上这一波热度&#xff0c;让社群自动化高效运营&#xf…

夏杰语音麦克精灵:智能语音交互升级新体验

对于很多人来说&#xff0c;通过语音声控电视、空调等家电已经不再陌生。 “你好小智&#xff0c;我想听音乐”、“你好小智&#xff0c;播放深圳卫视”……近几年&#xff0c;“小智”逐渐被人认识。是的&#xff0c;它是夏杰语音旗下的一款智能精灵——麦克精灵。它不仅可以…

跟着我学 AI丨ChatGPT 详解

随着人工智能的发展&#xff0c;聊天机器人成为了一个备受关注的领域。而ChatGPT作为其中的佼佼者&#xff0c;其功能和技术水平也越来越受到人们的关注。那么&#xff0c;什么是ChatGPT&#xff1f;它又有哪些优点和限制呢&#xff1f; ChatGPT是一款基于自然语言处理技术开发…

小红书内容种草,曝光渠道分析总结

这是一个内容为王的时代&#xff0c;也是一个内容爆炸的时代。想要在以分享特色的小红书平台&#xff0c;实现内容种草&#xff0c;迅速出圈。今天来马文化传媒就从实操的角度&#xff0c;为大家带来小红书内容种草&#xff0c;曝光渠道分析总结的各种干货&#xff01; 一、什…

小红书达人账号数据分析

文章目录 一、项目背景二、数据预处理1、查看数据2、数据清洗2.1对达人列表进行清洗2.2对涨分榜进行清洗2.3对MCN列表进行清洗2.4对定性变量&#xff08;分类变量&#xff09;进行处理 3、表格处理3.1合并达人列表和涨粉榜 三、分析与数据可视化1、对达人列表进行相关性分析2、…

小红书账号分析丨千瓜指数高的小红书账号是否真的优质?

关键词&#xff1a;千瓜指数、小红书数据、小红书账号分析 达人账号质量是否优质从多个维度衡量&#xff0c;千瓜指数能够客观综合评价达人账号的质量&#xff0c;给到小红书达人一定的参考价值。 那么千瓜指数能够帮助达人什么&#xff1f; 品牌筛选达人会选择更优质的&#x…

GPT逆向:高效解读小红书文案生成器的内部逻辑

文章目录 前言一、什么是小红书文案生成器二、具体步骤总结 前言 关注我的很多同学都会写爬虫。但如果想把爬虫写得好&#xff0c;那一定要掌握一些逆向技术&#xff0c;对网页的JavaScript和安卓App进行逆向&#xff0c;从而突破签名或者绕过反爬虫限制。 最近半年&#xff…

小红书爆款笔记运营攻略

小红书爆款笔记运营攻略 2020-04-14 挖塘人 来源 审核中 修改 现在可以看到很多人手机里有款叫小红书的软件&#xff0c;它致力于打造聚焦生活方式的内容社区&#xff0c;并且凭借“万物皆可种草”红极一时&#xff0c;吸引着一大批年轻用户。 根据官方数据&#xff0c;截…

小红书近期发展动态---预言专家

总结为5件大事情&#xff1a; 事件1&#xff1a;为期三个月打击侵权行动----6月持续到9月 双月计划----清扫爬虫 事件2&#xff1a;APP接口风控提高-----持续 事件3&#xff1a;web端进行摸排升级改版-----7月 事件4&#xff1a;APP增加unidbg|unicorn检测对抗 ----最近一个月内…

教培行业如何在小红书推广 教育机构红书推广上海氖天

品牌打造在教育机构营销中占有重要地位。提升小红书中品牌知名度&#xff0c;需尽量彰显教育机构独特魅力与卓越实力。小红书给教育行业带来新的推广途径&#xff0c;可以把品牌打造的更形象、更简洁&#xff0c;另外用户粘性大、转化率高。接下来&#xff0c;这篇文章就来具体…

小红书4大主要人群的消费特征,你占了几个?

数据显示90%的小红书用户在购买前有过搜索小红书的行为。社交媒体时代&#xff0c;人货场被重新定义&#xff0c;更加多元的消费需求涌现&#xff0c;通过洞察“人”&#xff0c;我们找到了小红书4大主要人群的消费特征&#xff0c;供大家参考。 TA洞察 重塑“人货场” 传统…

小红书话题笔记是什么意思?小红书话题的形式有哪些?

相信很多小红书用户总会看到别人发布小红书话题笔记&#xff0c;或者自己也经常参与其中&#xff0c;但究竟小红书话题笔记是什么意思&#xff1f;今天让我们为大家分享一下。 一&#xff0e;小红书话题笔记是什么意思 小红书话题笔记是平台开发的一种内容创作功能&#xff0c…

小红书账号分析丨小红书kol速成干货分享

导语&#xff1a;很多萌新小白在刚运营小红书时动力满满&#xff0c;坚持一段时间后&#xff0c;开始愁选题愁数据&#xff0c;最后不了了之。普通人和专业博主&#xff0c;差距真的很大吗&#xff1f;NO&#xff01;只要持续不断地努力输出专业运营知识的学习&#xff0c;从素…

小红书7W粉丝美女大V被盗号了,前来咨询乔戈里,封面就是她

众所周知&#xff0c;乔戈里分手了&#xff0c;现在重心放到公众号上&#xff08;说得好像你有女朋友给你放重心似的呢&#xff09;&#xff0c;对于粉丝的求助会尽量尽力去帮助大家&#xff0c;这位粉丝是乔戈里的知识星球的球友&#xff0c;毕竟乔哥时间有限&#xff0c;对于…

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

大模型系列文章&#xff0c;针对《Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models》的翻译。 超越一个模型适合所有&#xff1a;大型语言模型领域专业化综述 摘要1 引言1.1 相关综述 2 领域专业的分类2.1 背景2.2 领域专业的技术分…

“欠缺逻辑”的诺奖得主:我有更好的直觉

文 | 孙滔 王兆昱 作为乔治帕里西&#xff08;Giorgio Parisi&#xff09;的第一个博士生&#xff0c;张翼成至今也无法跟上导师的思维&#xff0c;尽管他自1981年跟随帕里西学习和工作有累计达8年的时间。 74岁的帕里西因发现了从原子到行星尺度的物理系统紊乱和波动的相互作用…

推特爆火!揭晓大模型的未来何去何从

文 | 智商掉了一地 巨大挑战 or 发展契机&#xff0c;ChatGPT 和 GPT-4 出现后&#xff0c;大模型的未来方向该何去何从&#xff1f; 近期&#xff0c;自然语言处理领域的快速发展引起了广泛的关注&#xff0c;尤其是大型语言模型&#xff08;LLM&#xff09;的兴起已经推动了该…

8个超好用的 AI 科研写作工具

近年来人工智能语言模型快速发展&#xff0c;尤其是当美国人工智能研究实验室 OpenAI 于2022年11月发布了聊天机器人ChatGPT&#xff0c;随之更是引爆了全世界的舆论&#xff0c;人们惊呼一个新的人工智能时代已经到来&#xff0c;很多工作都将被这类机器人取代。本文暂不讨论这…

最新研究:人类道德判断可能会受ChatGPT的影响

根据《科学报告》发表的一项研究&#xff0c;人类对道德困境的反应可能会受到人工智能对话机器人ChatGPT所写陈述的影响。这一研究表明&#xff0c;用户可能低估了自己的道德判断受ChatGPT影响的程度。 德国英戈尔施塔特应用科学大学科学家让ChatGPT&#xff08;由人工智能语言…

VGG应用:猫狗大战——基于VGG16的猫狗数据分类

一、数据集的处理与加载 class CatDogDataset(Dataset):def __init__(self, data_dir, mode"train", split_n0.9, rng_seed620, transformNone):self.mode modeself.data_dir data_dirself.rng_seed rng_seedself.split_n split_nself.data_info self._get_img…