{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "IqM-T1RTzY6C" }, "source": [ "[Unsloth](https://github.com/unslothai/unsloth)是Unsloth AI出品的一个高效的微调工具。它包含闭源和开源两个版本,在这里我们仅关注开源版本的使用。Unsloth重写了模型的内核,尤其是使用triton重写了loss、norm等算子,并手动重写了反向传播机制,使得模型的训练速度更快。同时,Unsloth使用了低精度量化结合LoRA进行微调,进一步降低了显存占用。\n", "\n", "* Unsloth支持Llama、Mistral、Phi-3、Gemma、Yi、DeepSeek、Qwen、TinyLlama、Vicuna、Open Hermes等\n", "* Unsloth支持16bit LoRA或4bit QLoRA。两者都快2倍。\n", "* `max_seq_length`可以设置为任何值,因为通过了[kaiokendev的方法](https://kaiokendev.github.io/til)进行自动RoPE缩放。\n", "* Unsloth使Gemma-2 9b / 27b **快2倍** 速度运行\n", "* 支持自动导出到Ollama\n", "\n", "要在您自己的计算机上安装Unsloth,请按照Github页面上的安装说明进行操作[这里](https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions)。\n", "\n", "笔记本中的功能:\n", "1. 使用[FineTome 100K](https://www.modelscope.cn/datasets/AI-ModelScope/FineTome-100k)数据集进行训练。\n", "1. 通过`standardize_sharegpt`将ShareGPT转换为标准格式\n", "2. 通过`train_on_responses_only`仅在完成/助手上进行训练" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecutionIndicator": { "show": true }, "execution": { "iopub.execute_input": "2024-12-30T08:36:19.428819Z", "iopub.status.busy": "2024-12-30T08:36:19.428314Z", "iopub.status.idle": "2024-12-30T08:36:33.712505Z", "shell.execute_reply": "2024-12-30T08:36:33.711914Z", "shell.execute_reply.started": "2024-12-30T08:36:19.428798Z" }, "id": "2eSvM9zX_2d3", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking in indexes: https://mirrors.aliyun.com/pypi/simple\n", "Collecting git+https://github.com/tastelikefeet/unsloth.git@feat/modelscope\n", " Cloning https://github.com/tastelikefeet/unsloth.git (to revision feat/modelscope) to /tmp/pip-req-build-ylwo66va\n", " Running command git clone --filter=blob:none --quiet https://github.com/tastelikefeet/unsloth.git /tmp/pip-req-build-ylwo66va\n", " Running command git checkout -b feat/modelscope --track origin/feat/modelscope\n", " 切换到一个新分支 'feat/modelscope'\n", " 分支 'feat/modelscope' 设置为跟踪来自 'origin' 的远程分支 'feat/modelscope'。\n", " Resolved https://github.com/tastelikefeet/unsloth.git to commit 7146bc2f759a9a2f300f4009b1d3ab671bb8dbb5\n", " Installing build dependencies ... \u001b[?25ldone\n", "\u001b[?25h Getting requirements to build wheel ... \u001b[?25ldone\n", "\u001b[?25h Preparing metadata (pyproject.toml) ... \u001b[?25ldone\n", "\u001b[?25h\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n", "\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.3.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.3.1\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n" ] } ], "source": [ "# !pip install unsloth\n", "# !pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git\n", "# 截止到当前文件编写的时候,modelscope的pr目前没有合并进主分支,请使用这个命令安装unsloth:\n", "!pip install git+https://github.com/tastelikefeet/unsloth.git@feat/modelscope" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "下面我们拉起模型。注意unsloth项目组在modelscope社区上提供了许多模型,如果您需要使用的模型在魔搭上不存在,请考虑告诉我们,或者直接上传一个。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecutionIndicator": { "show": true }, "colab": { "base_uri": "https://localhost:8080/", "height": 301, "referenced_widgets": [ "70b0793d26c14afbbe8af319d38dbb07", "20e75852859845f697760003945b952f", "f0c7c3402e6549148ac293dcf01ae8da", "b1c8acb40b924d8f854408252c1e061c", "aa28da40d8074e89ad32b5cb5ad04fa6", "4a91a45aba634535a53a49a873b25dfe", "d773266178564b6cb739d756ff8de459", "cff4f20460a442839c26bfa468b35dee", "683dd21b9eef4bb4a0c9bc15c984703a", "f0377d014b2e4e89b9932750f04a6a3c", "0d384a63bb5f4b7384ac5f20d1a399b0", "4606a430f0214a0bb524a06ab78af1a9", "700b2fc6799c41f4984e07c36502f2c9", "be744decce12471388b689a085576c7f", "e8efa5951c4b4f3e9922e029c57a5ab3", "afa86659651d48a7b4a3043a0909322f", "9edc5f3740e54d7ab6fd38f270f4a40e", "2aea4e74b6c4449ea24812d11675daee", "252c3d25fe3341379e5fc31d30dee7bc", "672262ede93c413cbc219ca4fec5b98a", "ea0f4a924ce14b97a53ba2b045ba1b1d", "757a4328afef4293b6d31ce9b90eec79", "18c4cb423f8744afa2b85ee361dcc121", "76fb012b818f4bfb88c103262fb9c20e", "8158ce20191d40fba140ed2c2f4ed859", "595694711ab148e8bbf9824896c10468", "4b355eddebb548a7b60d28c12403ebfd", "0faf73ed502649c4855adca930e3c8ff", "b0670b92e79744c5ba11501bafdc101b", "3ea24021851a4341b3cb30d4ff0d495f", "7028f81557e54d0d8df4cc1a0f85b12c", "0b8cf23562ae4e428ab4f00979e7ce07", "b774631435d4450da64f0a2b40cbf3aa", "8578be7c3f4044d3bfe0eae94fe2c21a", "4f0aaa3ce8ca43fdaea4cd4f7b595218", "8611f4c6d12641dd960377346daba5d1", "3601b111b63f401cb5c75ba08ae74b61", "a5bde1dcf7f745faa140cb6728c415fc", "6d163ef8e07b4a1cba351ec44175efd5", "9a4ea589183346dab240125cd38ba0f9", "d8f739452aa5410aa5eba87b2760c2af", "dc098c4a448243ddb82ab769dc261171", "d8c3cd1d08b747b690a59b7793aaf742", "95a6cb0e04984f3c96bbe9ee9c5fa287", "254473850de1471e9584a9c03962bd6f", "56e59a69f84e487f91e263313f5d2b72", "524b13439d53402eb35388606058c288", "fc7d250163bb470fa19503356dfe9e81", "9dc2697fe6984649b0ac36018f6904c9", "3ebff50e3c0b4cb4b99c93adde35b712", "35aee367658940b9b2f6884ea87b3e4b", "f7f53b3a53eb4b8e9fa47cd233f4df7f", "73efb213f4a84996a353495b912393d9", "8b09015cc22240cc8e736f7050c0f37c", "c7907c51d16a4aeb87974c36b3b9e7c8" ] }, "execution": { "iopub.execute_input": "2024-12-30T09:01:40.310562Z", "iopub.status.busy": "2024-12-30T09:01:40.310304Z", "iopub.status.idle": "2024-12-30T09:01:48.648813Z", "shell.execute_reply": "2024-12-30T09:01:48.648168Z", "shell.execute_reply.started": "2024-12-30T09:01:40.310540Z" }, "id": "QmUBVEnvCDJv", "outputId": "cb455c4b-2327-4a15-9078-de2fef37d9bd", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Unsloth: Your Flash Attention 2 installation seems to be broken?\n", "A possible explanation is you have a new CUDA version which isn't\n", "yet compatible with FA2? Please file a ticket to Unsloth or FA2.\n", "We shall now use Xformers instead, which does not have any performance hits!\n", "We found this negligible impact by benchmarking on 1x A100.\n", "🦥 Unsloth Zoo will now patch everything to make training faster!\n", "[2024-12-30 17:01:45,179] [INFO] [real_accelerator.py:219:get_accelerator] Setting ds_accelerator to cuda (auto detect)\n", "use modelscope\n", "Downloading Model to directory: /mnt/workspace/.cache/modelscope/hub/unsloth/llama-3.2-3b-instruct-bnb-4bit\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2024-12-30 17:01:46,115 - modelscope - INFO - Target directory already exists, skipping creation.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "==((====))== Unsloth 2024.12.9: Fast Llama patching. Transformers: 4.47.1.\n", " \\\\ /| GPU: NVIDIA A100-SXM4-80GB. Max memory: 79.347 GB. Platform: Linux.\n", "O^O/ \\_/ \\ Torch: 2.4.0+cu121. CUDA: 8.0. CUDA Toolkit: 12.1. Triton: 3.0.0\n", "\\ / Bfloat16 = TRUE. FA [Xformers = 0.0.27.post2. FA2 = False]\n", " \"-____-\" Free Apache license: http://github.com/unslothai/unsloth\n", "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n" ] } ], "source": [ "import os\n", "# 对于国内用户,魔搭社区提供了快速下载模型和数据集的方法,只需要简单引入一个环境变量:\n", "os.environ['UNSLOTH_USE_MODELSCOPE'] = 'true'\n", "from unsloth import FastLanguageModel\n", "import torch\n", "max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!\n", "dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+\n", "load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.\n", "\n", "# 4bit pre quantized models we support for 4x faster downloading + no OOMs.\n", "fourbit_models = [\n", " \"unsloth/Meta-Llama-3.1-8B-bnb-4bit\", # Llama-3.1 2x faster\n", " \"unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit\",\n", " \"unsloth/Meta-Llama-3.1-70B-bnb-4bit\",\n", " \"unsloth/Meta-Llama-3.1-405B-bnb-4bit\", # 4bit for 405b!\n", " \"unsloth/Mistral-Small-Instruct-2409\", # Mistral 22b 2x faster!\n", " \"unsloth/mistral-7b-instruct-v0.3-bnb-4bit\",\n", " \"unsloth/Phi-3.5-mini-instruct\", # Phi-3.5 2x faster!\n", " \"unsloth/Phi-3-medium-4k-instruct\",\n", " \"unsloth/gemma-2-9b-bnb-4bit\",\n", " \"unsloth/gemma-2-27b-bnb-4bit\", # Gemma 2x faster!\n", "\n", " \"unsloth/Llama-3.2-1B-bnb-4bit\", # NEW! Llama 3.2 models\n", " \"unsloth/Llama-3.2-1B-Instruct-bnb-4bit\",\n", " \"unsloth/Llama-3.2-3B-bnb-4bit\",\n", " \"unsloth/Llama-3.2-3B-Instruct-bnb-4bit\",\n", "\n", " \"unsloth/Llama-3.3-70B-Instruct-bnb-4bit\" # NEW! Llama 3.3 70B!\n", "]\n", "\n", "model, tokenizer = FastLanguageModel.from_pretrained(\n", " model_name = \"unsloth/Llama-3.2-3B-Instruct\", # or choose \"unsloth/Llama-3.2-1B-Instruct\"\n", " max_seq_length = max_seq_length,\n", " dtype = dtype,\n", " load_in_4bit = load_in_4bit,\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "SXd9bTZd1aaL" }, "source": [ "现在添加LoRA适配器" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "execution": { "iopub.execute_input": "2024-12-30T09:02:18.475150Z", "iopub.status.busy": "2024-12-30T09:02:18.474539Z", "iopub.status.idle": "2024-12-30T09:02:22.128940Z", "shell.execute_reply": "2024-12-30T09:02:22.128310Z", "shell.execute_reply.started": "2024-12-30T09:02:18.475119Z" }, "id": "6bZsfBuZDeCL", "outputId": "acc0f9f5-59a6-46fe-d5bb-cd09965bb8c9", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Unsloth 2024.12.9 patched 28 layers with 28 QKV layers, 28 O layers and 28 MLP layers.\n" ] } ], "source": [ "model = FastLanguageModel.get_peft_model(\n", " model,\n", " r = 16, # 选择任何大于0的数字!建议8, 16, 32, 64, 128\n", " target_modules = [\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n", " \"gate_proj\", \"up_proj\", \"down_proj\",],\n", " lora_alpha = 16,\n", " lora_dropout = 0, # 支持任何值,但=0是优化的\n", " bias = \"none\", # 支持任何值,但=\"none\"是优化的\n", " use_gradient_checkpointing = \"unsloth\", # 对于非常长的上下文使用True或\"unsloth\"\n", " random_state = 3407,\n", " use_rslora = False, # 我们支持秩稳定的LoRA\n", " loftq_config = None, # 以及LoftQ\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "vITh0KVJ10qX" }, "source": [ "\n", "### 数据准备\n", "我们现在使用 `Llama-3.1` 格式进行对话风格的微调。我们使用 [FineTome-100k](https://www.modelscope.cn/datasets/AI-ModelScope/FineTome-100k) 数据集,采用 ShareGPT 风格。但是我们将其转换为多轮格式 `(\"role\", \"content\")`,而不是 `(\"from\", \"value\")`。Llama-3 以如下方式呈现多轮对话:\n", "\n", "```\n", "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n", "\n", "Hello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", "\n", "Hey there! How are you?<|eot_id|><|start_header_id|>user<|end_header_id|>\n", "\n", "I'm great thanks!<|eot_id|>\n", "```\n", "\n", "我们使用 `get_chat_template` 函数来获取正确的聊天模板。我们支持 `zephyr, chatml, mistral, llama, alpaca, vicuna, vicuna_old, phi3, llama3` 等。" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecutionIndicator": { "show": true }, "colab": { "base_uri": "https://localhost:8080/", "height": 113, "referenced_widgets": [ "39bf1c29894f43acb6d2919e64a4fd28", "007a35a241b346ec9a5cdd6f3e4ddd27", "969a119573f942b29951ae2933e61cde", "b8c4d378ea0e4bcd9f572a191a7c136f", "7d37dd0e06724b53b4f31cc0a4321b0d", "4083b2ef8e6348e18b69d116508b46ff", "9555be409a2c4a97b18d4978ed13d35f", "5628ed38f304438faf5442b29a9511d6", "6e0fe945001140b3959e617a2f55c353", "0c30ded692064dc7bf36a93897f2b68f", "8c5ad85b4da14b239340ac95244d8ed4", "39684b70f2ff48cab454617c721f7777", "e8445e90b1054aacbecf198c7979a0b6", "d1cc50fb6d5849888af5d765dc51ab62", "2b359412d4914aa38a6e21284c12ecbc", "a4ceb6dbc8de4fa798ee39d28e5ebc40", "d6ab4d4143ff49bcae30be1bc2d76762", "904e7bac43bd4333b321cacfed5dcb60", "2bb75539976c49ed805c4ff6c58fb1d2", "45bc9d882a8f4a7e813245b1590d4427", "ddee625828cb4c22927aa73a02cd2dd9", "fd46f381983f49179de05497c171c805", "785d9147f4a341afafc5c5743892df16", "5e9825466cd2481b92cfe89f33b11fe3", "bfbb37b6f4b247b5bf5aaf7e1d80bcf9", "2a6ca29a76ff430d86213f910858db5b", "92d981a21b204f6c8b52e3caa16d1784", "c685f29a5d2c461ca3dda867bab6df50", "e2f16d56b21c4ff2918872d70e5ca847", "0bfbfe620ff446a0a47f7d5de7c88975", "5c9ee920068a47d89dbf5cbdd9e848a3", "95249b8fb6a84054a01f22c5f73f207b", "2ed2b017b9a24f36a4222c5c27753991" ] }, "execution": { "iopub.execute_input": "2024-12-30T09:03:34.106887Z", "iopub.status.busy": "2024-12-30T09:03:34.106690Z", "iopub.status.idle": "2024-12-30T09:03:50.893660Z", "shell.execute_reply": "2024-12-30T09:03:50.892954Z", "shell.execute_reply.started": "2024-12-30T09:03:34.106870Z" }, "id": "LjY75GoYUCB8", "outputId": "94095b01-dac6-4f9c-cbc3-ca78e007ba12", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Downloading [README.md]: 100%|██████████| 982/982 [00:00<00:00, 6.22MB/s]\n", "2024-12-30 17:03:36,084 - modelscope - INFO - storing https://www.modelscope.cn/api/v1/datasets/AI-ModelScope/FineTome-100k/repo?Source=SDK&Revision=master&FilePath=README.md&View=False in cache at /mnt/workspace/.cache/modelscope/hub/datasets/ad10352f0b6842676e6d5943f385c7bb2b07c1abd592c2aa260e9af638ecea41\n", "2024-12-30 17:03:36,091 - modelscope - INFO - creating metadata file for /mnt/workspace/.cache/modelscope/hub/datasets/ad10352f0b6842676e6d5943f385c7bb2b07c1abd592c2aa260e9af638ecea41\n", "Downloading data: 117MB [00:09, 11.8MB/s] \n", "2024-12-30 17:03:48,427 - modelscope - INFO - storing https://www.modelscope.cn/api/v1/datasets/AI-ModelScope/FineTome-100k/repo?Source=SDK&Revision=master&FilePath=data%2Ftrain-00000-of-00001.parquet in cache at /mnt/workspace/.cache/modelscope/hub/datasets/downloads/27484246eca7151cf2513ef6f4a6be4a7e7520bbcdd1b99b2cfd61c4164a7cd6\n", "2024-12-30 17:03:48,436 - modelscope - INFO - creating metadata file for /mnt/workspace/.cache/modelscope/hub/datasets/downloads/27484246eca7151cf2513ef6f4a6be4a7e7520bbcdd1b99b2cfd61c4164a7cd6\n", "Generating train split: 100%|██████████| 100000/100000 [00:01<00:00, 54443.18 examples/s]\n" ] } ], "source": [ "from unsloth.chat_templates import get_chat_template\n", "\n", "tokenizer = get_chat_template(\n", " tokenizer,\n", " chat_template = \"llama-3.1\",\n", ")\n", "\n", "def formatting_prompts_func(examples):\n", " convos = examples[\"conversations\"]\n", " texts = [tokenizer.apply_chat_template(convo, tokenize = False, add_generation_prompt = False) for convo in convos]\n", " return { \"text\" : texts, }\n", "pass\n", "\n", "from modelscope import MsDataset\n", "dataset = MsDataset.load(\"AI-ModelScope/FineTome-100k\", split = \"train\")" ] }, { "cell_type": "markdown", "metadata": { "id": "K9CBpiISFa6C" }, "source": [ "现在使用 `standardize_sharegpt` 将 ShareGPT 风格的数据集转换为Conversations通用格式。这将数据集从如下格式:\n", "```\n", "{\"from\": \"system\", \"value\": \"You are an assistant\"}\n", "{\"from\": \"human\", \"value\": \"What is 2+2?\"}\n", "{\"from\": \"gpt\", \"value\": \"It's 4.\"}\n", "```\n", "转换为\n", "```\n", "{\"role\": \"system\", \"content\": \"You are an assistant\"}\n", "{\"role\": \"user\", \"content\": \"What is 2+2?\"}\n", "{\"role\": \"assistant\", \"content\": \"It's 4.\"}\n", "```" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 81, "referenced_widgets": [ "dd9e90f2c16541e8a72c6771c4685b9a", "a326b2e89f1c46f28cd166afc7490e2b", "eb855a0fcb554a8eb245351b3593623d", "bd71b6cb29e147ab9b10d1b85908c413", "b1b0a4e3f00043b0a0eb7a053815a4a5", "58ce4633471c438db6e103a1ca3806a0", "cf1b769b7a744b5f8bccf6798566582f", "1c0c2835705f41089de4caea98127c04", "e2d886444f0047fa9e2245b9773ced9e", "c03b9410af384397849ef63b62f2c689", "098bd8ace574423da763eb0eae1d3bb6", "d08e764aa8b94e7f9e1c727b53980abe", "e62f6eb58a744d38b837e47d8a16db67", "bcf8e36d938a4d959c31ea4ff3c8d4cf", "ae2464c1cbc442a383de7577d2986116", "9a8f1b8079fe478ebf0b16096cb224f5", "e4bf3f8e63bb4c01bbe821d438445d91", "d7e0024b98a94a9fa12dc4154ff2b2fc", "cc0bd79ca9e847fba88aafe2d612ffe4", "76e2e47c93e541ff820bcbab9264381d", "4b41aa65c6894e918b04709f8e9270d2", "cdae06929214464ea25e343f17b4a843" ] }, "execution": { "iopub.execute_input": "2024-12-30T09:04:23.348211Z", "iopub.status.busy": "2024-12-30T09:04:23.347241Z", "iopub.status.idle": "2024-12-30T09:04:38.154900Z", "shell.execute_reply": "2024-12-30T09:04:38.154285Z", "shell.execute_reply.started": "2024-12-30T09:04:23.348182Z" }, "id": "oPXzJZzHEgXe", "outputId": "dd1c72fa-39ea-48a2-9ed2-c263a4549b91", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Standardizing format: 100%|██████████| 100000/100000 [00:04<00:00, 20475.55 examples/s]\n", "Map: 100%|██████████| 100000/100000 [00:09<00:00, 10242.12 examples/s]\n" ] } ], "source": [ "from unsloth.chat_templates import standardize_sharegpt\n", "dataset = standardize_sharegpt(dataset)\n", "dataset = dataset.map(formatting_prompts_func, batched = True,)" ] }, { "cell_type": "markdown", "metadata": { "id": "ndDUB23CGAC5" }, "source": [ "我们来看一下第0项的对话结构:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "execution": { "iopub.execute_input": "2024-12-30T09:04:55.928686Z", "iopub.status.busy": "2024-12-30T09:04:55.928172Z", "iopub.status.idle": "2024-12-30T09:04:55.940707Z", "shell.execute_reply": "2024-12-30T09:04:55.940258Z", "shell.execute_reply.started": "2024-12-30T09:04:55.928662Z" }, "id": "gGFzmplrEy9I", "outputId": "9f3f66fc-8649-40c8-829c-db3f11f88728", "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[{'content': 'Explain what boolean operators are, what they do, and provide examples of how they can be used in programming. Additionally, describe the concept of operator precedence and provide examples of how it affects the evaluation of boolean expressions. Discuss the difference between short-circuit evaluation and normal evaluation in boolean expressions and demonstrate their usage in code. \\n\\nFurthermore, add the requirement that the code must be written in a language that does not support short-circuit evaluation natively, forcing the test taker to implement their own logic for short-circuit evaluation.\\n\\nFinally, delve into the concept of truthiness and falsiness in programming languages, explaining how it affects the evaluation of boolean expressions. Add the constraint that the test taker must write code that handles cases where truthiness and falsiness are implemented differently across different programming languages.',\n", " 'role': 'user'},\n", " {'content': 'Boolean operators are logical operators used in programming to manipulate boolean values. They operate on one or more boolean operands and return a boolean result. The three main boolean operators are \"AND\" (&&), \"OR\" (||), and \"NOT\" (!).\\n\\nThe \"AND\" operator returns true if both of its operands are true, and false otherwise. For example:\\n\\n```python\\nx = 5\\ny = 10\\nresult = (x > 0) and (y < 20) # This expression evaluates to True\\n```\\n\\nThe \"OR\" operator returns true if at least one of its operands is true, and false otherwise. For example:\\n\\n```python\\nx = 5\\ny = 10\\nresult = (x > 0) or (y < 20) # This expression evaluates to True\\n```\\n\\nThe \"NOT\" operator negates the boolean value of its operand. It returns true if the operand is false, and false if the operand is true. For example:\\n\\n```python\\nx = 5\\nresult = not (x > 10) # This expression evaluates to True\\n```\\n\\nOperator precedence refers to the order in which operators are evaluated in an expression. It ensures that expressions are evaluated correctly. In most programming languages, logical AND has higher precedence than logical OR. For example:\\n\\n```python\\nresult = True or False and False # This expression is evaluated as (True or (False and False)), which is True\\n```\\n\\nShort-circuit evaluation is a behavior where the second operand of a logical operator is not evaluated if the result can be determined based on the value of the first operand. In short-circuit evaluation, if the first operand of an \"AND\" operator is false, the second operand is not evaluated because the result will always be false. Similarly, if the first operand of an \"OR\" operator is true, the second operand is not evaluated because the result will always be true.\\n\\nIn programming languages that support short-circuit evaluation natively, you can use it to improve performance or avoid errors. For example:\\n\\n```python\\nif x != 0 and (y / x) > 10:\\n # Perform some operation\\n```\\n\\nIn languages without native short-circuit evaluation, you can implement your own logic to achieve the same behavior. Here\\'s an example in pseudocode:\\n\\n```\\nif x != 0 {\\n if (y / x) > 10 {\\n // Perform some operation\\n }\\n}\\n```\\n\\nTruthiness and falsiness refer to how non-boolean values are evaluated in boolean contexts. In many programming languages, non-zero numbers and non-empty strings are considered truthy, while zero, empty strings, and null/None values are considered falsy.\\n\\nWhen evaluating boolean expressions, truthiness and falsiness come into play. For example:\\n\\n```python\\nx = 5\\nresult = x # The value of x is truthy, so result is also truthy\\n```\\n\\nTo handle cases where truthiness and falsiness are implemented differently across programming languages, you can explicitly check the desired condition. For example:\\n\\n```python\\nx = 5\\nresult = bool(x) # Explicitly converting x to a boolean value\\n```\\n\\nThis ensures that the result is always a boolean value, regardless of the language\\'s truthiness and falsiness rules.',\n", " 'role': 'assistant'}]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dataset[0][\"conversations\"]" ] }, { "cell_type": "markdown", "metadata": { "id": "idAEIeSQ3xdS" }, "source": [ "\n", "### 训练模型\n", "作为例子,我们使用 Huggingface TRL 的 `SFTTrainer`进行训练:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 67, "referenced_widgets": [ "3ffe42931dcf4a69972f4d50ee4dd3dd", "ee9dcec2d5c44fd883f16c06b9f76264", "982b6b94642d49fa85fab6ad621392fe", "42990f347a8c42f7b510e2d17c7d3c6e", "3cd95b7c5e2f4c6883333045db11c6d6", "5b34a4e8fc7747e78b49ad5bf67a6580", "23907906314743938db4e484c15480cc", "378176d2f0c9466d8762a584edf4217d", "e221482cbe95465191212d85d539938c", "74dc78a38e30465a96d2c8a22a27b127", "c6b4759ce826421081508270cb30334b" ] }, "execution": { "iopub.execute_input": "2024-12-30T09:05:04.680858Z", "iopub.status.busy": "2024-12-30T09:05:04.680307Z", "iopub.status.idle": "2024-12-30T09:06:09.429020Z", "shell.execute_reply": "2024-12-30T09:06:09.428350Z", "shell.execute_reply.started": "2024-12-30T09:05:04.680835Z" }, "id": "95_Nn-89DhsL", "outputId": "97211c96-b8e2-4b35-8691-892550ee0e7a", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Map (num_proc=2): 100%|██████████| 100000/100000 [01:04<00:00, 1551.82 examples/s]\n" ] } ], "source": [ "from trl import SFTTrainer\n", "from transformers import TrainingArguments, DataCollatorForSeq2Seq\n", "from unsloth import is_bfloat16_supported\n", "\n", "trainer = SFTTrainer(\n", " model = model,\n", " tokenizer = tokenizer,\n", " train_dataset = dataset,\n", " dataset_text_field = \"text\",\n", " max_seq_length = max_seq_length,\n", " data_collator = DataCollatorForSeq2Seq(tokenizer = tokenizer),\n", " dataset_num_proc = 2,\n", " packing = False, # 对于短序列可以使训练速度提高5倍。\n", " args = TrainingArguments(\n", " per_device_train_batch_size = 2,\n", " gradient_accumulation_steps = 4,\n", " warmup_steps = 5,\n", " # num_train_epochs = 1, # 设置为1次完整的训练运行。\n", " max_steps = 60,\n", " learning_rate = 2e-4,\n", " fp16 = not is_bfloat16_supported(),\n", " bf16 = is_bfloat16_supported(),\n", " logging_steps = 1,\n", " optim = \"adamw_8bit\",\n", " weight_decay = 0.01,\n", " lr_scheduler_type = \"linear\",\n", " seed = 3407,\n", " output_dir = \"outputs\",\n", " report_to = \"none\", # 用于WandB等\n", " ),\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "C_sGp5XlG6dq" }, "source": [ "使用Unsloth的`train_on_completions`方法,仅对助手的输出进行训练,并忽略用户输入的损失。" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecutionIndicator": { "show": true }, "colab": { "base_uri": "https://localhost:8080/", "height": 49, "referenced_widgets": [ "6064feeea79040409e18a1e2a289b09a", "bb241a26ca4d4d7186ba46cda1f8a802", "c9abb42da1734388a7d2f1a06832ecc6", "7c3a37494e5848b9994b37a4c8bac132", "c668ae4c7d174f2dad3fb837ff873e57", "dd30f3ead6394317be5a72aa890adfb9", "1e4ea03959b3496f8e75cc3588cf347c", "d356b597dda14c7ab023403ee6959cf8", "870ff8f17c7b47ec8d49cac84216b04c", "d5cfa138483f4007b2a95be833043235", "6d52daf29c90402a9762acdde765713f" ] }, "execution": { "iopub.execute_input": "2024-12-30T09:07:56.880355Z", "iopub.status.busy": "2024-12-30T09:07:56.879474Z", "iopub.status.idle": "2024-12-30T09:08:30.611780Z", "shell.execute_reply": "2024-12-30T09:08:30.611262Z", "shell.execute_reply.started": "2024-12-30T09:07:56.880326Z" }, "id": "juQiExuBG5Bt", "outputId": "dca88e73-ac69-4199-9c83-cb6300e8ce9a", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Map: 100%|██████████| 100000/100000 [00:33<00:00, 2965.92 examples/s]\n" ] } ], "source": [ "from unsloth.chat_templates import train_on_responses_only\n", "trainer = train_on_responses_only(\n", " trainer,\n", " instruction_part = \"<|start_header_id|>user<|end_header_id|>\\n\\n\",\n", " response_part = \"<|start_header_id|>assistant<|end_header_id|>\\n\\n\",\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "Dv1NBUozV78l" }, "source": [ "验证掩码是否实际完成:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 159 }, "execution": { "iopub.execute_input": "2024-12-30T09:09:07.324860Z", "iopub.status.busy": "2024-12-30T09:09:07.324551Z", "iopub.status.idle": "2024-12-30T09:09:07.340782Z", "shell.execute_reply": "2024-12-30T09:09:07.340222Z", "shell.execute_reply.started": "2024-12-30T09:09:07.324839Z" }, "id": "LtsMVtlkUhja", "outputId": "84735ea5-8489-4a34-f501-afe91901d542", "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\nCutting Knowledge Date: December 2023\\nToday Date: 26 July 2024\\n\\n<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nExplain what boolean operators are, what they do, and provide examples of how they can be used in programming. Additionally, describe the concept of operator precedence and provide examples of how it affects the evaluation of boolean expressions. Discuss the difference between short-circuit evaluation and normal evaluation in boolean expressions and demonstrate their usage in code. \\n\\nFurthermore, add the requirement that the code must be written in a language that does not support short-circuit evaluation natively, forcing the test taker to implement their own logic for short-circuit evaluation.\\n\\nFinally, delve into the concept of truthiness and falsiness in programming languages, explaining how it affects the evaluation of boolean expressions. Add the constraint that the test taker must write code that handles cases where truthiness and falsiness are implemented differently across different programming languages.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\nBoolean operators are logical operators used in programming to manipulate boolean values. They operate on one or more boolean operands and return a boolean result. The three main boolean operators are \"AND\" (&&), \"OR\" (||), and \"NOT\" (!).\\n\\nThe \"AND\" operator returns true if both of its operands are true, and false otherwise. For example:\\n\\n```python\\nx = 5\\ny = 10\\nresult = (x > 0) and (y < 20) # This expression evaluates to True\\n```\\n\\nThe \"OR\" operator returns true if at least one of its operands is true, and false otherwise. For example:\\n\\n```python\\nx = 5\\ny = 10\\nresult = (x > 0) or (y < 20) # This expression evaluates to True\\n```\\n\\nThe \"NOT\" operator negates the boolean value of its operand. It returns true if the operand is false, and false if the operand is true. For example:\\n\\n```python\\nx = 5\\nresult = not (x > 10) # This expression evaluates to True\\n```\\n\\nOperator precedence refers to the order in which operators are evaluated in an expression. It ensures that expressions are evaluated correctly. In most programming languages, logical AND has higher precedence than logical OR. For example:\\n\\n```python\\nresult = True or False and False # This expression is evaluated as (True or (False and False)), which is True\\n```\\n\\nShort-circuit evaluation is a behavior where the second operand of a logical operator is not evaluated if the result can be determined based on the value of the first operand. In short-circuit evaluation, if the first operand of an \"AND\" operator is false, the second operand is not evaluated because the result will always be false. Similarly, if the first operand of an \"OR\" operator is true, the second operand is not evaluated because the result will always be true.\\n\\nIn programming languages that support short-circuit evaluation natively, you can use it to improve performance or avoid errors. For example:\\n\\n```python\\nif x!= 0 and (y / x) > 10:\\n # Perform some operation\\n```\\n\\nIn languages without native short-circuit evaluation, you can implement your own logic to achieve the same behavior. Here\\'s an example in pseudocode:\\n\\n```\\nif x!= 0 {\\n if (y / x) > 10 {\\n // Perform some operation\\n }\\n}\\n```\\n\\nTruthiness and falsiness refer to how non-boolean values are evaluated in boolean contexts. In many programming languages, non-zero numbers and non-empty strings are considered truthy, while zero, empty strings, and null/None values are considered falsy.\\n\\nWhen evaluating boolean expressions, truthiness and falsiness come into play. For example:\\n\\n```python\\nx = 5\\nresult = x # The value of x is truthy, so result is also truthy\\n```\\n\\nTo handle cases where truthiness and falsiness are implemented differently across programming languages, you can explicitly check the desired condition. For example:\\n\\n```python\\nx = 5\\nresult = bool(x) # Explicitly converting x to a boolean value\\n```\\n\\nThis ensures that the result is always a boolean value, regardless of the language\\'s truthiness and falsiness rules.<|eot_id|>'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tokenizer.decode(trainer.train_dataset[0][\"input_ids\"])" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecutionIndicator": { "show": true }, "colab": { "base_uri": "https://localhost:8080/", "height": 106 }, "execution": { "iopub.execute_input": "2024-12-30T09:09:09.841937Z", "iopub.status.busy": "2024-12-30T09:09:09.841090Z", "iopub.status.idle": "2024-12-30T09:09:09.848999Z", "shell.execute_reply": "2024-12-30T09:09:09.848248Z", "shell.execute_reply.started": "2024-12-30T09:09:09.841909Z" }, "id": "_rD6fl8EUxnG", "outputId": "7b0d0ab4-06c3-4f2c-bb94-0ec853a4d0cc", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 271, 7035, 20197, 527, 20406, 20197, 1511, 304, 15840, 311, 37735, 2777, 2819, 13, 2435, 14816, 389, 832, 477, 810, 2777, 55610, 323, 471, 264, 2777, 1121, 13, 578, 2380, 1925, 2777, 20197, 527, 330, 4064, 1, 320, 7827, 705, 330, 878, 1, 320, 8651, 705, 323, 330, 14394, 1, 1533, 3677, 791, 330, 4064, 1, 5793, 4780, 837, 422, 2225, 315, 1202, 55610, 527, 837, 11, 323, 905, 6062, 13, 1789, 3187, 1473, 74694, 12958, 198, 87, 284, 220, 20, 198, 88, 284, 220, 605, 198, 1407, 284, 320, 87, 871, 220, 15, 8, 323, 320, 88, 366, 220, 508, 8, 220, 674, 1115, 7645, 67349, 311, 3082, 198, 14196, 19884, 791, 330, 878, 1, 5793, 4780, 837, 422, 520, 3325, 832, 315, 1202, 55610, 374, 837, 11, 323, 905, 6062, 13, 1789, 3187, 1473, 74694, 12958, 198, 87, 284, 220, 20, 198, 88, 284, 220, 605, 198, 1407, 284, 320, 87, 871, 220, 15, 8, 477, 320, 88, 366, 220, 508, 8, 220, 674, 1115, 7645, 67349, 311, 3082, 198, 14196, 19884, 791, 330, 14394, 1, 5793, 4277, 988, 279, 2777, 907, 315, 1202, 28312, 13, 1102, 4780, 837, 422, 279, 28312, 374, 905, 11, 323, 905, 422, 279, 28312, 374, 837, 13, 1789, 3187, 1473, 74694, 12958, 198, 87, 284, 220, 20, 198, 1407, 284, 539, 320, 87, 871, 220, 605, 8, 220, 674, 1115, 7645, 67349, 311, 3082, 198, 14196, 19884, 18968, 54156, 19813, 311, 279, 2015, 304, 902, 20197, 527, 26126, 304, 459, 7645, 13, 1102, 26420, 430, 24282, 527, 26126, 12722, 13, 763, 1455, 15840, 15823, 11, 20406, 3651, 706, 5190, 54156, 1109, 20406, 2794, 13, 1789, 3187, 1473, 74694, 12958, 198, 1407, 284, 3082, 477, 3641, 323, 3641, 220, 674, 1115, 7645, 374, 26126, 439, 320, 2575, 477, 320, 4139, 323, 3641, 5850, 902, 374, 3082, 198, 14196, 19884, 12755, 1824, 38368, 16865, 374, 264, 7865, 1405, 279, 2132, 28312, 315, 264, 20406, 5793, 374, 539, 26126, 422, 279, 1121, 649, 387, 11075, 3196, 389, 279, 907, 315, 279, 1176, 28312, 13, 763, 2875, 1824, 38368, 16865, 11, 422, 279, 1176, 28312, 315, 459, 330, 4064, 1, 5793, 374, 905, 11, 279, 2132, 28312, 374, 539, 26126, 1606, 279, 1121, 690, 2744, 387, 905, 13, 35339, 11, 422, 279, 1176, 28312, 315, 459, 330, 878, 1, 5793, 374, 837, 11, 279, 2132, 28312, 374, 539, 26126, 1606, 279, 1121, 690, 2744, 387, 837, 382, 644, 15840, 15823, 430, 1862, 2875, 1824, 38368, 16865, 308, 8046, 11, 499, 649, 1005, 433, 311, 7417, 5178, 477, 5766, 6103, 13, 1789, 3187, 1473, 74694, 12958, 198, 333, 865, 976, 220, 15, 323, 320, 88, 611, 865, 8, 871, 220, 605, 512, 262, 674, 26050, 1063, 5784, 198, 14196, 19884, 644, 15823, 2085, 10068, 2875, 1824, 38368, 16865, 11, 499, 649, 4305, 701, 1866, 12496, 311, 11322, 279, 1890, 7865, 13, 5810, 596, 459, 3187, 304, 51743, 44788, 1473, 14196, 4077, 333, 865, 976, 220, 15, 341, 262, 422, 320, 88, 611, 865, 8, 871, 220, 605, 341, 286, 443, 26050, 1063, 5784, 198, 262, 457, 534, 14196, 19884, 25025, 1918, 323, 33032, 1918, 8464, 311, 1268, 2536, 12, 6245, 2819, 527, 26126, 304, 2777, 38697, 13, 763, 1690, 15840, 15823, 11, 2536, 38029, 5219, 323, 2536, 40533, 9246, 527, 6646, 8206, 88, 11, 1418, 7315, 11, 4384, 9246, 11, 323, 854, 14, 4155, 2819, 527, 6646, 33032, 88, 382, 4599, 38663, 2777, 24282, 11, 8206, 1918, 323, 33032, 1918, 2586, 1139, 1514, 13, 1789, 3187, 1473, 74694, 12958, 198, 87, 284, 220, 20, 198, 1407, 284, 865, 220, 674, 578, 907, 315, 865, 374, 8206, 88, 11, 779, 1121, 374, 1101, 8206, 88, 198, 14196, 19884, 1271, 3790, 5157, 1405, 8206, 1918, 323, 33032, 1918, 527, 11798, 22009, 4028, 15840, 15823, 11, 499, 649, 21650, 1817, 279, 12974, 3044, 13, 1789, 3187, 1473, 74694, 12958, 198, 87, 284, 220, 20, 198, 1407, 284, 1845, 2120, 8, 220, 674, 32430, 398, 34537, 865, 311, 264, 2777, 907, 198, 14196, 19884, 2028, 26420, 430, 279, 1121, 374, 2744, 264, 2777, 907, 11, 15851, 315, 279, 4221, 596, 8206, 1918, 323, 33032, 1918, 5718, 13, 128009]\n" ] } ], "source": [ "space = tokenizer(\" \", add_special_tokens = False).input_ids[0]\n", "tokenizer.decode([space if x == -100 else x for x in trainer.train_dataset[5][\"labels\"]])\n", "print(trainer.train_dataset[0][\"labels\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "3enWUM0jV-jV" }, "source": [ "我们可以看到系统和指令提示已成功屏蔽。" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "cellView": "form", "colab": { "base_uri": "https://localhost:8080/" }, "execution": { "iopub.execute_input": "2024-12-30T09:09:13.136253Z", "iopub.status.busy": "2024-12-30T09:09:13.135684Z", "iopub.status.idle": "2024-12-30T09:09:13.141100Z", "shell.execute_reply": "2024-12-30T09:09:13.140345Z", "shell.execute_reply.started": "2024-12-30T09:09:13.136229Z" }, "id": "2ejIt2xSNKKp", "outputId": "ac07343f-67db-44e4-f9d3-83539724e6af", "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "GPU = NVIDIA A100-SXM4-80GB. 最大内存 = 79.347 GB.\n", "2.635 GB 的内存已保留。\n" ] } ], "source": [ "#@title 显示当前内存状态\n", "gpu_stats = torch.cuda.get_device_properties(0)\n", "start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n", "max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n", "print(f\"GPU = {gpu_stats.name}. 最大内存 = {max_memory} GB.\")\n", "print(f\"{start_gpu_memory} GB 的内存已保留。\")" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "execution": { "iopub.execute_input": "2024-12-30T09:09:15.340607Z", "iopub.status.busy": "2024-12-30T09:09:15.340272Z", "iopub.status.idle": "2024-12-30T09:10:56.381821Z", "shell.execute_reply": "2024-12-30T09:10:56.381226Z", "shell.execute_reply.started": "2024-12-30T09:09:15.340583Z" }, "id": "yqxqAZ7KJ4oL", "outputId": "fb3dc2a2-5cd6-4aa0-dfc5-ad734359f397", "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "==((====))== Unsloth - 2x faster free finetuning | Num GPUs = 1\n", " \\\\ /| Num examples = 100,000 | Num Epochs = 1\n", "O^O/ \\_/ \\ Batch size per device = 2 | Gradient Accumulation steps = 4\n", "\\ / Total batch size = 8 | Total steps = 60\n", " \"-____-\" Number of trainable parameters = 24,313,856\n" ] }, { "data": { "text/html": [ "\n", "
| Step | \n", "Training Loss | \n", "
|---|---|
| 1 | \n", "0.793500 | \n", "
| 2 | \n", "0.839500 | \n", "
| 3 | \n", "1.091000 | \n", "
| 4 | \n", "0.904500 | \n", "
| 5 | \n", "0.779700 | \n", "
| 6 | \n", "0.924900 | \n", "
| 7 | \n", "0.624100 | \n", "
| 8 | \n", "1.003900 | \n", "
| 9 | \n", "0.856700 | \n", "
| 10 | \n", "0.752200 | \n", "
| 11 | \n", "0.885600 | \n", "
| 12 | \n", "1.091500 | \n", "
| 13 | \n", "0.936600 | \n", "
| 14 | \n", "0.640300 | \n", "
| 15 | \n", "0.870000 | \n", "
| 16 | \n", "0.632300 | \n", "
| 17 | \n", "1.001600 | \n", "
| 18 | \n", "0.829400 | \n", "
| 19 | \n", "0.765800 | \n", "
| 20 | \n", "0.923300 | \n", "
| 21 | \n", "0.893800 | \n", "
| 22 | \n", "0.849500 | \n", "
| 23 | \n", "1.028500 | \n", "
| 24 | \n", "0.871800 | \n", "
| 25 | \n", "0.637500 | \n", "
| 26 | \n", "0.829400 | \n", "
| 27 | \n", "0.830200 | \n", "
| 28 | \n", "0.783600 | \n", "
| 29 | \n", "1.083400 | \n", "
| 30 | \n", "1.030400 | \n", "
| 31 | \n", "0.711100 | \n", "
| 32 | \n", "0.540800 | \n", "
| 33 | \n", "0.657700 | \n", "
| 34 | \n", "0.574700 | \n", "
| 35 | \n", "0.763000 | \n", "
| 36 | \n", "0.995500 | \n", "
| 37 | \n", "0.897600 | \n", "
| 38 | \n", "0.713900 | \n", "
| 39 | \n", "0.779500 | \n", "
| 40 | \n", "0.998500 | \n", "
| 41 | \n", "0.740600 | \n", "
| 42 | \n", "0.998300 | \n", "
| 43 | \n", "0.771500 | \n", "
| 44 | \n", "0.810500 | \n", "
| 45 | \n", "0.762400 | \n", "
| 46 | \n", "0.862600 | \n", "
| 47 | \n", "0.788200 | \n", "
| 48 | \n", "0.647500 | \n", "
| 49 | \n", "1.020300 | \n", "
| 50 | \n", "1.028800 | \n", "
| 51 | \n", "0.456000 | \n", "
| 52 | \n", "0.908400 | \n", "
| 53 | \n", "1.298500 | \n", "
| 54 | \n", "0.690500 | \n", "
| 55 | \n", "1.059200 | \n", "
| 56 | \n", "1.115600 | \n", "
| 57 | \n", "0.721900 | \n", "
| 58 | \n", "0.828800 | \n", "
| 59 | \n", "0.753700 | \n", "
| 60 | \n", "0.915500 | \n", "
"
],
"text/plain": [
"