ORPO is a new fine-tuning technique that merges the traditional stages of instruction tuning and preference alignment into a single stage, and as a result significanly reduces the training time as well as the required resources. Learn more about ORPO in this article from HuggingFace.

In this article, we will fine-tune the Mistral 7B model using ORPO with an Axolotl Docker container. Check this other article to learn more about fine-tuning with Axolotl. We will provision the necessary compute resources (mainly a VM with GPU) on Google Cloud using the skypilot library.

First, install Skypilot with GCP driver

$ pip install "skypilot-nightly[gcp]"

Then, configure your GCP project and login to the console

$ gcloud init
$ gcloud auth application-default login

Note: make sure the Compute Engine API is enabled on GCP as skypilot does uses the API to provision resources on GCP.

After that, we can check that skypilot has proper access to GCP

$ sky check

🎉 Enabled clouds 🎉
  ✔ GCP

Now, we need to define the skypilot job as well as the needed resources with the following mistral-orpo.yaml configuration file

name: fine-tuning

resources:
  accelerators: L4:1
  cloud: gcp

workdir: configs

setup: |
  docker pull winglian/axolotl:main-py3.11-cu121-2.2.1

run: |
  docker run --gpus all \
    -v ~/sky_workdir:/sky_workdir \
    -v /root/.cache:/root/.cache \
    winglian/axolotl:main-py3.11-cu121-2.2.1 \
    huggingface-cli login --token ${HF_TOKEN} 

  docker run --gpus all \
    -v ~/sky_workdir:/sky_workdir \
    -v /root/.cache:/root/.cache \
    -e WANDB_API_KEY=${WANDB_API_KEY} \
    winglian/axolotl:main-py3.11-cu121-2.2.1 \
    accelerate launch -m axolotl.cli.train /sky_workdir/mistral-qlora-orpo.yml

envs:
  HF_TOKEN:  <HuggingFace Token to access gated models>
  WANDB_API_KEY: <optional WANDB API key for training tracking>

The following is a detailed explanation of the above configuration file:

  • name: The name of the configuration file is “fine-tuning”.
  • resources: specifies the resources required for the fine-tuning. In our case we want a single GPU of type L4 and using GCP as the cloud provider.
  • workdir: specifies the directory where the Axolotl configuration files are located.
  • setup: contains the setup commands that need to be executed before running the fine-tuning. In this case, it pulls the Axolotl Docker image. you may need to check that you’re using the latest one, you can check the tags here
  • run: contains the commands that need to be executed to run the fine-tuning. In our case, we login to HuggingFace so we can download the Mistral model and then starts the fine-tuning using Axolotl Docker image.
  • envs: defines environment variables that are used in the configuration file. In our case, we define HF_TOKEN for the token and WANDB_API_KEY for the API key.

Next, we need to define the Axolotl configuration under the configs folder. We can using the example mistral-qlora-orpo.yml configuration file.

Finally, we can start the fine-tuning process

$ sky launch -c finetune mistral-orpo.yaml

Task from YAML spec: mistral-orpo.yaml
I 05-10 16:30:28 optimizer.py:694] == Optimizer ==
I 05-10 16:30:28 optimizer.py:705] Target: minimizing cost
I 05-10 16:30:28 optimizer.py:717] Estimated cost: $0.7 / hour
I 05-10 16:30:28 optimizer.py:717] 
I 05-10 16:30:28 optimizer.py:842] Considered resources (1 node):
I 05-10 16:30:28 optimizer.py:912] --------------------------------------------------------------------------------------------
I 05-10 16:30:28 optimizer.py:912]  CLOUD   INSTANCE        vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE   COST ($)   CHOSEN   
I 05-10 16:30:28 optimizer.py:912] --------------------------------------------------------------------------------------------
I 05-10 16:30:28 optimizer.py:912]  GCP     g2-standard-4   4       16        L4:1           us-east4-a    0.70          ✔     
I 05-10 16:30:28 optimizer.py:912] --------------------------------------------------------------------------------------------
I 05-10 16:30:28 optimizer.py:912] 
Launching a new cluster 'finetune'. Proceed? [Y/n]: Y
I 05-10 16:30:31 cloud_vm_ray_backend.py:4378] Creating a new cluster: 'finetune' [1x GCP(g2-standard-4, {'L4': 1})].
I 05-10 16:30:31 cloud_vm_ray_backend.py:4378] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 05-10 16:30:31 cloud_vm_ray_backend.py:1372] To view detailed progress: tail -n100 -f /Users/dzlab/sky_logs/sky-2024-05-10-16-30-27-924742/provision.log
I 05-10 16:30:34 provisioner.py:76] Launching on GCP us-east4 (us-east4-a)
I 05-10 16:33:21 provisioner.py:458] Successfully provisioned or found existing instance.
I 05-10 16:34:10 provisioner.py:560] Successfully provisioned cluster: finetune
I 05-10 16:34:10 cloud_vm_ray_backend.py:3089] Syncing workdir (to 1 node): mistral -> ~/sky_workdir
I 05-10 16:34:10 cloud_vm_ray_backend.py:3097] To view detailed progress: tail -n100 -f ~/sky_logs/sky-2024-05-10-16-30-27-924742/workdir_sync.log
I 05-10 16:34:11 cloud_vm_ray_backend.py:3188] Running setup on 1 node.
main-py3.11-cu121-2.2.1: Pulling from winglian/axolotl
. . .
Status: Downloaded newer image for winglian/axolotl:main-py3.11-cu121-2.2.1
docker.io/winglian/axolotl:main-py3.11-cu121-2.2.1
I 05-10 16:38:12 cloud_vm_ray_backend.py:3201] Setup completed.
I 05-10 16:38:21 cloud_vm_ray_backend.py:3300] Job submitted with Job ID: 1
I 05-10 15:38:24 log_lib.py:408] Start streaming logs for job 1.
INFO: Tip: use Ctrl-C to exit log streaming (task will not be killed).
INFO: Waiting for task resources on 1 node. This will block if the cluster is full.
INFO: All task resources reserved.
INFO: Reserved IPs: ['10.150.0.4']
. . .
Shared connection to 35.245.48.114 closed.
I 05-10 22:45:10 cloud_vm_ray_backend.py:3335] Job ID: 1
I 05-10 22:45:10 cloud_vm_ray_backend.py:3335] To cancel the job:	sky cancel finetune 1
I 05-10 22:45:10 cloud_vm_ray_backend.py:3335] To stream job logs:	sky logs finetune 1
I 05-10 22:45:10 cloud_vm_ray_backend.py:3335] To view the job queue:	sky queue finetune
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] 
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] Cluster name: finetune
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] To log into the head VM:	ssh finetune
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] To submit a job:		sky exec finetune yaml_file
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] To stop the cluster:	sky stop finetune
I 05-10 22:45:10 cloud_vm_ray_backend.py:3431] To teardown the cluster:	sky down finetune
Clusters
NAME                     LAUNCHED    RESOURCES                          STATUS  AUTOSTOP  COMMAND                        
finetune                 5 hrs ago   1x GCP(g2-standard-4, {'L4': 1})   UP      -         sky launch axolotl.yaml -c...  

We can list status of all jobs with

$ sky queue finetune
Fetching and parsing job queue...

Job queue of cluster finetune
ID  NAME     SUBMITTED  STARTED    DURATION   RESOURCES  STATUS   LOG                                        
1   axolotl  5 hrs ago  5 hrs ago  5h 44m 2s  1x[L4:1]   RUNNING  ~/sky_logs/sky-2024-05-10-16-59-38-614417

We can access to the main cluster machine with

$ sky ssh finetune

To stream the logs from the job given its ID

$ sky logs finetune <ID>