Finetuning scheduler
For some of our experiments, we leverage the gradual unfreezing finetuning scheduler, adapted from RiNALMo's code repository.
-
To use a FT scheduler, first we have to create a schedule and saving as a
.yaml
file. An example schedule is shown below:0: - adapter.* 3: - backbone.encoder.encoder.ln.* - backbone.encoder.encoder.layer.32.*
In this example, when the model is setup, all the layers are first frozen. Then before the0-th
-th epoch starts, all the parameters in theadapter
module are unfrozen, and they remain unfrozen (trainable) for the rest of the training run. Similarly, before the3-rd
epoch starts, parameters in thebackbone.encoder.encoder.ln
module (i.e., the last layer norm module of the backbone's encoder) is unfrozen, and they remain unfrozen until the training ends. Here can add any other layer or module if we want to unfreeze it before the starting of some specific epoch. -
In order to use this schedule for finetuning, we can simply to set this as CLI argument for
--trainer.callbacks.ft_schedule_path
when callingmget fit
.Following is an example of finetuning the AIDO.RNA-1.6B model for RNA secondary structure prediction, with a scheduler named
layers_0_32.yaml
. (NOTE: Please refer to the correspoding experiment folder for details of this experiment):cd experiments/AIDO.RNA/rna_secondary_structure_prediction MGEN_DATA_DIR=~/mgen_data DATASET_NAME=bpRNA CKPT_SAVE_DIR=logs/rna_ss/${DATASET_NAME} mgen fit --config rna_ss_prediction.yaml \ --data.path ${MGEN_DATA_DIR}/modelgenerator/datasets/rna_ss_data/ \ --data.dataset ${DATASET_NAME} \ --trainer.default_root_dir ${CKPT_SAVE_DIR} \ --trainer.callbacks.ft_schedule_path ft_schedules/layers_0_32.yaml \ --trainer.devices 0,1,2,3