Quantizing is not enough when fine-tuning a model! Even in the lowest precisions, most of the memory is going to be taken by the optimizer state when training…