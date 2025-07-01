Abstract and 1. Introduction





QDyLoRA offers an efficient and effective technique for LoRA-based fine-tuning LLMs on downstream tasks. Eliminating the need for fine-tuning multiple models to find the optimal LoRA rank and offering the possibility of fine-tuning larger LLMs are two main advantages of QDyLoRA. The experimental results demonstrated that the optimal rank for QDyLoRA can be surprisingly low, yet it consistently outperforms QLoRA. QDyLoRA provides greater flexibility for deploying LLMs in various contexts and represents a promising step towards making fine-tuning large language models more accessible and efficient.

Limitations

While the 4-bit QDyLoRA exhibits notable performance, it falls short of achieving the performance levels of full precision fine-tuning. One possible solution could be dynamic quantized DyLoRA (DyQDyLoRA), in which the quantization level could also vary during finetuning. In particular, the finetuning strategy can dynamically switch between different quantization levels based on a predefined learning feedback. Additionally, further research is required to investigate the impact of LoRA's scalar and the range of underlying ranks in QDyLoRA.





A.1 Hyperparameters

Table 4 provides an overview of the hyperparameters and experimental configurations employed in this study, which are crucial configurations that determine various aspects of the training process and model behavior in this study. Common key parameters across the experiments include the choice of optimizer, Adam-Beta2 value, maximum gradient norm, and warmup ratio, which collectively influence how the model adjusts its weights during training. LoRA-specific parameters such as LoRA dropout probability, maximum LoRA rank, and alpha value control the behavior of LoRA layers. Additionally, double quantization and quantization type impact the precision of numerical representations within the model, which are considered the same as baselines. Learning rate scheduling and weight decay contribute to the optimization process, helping to prevent overfitting and stabilize training. Random seeds ensure reproducibility, while the specified GPU determines the hardware used for training. Each model configuration, whether for the Web-GLM, GSM8k, or the specific experiment outlined in Table 1, features parameters tailored to the characteristics of the dataset and the computational resources available. These hyperparameters collectively shape the training process, ultimately influencing the performance and effectiveness of the models in the study.









A.2 Generated Text Quality

To describe the quality of texts generated by QDyLoRA, a sample query taken from GSM8k is fed to Falcon-40b trained by QDyLoRA. Below, we report the generated answers for different LoRA ranks.





Query: Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers’ market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers’ market?





Rank=64: Janet has 16 - 3 - 4 = «16-3-4=7»7 eggs left. She sells 7 - 16 = «7-16=9»9 eggs at the

















farmers’ market. She makes 9 * 2 =«9*2=18»18 at the farmers’ market every day.





Rank=32: She eats 3 eggs for breakfast and bakes 4 muffins per day, so she uses 3 + 4 = «3+4=7»7 eggs per day. She has 16 - 7 = «16- 7=9»9 eggs left over each day. She sells these eggs at the farmers’ market for $2 per egg, so she makes 2 * 9 = «2*9=18»18 per day at the market.





Rank=28: She eats 3 eggs for breakfast and bakes 4 muffins per day, so she uses 3+4 = «3+4=7»7 eggs per day. She sells 16-7 = «16-7=9»9 eggs per day for $2 per egg. She makes 9*2 = «9*2=18»18 per day at the market.





Rank=16: Janet’s ducks lay 16 eggs per day and she eats 3 for breakfast every morning so she has 16-3=13 eggs left. Janet bakes 4 muffins every day and she uses 4 eggs so she has 13-4=9 eggs left. Janet sells the remaining eggs at the farmers’ market for $2 each so she makes 9*2=«9*2=18»18 dollars a day. Janet makes 18 dollars a day at the farmers’ market.





