Implement LoRA in a Linear Layer
In this exercise, you will implement the LoRA (Low-Rank Adaptation) technique in a linear (fully connected) layer of a neural network (paper: https://arxiv.org/pdf/2106.09685.pdf ). LoRA introduces low-rank matrices to adapt the weights of a pre-trained model efficiently, allowing for fine-tuning while preserving the original model's structure. Your task is to modify the forward pass to include LoRA adaptation. Follow these steps:
1. Note that two low-rank matrices A and B were intitialized. Matrix A has dimensions (r, in_features), and B has dimensions (out_features, r), where r is the rank.
2. In forward(), apply dropout to the input tensor x.
3. Compute the LoRA update: Multiply the dropout-applied input with transposed matrix A, then multiply the result with transposed matrix B.
4. Scale the LoRA update by a factor (lora_alpha / r).
5. Add the scaled LoRA update to the original linear transformation result.