Fine-Tuning

This guide explains how to fine-tune models using LoRA (Low-Rank Adaptation) in FinLoRA.

Fine-Tuning Process

FinLoRA uses the Axolotl library for fine-tuning, which is wrapped in a convenient script. The fine-tuning process involves several steps outlined below.

Step-by-Step Fine-Tuning

Navigate to the LoRA directory and fetch deepspeed configs

First, navigate to the lora directory and fetch deepspeed configs. The deepspeed configs allow the fine-tuning framework to parallelize fine-tuning across GPUs:
```
cd lora
axolotl fetch deepspeed_configs
```
Add your fine-tuning dataset

Add your fine-tuning dataset (e.g., your_dataset_train.jsonl) in the ../data/train/ folder.

Configure your LoRA adapter

Open finetune_configs.json and add the configuration for the LoRA adapter you want to create with hyperparameters defined. There are examples you can reference in the file. The following is an example:

"your_config_name": {
  "base_model": "meta-llama/Llama-3.1-8B-Instruct",
  "dataset_path": "../data/train/your_dataset_train.jsonl",
  "lora_r": 8,
  "quant_bits": 8,
  "learning_rate": 0.0001,
  "num_epochs": 1,
  "batch_size": 4,
  "gradient_accumulation_steps": 2
}

Run fine-tuning

Run fine-tuning with your configuration by executing the following command:
```
python finetune.py your_config_name
```
For example, to use the existing formula configuration:

python finetune.py formula_llama_3_1_8b_8bits_r8

Retrieve your adapter

After fine-tuning completes, the adapter will be saved in the axolotl-output subfolder within the ‘lora’ folder. Download the adapter files from this directory. You can remove checkpoints to save space.

Note

If you don’t have compute resources, you can rent 4 A5000s at a low cost from RunPod.

Configuration File Structure

The configuration file (finetune_configs.json) contains settings for different fine-tuning runs. Each configuration includes:

base_model: The base model to fine-tune (e.g., “meta-llama/Llama-3.1-8B-Instruct”)
dataset_path: Path to the training dataset
lora_r: Rank of the LoRA adapters
quant_bits: Quantization bits (4 or 8)
learning_rate: Learning rate for training
num_epochs: Number of training epochs
batch_size: Batch size for training
gradient_accumulation_steps: Number of gradient accumulation steps
Additional parameters for specific LoRA variants (e.g., peft_use_rslora, peft_use_dora)

Example configurations for different LoRA methods:

Vanilla LoRA:

{
  "sentiment_llama_3_1_8b_8bits_r8": {
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "dataset_path": "../data/train/finlora_sentiment_train.jsonl",
    "lora_r": 8,
    "quant_bits": 8,
    "learning_rate": 0.0001,
    "num_epochs": 4,
    "batch_size": 8,
    "gradient_accumulation_steps": 2
  }
}

QLoRA (Quantized LoRA):

{
  "xbrl_term_llama_3_1_8b_4bits_r4": {
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "dataset_path": "../data/train/xbrl_term_train.jsonl",
    "lora_r": 4,
    "quant_bits": 4,
    "learning_rate": 0.0001,
    "num_epochs": 1,
    "batch_size": 4,
    "gradient_accumulation_steps": 2
  }
}

DoRA (Weight-Decomposed Low-Rank Adaptation):

{
  "sentiment_llama_3_1_8b_8bits_r8_dora": {
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "dataset_path": "../data/train/finlora_sentiment_train.jsonl",
    "lora_r": 8,
    "quant_bits": 8,
    "learning_rate": 0.0001,
    "num_epochs": 4,
    "batch_size": 8,
    "gradient_accumulation_steps": 2,
    "peft_use_dora": true
  }
}

RSLoRA (Rank-Stabilized LoRA):

{
  "sentiment_llama_3_1_8b_8bits_r8_rslora": {
    "base_model": "meta-llama/Llama-3.1-8B-Instruct",
    "dataset_path": "../data/train/finlora_sentiment_train.jsonl",
    "lora_r": 8,
    "quant_bits": 8,
    "learning_rate": 0.0001,
    "num_epochs": 4,
    "batch_size": 8,
    "gradient_accumulation_steps": 2,
    "peft_use_rslora": true
  }
}

Using Your LoRA Adapter

Once you have trained a LoRA adapter, you can use it for inference by using the following code:

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Load and apply the LoRA adapter
adapter_path = "./path/to/your/adapter"  # Path to your adapter
model = PeftModel.from_pretrained(base_model, adapter_path)

# Generate text
prompt = "What is the formula for the Black-Scholes model?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        # This ensures that you get reproducible responses.
        temperature=0,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

LoRA Adapters Directory Structure

The fine-tuned LoRA adapters are saved in the lora_adapters directory. This directory contains subdirectories for different quantization and rank configurations:

lora_adapters/4bits_r4: 4-bit quantization with rank 4
lora_adapters/8bits_r8: 8-bit quantization with rank 8
lora_adapters/8bits_r8_dora: 8-bit quantization with rank 8 using DoRA
lora_adapters/8bits_r8_rslora: 8-bit quantization with rank 8 using RSLoRA
lora_adapters/fp16_r8: FP16 precision with rank 8

Each subdirectory contains the fine-tuned adapters for different tasks, such as sentiment analysis, headline analysis, named entity recognition, etc.

These adapters can be loaded during evaluation to test the fine-tuned models.

Available Datasets and LoRA Methods

The following table lists the available datasets and LoRA methods in FinLoRA, along with links to their documentation:

Table 1 Datasets and LoRA Methods
Dataset	Description	Configuration Name	Documentation
Sentiment Analysis	Financial sentiment analysis datasets (FPB, FiQA SA, TFNS, NWGI)	`sentiment_llama_3_1_8b_8bits_r8`	General Financial Tasks
Headline Analysis	Financial headline classification	`headline_llama_3_1_8b_8bits_r8`	General Financial Tasks
Named Entity Recognition	Financial named entity recognition	`ner_llama_3_1_8b_8bits_r8`	General Financial Tasks
FiNER-139	XBRL tagging with 139 common US GAAP tags	`finer_llama_3_1_8b_8bits_r8`	Financial Data Reporting
XBRL Term	XBRL terminology explanation	`xbrl_term_llama_3_1_8b_8bits_r8`	Financial Data Reporting
XBRL Extraction	Tag and value extraction from XBRL documents	`xbrl_extract_llama_3_1_8b_8bits_r8`	Financial Statement Analysis
Financial Math	Financial mathematics problems	`formula_llama_3_1_8b_8bits_r8`	Financial Statement Analysis
FinanceBench	Financial benchmarking and analysis	`financebench_llama_3_1_8b_8bits_r8`	Financial Statement Analysis

Table 2 LoRA Methods
Method	Description	Configuration Parameter	Documentation
Vanilla LoRA	Low-Rank Adaptation	No special parameter needed	../lora_methods/lora_methods
QLoRA	Quantized LoRA (4-bit)	`quant_bits: 4`	QLoRA
DoRA	Weight-Decomposed Low-Rank Adaptation	`peft_use_dora: true`	DoRA
RSLoRA	Rank-Stabilized LoRA	`peft_use_rslora: true`	rsLoRA
FedLoRA	Federated Learning with LoRA	Uses Flower framework	FedLoRA