Fine-tuning

Fine-Tuning

Fine-tuning allows you to adapt powerful pre-trained models to your own data and domain. This improves output quality, accuracy, and relevance for your specific use case.

AI Studio simplifies the fine-tuning process using efficient adapter-based methods, so you can achieve high-quality customization without the infrastructure complexity.


1. How Fine-Tuning Works on AI Studio

AI Studio supports fine-tuning via LoRA (Low-Rank Adaptation) adapters. These adapters:

  • Do not alter the base model weights

  • Are faster and cheaper to train

  • Enable task-specific customization

  • Allow easy rollback or switching between fine-tuned variants

All fine-tuning jobs are run on managed infrastructure with checkpointing and easy deployment built in.


2. Creating a Fine-Tuning Job

You can create a fine-tuning job from the Fine-Tuning section of the console.

Step-by-Step

  1. Select a Model Choose from supported models like Llama-3 or Mistral. Only supported models will appear in the dropdown.

  2. Upload a Dataset Format must follow the structure below (JSONL or JSON array):

    {
      "instruction": "Summarize the following text:",
      "input": "Artificial Intelligence is transforming industries...",
      "output": "AI is revolutionizing industries by automating tasks..."
    }
  3. Configure Training Parameters

    • LoRA Rank: Controls capacity of the adapter. Higher rank = more detailed adaptation.

    • Learning Rate: Default is 1. Lower values are more conservative.

    • Batch Size: Default is 8. Tune based on GPU resources.

    • Checkpointing: Enable to resume from intermediate points or track progress.

  4. Launch Job Once configured, submit the job. Resources will be provisioned automatically.


3. Monitoring Progress

After launching, the console displays real-time metrics:

  • Training loss

  • Token processed

  • Checkpoint status

  • Remaining time

Checkpoints are saved periodically and can be used to resume or deploy at any stage.


4. Deploying the Fine-Tuned Model

Once training is complete:

  1. Click Deploy from the job page

  2. Choose a unique deployment name

  3. The model will be deployed to a dedicated NVIDIA H100 GPU instance

  4. Once deployed, it is accessible via the same API structure used for base models

You can manage or terminate deployments to control costs.


5. Best Practices for Dataset Preparation

  • Use high-quality, representative examples

  • Avoid noisy or inconsistent entries

  • Ensure input-output alignment for each task

  • Keep instruction phrasing consistent if possible

  • Aim for at least a few hundred examples; more if doing complex generation


6. Supported Models

Currently supported for fine-tuning:

  • Llama-3-8B-Instruct

  • Llama-3-8B

  • Mistral-7B

More models will be added over time.


7. Pricing

Fine-tuning is billed per 1 million tokens processed during training.

Model
Cost per 1M Tokens

Llama-3-8B-Instruct

₹33

Deployment is billed per GPU-hour:

Deployment Type
GPU
Cost per Hour

Fine-tuned Model

NVIDIA H100

₹215

For cost efficiency:

  • Review your dataset before launch

  • Use checkpointing to avoid reruns

  • Shut down deployments when not in use


8. Next Steps

  • Evaluate your fine-tuned model on benchmark tasks

  • Deploy the model for production use

  • Review Billing for usage-based pricing and limits

  • Consult the API Reference to integrate your fine-tuned model

Last updated

Was this helpful?