Billing for AI Studio

Krutrim Cloud's AI Studio offers access to a diverse catalogue of open-source and in-house AI models for text generation, embeddings, speech, and multimodal use cases. All models are billed based on usage, with costs varying depending on the type of input/output and the model in use.


Model catalog billing

Each model in AI Studio defines:

  • Billing mode (Input tokens, Output tokens, Audio minutes, etc.)

  • Unit of pricing (Per 1M tokens, Per minute, Per hour, etc.)

  • Rate (in ₹)

You are billed only for what you use. Charges are calculated based on usage metrics collected during API calls or console interactions with the models.


Model Catalog Pricing

Model Name
Input Rate (₹)
Output Rate (₹)
Unit Type
Task Type

DeepSeek-R1-Distill-Llama-70B

₹10.00 / 1M tokens

₹10.00 / 1M tokens

Tokens

Text Generation

DeepSeek-R1-Distill-Llama-8B

₹3.00 / 1M tokens

₹3.00 / 1M tokens

Tokens

Text Generation

DeepSeek-R1

₹11.00 / 1M tokens

₹16.00 / 1M tokens

Tokens

Text Generation

Google / Gemma-3-27b-it

₹8.00 / 1M tokens

₹25.00 / 1M tokens

Tokens

Multimodal

Krutrim / Krutrim-1

₹16.6 / 1M tokens

₹16.6 / 1M tokens

Tokens

Text Generation

Krutrim / Krutrim-2

₹6.6 / 1M tokens

₹6.6 / 1M tokens

Tokens

Text Generation

Krutrim / Krutrim-TTS

N/A

₹4.42 / min

Minutes (output)

Text-to-Speech

Krutrim / Krutrim-Dhwani

₹24.00 / hour

N/A

Hours (input)

Speech-to-Text

Krutrim / Bhasantarit

₹6.26 / 1M tokens

N/A

Tokens (input)

Text-to-Embedding

Krutrim / Vyakyarth

₹6.06 / 1M tokens

N/A

Tokens (input)

Text-to-Embedding

Krutrim / Chitrapathak

₹83.6 / 1M tokens

₹34.53 / 1M tokens

Tokens

Image-to-Text

Krutrim / Tokenizer

₹0.00

₹0.00

Tokens

Tokenization Utility

Meta / LLaMA-3.2-11B-Vision-Instruct

₹14.94 / 1M tokens

₹14.94 / 1M tokens

Tokens

Image-to-Text

Meta / LLaMA-3.3-70B-Instruct

₹73.04 / 1M tokens

₹73.04 / 1M tokens

Tokens

Text Generation

Meta / LLaMA-4-Maverick-17B-12BE-Instruct

₹17.00 / 1M tokens

₹50.00 / 1M tokens

Tokens

Multimodal

Microsoft / Phi-4 Reasoning Plus

₹5.00 / 1M tokens

₹29.00 / 1M tokens

Tokens

Text Generation

Mistral / Mistral-7B-v0.2

₹16.6 / 1M tokens

₹16.6 / 1M tokens

Tokens

Text Generation

Qwen / Qwen3-32B

₹8.00 / 1M tokens

₹25.00 / 1M tokens

Tokens

Text Generation

Qwen / Qwen3-30B-A3B

₹8.00 / 1M tokens

₹25.00 / 1M tokens

Tokens

Text Generation


Additional Notes

  • Tokenization logic varies by model and is handled automatically using the model’s default tokenizer.

  • Speech models are priced per minute/hour of audio, based on the type (TTS or STT).

  • Multimodal models may count both text and vision tokens.


Evaluation Billing

The cost of running Model Evaluations and Performance Evaluations on the Krutrim platform is the same as inference — based on the number of tokens processed.

Token-Based Pricing

  • You are billed per token used during evaluation.

  • There are no additional fees for launching or running evaluation jobs.

Low Balance Handling

  • If your account balance drops below a certain threshold, the platform will automatically pause evaluation services until funds are added.

Fine-Tuning Billing

Krutrim offers transparent and flexible pricing for both fine-tuning and deployment of custom models. Below is a detailed breakdown of the billing model:

Fine-Tuning Pricing

Fine-tuning costs are calculated per 1 million tokens, and pricing varies based on the model used.

Model
Price (INR)
Unit

Llama-3-8b-instruct

₹33

per 1M tokens

Note: This is token-based pricing, ensuring you pay only for the compute resources used during fine-tuning.


Deployment Pricing

Once your model is fine-tuned, you can deploy it on an NVIDIA H100 GPU. Pricing is based on actual GPU usage.

GPU Resource
Price (INR)
Unit

NVIDIA H100

₹213

per GPU-hour

Deployment pricing includes inference usage — you are charged only for the active duration of your deployment.


Cost Management Best Practices

  • Optimize Your Dataset: High-quality, concise datasets reduce token usage and improve training efficiency.

  • Monitor Active Deployments: Shut down unused deployments to avoid incurring extra GPU charges.

  • Use Checkpoints: Save checkpoints during training to avoid repeating the entire process in case of interruptions.


Low Balance Handling

If your account balance falls below a defined threshold:

  • Fine-tuning and deployment services will be automatically paused.

  • Resume once sufficient balance is added to the account.

Where to See This in Console

To see real-time usage and billing:

  1. Go to Billing → Usage → AI Studio

  2. Select the relevant sub-tab:

    • Model Catalogue

    • Fine Tuning

    • Deployment

    • Evaluation

  3. Use the date filter to narrow your view

  4. Click Export to download usage data

Last updated

Was this helpful?