Billing for AI Studio
Krutrim Cloud's AI Studio offers access to a diverse catalogue of open-source and in-house AI models for text generation, embeddings, speech, and multimodal use cases. All models are billed based on usage, with costs varying depending on the type of input/output and the model in use.
Model catalog billing
Each model in AI Studio defines:
Billing mode (Input tokens, Output tokens, Audio minutes, etc.)
Unit of pricing (Per 1M tokens, Per minute, Per hour, etc.)
Rate (in ₹)
You are billed only for what you use. Charges are calculated based on usage metrics collected during API calls or console interactions with the models.
Model Catalog Pricing
DeepSeek-R1-Distill-Llama-70B
₹10.00 / 1M tokens
₹10.00 / 1M tokens
Tokens
Text Generation
DeepSeek-R1-Distill-Llama-8B
₹3.00 / 1M tokens
₹3.00 / 1M tokens
Tokens
Text Generation
DeepSeek-R1
₹11.00 / 1M tokens
₹16.00 / 1M tokens
Tokens
Text Generation
Google / Gemma-3-27b-it
₹8.00 / 1M tokens
₹25.00 / 1M tokens
Tokens
Multimodal
Krutrim / Krutrim-1
₹16.6 / 1M tokens
₹16.6 / 1M tokens
Tokens
Text Generation
Krutrim / Krutrim-2
₹6.6 / 1M tokens
₹6.6 / 1M tokens
Tokens
Text Generation
Krutrim / Krutrim-TTS
N/A
₹4.42 / min
Minutes (output)
Text-to-Speech
Krutrim / Krutrim-Dhwani
₹24.00 / hour
N/A
Hours (input)
Speech-to-Text
Krutrim / Bhasantarit
₹6.26 / 1M tokens
N/A
Tokens (input)
Text-to-Embedding
Krutrim / Vyakyarth
₹6.06 / 1M tokens
N/A
Tokens (input)
Text-to-Embedding
Krutrim / Chitrapathak
₹83.6 / 1M tokens
₹34.53 / 1M tokens
Tokens
Image-to-Text
Krutrim / Tokenizer
₹0.00
₹0.00
Tokens
Tokenization Utility
Meta / LLaMA-3.2-11B-Vision-Instruct
₹14.94 / 1M tokens
₹14.94 / 1M tokens
Tokens
Image-to-Text
Meta / LLaMA-3.3-70B-Instruct
₹73.04 / 1M tokens
₹73.04 / 1M tokens
Tokens
Text Generation
Meta / LLaMA-4-Maverick-17B-12BE-Instruct
₹17.00 / 1M tokens
₹50.00 / 1M tokens
Tokens
Multimodal
Microsoft / Phi-4 Reasoning Plus
₹5.00 / 1M tokens
₹29.00 / 1M tokens
Tokens
Text Generation
Mistral / Mistral-7B-v0.2
₹16.6 / 1M tokens
₹16.6 / 1M tokens
Tokens
Text Generation
Qwen / Qwen3-32B
₹8.00 / 1M tokens
₹25.00 / 1M tokens
Tokens
Text Generation
Qwen / Qwen3-30B-A3B
₹8.00 / 1M tokens
₹25.00 / 1M tokens
Tokens
Text Generation
Additional Notes
Tokenization logic varies by model and is handled automatically using the model’s default tokenizer.
Speech models are priced per minute/hour of audio, based on the type (TTS or STT).
Multimodal models may count both text and vision tokens.
Evaluation Billing
The cost of running Model Evaluations and Performance Evaluations on the Krutrim platform is the same as inference — based on the number of tokens processed.
Token-Based Pricing
You are billed per token used during evaluation.
There are no additional fees for launching or running evaluation jobs.
Low Balance Handling
If your account balance drops below a certain threshold, the platform will automatically pause evaluation services until funds are added.
Fine-Tuning Billing
Krutrim offers transparent and flexible pricing for both fine-tuning and deployment of custom models. Below is a detailed breakdown of the billing model:
Fine-Tuning Pricing
Fine-tuning costs are calculated per 1 million tokens, and pricing varies based on the model used.
Llama-3-8b-instruct
₹33
per 1M tokens
Note: This is token-based pricing, ensuring you pay only for the compute resources used during fine-tuning.
Deployment Pricing
Once your model is fine-tuned, you can deploy it on an NVIDIA H100 GPU. Pricing is based on actual GPU usage.
NVIDIA H100
₹213
per GPU-hour
Deployment pricing includes inference usage — you are charged only for the active duration of your deployment.
Cost Management Best Practices
Optimize Your Dataset: High-quality, concise datasets reduce token usage and improve training efficiency.
Monitor Active Deployments: Shut down unused deployments to avoid incurring extra GPU charges.
Use Checkpoints: Save checkpoints during training to avoid repeating the entire process in case of interruptions.
Low Balance Handling
If your account balance falls below a defined threshold:
Fine-tuning and deployment services will be automatically paused.
Resume once sufficient balance is added to the account.
Where to See This in Console
To see real-time usage and billing:
Go to Billing → Usage → AI Studio
Select the relevant sub-tab:
Model Catalogue
Fine Tuning
Deployment
Evaluation
Use the date filter to narrow your view
Click Export to download usage data
Last updated
Was this helpful?