Catalog Overview

Browse and explore AI models from various providers in the Kloud Team catalog

Catalog Overview

The AI Models Catalog provides access to a wide range of pre-configured AI models from leading providers. Browse, compare, and use them on demand.

Models Catalog

What's Available

Kloud Team's AI Models Catalog offers:

  • Self-Hosted Models – Deploy open-source models on your own infrastructure
  • Third-Party Models – Access models from Anthropic, OpenAI, xAI, and other providers
  • Multiple Model Types – Chat completion, code generation, vision, video, and more
  • Transparent Pricing – See token costs for prompts and completions upfront

Accessing the Catalog

Navigate to AI [Machine Learning] > Models > Catalog in the Kloud Team dashboard to browse available models.

Catalog Sections

The Models section includes:

  • Catalog – Browse all available AI models
  • My Library – View your deployed and saved models
  • Playground – Test models before deployment

Available Models

Self-Hosted Models

Self-hosted models run on your own Kloud Team infrastructure, giving you full control over data and deployment:

openai/gpt-oss-120b

  • Large open-source language model with 120B parameters
  • Context Length: 128K tokens
  • Prompts: 16,050 Tomans / 1,000,000 Tokens
  • Completions: 64,200 Tomans / 1,000,000 Tokens
  • Use cases: Chat completion, general AI tasks

Third-Party Models

Third-party models are hosted by the provider and accessed via API:

anthropic/claude-sonnet-4-5

  • Anthropic's balanced flagship model
  • Context Length: 200K tokens
  • Prompts: 321,000 Tomans / 1,000,000 Tokens
  • Completions: 1,605,000 Tomans / 1,000,000 Tokens
  • Capabilities: Chat completion, vision
  • Best for: Advanced reasoning with multimodal inputs

anthropic/claude-opus-4-1

  • Advanced reasoning and long-context understanding
  • Context Length: 200K tokens
  • Prompts: 58,850 Tomans / 1,000,000 Tokens
  • Completions: 234,330 Tomans / 1,000,000 Tokens
  • Capabilities: Code generation, messages
  • Best for: Complex coding tasks and analysis

anthropic/claude-haiku-4-5

  • Anthropic's fastest and most efficient model
  • Context Length: 200K tokens
  • Prompts: 321,000 Tomans / 1,000,000 Tokens
  • Completions: 1,605,000 Tomans / 1,000,000 Tokens
  • Capabilities: Code generation, messages
  • Best for: Quick reasoning and high-throughput tasks

openai/gpt-5-chat

  • OpenAI's next-generation conversational model
  • Context Length: 128K tokens
  • Prompts: 321,000 Tomans / 1,000,000 Tokens
  • Completions: 1,605,000 Tomans / 1,000,000 Tokens
  • Capabilities: Chat completion, messages
  • Best for: Advanced conversational AI

openai/sora

  • Advanced text-to-video generation model
  • Context Length: 128K tokens
  • Capabilities: Video generation
  • Best for: Creating videos from text descriptions

x-ai/grok-3

  • xAI's latest conversational reasoning model
  • Context Length: 128K tokens
  • Prompts: 321,000 Tomans / 1,000,000 Tokens
  • Completions: 1,605,000 Tomans / 1,000,000 Tokens
  • Capabilities: Chat completion, messages
  • Best for: Fast, intelligent interactions

Browsing Models

1

Search and Filter

Use the search bar to find models by name:

Search by model name

Models are organized by provider and capabilities (Chat completion, Code, Vision, Video, Messages).

2

Review Model Details

Each model card displays:

  • Model Name – Provider and model identifier
  • Hosting Type – Self Hosted or Third Party
  • Capabilities – Supported features (chat, code, vision, etc.)
  • Context Length – Maximum token context window
  • Prompts Pricing – Cost per 1M input tokens
  • Completions Pricing – Cost per 1M output tokens
3

Compare Models

Compare models based on:

  • Context Length – Larger contexts for longer documents
  • Pricing – Balance cost vs. performance
  • Capabilities – Match features to your use case
  • Hosting Type – Self-hosted for data control, third-party for convenience

Model Types and Use Cases

Chat Completion Models

Best for conversational AI, customer support, and general-purpose text generation:

  • claude-sonnet-4-5
  • claude-haiku-4-5
  • gpt-5-chat
  • gpt-oss-120b
  • grok-3

Code Generation Models

Optimized for programming tasks, code review, and technical documentation:

  • claude-opus-4-1
  • claude-haiku-4-5

Vision Models

Process and understand images alongside text:

  • claude-sonnet-4-5 (multimodal)

Video Models

Generate videos from text descriptions:

  • openai/sora

Playground

Test models before deployment in the playground:

Playground

Pricing Structure

Model pricing is based on token usage:

  • Prompts (Input) – Tokens in your request
  • Completions (Output) – Tokens in the model's response
  • Pricing Unit – Tomans per 1,000,000 tokens

Cost Optimization:

To minimize costs:

  • Use smaller context windows when possible
  • Choose efficient models like Haiku for simple tasks
  • Monitor llm token usage in the dashboard overview page

Monitoring LLM Tokens Usage

Monitor llm tokens