> For the complete documentation index, see [llms.txt](https://docs.ai.neevcloud.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.ai.neevcloud.com/api-reference/gpu-instance/airuntime.md).

# AIRuntime

Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.

## List GPU Instances

> Retrieves a paginated list of all GPU Instances (AI Runtimes) in the project.

```json
{"openapi":"3.0.3","info":{"title":"GPU Instance API","version":"0.1.0"},"tags":[{"name":"AIRuntime","description":"Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/gpu","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"AIRuntimeListResponse":{"type":"object","description":"A response object containing a list of AI Runtimes and pagination details.","properties":{"data":{"type":"array","items":{"$ref":"#/components/schemas/AIRuntimeListItem"}},"pagination":{"$ref":"#/components/schemas/PaginationResponse"}}},"AIRuntimeListItem":{"type":"object","description":"Summary representation of an AI Runtime for list responses.","required":["id","name","region","project","status","gpuConfigID","planId","createdAt"],"properties":{"id":{"type":"string","format":"uuid","readOnly":true,"description":"The unique system-generated identifier for the AI Runtime."},"name":{"type":"string","description":"The user-defined name of the AI Runtime."},"region":{"type":"string","description":"The region where the AI Runtime is deployed."},"project":{"type":"string","description":"The project this AI Runtime belongs to."},"gpuConfigID":{"type":"string","description":"The ID of the GPU configuration to use for the AI Runtime."},"planId":{"type":"string","description":"The pricing plan ID for billing the AI Runtime."},"templateID":{"type":"string","readOnly":true,"description":"Template identifier used during deployment."},"gpuCount":{"type":"integer","format":"int32","readOnly":true,"description":"Number of GPUs allocated."},"sshKeys":{"type":"array","readOnly":true,"description":"SSH public keys for accessing the AI Runtime instance.","items":{"type":"string"}},"status":{"type":"string","enum":["Pending","Running","Succeeded","Failed","Terminating"],"readOnly":true,"description":"The current operational status of the AI Runtime."},"reason":{"type":"string","readOnly":true,"nullable":true,"description":"The reason for the current status. Populated when the AI Runtime is in a Failed state, containing the exact Kubernetes container error (e.g. CrashLoopBackOff, ImagePullBackOff)."},"storage":{"$ref":"#/components/schemas/StorageResponse"},"createdAt":{"type":"string","format":"date-time","readOnly":true,"description":"The timestamp when the AI Runtime was created."}}},"StorageResponse":{"type":"object","description":"Represents a storage volume.","properties":{"id":{"type":"string","format":"uuid","description":"The unique identifier for the storage volume."},"type":{"type":"string","description":"The type of the storage volume (ephemeral or persistent).","enum":["ephemeral","persistent"]},"subType":{"type":"string","description":"The name of the storage volume.","enum":["local","network"]},"name":{"type":"string","description":"The name of the storage volume."},"sizeGB":{"type":"integer","format":"int32","description":"The size of the storage volume in gigabytes."},"planId":{"type":"string","description":"The pricing plan ID for the persistent storage."}}},"PaginationResponse":{"type":"object","properties":{"total_items":{"type":"integer","description":"The total number of items available across all pages."},"total_pages":{"type":"integer","description":"The total number of pages."},"current_page":{"type":"integer","description":"The current page number."},"items_per_page":{"type":"integer","description":"The number of items returned per page."}}},"ErrorResponse":{"type":"object","description":"A standard format for error responses.","properties":{"code":{"type":"string","description":"A machine-readable error code."},"message":{"type":"string","description":"A human-readable message providing details about the error."}},"required":["code","message"]}}},"paths":{"/api/v1beta1/orgs/{org_id}/projects/{project_id}/airuntimes":{"get":{"tags":["AIRuntime"],"summary":"List GPU Instances","description":"Retrieves a paginated list of all GPU Instances (AI Runtimes) in the project.","operationId":"listAIRuntimes","parameters":[{"in":"path","name":"org_id","required":true,"schema":{"type":"string"},"description":"The organization identifier."},{"in":"path","name":"project_id","required":true,"schema":{"type":"string"},"description":"The project identifier."},{"in":"query","name":"page","schema":{"type":"integer","minimum":1,"default":1},"description":"Page number for pagination (starts from 1)."},{"in":"query","name":"limit","schema":{"type":"integer","minimum":1,"maximum":100,"default":20},"description":"Number of instances to return per page (max 100)."}],"responses":{"200":{"description":"A paginated list of GPU Instances.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AIRuntimeListResponse"}}}},"400":{"description":"The request is invalid.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"401":{"description":"Unauthorized access.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"404":{"description":"The project was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"500":{"description":"An unexpected internal server error occurred.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}}}}}
```

## Create a GPU Instance

> Creates a new GPU Instance from a platform template or your organization's custom template.\
> \
> \*\*Steps before calling this endpoint:\*\*\
> \
> 1\. Get a \`templateID\` from \`GET /api/v1beta1/airuntime-templates\` (platform templates) or \`GET /api/v1beta1/orgs/{org\_id}/airuntime-templates\` (custom templates).\
> 2\. Get a \`gpuConfigID\` from \`GET /api/v1beta1/inventory\` — use the \`config\_id\` field.\
> 3\. Get a \`planId\` from the inventory response's \`price\_per\_gpu\_per\_hour\` context or from the NeevAI console pricing page.\
> \
> The instance starts asynchronously. Poll \`GET .../airuntimes/{airuntime\_id}\` until \`status\` is \`Running\`, then access it via \`uiAccess.proxyUrl\` or \`sshAccess.proxyCommand\`.<br>

```json
{"openapi":"3.0.3","info":{"title":"GPU Instance API","version":"0.1.0"},"tags":[{"name":"AIRuntime","description":"Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/gpu","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"AIRuntimeRequest":{"type":"object","description":"Configuration for creating a GPU Instance (AI Runtime).","required":["name","region","templateID","gpuConfigID","gpuCount","planId"],"properties":{"name":{"type":"string","description":"A unique name for this GPU Instance within the project. Must be lowercase and may only contain alphanumeric characters and hyphens (DNS-1123 label format).\n","pattern":"^[a-z0-9]([-a-z0-9]*[a-z0-9])?$","maxLength":63},"region":{"type":"string","description":"The region where the GPU Instance will run. Must be a region where the requested GPU type has availability (see `GET /api/v1beta1/inventory`).\n"},"templateID":{"type":"string","description":"The identifier of the Template to use for this instance. Templates define the container image and pre-installed applications (e.g. JupyterLab, SSH).\n- Get platform template IDs from `GET /api/v1beta1/airuntime-templates`. - Get custom template IDs from `GET /api/v1beta1/orgs/{org_id}/airuntime-templates`.\nTemplate IDs follow the format `tpl-<name>`.\n"},"gpuConfigID":{"type":"string","description":"The GPU hardware configuration ID that determines the GPU type, VRAM, and associated CPU/memory allocation. Obtain from the `config_id` field in `GET /api/v1beta1/inventory`.\n"},"gpuCount":{"type":"integer","format":"int32","description":"Number of GPUs to allocate. Must be between `min_gpu_count` and `max_gpu_count` as returned by the inventory API for the selected `gpuConfigID`.\n","minimum":1},"sshKeys":{"type":"array","description":"Optional list of SSH public keys to install on the instance. Required if you want SSH access. Supported formats: `ssh-rsa` (RSA ≥2048 bits) and `ssh-ed25519`.\n","items":{"type":"string","pattern":"^(ssh-rsa|ssh-ed25519)\\s+[A-Za-z0-9+/]+[=]{0,2}(\\s+.*)?$","minLength":50,"maxLength":2000,"description":"SSH public key in OpenSSH format."},"minItems":0,"maxItems":10,"uniqueItems":true},"planId":{"type":"string","description":"The billing plan identifier that determines the hourly rate for this GPU Instance. Plan IDs are visible in the NeevAI console pricing page and follow the format `gpu-<model>-<commitment>` (e.g. `gpu-h200-on-demand-1h`, `gpu-a100-reserved-1m`).\n"},"customPorts":{"type":"array","description":"Optional list of additional container ports to expose externally. Useful when your container runs a custom web service on a non-standard port. Maximum 10 custom ports. Custom ports are TCP and are managed independently of `udpPortRanges`.\n","items":{"$ref":"#/components/schemas/CustomPortSpec"},"maxItems":10},"udpPortRanges":{"type":"array","description":"Optional list of UDP port ranges to expose externally for real-time, low-latency traffic (e.g. call-center voice / RTP). Each entry is an inclusive range; a single port is expressed as `startPort == endPort`. At most 5 ranges, 20 ports per range, and 64 UDP ports in total across all ranges. UDP ranges are managed independently of TCP `customPorts`, so the same port number may appear in both.\n","items":{"$ref":"#/components/schemas/UdpPortRangeSpec"},"maxItems":5},"storage":{"$ref":"#/components/schemas/AIRuntimeStorageSpec"}}},"CustomPortSpec":{"type":"object","description":"Defines a custom port to be exposed by the AI Runtime.","required":["name","containerPort"],"properties":{"name":{"type":"string","description":"Unique name for the port. Must be a valid DNS label.","pattern":"^[a-z0-9]([-a-z0-9]*[a-z0-9])?$","maxLength":15},"containerPort":{"type":"integer","format":"int32","description":"Port number exposed by the container.","minimum":1,"maximum":65535},"protocol":{"type":"string","enum":["TCP"],"default":"TCP","description":"Network protocol. Only TCP is supported."},"serviceType":{"type":"string","enum":["External"],"default":"External","description":"Service type. 'External' exposes the port via a LoadBalancer."}}},"UdpPortRangeSpec":{"type":"object","description":"Defines an inclusive range of UDP ports to expose externally. A single port is represented by setting `endPort` equal to `startPort` (or omitting `endPort`). Intended for real-time, low-latency workloads such as voice (RTP) traffic.\n","required":["startPort"],"properties":{"name":{"type":"string","description":"Optional unique name for the range. Must be a valid DNS label. Auto-generated from the port range when omitted.\n","pattern":"^[a-z0-9]([-a-z0-9]*[a-z0-9])?$","maxLength":15},"startPort":{"type":"integer","format":"int32","description":"First port in the inclusive range. Must be between 1024 and 65535.","minimum":1024,"maximum":65535},"endPort":{"type":"integer","format":"int32","description":"Last port in the inclusive range. Must be >= `startPort`, and the range may span at most 20 ports. Defaults to `startPort` (a single port) when omitted.\n","minimum":1024,"maximum":65535}}},"AIRuntimeStorageSpec":{"description":"Optional storage configuration for persistent storage (local or network).","oneOf":[{"$ref":"#/components/schemas/EphemeralStorageSpec"},{"$ref":"#/components/schemas/PersistentStorageSpec"}],"discriminator":{"propertyName":"type","mapping":{"ephemeral":"#/components/schemas/EphemeralStorageSpec","persistent":"#/components/schemas/PersistentStorageSpec"}}},"EphemeralStorageSpec":{"type":"object","description":"Ephemeral storage backed by node disk. Data is lost when pod terminates.","required":["type"],"properties":{"type":{"type":"string","enum":["ephemeral"]},"sizeGb":{"type":"integer","minimum":1,"maximum":20,"default":20}}},"PersistentStorageSpec":{"type":"object","description":"Persistent storage configuration.","required":["type","mode"],"properties":{"type":{"type":"string","enum":["persistent"]},"mode":{"type":"string","enum":["local","network"]}},"oneOf":[{"$ref":"#/components/schemas/LocalStorageConfig"},{"$ref":"#/components/schemas/NetworkStorageConfig"}],"discriminator":{"propertyName":"mode","mapping":{"local":"#/components/schemas/LocalStorageConfig","network":"#/components/schemas/NetworkStorageConfig"}}},"LocalStorageConfig":{"type":"object","description":"Configuration for local storage volume.","required":["mode","sizeGB"],"properties":{"mode":{"type":"string","enum":["local"]},"sizeGB":{"type":"integer","format":"int32","description":"Size of the local storage volume in gigabytes."},"mountPath":{"type":"string","description":"The path where the local storage volume will be mounted."}}},"NetworkStorageConfig":{"type":"object","description":"Configuration for network storage volume. Either create a new volume (provide sizeGB) or reuse an existing volume (provide existing_volume_id). Network storage can be shared across multiple GPU instances in the same region.","required":["mode"],"properties":{"mode":{"type":"string","enum":["network"]},"name":{"type":"string","description":"Optional name for a new volume. If not provided, a name will be auto-generated."},"sizeGB":{"type":"integer","format":"int32","description":"Size of the network storage volume in gigabytes (required when creating new volume)."},"mountPath":{"type":"string","description":"The path where the network storage volume will be mounted."},"existing_volume_id":{"type":"string","format":"uuid","description":"UUID of an existing network volume to reuse. The volume must belong to the same org/project/region and be in 'bound' status. Network storage volumes can be shared across multiple GPU instances in the same region. When provided, the volume name will be automatically resolved from the volume ID. Mutually exclusive with sizeGB (for new volumes)."}}},"AIRuntimeResponse":{"type":"object","description":"Represents the state and configuration of an AI Runtime instance.","required":["id","name","region","project","status","gpuConfigID","planId","createdAt"],"properties":{"id":{"type":"string","format":"uuid","readOnly":true,"description":"The unique system-generated identifier for the AI Runtime."},"name":{"type":"string","description":"The user-defined name of the AI Runtime."},"region":{"type":"string","description":"The region where the AI Runtime is deployed."},"project":{"type":"string","description":"The project this AI Runtime belongs to."},"gpuConfigID":{"type":"string","description":"The ID of the GPU configuration to use for the AI Runtime."},"planId":{"type":"string","description":"The pricing plan ID for billing the AI Runtime."},"templateID":{"type":"string","readOnly":true,"description":"Template identifier used during deployment."},"templateName":{"type":"string","readOnly":true,"nullable":true,"description":"Human-readable name of the template used during deployment."},"gpuCount":{"type":"integer","format":"int32","readOnly":true,"description":"Number of GPUs allocated."},"sshKeys":{"type":"array","readOnly":true,"description":"SSH public keys for accessing the AI Runtime instance.","items":{"type":"string"}},"customPorts":{"type":"array","description":"List of custom (TCP) ports exposed.","items":{"$ref":"#/components/schemas/CustomPortSpec"}},"udpPortRanges":{"type":"array","description":"List of UDP port ranges exposed.","items":{"$ref":"#/components/schemas/UdpPortRangeSpec"}},"status":{"type":"string","enum":["Pending","Running","Succeeded","Failed","Terminating"],"readOnly":true,"description":"The current operational status of the AI Runtime."},"reason":{"type":"string","readOnly":true,"nullable":true,"description":"The reason for the current status. Populated when the AI Runtime is in a Failed state, containing the exact Kubernetes container error (e.g. CrashLoopBackOff, ImagePullBackOff)."},"createdAt":{"type":"string","format":"date-time","readOnly":true,"description":"The timestamp when the AI Runtime was created."},"storage":{"$ref":"#/components/schemas/StorageResponse"},"uiAccess":{"type":"object","nullable":true,"readOnly":true,"description":"Primary UI application access information","properties":{"name":{"type":"string","description":"Name of the primary UI application"},"directUrl":{"type":"string","description":"Direct NodePort access URL"},"proxyUrl":{"type":"string","description":"Proxy access URL via hostname (no port needed)"},"password":{"type":"string","nullable":true,"description":"Password or token for accessing the UI (if required)"}}},"sshAccess":{"type":"object","nullable":true,"readOnly":true,"description":"SSH access information","properties":{"directCommand":{"type":"string","description":"Direct SSH connection command"},"proxyCommand":{"type":"string","description":"SSH proxy connection command with identity file"}}}}},"StorageResponse":{"type":"object","description":"Represents a storage volume.","properties":{"id":{"type":"string","format":"uuid","description":"The unique identifier for the storage volume."},"type":{"type":"string","description":"The type of the storage volume (ephemeral or persistent).","enum":["ephemeral","persistent"]},"subType":{"type":"string","description":"The name of the storage volume.","enum":["local","network"]},"name":{"type":"string","description":"The name of the storage volume."},"sizeGB":{"type":"integer","format":"int32","description":"The size of the storage volume in gigabytes."},"planId":{"type":"string","description":"The pricing plan ID for the persistent storage."}}},"ErrorResponse":{"type":"object","description":"A standard format for error responses.","properties":{"code":{"type":"string","description":"A machine-readable error code."},"message":{"type":"string","description":"A human-readable message providing details about the error."}},"required":["code","message"]}}},"paths":{"/api/v1beta1/orgs/{org_id}/projects/{project_id}/airuntimes":{"post":{"tags":["AIRuntime"],"summary":"Create a GPU Instance","description":"Creates a new GPU Instance from a platform template or your organization's custom template.\n\n**Steps before calling this endpoint:**\n\n1. Get a `templateID` from `GET /api/v1beta1/airuntime-templates` (platform templates) or `GET /api/v1beta1/orgs/{org_id}/airuntime-templates` (custom templates).\n2. Get a `gpuConfigID` from `GET /api/v1beta1/inventory` — use the `config_id` field.\n3. Get a `planId` from the inventory response's `price_per_gpu_per_hour` context or from the NeevAI console pricing page.\n\nThe instance starts asynchronously. Poll `GET .../airuntimes/{airuntime_id}` until `status` is `Running`, then access it via `uiAccess.proxyUrl` or `sshAccess.proxyCommand`.\n","operationId":"createAIRuntime","parameters":[{"in":"path","name":"org_id","required":true,"schema":{"type":"string"},"description":"The organization identifier."},{"in":"path","name":"project_id","required":true,"schema":{"type":"string"},"description":"The project identifier."}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/AIRuntimeRequest"}}}},"responses":{"201":{"description":"The GPU Instance was created successfully and is initializing.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AIRuntimeResponse"}}}},"400":{"description":"The request payload is invalid.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"401":{"description":"Unauthorized access.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"404":{"description":"The project or template was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"409":{"description":"A GPU Instance with the same name already exists in the project.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"500":{"description":"An unexpected internal server error occurred.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}}}}}
```

## Get GPU Instance details

> Fetches the current state of a specific GPU Instance. Use this to poll the \`status\` field after creation. Once \`status\` is \`Running\`, the \`uiAccess\` and \`sshAccess\` fields are populated with connection details.<br>

```json
{"openapi":"3.0.3","info":{"title":"GPU Instance API","version":"0.1.0"},"tags":[{"name":"AIRuntime","description":"Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/gpu","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"AIRuntimeResponse":{"type":"object","description":"Represents the state and configuration of an AI Runtime instance.","required":["id","name","region","project","status","gpuConfigID","planId","createdAt"],"properties":{"id":{"type":"string","format":"uuid","readOnly":true,"description":"The unique system-generated identifier for the AI Runtime."},"name":{"type":"string","description":"The user-defined name of the AI Runtime."},"region":{"type":"string","description":"The region where the AI Runtime is deployed."},"project":{"type":"string","description":"The project this AI Runtime belongs to."},"gpuConfigID":{"type":"string","description":"The ID of the GPU configuration to use for the AI Runtime."},"planId":{"type":"string","description":"The pricing plan ID for billing the AI Runtime."},"templateID":{"type":"string","readOnly":true,"description":"Template identifier used during deployment."},"templateName":{"type":"string","readOnly":true,"nullable":true,"description":"Human-readable name of the template used during deployment."},"gpuCount":{"type":"integer","format":"int32","readOnly":true,"description":"Number of GPUs allocated."},"sshKeys":{"type":"array","readOnly":true,"description":"SSH public keys for accessing the AI Runtime instance.","items":{"type":"string"}},"customPorts":{"type":"array","description":"List of custom (TCP) ports exposed.","items":{"$ref":"#/components/schemas/CustomPortSpec"}},"udpPortRanges":{"type":"array","description":"List of UDP port ranges exposed.","items":{"$ref":"#/components/schemas/UdpPortRangeSpec"}},"status":{"type":"string","enum":["Pending","Running","Succeeded","Failed","Terminating"],"readOnly":true,"description":"The current operational status of the AI Runtime."},"reason":{"type":"string","readOnly":true,"nullable":true,"description":"The reason for the current status. Populated when the AI Runtime is in a Failed state, containing the exact Kubernetes container error (e.g. CrashLoopBackOff, ImagePullBackOff)."},"createdAt":{"type":"string","format":"date-time","readOnly":true,"description":"The timestamp when the AI Runtime was created."},"storage":{"$ref":"#/components/schemas/StorageResponse"},"uiAccess":{"type":"object","nullable":true,"readOnly":true,"description":"Primary UI application access information","properties":{"name":{"type":"string","description":"Name of the primary UI application"},"directUrl":{"type":"string","description":"Direct NodePort access URL"},"proxyUrl":{"type":"string","description":"Proxy access URL via hostname (no port needed)"},"password":{"type":"string","nullable":true,"description":"Password or token for accessing the UI (if required)"}}},"sshAccess":{"type":"object","nullable":true,"readOnly":true,"description":"SSH access information","properties":{"directCommand":{"type":"string","description":"Direct SSH connection command"},"proxyCommand":{"type":"string","description":"SSH proxy connection command with identity file"}}}}},"CustomPortSpec":{"type":"object","description":"Defines a custom port to be exposed by the AI Runtime.","required":["name","containerPort"],"properties":{"name":{"type":"string","description":"Unique name for the port. Must be a valid DNS label.","pattern":"^[a-z0-9]([-a-z0-9]*[a-z0-9])?$","maxLength":15},"containerPort":{"type":"integer","format":"int32","description":"Port number exposed by the container.","minimum":1,"maximum":65535},"protocol":{"type":"string","enum":["TCP"],"default":"TCP","description":"Network protocol. Only TCP is supported."},"serviceType":{"type":"string","enum":["External"],"default":"External","description":"Service type. 'External' exposes the port via a LoadBalancer."}}},"UdpPortRangeSpec":{"type":"object","description":"Defines an inclusive range of UDP ports to expose externally. A single port is represented by setting `endPort` equal to `startPort` (or omitting `endPort`). Intended for real-time, low-latency workloads such as voice (RTP) traffic.\n","required":["startPort"],"properties":{"name":{"type":"string","description":"Optional unique name for the range. Must be a valid DNS label. Auto-generated from the port range when omitted.\n","pattern":"^[a-z0-9]([-a-z0-9]*[a-z0-9])?$","maxLength":15},"startPort":{"type":"integer","format":"int32","description":"First port in the inclusive range. Must be between 1024 and 65535.","minimum":1024,"maximum":65535},"endPort":{"type":"integer","format":"int32","description":"Last port in the inclusive range. Must be >= `startPort`, and the range may span at most 20 ports. Defaults to `startPort` (a single port) when omitted.\n","minimum":1024,"maximum":65535}}},"StorageResponse":{"type":"object","description":"Represents a storage volume.","properties":{"id":{"type":"string","format":"uuid","description":"The unique identifier for the storage volume."},"type":{"type":"string","description":"The type of the storage volume (ephemeral or persistent).","enum":["ephemeral","persistent"]},"subType":{"type":"string","description":"The name of the storage volume.","enum":["local","network"]},"name":{"type":"string","description":"The name of the storage volume."},"sizeGB":{"type":"integer","format":"int32","description":"The size of the storage volume in gigabytes."},"planId":{"type":"string","description":"The pricing plan ID for the persistent storage."}}},"ErrorResponse":{"type":"object","description":"A standard format for error responses.","properties":{"code":{"type":"string","description":"A machine-readable error code."},"message":{"type":"string","description":"A human-readable message providing details about the error."}},"required":["code","message"]}}},"paths":{"/api/v1beta1/orgs/{org_id}/projects/{project_id}/airuntimes/{airuntime_id}":{"get":{"tags":["AIRuntime"],"summary":"Get GPU Instance details","description":"Fetches the current state of a specific GPU Instance. Use this to poll the `status` field after creation. Once `status` is `Running`, the `uiAccess` and `sshAccess` fields are populated with connection details.\n","operationId":"getAIRuntime","parameters":[{"in":"path","name":"org_id","required":true,"schema":{"type":"string"},"description":"The organization identifier."},{"in":"path","name":"project_id","required":true,"schema":{"type":"string"},"description":"The project identifier."},{"in":"path","name":"airuntime_id","required":true,"schema":{"type":"string","format":"uuid"},"description":"The unique identifier of the GPU Instance."}],"responses":{"200":{"description":"Current state and connection details of the GPU Instance.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AIRuntimeResponse"}}}},"400":{"description":"The request is invalid.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"401":{"description":"Unauthorized access.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"404":{"description":"An AI Runtime with the specified ID was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"500":{"description":"An unexpected internal server error occurred.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}}}}}
```

## Delete an AI Runtime

> Permanently deletes a specific AI Runtime by its unique ID.

```json
{"openapi":"3.0.3","info":{"title":"GPU Instance API","version":"0.1.0"},"tags":[{"name":"AIRuntime","description":"Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/gpu","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"ErrorResponse":{"type":"object","description":"A standard format for error responses.","properties":{"code":{"type":"string","description":"A machine-readable error code."},"message":{"type":"string","description":"A human-readable message providing details about the error."}},"required":["code","message"]}}},"paths":{"/api/v1beta1/orgs/{org_id}/projects/{project_id}/airuntimes/{airuntime_id}":{"delete":{"tags":["AIRuntime"],"summary":"Delete an AI Runtime","description":"Permanently deletes a specific AI Runtime by its unique ID.","operationId":"deleteAIRuntime","parameters":[{"in":"path","name":"org_id","required":true,"schema":{"type":"string"},"description":"The organization identifier."},{"in":"path","name":"project_id","required":true,"schema":{"type":"string"},"description":"The project identifier."},{"in":"path","name":"airuntime_id","required":true,"schema":{"type":"string","format":"uuid"},"description":"The unique identifier of the AI Runtime to delete."}],"responses":{"204":{"description":"The AI Runtime was deleted successfully."},"400":{"description":"The request is invalid.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"401":{"description":"Unauthorized access.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"404":{"description":"An AI Runtime with the specified ID was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"500":{"description":"An unexpected internal server error occurred.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}}}}}
```

## Get AI Runtime utilization metrics

> Fetches real-time resource utilization metrics (CPU, RAM, GPU) for a specific AI Runtime.

```json
{"openapi":"3.0.3","info":{"title":"GPU Instance API","version":"0.1.0"},"tags":[{"name":"AIRuntime","description":"Create, list, inspect, and delete **GPU Instances** (referred to internally as AI Runtimes). A GPU Instance is a running container on dedicated GPU hardware within your project. Access it via the browser (Jupyter, custom UIs) or SSH once `status` is `Running`.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/gpu","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"AIRuntimeMetricsResponse":{"type":"object","description":"Resource utilization metrics for an AI Runtime.","required":["cpuUtilizationPercent","ramUtilizationPercent","gpuUtilizationPercent"],"properties":{"cpuUtilizationPercent":{"type":"number","format":"float","description":"CPU utilization percentage (0-100)"},"ramUtilizationPercent":{"type":"number","format":"float","description":"RAM utilization percentage (0-100)"},"gpuUtilizationPercent":{"type":"number","format":"float","description":"GPU utilization percentage (0-100)"}}},"ErrorResponse":{"type":"object","description":"A standard format for error responses.","properties":{"code":{"type":"string","description":"A machine-readable error code."},"message":{"type":"string","description":"A human-readable message providing details about the error."}},"required":["code","message"]}}},"paths":{"/api/v1beta1/orgs/{org_id}/projects/{project_id}/airuntimes/{airuntime_id}/metrics":{"get":{"tags":["AIRuntime"],"summary":"Get AI Runtime utilization metrics","description":"Fetches real-time resource utilization metrics (CPU, RAM, GPU) for a specific AI Runtime.","operationId":"getAIRuntimeMetrics","parameters":[{"in":"path","name":"org_id","required":true,"schema":{"type":"string"},"description":"The organization identifier."},{"in":"path","name":"project_id","required":true,"schema":{"type":"string"},"description":"The project identifier."},{"in":"path","name":"airuntime_id","required":true,"schema":{"type":"string","format":"uuid"},"description":"The unique identifier of the AI Runtime."}],"responses":{"200":{"description":"Resource utilization metrics for the AI Runtime.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/AIRuntimeMetricsResponse"}}}},"400":{"description":"The request is invalid.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"401":{"description":"Unauthorized access.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"404":{"description":"An AI Runtime with the specified ID was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"500":{"description":"An unexpected internal server error occurred.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}}}}}
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.ai.neevcloud.com/api-reference/gpu-instance/airuntime.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.