> For the complete documentation index, see [llms.txt](https://docs.ai.neevcloud.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.ai.neevcloud.com/api-reference/dedicated-models/catalog.md).

# Catalog

Browse the NeevAI model catalog to discover available models, their recommended GPU configurations, and per-hour pricing. Use the `id` field from a catalog item as the `model_id` when creating a deployment.

## Browse model catalog

> Lists all AI models available for deployment on NeevAI, along with recommended GPU hardware and estimated per-hour pricing.\
> \
> Use the url-safe \`id\` field from a catalog item as \`model\_id\` when creating a deployment, and as the \`{model\_id}\` path parameter on \`GET /api/v1beta1/aimodels-catalog/{model\_id}\`. The \`model\_id\` field holds the slash HuggingFace path used for inference / the Model Routing API. Use the \`recommended\_gpu.gpu\_config\_id\` as a starting point for \`gpu\_config\_id\` in the deployment request (you can also pick a different GPU from the inventory API).<br>

```json
{"openapi":"3.0.3","info":{"title":"Dedicated Model Deployment API","version":"0.1.0"},"tags":[{"name":"Catalog","description":"Browse the NeevAI model catalog to discover available models, their recommended GPU configurations, and per-hour pricing. Use the `id` field from a catalog item as the `model_id` when creating a deployment.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/aimodels","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"ModelCatalogResponse":{"type":"object","properties":{"items":{"type":"array","items":{"$ref":"#/components/schemas/CatalogItem"}}}},"CatalogItem":{"type":"object","properties":{"id":{"type":"string","description":"Url-safe catalog id (slashes replaced with hyphens, e.g. \"meta-llama-Llama-3.1-405B\"). Use this as the `{model_id}` path parameter on the catalog detail endpoint and as `model_id` when creating a deployment.\n"},"model_id":{"type":"string","description":"The model's slash HuggingFace path (e.g. \"meta-llama/Llama-3.1-405B\"). This is the identity used for inference / the Model Routing API; pass it as the `model` field in inference requests. It is NOT the deploy `model_id` (use `id` for that).\n"},"name":{"type":"string","description":"Display name (e.g., \"Llama 3.1 405B\")"},"description":{"type":"string"},"model_url":{"type":"string","description":"URL to the model page (e.g., HuggingFace model page)"},"task_type":{"type":"string","description":"The model's primary task (e.g., text-generation, text-to-image)"},"parameters_size":{"type":"string","description":"Human-readable parameter count (e.g., \"405B\", \"20B\", \"335M\")"},"license":{"type":"string","description":"License type (e.g., \"Apache 2.0\", \"MIT\")"},"owner_org_id":{"type":"string","description":"Organization that owns/manages this catalog entry (default \"neevai\")"},"icon":{"type":"string","format":"byte","description":"Base64-encoded icon image"},"recommended_gpu":{"$ref":"#/components/schemas/RecommendedGPU"},"unit_price":{"type":"number","description":"Price per hour (synced from billing service)"},"currency":{"type":"string","description":"Currency code (e.g., \"USD\")"},"framework":{"type":"string","description":"Serving framework (e.g., \"vllm\", \"tgi\")"},"configurations":{"type":"array","description":"Tunable serving parameters for this model, each with its default, minimum, and maximum allowed value. Submit chosen values via the `config` object in the deploy request (`POST .../aimodels`); values outside the advertised range are rejected.\n","items":{"$ref":"#/components/schemas/ConfigParameter"}}}},"RecommendedGPU":{"type":"object","properties":{"gpu_config_id":{"type":"string","description":"GPU config ID — used to look up GPU details and pricing from billing service"},"gpu_count":{"type":"integer","description":"Recommended number of GPUs"},"vram_gb":{"type":"number","description":"VRAM per GPU in GB"},"cpu_cores":{"type":"integer","description":"Total vCPU cores allocated for the recommended GPU configuration (per-GPU default × gpu_count)"},"memory_gib":{"type":"integer","format":"int64","description":"Total system RAM in GiB allocated for the recommended GPU configuration (per-GPU default × gpu_count)"}}},"ConfigParameter":{"type":"object","description":"A single tunable serving parameter and its allowed values. The `type` field tells clients how to interpret the constraint fields: for `integer` parameters use `default`/`min`/`max`; for `enum` parameters use `allowed_values`. Future types (e.g. `float`, `boolean`) may add their own constraint fields — clients should branch on `type` and ignore fields they don't recognize.\n","properties":{"name":{"type":"string","description":"The config key, as accepted in the deploy request `config` object (e.g., \"max_model_len\")."},"type":{"type":"string","description":"Value type of this parameter, governing which constraint fields apply. Currently always `integer`; `enum`, `float`, and `boolean` are reserved for future parameters.\n"},"default":{"type":"integer","description":"Value applied when the user does not specify this parameter (for `integer` parameters)."},"min":{"type":"integer","description":"Minimum accepted value, inclusive (for `integer` parameters). Omitted when there is no lower bound."},"max":{"type":"integer","description":"Maximum accepted value, inclusive (for `integer` parameters). Omitted when there is no upper bound."},"allowed_values":{"type":"array","items":{"type":"string"},"description":"The permitted values, for `enum` parameters. Omitted for non-enum types."}}},"ErrorResponse":{"type":"object","required":["code","message"],"properties":{"code":{"type":"string","description":"A machine-readable error code. Common values: `invalid_request`, `unauthorized`, `forbidden`, `not_found`, `internal_error`.\n"},"message":{"type":"string","description":"A human-readable description of what went wrong."}}}},"responses":{"Unauthorized":{"description":"The request is missing a valid Bearer token.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"Forbidden":{"description":"The authenticated user does not have permission to perform this action.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"InternalServerError":{"description":"An unexpected error occurred on the server. Please retry or contact support.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}},"paths":{"/api/v1beta1/aimodels-catalog":{"get":{"tags":["Catalog"],"summary":"Browse model catalog","description":"Lists all AI models available for deployment on NeevAI, along with recommended GPU hardware and estimated per-hour pricing.\n\nUse the url-safe `id` field from a catalog item as `model_id` when creating a deployment, and as the `{model_id}` path parameter on `GET /api/v1beta1/aimodels-catalog/{model_id}`. The `model_id` field holds the slash HuggingFace path used for inference / the Model Routing API. Use the `recommended_gpu.gpu_config_id` as a starting point for `gpu_config_id` in the deployment request (you can also pick a different GPU from the inventory API).\n","operationId":"listModelCatalog","parameters":[{"name":"task_type","in":"query","schema":{"type":"string"},"description":"Filter by the model's primary task type. Common values: `text-generation`, `text-to-image`, `text-to-speech`, `embeddings`.\n"},{"name":"search","in":"query","schema":{"type":"string"},"description":"Case-insensitive search filter applied to the model name and description."}],"responses":{"200":{"description":"List of available models with pricing and GPU recommendations.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ModelCatalogResponse"}}}},"401":{"$ref":"#/components/responses/Unauthorized"},"403":{"$ref":"#/components/responses/Forbidden"},"500":{"$ref":"#/components/responses/InternalServerError"}}}}}}
```

## Get a single catalog model

> Returns the full details for one catalog model — configurations (tunable serving parameters), recommended GPU, license, framework, and estimated per-hour pricing — enriched the same way as the list endpoint but for a single model.\
> \
> The \`{id}\` path parameter is the catalog item's url-safe \`id\` (slashes replaced with hyphens, e.g. \`openai-gpt-oss-120b\`), as returned in the \`id\` field of \`GET /api/v1beta1/aimodels-catalog\`. It is NOT the slash HuggingFace path (that is exposed as the \`model\_id\` field on each catalog item).<br>

```json
{"openapi":"3.0.3","info":{"title":"Dedicated Model Deployment API","version":"0.1.0"},"tags":[{"name":"Catalog","description":"Browse the NeevAI model catalog to discover available models, their recommended GPU configurations, and per-hour pricing. Use the `id` field from a catalog item as the `model_id` when creating a deployment.\n"}],"servers":[{"url":"https://api.ai.neevcloud.com/aimodels","description":"Consolidated public API gateway"}],"security":[{"BearerAuth":[]}],"components":{"securitySchemes":{"BearerAuth":{"type":"http","scheme":"bearer","description":"Obtain an **`access_token`** from `POST /api/v1/auth/login` on the tenant API (same credentials as the console). In Authorize, paste **only that token** — do not prepend `Bearer`, and do not use inference keys (`sk-nc-*`).\n"}},"schemas":{"CatalogItem":{"type":"object","properties":{"id":{"type":"string","description":"Url-safe catalog id (slashes replaced with hyphens, e.g. \"meta-llama-Llama-3.1-405B\"). Use this as the `{model_id}` path parameter on the catalog detail endpoint and as `model_id` when creating a deployment.\n"},"model_id":{"type":"string","description":"The model's slash HuggingFace path (e.g. \"meta-llama/Llama-3.1-405B\"). This is the identity used for inference / the Model Routing API; pass it as the `model` field in inference requests. It is NOT the deploy `model_id` (use `id` for that).\n"},"name":{"type":"string","description":"Display name (e.g., \"Llama 3.1 405B\")"},"description":{"type":"string"},"model_url":{"type":"string","description":"URL to the model page (e.g., HuggingFace model page)"},"task_type":{"type":"string","description":"The model's primary task (e.g., text-generation, text-to-image)"},"parameters_size":{"type":"string","description":"Human-readable parameter count (e.g., \"405B\", \"20B\", \"335M\")"},"license":{"type":"string","description":"License type (e.g., \"Apache 2.0\", \"MIT\")"},"owner_org_id":{"type":"string","description":"Organization that owns/manages this catalog entry (default \"neevai\")"},"icon":{"type":"string","format":"byte","description":"Base64-encoded icon image"},"recommended_gpu":{"$ref":"#/components/schemas/RecommendedGPU"},"unit_price":{"type":"number","description":"Price per hour (synced from billing service)"},"currency":{"type":"string","description":"Currency code (e.g., \"USD\")"},"framework":{"type":"string","description":"Serving framework (e.g., \"vllm\", \"tgi\")"},"configurations":{"type":"array","description":"Tunable serving parameters for this model, each with its default, minimum, and maximum allowed value. Submit chosen values via the `config` object in the deploy request (`POST .../aimodels`); values outside the advertised range are rejected.\n","items":{"$ref":"#/components/schemas/ConfigParameter"}}}},"RecommendedGPU":{"type":"object","properties":{"gpu_config_id":{"type":"string","description":"GPU config ID — used to look up GPU details and pricing from billing service"},"gpu_count":{"type":"integer","description":"Recommended number of GPUs"},"vram_gb":{"type":"number","description":"VRAM per GPU in GB"},"cpu_cores":{"type":"integer","description":"Total vCPU cores allocated for the recommended GPU configuration (per-GPU default × gpu_count)"},"memory_gib":{"type":"integer","format":"int64","description":"Total system RAM in GiB allocated for the recommended GPU configuration (per-GPU default × gpu_count)"}}},"ConfigParameter":{"type":"object","description":"A single tunable serving parameter and its allowed values. The `type` field tells clients how to interpret the constraint fields: for `integer` parameters use `default`/`min`/`max`; for `enum` parameters use `allowed_values`. Future types (e.g. `float`, `boolean`) may add their own constraint fields — clients should branch on `type` and ignore fields they don't recognize.\n","properties":{"name":{"type":"string","description":"The config key, as accepted in the deploy request `config` object (e.g., \"max_model_len\")."},"type":{"type":"string","description":"Value type of this parameter, governing which constraint fields apply. Currently always `integer`; `enum`, `float`, and `boolean` are reserved for future parameters.\n"},"default":{"type":"integer","description":"Value applied when the user does not specify this parameter (for `integer` parameters)."},"min":{"type":"integer","description":"Minimum accepted value, inclusive (for `integer` parameters). Omitted when there is no lower bound."},"max":{"type":"integer","description":"Maximum accepted value, inclusive (for `integer` parameters). Omitted when there is no upper bound."},"allowed_values":{"type":"array","items":{"type":"string"},"description":"The permitted values, for `enum` parameters. Omitted for non-enum types."}}},"ErrorResponse":{"type":"object","required":["code","message"],"properties":{"code":{"type":"string","description":"A machine-readable error code. Common values: `invalid_request`, `unauthorized`, `forbidden`, `not_found`, `internal_error`.\n"},"message":{"type":"string","description":"A human-readable description of what went wrong."}}}},"responses":{"Unauthorized":{"description":"The request is missing a valid Bearer token.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"Forbidden":{"description":"The authenticated user does not have permission to perform this action.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"NotFound":{"description":"The requested resource was not found.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}},"InternalServerError":{"description":"An unexpected error occurred on the server. Please retry or contact support.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/ErrorResponse"}}}}}},"paths":{"/api/v1beta1/aimodels-catalog/{id}":{"get":{"tags":["Catalog"],"summary":"Get a single catalog model","description":"Returns the full details for one catalog model — configurations (tunable serving parameters), recommended GPU, license, framework, and estimated per-hour pricing — enriched the same way as the list endpoint but for a single model.\n\nThe `{id}` path parameter is the catalog item's url-safe `id` (slashes replaced with hyphens, e.g. `openai-gpt-oss-120b`), as returned in the `id` field of `GET /api/v1beta1/aimodels-catalog`. It is NOT the slash HuggingFace path (that is exposed as the `model_id` field on each catalog item).\n","operationId":"getModelCatalogItem","parameters":[{"name":"id","in":"path","required":true,"schema":{"type":"string"},"description":"The catalog item's url-safe `id` (e.g. `openai-gpt-oss-120b`), as returned in the `id` field of `GET /api/v1beta1/aimodels-catalog`.\n"}],"responses":{"200":{"description":"The requested catalog model with pricing and GPU recommendation.","content":{"application/json":{"schema":{"$ref":"#/components/schemas/CatalogItem"}}}},"401":{"$ref":"#/components/responses/Unauthorized"},"403":{"$ref":"#/components/responses/Forbidden"},"404":{"$ref":"#/components/responses/NotFound"},"500":{"$ref":"#/components/responses/InternalServerError"}}}}}}
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.ai.neevcloud.com/api-reference/dedicated-models/catalog.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
