# Overview

## AI Inference

AI Inference allows you to run trained models to generate predictions and responses. It removes the need to manage inference servers, scaling, and availability.

AI Inference is offered through two main experiences:

* Model API
* Model Playground

***

## Model API

### Overview

Model API provides production-ready inference endpoints for AI models. You can integrate these APIs directly into your applications.

***

### Why use Model API

Building and managing inference infrastructure is complex. You need to handle scaling, failures, and performance.

Model API handles these challenges by:

* Automatically scaling based on traffic
* Providing stable and secure endpoints
* Reducing operational overhead
* Ensuring consistent performance

***

### What Model API provides

* Hosted inference endpoints
* Support for popular and validated models
* Configuration for performance and scaling
* Optional streaming responses
* Usage-based billing

You do not need to manage servers or containers.

***

### Who should use Model API

* Developers building AI-powered applications
* Teams adding AI features to existing products
* Startups needing fast time to market
* Enterprises running production inference workloads

***

### How Model API works

1. Select a model from the model catalog
2. Deploy the model as an API endpoint
3. Configure scaling and compute settings
4. Send inference requests using HTTP or SDKs
5. Receive responses in real time

The platform monitors usage and performance automatically.

***

## Model Playground

### Overview

Model Playground is an interactive interface for testing and experimenting with AI models. It is designed for quick validation before production deployment.

***

### Why use Model Playground

Testing models only through code can slow down experimentation. Model Playground allows faster iteration and easier collaboration.

It helps you:

* Validate model behavior
* Test prompts and parameters
* Compare outputs across models
* Reduce trial-and-error during development

***

### What Model Playground provides

* Web-based UI for model testing
* Prompt input and output visualization
* Adjustable inference parameters
* Support for text-based inputs
* Easy transition to Model API deployment

***

### Who should use Model Playground

* ML engineers evaluating models
* Developers testing prompts and responses
* Product managers validating AI outputs
* Teams collaborating on prompt design

***

### How Model Playground works

1. Select a model in the Playground
2. Enter a prompt or input text
3. Adjust parameters such as tokens or temperature
4. Run inference and view results
5. Deploy the same model using Model API when ready

No infrastructure setup is required.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.ai.neevcloud.com/ai-inference/overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.