Chat Completions
Given a list of messages comprising a conversation, the model will return a generated response. This endpoint utilizes our proprietary TensorRT-LLM optimized infrastructure for sub-millisecond TTFT (Time to First Token).
POST https://a3gate.in/v1/chat/completions
Request Parameters
| Parameter |
Description |
|
model
string
Required
|
ID of the model to use. See the model endpoint compatibility table for details on which models work with the Chat API.
Options: llama-3-70b, mistral-8x7b, custom-ft-id
|
|
messages
array
Required
|
A list of messages comprising the conversation so far.
Each object requires a role (system, user, assistant) and content.
|
|
temperature
number
|
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Defaults to 1.
|
Response Format
The response is a JSON object containing the model's output along with exact token usage metrics for billing purposes.
Embeddings
Get a vector representation of a given input that can be easily consumed by machine learning models and algorithms.
POST https://a3gate.in/v1/embeddings
Request Parameters
| Parameter |
Description |
|
model
string
Required
|
ID of the model to use. You can use the List models API to see all of your available models.
|
|
input
string or array
Required
|
Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays.
|
Introduction
Welcome to the A3Gate DeepTech API. Our REST API provides programmatic access to our enterprise-grade GPU clusters, allowing you to integrate state-of-the-art language models, embedding pipelines, and fine-tuning workloads directly into your applications.
We provide official SDKs for Python and Node.js, or you can interact directly via REST using cURL or your preferred HTTP client.
Base URL
All API requests must be made over HTTPS. Calls made over plain HTTP will fail. API requests without authentication will also fail.
https://a3gate.in/v1
Authentication
The A3Gate API uses API keys for authentication. Visit your API Keys page in the Dashboard to retrieve the API key you'll use in your requests.
Remember that your API key is a secret! Do not share it with others or expose it in any client-side code (browsers, apps). Production requests must be routed through your own backend server.
Authorization Header
All API requests should include your API key in an Authorization HTTP header as follows:
Authorization: Bearer A3GATE_API_KEY
Error Codes
A3Gate uses conventional HTTP response codes to indicate the success or failure of an API request. In general:
- Codes in the
2xx range indicate success.
- Codes in the
4xx range indicate an error that failed given the information provided (e.g., a required parameter was omitted).
- Codes in the
5xx range indicate an error with A3Gate's servers.
Common Status Codes
| Code | Description |
| 400 - Bad Request | The request was unacceptable, often due to missing a required parameter. |
| 401 - Unauthorized | No valid API key provided. |
| 403 - Forbidden | The API key doesn't have permissions to perform the request. |
| 404 - Not Found | The requested resource doesn't exist. |
| 429 - Too Many Requests | Too many requests hit the API too quickly. We recommend an exponential backoff. |
| 500, 502, 503, 504 | Server Errors. Something went wrong on A3Gate's end. |
Vision Analysis
The Vision API allows our multimodal models to take in images and answer questions about them. This is powered by our custom LLaVA-based architectures running on optimized TensorRT engines.
POST https://a3gate.in/v1/chat/completions
Vision uses the exact same endpoint as Chat Completions, but allows passing an array of content objects containing an image URL or base64 data.
Create Fine-Tuning Job
Creates a fine-tuning job which begins the process of creating a new model from a given dataset. Our distributed compute handles LoRA and full-parameter tuning.
POST https://a3gate.in/v1/fine_tuning/jobs
| Parameter | Description |
| training_filestringRequired |
The ID of an uploaded file that contains training data (JSONL format). |
| modelstringRequired |
The name of the base model to fine-tune. You can select "llama-3-8b" or "mistral-7b". |
| hyperparametersobject |
Optional hyperparameters used for fine-tuning. Defaults to automatic. |
List Fine-Tuning Jobs
List your organization's fine-tuning jobs, including their status (running, succeeded, failed) and the ID of the resulting fine-tuned model if complete.
GET https://a3gate.in/v1/fine_tuning/jobs
| Parameter | Description |
| afterstring |
Identifier for the last job from the previous pagination request. |
| limitinteger |
Number of fine-tuning jobs to retrieve. Defaults to 20. |