This playground runs on dedicated H100 clusters utilizing TensorRT-LLM for sub-millisecond TTFT (Time To First Token) and 100+ tokens/sec throughput.
Type a prompt below to see the streaming response powered by A3Gate's optimized infrastructure.