AI Platform Engineer

Jobs
T-Pro

T-Pro

-

🌎 Remote

Posted on: 14 September, 2025

AI Platform Engineer

We are seeking an AI Platform Engineer to build and scale the infrastructure that powers our production AI services. You will take cutting-edge models-ranging from speech recognition (ASR) to large language models (LLMs), and deploy them into highly available, developer-friendly APIs.

You will be responsible for creating the bridge between the R&D team, who train models, and the applications that consume them. This means developing robust APIs, deploying and optimising models on Triton Inference Server (or similar frameworks), and ensuring real-time, scalable inference.

Responsibilities

API Development

  • Design, build, and maintain production-ready APIs for speech, language, and other AI models.
  • Provide SDKs and documentation to enable easy developer adoption.

Model Deployment

  • Deploy models (ASR, LLM, and others) using Triton Inference Server or similar systems.
  • Optimise inference pipelines for low-latency, high-throughput workloads.

Scalability & Reliability

  • Architect infrastructure for handling large-scale, concurrent inference requests.
  • Implement monitoring, logging, and auto-scaling for deployed services.

Collaboration

  • Work with research teams to productionize new models.
  • Partner with application teams to deliver AI functionality seamlessly through APIs.

DevOps & Infrastructure

  • Automate CI/CD pipelines for models and APIs.
  • Manage GPU-based infrastructure in cloud or hybrid environments.

Requirements

Core Skills

  • Strong programming experience in Python (FastAPI, Flask) and/or Go/Node.js for API services.
  • Hands-on experience with model deployment using Triton Inference Server, TorchServe, or similar.
  • Familiarity with both ASR frameworks and LLM frameworks (Hugging Face Transformers, TensorRT-LLM, vLLM, etc.).

Infrastructure

  • Experience with Docker, Kubernetes, and managing GPU-accelerated workloads.
  • Deep knowledge of real-time inference systems (REST, gRPC, WebSockets, streaming).
  • Cloud experience (AWS, GCP, Azure).

Bonus

  • Experience with model optimisation (quantisation, distillation, TensorRT, ONNX).
  • Exposure to MLOps tools for deployment and monitoring

Tags:
ai
ml
Share the job:

Related Jobs

AI Developer - Predictive Modeling
AI Developer - Predictive Modeling

Contract - 🌎 Remote