AI Platform Engineer

We are seeking an AI Platform Engineer to build and scale the infrastructure that powers our production AI services. You will take cutting-edge models—ranging from speech recognition (ASR) to large language models (LLMs), and deploy them into highly available, developer-friendly APIs.You will be responsible for creating the bridge between the R&D team, who train models, and the applications that consume them. This means developing robust APIs, deploying and optimising models on Triton Inference Server (or similar frameworks), and ensuring real-time, scalable inference.ResponsibilitiesAPI DevelopmentDesign, build, and maintain production-ready APIs for speech, language, and other AI models.Provide SDKs and documentation to enable easy developer adoption.Model DeploymentDeploy models (ASR, LLM, and others) using Triton Inference Server or similar systems.Optimise inference pipelines for low-latency, high-throughput workloads.Scalability & ReliabilityArchitect infrastructure for handling large-scale, concurrent inference requests.Implement monitoring, logging, and auto-scaling for deployed services.CollaborationWork with research teams to productionize new models.Partner with application teams to deliver AI functionality seamlessly through APIs.DevOps & InfrastructureAutomate CI/CD pipelines for models and APIs.Manage GPU-based infrastructure in cloud or hybrid environments.RequirementsCore SkillsStrong programming experience in Python (FastAPI, Flask) and/or Go/Node.js for API services.Hands-on experience with model deployment using Triton Inference Server, TorchServe, or similar.Familiarity with both ASR frameworks and LLM frameworks (Hugging Face Transformers, TensorRT-LLM, vLLM, etc.).InfrastructureExperience with Docker, Kubernetes, and managing GPU-accelerated workloads.Deep knowledge of real-time inference systems (REST, gRPC, WebSockets, streaming).Cloud experience (AWS, GCP, Azure).BonusExperience with model optimisation (quantisation, distillation, TensorRT, ONNX).Exposure to MLOps tools for deployment and monitoring


                                    



                                    If you are seeing this message it may be an redirection error, 
                                    please contact our support with this code: 
                                    TW96aWxsYS81LjAgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbzsgY29tcGF0aWJsZTsgQ2xhdWRlQm90LzEuMDsgK2NsYXVkZWJvdEBhbnRocm9waWMuY29tKQ==