Job Summary:
This position involves developing a software pipeline for end-to-end ML Model Inference optimized for specific hardware accelerators to achieve maximum performance and accuracy. The role requires expertise in implementing cutting-edge deep learning layers for models such as CNN, RNN, LSTM, GANs, and optimizing performance for LLM models on customized hardware.
Job Description:
- Develop a software pipeline tailored for neural network processors to enable efficient ML inference execution.
- Implement deep learning layers across various model categories using custom inference pipelines.
- Optimize performance for inferencing large language models on hardware, including transformer-based architectures.
- Ensure hardware-aware, computation-conscious implementations on embedded devices to maximize throughput.
- Write clean, efficient code for tools and applications.
- Identify, prioritize, and execute tasks based on project requirements.
- Manage code implementation, review, debugging, and product delivery with fast turnaround times.
- Collaborate with the team to brainstorm and develop new products.
- Mentor new team members and promote a healthy team culture.
Profile Summary:
- BE/BTech/MS/MTech graduate in Computer Science Engineering with 4+ years of experience.
- Solid programming experience in C/C++, with proven experience as a Senior Software Engineer.
- Expertise in implementing kernel intrinsics for Machine Learning or Computer Vision algorithms focused on optimization.
- Extensive experience in software development and project management.
- Strong analytical and problem-solving skills.
- Able to adapt and execute complex tasks under tight schedules and dynamic conditions.
- Familiarity with operating systems such as Linux, Mac OS, and Windows.
- Ability to work independently and manage a team effectively.
- Excellent organizational and leadership skills.
- Working knowledge of Deep Learning frameworks (ONNX, TensorFlow, PyTorch, or any hardware accelerator software pipeline experience).
About the Company:
The client is a global deep tech software and engineering services provider specializing in high performance solutions across compilers, machine learning, vision analytics, and video codec technologies. Its offerings power applications in emerging domains autonomous mobility, edge AI, multimedia streaming, and sensor fusion on heterogeneous platforms (CPU/GPU/AI accelerators). Privately backed and profitable since its inception in 2009, the organization maintains engineering centers in North America, China, India, and Asia Pacific, and has been repeatedly certified as a Great Place to Work.