Browse
···
Log in / Register

Senior Site Reliability Engineer

$140,000-180,000/year

TP-Link Systems Inc.

Irvine, CA, USA

Favourites
Share

Description

At the forefront of the future of connected living, TP-Link's Systems Inc. R&D Center in Irvine, Southern California's innovation hub, spearheads research and development of next-generation networking, IoT smart home products, and software services. Our team of passionate engineers are constantly innovating, engineering solutions that transform the end user experience with simpler, smarter, and more reliable connectivity. We're looking for a passionate and experienced Senior Site Reliability Engineer to join our team and play a crucial role in ensuring our cloud platform's security, Reliability, scalability, and operational excellence. About Us: Headquartered in the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world’s top provider of Wi-Fi devices. The company is committed to delivering innovative products that enhance people’s lives through faster, more reliable connectivity. With a commitment to excellence, TP-Link serves customers in over 170 countries and continues to grow its global footprint. We believe technology changes the world for the better! At TP-Link Systems Inc, we are committed to crafting dependable, high-performance products to connect users worldwide with the wonders of technology.  Embracing professionalism, innovation, excellence, and simplicity, we aim to assist our clients in achieving remarkable global performance and enable consumers to enjoy a seamless, effortless lifestyle.  Responsibilities: Serve as technical SME for implementing and operating Microservices on Kubernetes cloud-based platforms.  Collaborate with the Cloud Technical Development and DevOps teams to deploy services to the Multi-Cloud Platform.  Performing Load Tests and Chaos Tests to ensure the scalability and reliability of microservices. Build Observability for Microservices and cloud platforms like AWS, OCI, Azure, and GCP. Write and Execute the Disaster recovery plans in collaboration with the Development and DevOps team. Analyze and resolve production risks caused by insufficient resources, such as node groups, CPU, memory, HPA scheduling, JVM pre-warming, etc. Write and maintain scripts for automation using languages like Python, Go, or Bash. Define and maintain the KPIs (SLA/SLO/SLI) for all cloud microservices with development teams to better understand the business. Create and maintain technical documentation, including architecture diagrams, design documents, and standard operating procedures. Guarantee adherence to security and compliance standards, including ISO27001, SOC2, and GDPR. Lead incident response efforts to troubleshoot and resolve production issues quickly. Perform post-incident analysis to identify root causes and potential workarounds/solutions. Assist with product/technology selection, including implementation of POCs Be fluid and open to change and evolving processes and tools  Help to mentor and train less senior members of the team Ability to be part of On-call rotation and provide support after work hours and on weekends.  Other duties as assigned Requirements Bachelor's degree in Computer Science, Information Technology, or a related field. 5+ years of experience as a Site Reliability Engineer. Proficiency in programming and scripting languages like Java, Python, Bash, or PowerShell. Hands-on experience in SRE, DevOps, cloud operations, and cloud security best practices. Strong knowledge of security technologies, including Identity and access management, Network security, Application security, and Data protection. Strong problem-solving and analytical skills, with the ability to work independently and as part of a team. Experience in developing and maintaining technical documentation and implementing compliance requirements. Additional Skills (Preferred): Expert-level cloud certifications include AWS Solutions Architect, Professional, Azure Solutions Architect Expert, and GCP Professional Cloud Architect. Experience with container orchestration technologies (e.g., Kubernetes). Benefits Salary range: $140,000 - $180,000 Free snacks and drinks, and provided lunch on Fridays Fully paid medical, dental, and vision insurance (partial coverage for dependents) Contributions to 401k funds Bi-annual reviews, and annual pay increases Health and wellness benefits, including free gym membership Quarterly team-building events At TP-Link Systems Inc., we are continually searching for ambitious individuals who are passionate about their work. We believe that diversity fuels innovation, collaboration, and drives our entrepreneurial spirit. As a global company, we highly value diverse perspectives and are committed to cultivating an environment where all voices are heard, respected, and valued. We are dedicated to providing equal employment opportunities to all employees and applicants, and we prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws. Beyond compliance, we strive to create a supportive and growth-oriented workplace for everyone. If you share our passion and connection to this mission, we welcome you to apply and join us in building a vibrant and inclusive team at TP-Link Systems Inc. Please, no third-party agency inquiries, and we are unable to offer visa sponsorships at this time.

Source:  workable View original post

Location
Irvine, CA, USA
Show map

workable

You may also like

Workable
Platform Engineer AI Tool & Integration
Platform Engineer - AI Tool & Integration Position Overview: The Data Analytics & AI department at RCG is seeking a highly skilled and experienced Software & Platform Engineer to join our team. This pivotal role requires a strong technical background in AI tooling, data platform architecture, cloud computing, and big data technologies. The successful candidate will be responsible for all tooling and integration with GenAI LLM, maintaining our Azure platform with OpenAI, and leveraging Databricks Mosaic AI. This role will be instrumental in driving innovation, ensuring seamless integration, and optimizing our AI and data platforms to meet the evolving needs of the business. Key Responsibilities: • Design, develop, and maintain tooling and integration for GenAI LLM and other AI models. • Manage and optimize our Azure platform, ensuring seamless integration with OpenAI and Databricks Mosaic AI. • Collaborate with cross-functional teams to identify and implement innovative AI solutions to enhance our platform capabilities. • Stay up to date with the latest advancements in AI and machine learning technologies to drive continuous improvement and innovation. • Develop and implement best practices for AI model deployment & scaling. • Design and execute integration strategies for incorporating large language models (LLMs) and CoPilot technologies with existing business platforms. • Assess and recommend suitable LLM and CoPilot technologies that align with business needs and technical requirements. • Conduct feasibility studies and proof-of-concepts to validate the integration of new tools and technologies. • Keep abreast of the latest advancements in LLM, CoPilot, and related technologies to identify opportunities for further innovation and improvement. • Understand and leverage MS CoPilot and MS CoPilot Studio for enhanced productivity and collaboration within the development team. • Integrate MS CoPilot tools into existing workflows and ensure seamless integration with other systems and applications used by the team. • Work closely with product managers, data scientists, and other stakeholders to gather requirements and ensure successful integration of LLM and CoPilot technologies. Requirements: • Bachelor’s or Master’s degree in computer science, Engineering, or a related field. • Proven experience in software/system/platform engineering, with a focus on AI tooling and integration. • Strong expertise in working with Azure, including managing and integrating AI services such as OpenAI and Databricks Mosaic AI. • Proficiency in programming languages such as Python, Java, or C++. • Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch). • Solid understanding of software development methodologies, including Agile and DevOps practices and tools for continuous integration and deployment. • Understanding of data security best practices and compliance requirements in software development and integration • Excellent problem-solving skills and the ability to work in a fast-paced, dynamic environment. • Strong communication and collaboration skills. Preferred Qualification: • Strong proficiency in one or more programming languages such as Python, Java, C#, or JavaScript. • Experience with large language models (LLMs) and familiarity with tools like OpenAI GPT, Google BERT, or similar. • Hands-on experience with Databricks for data engineering, data science, or machine learning workflows. • Proficiency in using Databricks for building and managing data pipelines, ETL processes, and real-time data processing. • Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch). • Hands-on experience with Microsoft CoPilot and CoPilot Studio. • Proficiency in working with APIs, microservices architecture, and web services (REST/SOAP). • Familiarity with cloud platforms such as AWS, Azure, or Google Cloud, and their integration services. • Knowledge of database systems, both SQL and NoSQL (e.g., MySQL, MongoDB). • Knowledge of natural language processing (NLP) and large language models (LLMs). • Previous experience in a similar role within a technology-driven organization. • Certifications in Azure, AI, or related areas.
Coral Gables, FL, USA
Negotiable Salary
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.