Search by job, company or skills

DFI Retail Group

Site Reliability Engineer

Early Applicant
  • 13 days ago
  • Be among the first 50 applicants

Job Description

DFI Team Brief

As a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance. You will leverage your software engineering expertise to automate operations, optimize system performance, and develop solutions that prevent recurring issues. Your work will be essential in guaranteeing seamless experience for our users by maintaining the high availability and efficiency of our services.

Is this your next challenge in Site Reliability Engineering

Responsibilities:

  • Design and Implement Solutions for Reliability and Scalability: Develop and implement highly scalable and available system architectures to meet growing user demands without compromising performance.
  • Automate Operations: Design, build, and integrate software tools to automate operational processes, including system monitoring, incident response, and deployment procedures.
  • Optimize System Performance: Proactively monitor system performance, identify bottlenecks, and implement optimization strategies to ensure efficient resource utilization and service delivery.
  • Implement and Manage Monitoring and Observability: Establish comprehensive service metrics and implement robust monitoring systems to track, analyze, and report on system reliability, performance, and efficiency including, but not limited to the following monitoring systems (New Relic, Azure Monitor, and Google Cloud Monitoring). Utilize observability tools to gain deeper insights into system behavior and identify potential issues proactively.
  • Incident Response and Resolution: Develop and implement strategies for rapid incident detection and response. Troubleshoot and resolve complex system issues, minimizing downtime and mitigating service disruptions.
  • Capacity Planning and Performance Tuning: Conduct capacity planning analyses to anticipate future resource needs and ensure system scalability. Proactively tune system performance to optimize resource utilization and maintain service level agreements (SLAs).
  • Collaboration with Development Teams: Work closely with software development teams to integrate reliability considerations throughout the software development lifecycle. Participate in code reviews, design discussions, and post-incident reviews to enhance system reliability and prevent recurring issues.
  • Drive Continuous Improvement: Continuously evaluate existing processes and tools, identifying areas for improvement and automation. Research and implement new technologies and best practices to enhance system reliability and operational efficiency.
  • Documentation and Knowledge Sharing: Create and maintain comprehensive documentation for systems, processes, and incident responses. Actively share knowledge and best practices with the team and organization.
  • Administer Atlassian Product Suite: Manage and maintain the Atlassian product suite, including Jira, Confluence, and Bitbucket, ensuring seamless operation and integration with existing workflows. Provide user support and training as needed.

Do you have experience as Site Reliability Engineer

Qualifications:

  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • Proven experience (At least 3 years) as an SRE, DevOps Engineer, or a similar role, demonstrating a strong understanding of software engineering principles and IT operations.
  • Hands-on experience in the administration of the Atlassian product suite (Jira, Confluence and Bitbucket).
  • In-depth knowledge of cloud platforms such as AWS, Azure, or GCP, including services related to compute, storage, networking, and databases.
  • Proficiency in scripting languages like Python or PowerShell and experience with automation tools such as Terraform or Ansible.

Familiarity with Monitoring and log system (Prometheus, Zabbix, Grafana, ELK, Azure Monitor, Google Monitoring)

  • Hands-on experience with containerization technologies like Docker and container orchestration tools like Kubernetes.
  • Strong understanding of networking concepts and protocols.
  • Experience with CI/CD pipelines and tools for continuous integration, continuous delivery, and infrastructure automation.
  • Solid understanding of security best practices for cloud environments.
  • Strong analytical and problem-solving skills, with the ability to identify root causes and implement effective solutions.
  • Excellent communication and collaboration skills, with the ability to work effectively within a team and communicate technical details to both technical and non-technical audiences.

If you have the right skills and experience, this is an opportunity to build your career with Pan Asia's leading retailer.

DFI Retail Group is an equal opportunity employer and responsible for ensuring that all personal information collected from each Candidate presented to DFI Retail Group is used for recruitment purposes only and the personal data will be kept and handled confidentially. We will retain the applications of candidates not selected for a period of no more than 24 months. The data collection process is in accordance with all applicable laws and compliant with the Code of Practice on Human Resource Management.

To find out more about Our Businesses and Our People, please visit our website: https://www.DFIretailgroup.com

Issued by The Dairy Farm Company, Limited

More Info

Industry:Other

Function:Retail

Job Type:Permanent Job

Skills Required

Login to check your skill match score

Login

Date Posted: 12/11/2024

Job ID: 99980719

Report Job

About Company

Hi , want to stand out? Get your resume crafted by experts.

Similar Jobs

Senior Site Reliability Engineer

NYGCICompany Name Confidential

Site Reliability Engineer Hybrid Working Set Up 2x a week

Mosed CorporationCompany Name Confidential
Last Updated: 23-11-2024 07:42:13 PM
Home Jobs in Philippines Site Reliability Engineer