Job Summary:
An Infrastructure and Network Engineer is a professional responsible for designing, implementing, and maintaining an organization's IT infrastructure to ensure high availability, disaster recovery, and uninterrupted service delivery. This role involves a combination of cloud computing and on-premises data center management, with a focus on creating a resilient and robust network that can withstand various challenges and disruptions.
An Infrastructure and Network Engineer is essential for organizations that rely on a cloud, traditional, and/or both IT infrastructure. This role requires a blend of technical skills, strategic planning, and a commitment to ensuring that IT services remain reliable and available.
Here is a summary of the qualifications, skills, and job responsibilities for this role:
Qualifications:
- A bachelor's degree in computer science, information technology, engineering, or a related field.
- Several years of experience in IT infrastructure, cloud services, and network engineering.
- Years of experience in managing both cloud-based and on-premises IT infrastructure.
- Experience with implementing and managing service resiliency in a complex IT environment.
- Proven track record of maintaining service resiliency in a hybrid IT environment.
- Flexibility to adapt to changing business needs and technology landscapes.
Skills:
- In-depth knowledge of cloud computing platforms (e.g., AWS, Azure, Google Cloud) and on-premises infrastructure.
- Expertise in network design, implementation, and troubleshooting.
- Experience with virtualization technologies and container orchestration (e.g., VMware, Docker, Kubernetes).
- Proficiency in network architecture, protocols, and security.
- Strong understanding of disaster recovery and business continuity planning/principles.
- Ability to analyze and troubleshoot complex systems and resolve issues under pressure.
- Familiarity with monitoring and logging tools
- Proficiency in infrastructure automation and configuration management tools.
- Scripting skills for automation purposes (e.g., Python, PowerShell).
- Excellent verbal and written communication skills, with the ability to work effectively with both technical and non-technical stakeholders.
Responsibilities:
- Design and Implementation of Resilient Infrastructure:
- Design and deploy highly available, scalable, and secure infrastructure solutions across cloud and on-premises environments.
- Implement redundancy, load balancing, and failover mechanisms to ensure service continuity.
- Develop and maintain network architecture that supports both cloud-based and on-premises systems, optimizing for performance and resiliency.
- Service Resiliency and Disaster Recovery Planning:
- Develop, implement, and test disaster recovery plans to ensure minimal downtime and data loss during incidents.
- Conduct regular assessments of infrastructure and network vulnerabilities and develop strategies to mitigate risks.
- Collaborate with business continuity teams to align IT infrastructure with organizational resiliency goals.
- Monitoring and Incident Management:
- Implement and manage monitoring tools to proactively detect and respond to infrastructure and network issues.
- Lead incident response efforts, including root cause analysis, remediation, and post-incident reviews.
- Continuously improve monitoring and incident management processes to reduce response times and enhance service resiliency.
- Security and Compliance:
- Ensure that all infrastructure and network configurations comply with security policies and industry regulations.
- Implement security best practices, including firewalls, encryption, and access controls, to protect critical systems.
- Conduct regular security audits and work with security teams to address vulnerabilities.
- Automation and Continuous Improvement:
- Develop and implement automation scripts and tools to streamline infrastructure management and reduce manual intervention.
- Identify and implement continuous improvement initiatives to enhance infrastructure performance, reliability, and efficiency.
- Stay updated on industry trends and emerging technologies and recommend upgrades or new solutions to enhance service resiliency.
- Collaboration and Documentation:
- Work closely with cross-functional teams, including DevOps, IT operations, and application teams, to ensure seamless integration and operation of infrastructure components.
- Maintain comprehensive documentation of infrastructure designs, configurations, disaster recovery plans, and incident response procedures.
- Provide training and guidance to other IT staff on best practices for maintaining service resiliency.
- Technology Research:
- Stay informed about new technologies and industry best practices to enhance the resiliency and efficiency of IT services.