Job Description General Summary: The Data, Technology and Engineering (DTE) Core Technology team is expanding its Generative AI and large language model (LLM) capabilities and is looking for an Azure OpenAI Infrastructure Engineer, who will be responsible for building and maintaining the Azure cloud infrastructure that supports the OpenAI platform, emphasizing scalability, security, automation, and cost-effectiveness. Key Duties and Responsibilities: 1. Infrastructure Design and Deployment
- Design Infrastructure Solutions:
- Design scalable, robust, and secure Azure infrastructures for hosting OpenAI services.
- Deploy appropriate Azure services for compute and storage needs.
- Deploying Infrastructure:
- Deploy and configure OpenAI resources for the different use cases.
- Configure networking components such as Virtual Networks, Virtual Firewalls, Load Balancers, and Application Gateways.
- Integration with OpenAI Services:
- Set up and manage integrations with OpenAI APIs and services.
- Ensure compatibility and optimal configurations for OpenAI models.
2. System Monitoring and Maintenance
- Monitoring Infrastructure Performance:
- Implement monitoring solutions using Azure Monitor, Log Analytics, and Application Insights.
- Set up alerts and dashboards to track the health and performance of infrastructure components.
- Maintenance Tasks:
- Schedule and perform updates, patches, and backups of infrastructure components.
- Manage the lifecycle of infrastructure resources.
- Resource Optimization:
- Analyze resource utilization and optimize for cost and performance.
- Implement autoscaling policies to adjust to workload changes.
3. Security and Compliance
- Implementing Security Measures:
- Configure Azure Security Center, Network Security Groups, and firewalls to protect resources.
- Use EntraID (Azure Active Directory) for identity and access management.
- Compliance Checks:
- Ensure compliance with organizational policies, industry standards, and regulatory requirements.
- Conduct regular security assessments and audits.
- Managing Credentials and Keys:
- Use Azure Key Vault to securely store and manage API keys, secrets, and certificates.
4. Automation and DevOps Practices
- CI/CD Pipeline Management:
- Develop pipelines in Azure DevOps for infrastructure deployments and updates.
- Automate testing and validation of infrastructure changes.
- Automation Scripting:
- Create scripts using PowerShell or Azure CLI to automate tasks.
- Configuration Management:
- Use automation tools for consistent configuration across resources and environments.
5. Troubleshooting and Incident Response
- Resolving Technical Issues:
- Diagnose and resolve issues related to infrastructure components.
- Use Azure diagnostic tools to analyze problems.
- Incident Management:
- Lead response efforts for infrastructure-related incidents.
- Document incidents and implement strategies to prevent future occurrences.
- Disaster Recovery Planning:
- Develop and maintain disaster recovery plans using Azure Site Recovery and backups.
- Test recovery procedures regularly.
6. Collaboration with Cross-Functional Teams
- Supporting Developers and Engineers:
- Assist teams in provisioning and configuring Azure environments for development and testing.
- Provide guidance on best practices for using Azure services.
- Stakeholder Communication:
- Communicate infrastructure status, changes, and upgrades to stakeholders.
- Coordinate with vendors and service providers when necessary.
7. Performance Optimization
- Hardware Acceleration:
- Utilize Azure's specialized hardware for performance-intensive workloads.
- Cost Management:
- Monitor compute/storage spending using Azure Cost Management tools.
- Optimize resource allocation to reduce costs.
- Scaling Strategies:
- Implement horizontal and vertical scaling strategies to handle load effectively.
8. Documentation and Knowledge Sharing
- Creating Documentation:
- Document infrastructure architecture, configurations, and operational procedures.
- Maintain a repository of scripts and templates.
- Training and Mentoring:
- Educate team members on Azure best practices and new features.
- Building Runbooks:
- Develop detailed runbooks for common tasks and incident responses.
9. Continuous Improvement and Learning
- Staying Current with Azure Technologies:
- Keep abreast of new Azure services, updates, and best practices.
- Participate in Azure communities and forums.
- Experimentation:
- Test new Azure features that could benefit the organization.
- Feedback Integration:
- Incorporate feedback from monitoring and teams to improve infrastructure.
Knowledge and Skills:
- Strong understanding of Microsoft Azure and emerging technologies in cloud and GenerativeAI/LLM infrastructure
- Strong problem-solving and troubleshooting skills
- Ability to work on multiple concurrent projects and activities as both a lead and team member
- Able to reliably estimate level of effort needed for assignments and work within those parameters
- Able to work independently with minimal guidance
- Strong verbal and written communication skills, organizational skills, and attention to detail
- Demonstrated ability to collaborate in cross-functional teams
Education and Experience:
- Bachelor's degree in a relevant field (technology discipline preferred) or relevant experience
- 8+ years' experience working in Azure cloud technology roles with at least 5 years in technology lead positions
- Experience with generativeAI and large language models preferred
- Microsoft Azure or Cloud certifications
- Exposure to working in Agile environments
Flex Designation: Hybrid-Eligible Or On-Site Eligible
Flex Eligibility Status:
In this Hybrid-Eligible role, you can choose to be designated as: 1. Hybrid: work remotely up to two days per week; or select 2. On-Site: work five days per week on-site with ad hoc flexibility. Note: The Flex status for this position is subject to Vertex's Policy on Flex @ Vertex Program and may be changed at any time. Company Information Vertex is a global biotechnology company that invests in scientific innovation. Vertex is committed to equal employment opportunity and non-discrimination for all employees and qualified applicants without regard to a person's race, color, sex, gender identity or expression, age, religion, national origin, ancestry, ethnicity, disability, veteran status, genetic information, sexual orientation, marital status, or any characteristic protected under applicable law. Vertex is an E-Verify Employer in the United States. Vertex will make reasonable accommodations for qualified individuals with known disabilities, in accordance with applicable law. Any applicant requiring an accommodation in connection with the hiring process and/or to perform the essential functions of the position for which the applicant has applied should make a request to the recruiter or hiring manager, or contact Talent Acquisition at ApplicationAssistance@vrtx.com
|