Epicareer Might not Working Properly
Learn More

Data Centre Operations Lead

Salary undisclosed

Apply on


Original
Simplified

Title: Data Centre Operations Lead

Location: Rockville, MD

Type: Contract

Note: Onsite

Job Description:

  • Lead the data center operations team, providing guidance, training, and support to ensure high performance and operational excellence. Act as the primary point of contact for all data center-related issues and escalations.
  • Oversee the daily operations of data center facilities, ensuring high availability and reliability of all systems.
  • Manage data center infrastructure technology stack end to end VMWare/VxRail/Citrix/Logic Monitor/Moog Soft/AD/Azure AD SSO, Azure Security Policy/PKI/Windows & Linux Servers/Vulnerability management/Beyond Trust Password Safe and AD-Bridge/Storage & Backup tools etc.
  • Ensure adherence to operational standards and best practices.
  • Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
  • Lead, mentor, and manage a team of data center operation engineers.
  • Provide guidance and support for professional development and performance improvement.
  • Coordinate and manage the team's daily activities, ensuring alignment with organizational goals and priorities.
  • Lead the response to data center incidents, ensuring timely resolution and minimal impact on business operations.
  • Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
  • Develop and maintain incident management processes and procedures.
  • Plan and oversee scheduled maintenance and upgrades of data center infrastructure.
  • Ensure that all hardware and software components are up-to-date and functioning optimally.
  • Coordinate with vendors and service providers for maintenance and support activities.
  • Monitor and analyze data center resource usage, ensuring efficient utilization and avoiding over-provisioning.
  • Conduct capacity planning to support future growth and demand.
  • Implement optimization strategies to enhance performance and reduce operational costs.
  • Ensure data center infrastructure adheres to security policies, standards, and best practices.
  • Implement and maintain security controls to protect data and systems.
  • Ensure compliance with regulatory requirements and industry standards (e.g., ISO 27001, HIPAA).
  • Develop and implement disaster recovery and business continuity plans for data center operations.
  • Ensure regular testing and validation of disaster recovery procedures.
  • Ensure data center infrastructure is resilient and can recover quickly from failures or disruptions.
  • Work closely with other IT teams, business units, and stakeholders to understand requirements and deliver solutions that meet their needs.
  • Collaborate with vendors and service providers to evaluate and integrate new technologies and services.
  • Communicate effectively with stakeholders, providing regular updates on data center operations and performance.
  • Maintain comprehensive documentation of data center infrastructure, configurations, processes, and procedures.
  • Generate regular reports on data center performance, incidents, and operational metrics.
  • Ensure documentation is up-to-date and accessible to relevant stakeholders.

Here are some technical responsibilities in detail.

  • Active Directory and Cloud Services
  • Administer Azure AD, manage security groups, GPO, SSO, and application configurations.
  • Handle public cloud directory services, Oracle IDCS, network/file shares, SCP policies, privileged user management, and service account passwords.
  • Conduct AD audits, schema updates, backup/restore services, and assist with JSOX, FDA, and GQS audits.
  • Manage ticket queues and follow up on aging tickets.
  • End-to-end support for Active Directory Domains (Azure AD, AD security groups, GPO, SSO, application configurations, etc.

IT Environment Monitoring

  • 24x7 ITSM queue-based monitoring.
  • Triage and first-level troubleshooting based on alert severity.
  • Incident resolution using Standard Operating Procedures.

Vendor Coordination

  • Coordinate with vendors for infrastructure on public/private Cloud.
  • Provide vendor contact details and escalation matrix.

Citrix Architecture and Optimization

  • Maintain Citrix architecture and seek continuous optimization.
  • Participate in architecture design and planning with the steering committee.
  • Recommend system and end-user performance improvements.
  • Implement approved performance improvements.

Citrix Environment Support

  • Support Citrix environment and integrate with Otsuka-specific technologies.
  • Order, install, update, and maintain Citrix servers and tools.
  • Assess, consolidate, upgrade, and manage Citrix infrastructure, including SDX appliances.
  • Manage NetScaler infrastructure and upgrades.

.

IT Service Continuity and Disaster Recovery (DR) Services

  • Strategy and Policy Definition
  • Coordination and Execution
  • Data Management
  • Testing and Reporting
  • DR Activation and Coordination
  • Review and Enhancement

Onsite and Remote Support

  • Onsite server support, IMAC services, and remote software installation.
  • Decommissioning, proactive evaluation, and datacenter assessment.

Windows Server Management & Projects

  • Administer and monitor Windows servers, including health checks and problem management.
  • Manage local users, groups, shares, and server disk/storage.
  • Handle event logs, vendor coordination, and performance issues.
  • Install and manage IIS, apply security patches, and troubleshoot clusters.
  • Oversee DNS, SCOM, certificate management, migrations, and server deployments.

Linux Server Administration and Projects

  • User Administration - Manage user accounts, environments, and home directories.
  • OS Package Administration - Add/remove OS packages and troubleshoot issues.
  • Storage Management - Create/manage file systems, logical volumes, and clean up disk space.
  • NIS and NFS Management - Administer NIS tables and services, install/configure NFS servers.
  • Network and Security - Configure/manage NTP, DNS, and implement security standards.
  • OS Upgrade and Patching - Upgrade/patch Linux OS, configure SSSD and AD, manage disk and security.
  • High Availability and Compliance - Build/configure HA environments, enforce security, and ensure regulatory compliance.
  • Server Builds and Management - Install/configure NIS, mail, DNS servers, and centralized syslog servers.

DC Power Tools

  • Tool Stack Logic Monitor, MoogSoft, Manage Engine, Beyond Trust Password Safe, Beyond Trust AD Bridge, CommVault compliance Search, Veritas Hubstor etc. Management and Support

Logic Monitor Administration

  • Installation and Configuration - Install and configure LogicMonitor Collectors and group servers for monitoring.
  • Monitoring and Reporting - Configure monitoring settings, create HLD/Templates/SOPs, and integrate with Moogsoft.
  • Maintenance and Troubleshooting - Backup/restore LogicMonitor Collectors, troubleshoot devices, and modify LogicModules.
  • Consultancy and Coordination - Provide consultancy, manage stakeholders, oversee platform support, and monitor infrastructure service
  • Moogsoft Administration and Issues
  • Integration and Event Management -Resolve Element Layer Tool integration issues and missing events/alarms at the Moogsoft layer.
  • Ticketing and Situation Formulation - Address ticketing problems with ITSM tools and inconsistencies in situation formulation/Cookbook.
  • Maintenance and Upgrades - Fix maintenance window malfunctions and perform Moogsoft module upgrades.
  • Configuration Management - Manage Moogsoft ReC, Ipe additions/deletions/modifications, and Cookbook enablement/disablement.
  • TeamRooms and API Integration - Create/modify/delete Moogsoft TeamRooms and integrate Moogsoft AI Operations with vendor APIs to automate ticketing.
  • Updates and Enhancements - Manage Moogsoft updates and enhancements.

Storage Backup & Data Management

  • Define performance, data segregation, backup, restore, archival, retention, reliability, encryption, security, scheduling, and access control needs.
  • Recommend hierarchical storage solutions (shared/dedicated, tiered storage, platforms) and procedures to meet requirements and SLRs.
  • Review and approve storage and backup solutions and procedures.
  • Procure and manage data storage infrastructure (SAN, NAS, tape, optical).
  • Provide and manage backup and archival consumables for Otsuka facilities.
  • Maintain data set placement, manage data catalogs, and configure Nimble SAN and NAS switches.
  • Notify Otsuka of any data losses or risks.
  • Perform data and file backups/restores per procedures and SLRs.
  • Manage file transfers, data movement, and input processing for third-party media.
  • Decommission storage and backup environments per policies.
  • Develop and maintain backup schedules, manage backup media, and ensure data retention.
  • Work with third-party vendors to archive data at secure offsite locations.
  • Conduct media testing to ensure data recovery capability and integrity.
  • Test end-to-end system recovery, remediate flaws, and coordinate with vendors.
  • Recover files/data as required, provide recovery updates, and manage data replication to DR sites.

Qualifications we seek in you!

Minimum Qualifications / Skills

  • Bachelor s degree in Computer Science, Information Technology, Electrical Engineering, or a related field. Advanced degrees or relevant professional training are a plus.
  • Minimum 10 years of experience in data center operations, with at least 5 years in a leadership or senior technical role.
  • Extensive experience in data center operations, with a proven track record of managing large-scale data center environments.
  • Strong leadership and team management skills, with the ability to motivate and develop a high-performing operations team.
  • In-depth knowledge of data center infrastructure, including servers, storage, networking, power, and cooling systems.
  • Excellent problem-solving and analytical skills, with the ability to diagnose and resolve complex technical issues.
  • Experience with incident and problem management, change management, and capacity planning.
  • Strong understanding of compliance, security, and regulatory requirements related to data center operations.
  • Effective communication and interpersonal skills, with the ability to interact with stakeholders at all levels.
  • Experience in vendor management and contract negotiations.
  • A proactive approach to continuous improvement and innovation in data center operations.

Preferred Qualifications/ Skills

  • Relevant certifications from Microsoft, VMWare Citrix and Storage vendors are highly desirable.
  • Experience with ITIL or other IT service management frameworks.
  • Familiarity with cloud computing and hybrid data center environments.
  • Excellent communication and collaboration skills, with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
  • Strong analytical and problem-solving skills, with the ability to identify root causes of issues and implement effective solutions in a timely manner.
  • Proven ability to work independently as well as part of a team, with a proactive and self-motivated attitude towards achieving project goals.

Thanks & Regards,

Burra Teja | Recruiter | Email:

Direct:

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job