Sr. Linux Engineer
- Full Time, onsite
- Columbia University
- HybridOnsite Mondays & Wednesdays as needed., United States of America
Apply on
Reporting to the Manager, High Performance Computing; the Sr. Research Systems Engineer participates in the design, development, implementation, and operations of Columbia s portfolio of high-performance computing services. The position collaborates with other CUIT technical teams and Columbia researchers to support research computing resources, including but not limited to high-performance computing (HPC) clusters, ensuring user requirements are met and planning ongoing improvements and modifications to these systems and services.
Responsibilities:
- Takes a primary role in the planning and design of research computing services.
- Investigates new and emerging technologies, evaluating usefulness to Columbia researchers and making recommendations for future services.
- Interacts with Columbia researchers on various topics, including (but not limited to) the use of existing services, service policies, and research requirements.
- Takes a primary role in HPC system troubleshooting including coordinating with users, vendors, and other CUIT departments to resolve system problems.
- Manage storage systems.
- Resolves incidents and service requests.
- Administration of systems in the research computing infrastructure, including the installation and management of configuration, monitoring, and notification tools, as well as basic network administration.
- Assists in the creation and maintenance of user documentation.
- Interacts with vendors, assessing products and making purchasing recommendations.
- All other duties as assigned.
Minimum Qualifications:
- Bachelor s degree or equivalent experience required.
- Minimum 4-6 years related experience.
- 4 years of Linux/Unix experience.
- Prior experience in programming, software development, or system administration.
- Excellent written and verbal communication skills.
- Demonstrated ability to work in a fast-paced, deadline-driven environment.
- Demonstrated excellence in a variety of competencies including teamwork/collaboration, analytical thinking, communication and influencing skills, and technical expertise.
- Ability to work with changing priorities and with multiple projects.
- Ability to be precise and attentive to detail is essential.
- Ability to work with minimal supervision.
- Ability to work weekends and off-hour work on occasion.
Preferred Qualifications:
The following qualifications are not requirements, but highly advantageous. We will provide on-the-job training in High Performance Computing (HPC) and related topics to individuals who are otherwise skilled and motivated.
- Experience with Linux system administration, particularly Red Hat (7, 8).
- Experience with SLURM or other workload management services.
- Experience with Bright (Base Command Manager), OpenHPC, SGE, Confluent, or other clusterware.
- Knowledge of GPFS, Lustre, ZFS, NFS, or other network or parallel file systems.
- Familiarity with Ansible, Puppet, or other Linux configuration management tools.
- Familiarity with other HPC components, such as Infiniband network and GPU.
- Experience with Shell scripting and Python.
- Experience with version control systems, such as Git, and monitoring tools like Grafana or Nagios.
- Familiarity with HPC programming technologies (such as MPI, OpenMP, or CUDA).
- Familiarity with other HPC technologies (such as Infiniband, or GPU, and DDN appliance).
- Familiarity with JupyterHub.
- Familiarity with standard programming languages (such as C, C++, Fortran, or Java).
- Knowledge of TCP/IP.
- Familiarity with statistical tools (such as R) or mathematical tools (such as Matlab).
- Knowledge of technology, applications, and interfaces designed to support research, such as Globus.
Equal Opportunity Employer / Disability / Veteran
Columbia University is committed to the hiring of qualified local residents.