Epicareer Might not Working Properly
Learn More

Sr. Linux Engineer

  • Full Time, onsite
  • Columbia University
  • HybridOnsite Mondays & Wednesdays as needed., United States of America
Salary undisclosed

Apply on


Original
Simplified

Reporting to the Manager, High Performance Computing; the Sr. Research Systems Engineer participates in the design, development, implementation, and operations of Columbia s portfolio of high-performance computing services. The position collaborates with other CUIT technical teams and Columbia researchers to support research computing resources, including but not limited to high-performance computing (HPC) clusters, ensuring user requirements are met and planning ongoing improvements and modifications to these systems and services.

Responsibilities:

  • Takes a primary role in the planning and design of research computing services.
  • Investigates new and emerging technologies, evaluating usefulness to Columbia researchers and making recommendations for future services.
  • Interacts with Columbia researchers on various topics, including (but not limited to) the use of existing services, service policies, and research requirements.
  • Takes a primary role in HPC system troubleshooting including coordinating with users, vendors, and other CUIT departments to resolve system problems.
  • Manage storage systems.
  • Resolves incidents and service requests.
  • Administration of systems in the research computing infrastructure, including the installation and management of configuration, monitoring, and notification tools, as well as basic network administration.
  • Assists in the creation and maintenance of user documentation.
  • Interacts with vendors, assessing products and making purchasing recommendations.
  • All other duties as assigned.

Minimum Qualifications:

  • Bachelor s degree or equivalent experience required.
  • Minimum 4-6 years related experience.
  • 4 years of Linux/Unix experience.
  • Prior experience in programming, software development, or system administration.
  • Excellent written and verbal communication skills.
  • Demonstrated ability to work in a fast-paced, deadline-driven environment.
  • Demonstrated excellence in a variety of competencies including teamwork/collaboration, analytical thinking, communication and influencing skills, and technical expertise.
  • Ability to work with changing priorities and with multiple projects.
  • Ability to be precise and attentive to detail is essential.
  • Ability to work with minimal supervision.
  • Ability to work weekends and off-hour work on occasion.

Preferred Qualifications:

The following qualifications are not requirements, but highly advantageous. We will provide on-the-job training in High Performance Computing (HPC) and related topics to individuals who are otherwise skilled and motivated.

  • Experience with Linux system administration, particularly Red Hat (7, 8).
  • Experience with SLURM or other workload management services.
  • Experience with Bright (Base Command Manager), OpenHPC, SGE, Confluent, or other clusterware.
  • Knowledge of GPFS, Lustre, ZFS, NFS, or other network or parallel file systems.
  • Familiarity with Ansible, Puppet, or other Linux configuration management tools.
  • Familiarity with other HPC components, such as Infiniband network and GPU.
  • Experience with Shell scripting and Python.
  • Experience with version control systems, such as Git, and monitoring tools like Grafana or Nagios.
  • Familiarity with HPC programming technologies (such as MPI, OpenMP, or CUDA).
  • Familiarity with other HPC technologies (such as Infiniband, or GPU, and DDN appliance).
  • Familiarity with JupyterHub.
  • Familiarity with standard programming languages (such as C, C++, Fortran, or Java).
  • Knowledge of TCP/IP.
  • Familiarity with statistical tools (such as R) or mathematical tools (such as Matlab).
  • Knowledge of technology, applications, and interfaces designed to support research, such as Globus.

Equal Opportunity Employer / Disability / Veteran

Columbia University is committed to the hiring of qualified local residents.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
Report this job