The Systems-Facing track of the People Network will focus on aspects relevant to the execution of research computing and data systems.
Topics include:
- Architecture (hardware solutions, cloud integration, VMs, container integration, …)
- Storage systems (file systems, SAN, NAS, NVMe, Burstbuffer, …)
- Networking (data transfer solutions, Infiniband, campus-wide, …)
- Cluster management and configuration (scheduling, accounting and reporting, …)
- Accelerators (GPU, FPGA, …)
- Security (auditing, compliance, policies, procedures, …)
- User environment (Modules, CUDA, …)
We connect via monthly calls, on slack and an email list.
Join Us!
Join Us! Fill out our membership form to let us know who you are and what you’re interested in. We’ll add you to the email list. Visit the People Network page to join other tracks.
Monthly Calls
Monthly calls are on the third Thursday of the month, 1pm ET. Connection information and links to any materials are distributed via email.
Upcoming Plans & Calls
May 16, 2024 @ 1p ET/ 12p CT/ 11a MT/ 10a PT
Energy and Sciences Network (ESnet) presenting on PerfSONAR
perfSONAR is an infrastructure for network performance monitoring, making it easier to solve end-to-end performance problems on paths crossing several networks. It contains a set of services delivering performance measurements in a federated environment. perfSONAR has been developed through an international collaboration led by ESnet, Indiana University, Internet2, the University of Michigan, and GEANT.
This presentation will provide an overview of how perfSONAR is used to support the work of research system administrators, as well as updates and features that may be relevant to this community.
Speaker: Ken Miller, Computer Systems Engineer, ESnet
Previous Call Topics
- April 2024, – Supply Chain Challenges & Strategy Discussion
- March 2024 – AI Support Community Discussion
- February 2024 – Overview of the Georgia Tech Rogues Gallery Testbed
- January 2024 – Navigating PBS to Slurm Migrations in Production Environments
- October 2023 – Containerizing HPC
- September 2023 – CaRCC Systems Facing Roundup Session
- August 2023 – Cluster Resource Management software and solutions – Part 2
- June 2023 – Cluster Resource Management software and solutions – Part 1
- May 2023 – CaRCC Accessibility Month Discussion – Introduction to Universal Design
- April 2023 – Dealing with New Cloud Storage Limits Joint Call Follow-on Discussion with Data-Facing, Researcher-Facing, and Systems-Facing
- March 2023 – Chilling out: Building, operating, and updating a modern data center for today and tomorrow’s HPC and enterprise needs
- February 2023 – Towards a Data Commons at Harvard (Joint call with Data-Facing Track)
- January 2023 –Deploying web-based software to enable self-service HPC for researchers on the commercial cloud
- October 2022 – Identifying “core/common skills” that a Novice RCD Systems Focused IT professional should prioritize developing as they start their career. – Part 2
- September 2022 – Open XDMoD/ColdFront/Open OnDemand HPC Toolset Overview
- August 2022 – Identifying Requirements for Early Career PT/FT RCD Systems Focused IT – Part 1
- June 2022 – JupyterHub @ NCAR: A Case for a Notebook-Centric Analysis Platform Around the Jupyter Ecosystem
- May 2022 – Operating a research computing environment for a large university
- April 2022 – Cloud SolutionsPanel Discussion
- March 2022 – Integrating Cloud Storage into Research Workflows
- February 2022 – Post CentOS 8 Era: Community Discussion
- January 2022 – Real-world RoCE: Architecture, Implementation and Upkeep
- October 2021 – Network Troubleshooting: Techniques and Approaches
- September 2021 – Managing an NVIDIA DGX SuperPod at UF
- August 2021 – PEARC21 recap and community discussion
- June 2021 – Two unique approaches to developing CUI Compliant Systems
- May 2021 – Clusters in the Sky: How Canada is Building Beyond IaaS Cloud with Magic Castle
- April 2021 – Experiences and advice for large and small data centers – cooling
- March 2021 – Enabling Science Collaborations with Secure and Flexible Service Deployment
- February 2021 – Geddes Composable Platform – Purdue’s Kubernetes-based private cloud solution
- January 2021 – HPC Cluster Operating Systems Options
- October 2020 – Basic Cloud Bursting with Azure & VMWare
- September 2020 – Overview of HPC File Systems and One Site’s Experience
- August 2020 – Multi-Track Call: Service Models for Researcher-Purchased Computing and Data
- July 2020 – OURRstore: Big Data on a Small Budget, discussion facilitated by Henry Neeman, Patrick Calhoun
- June 2020 – An intro to metrics collection with Prometheus and visualization with Grafana at the Colorado School of Mines
- May 2020 – Multi-Track: Student Workers
- April 2020 – Direct to the chip warm water cooling HPC system at OSC
- March 2020 – Visualize And Analyze Your Network Activities Using OSU INAM
- February 2020 – UC San Diego Health’s approach to meeting HIPAA compliance in AWS
- January 2020 – HIPAA-aligned data and Public Cloud Platforms
- December 2019 – Cornell’s Data Finder Tool
- October 2019 – HTCondor’s Philosophy of High-Throughput Computing
- September 2019 – Providing a Unified Software Environment for Canada’s National Advanced Computing Centers
- August 2019 – Storage and Data Management for a Mid-Scale Research Data Set
- July 2019 – Racktables for graphical HPC cluster management
- June 2019 – Open Discussion
- May, 2019 – Globus Installation and Configuration
- April, 2019 – Charliecloud Unprivileged Containers for User-Defined Software Stacks
- March, 2019 – Ceph as a campus storage solution (block and objects)
CaRCC Systems-Facing YouTube playlist of past calls
Google Drive folder (includes a sub folder for each month’s call with call doc with notes and Q&A as well as the zoom recording files)
Track Coordinators
Systems-Facing Coordinators email contact link
Brian Haymore, Sr IT Architect – HPC, Center for High Performance Computing, University of Utah
Matthew Smith, Manager – Research Systems, Advanced Research Computing, University of British Columbia
Betsy Hillery, Director, Research Computing, Purdue University
Steering Committee
- John Blaas, HPC Operations Manager at Lambda and Chair for ACM SIGHPC Syspros
- Jim Leous, Research Computing Advisor, Advanced Research Computing, Division of Information Technology, Virginia Tech
- Sai Pinnepalli, Chief Technologist, Center for Computation and Technology, Louisiana State University
- Dori Sajdak, Senior Systems Administrator at University at Buffalo
- Jason Wells, Founder, Builder, Admin, Research Computing Architect and Data Management Lead, Harvard John A. Paulson School of Engineering and Applied Sciences
- <Open Spot Seeking Volunteer> – (Contact S-F Coordinators for more info)
- <Open Spot Seeking Volunteer> – (Contact S-F Coordinators for more info)
Wall of Thanks – (Former Track Leadership)
Former Track Co-Coordinators:
- Alan Silver, IT Professional, UW-Madison
- Troy Baer, Senior HPC Systems Engineer, Ohio Supercomputer Center
Former Track Steering Committee Members:
- Betsy Hillery, Director, Research Computing, Purdue University
- Matthew Smith, Manager – Research Systems, Advanced Research Computing, University of British Columbia