Weekly Colloquia Series

Sponsored by Compute Ontario and presented by CAC, SciNet and SHARCNET

The Compute Ontario Colloquia is a weekly informational series hosted via Zoom. These informational presentations cover a wide range of Digital Research Infrastructure (DRI) topics, such as advanced research computing (ARC), research data management (RDM), and research software (RS). The presentations are delivered by Compute Ontario and consortium staff and featured speakers. The series commences January 2023 and will supersede similar webinar series previously delivered by Compute Ontario’s consortium (e.g. SHARCNET’s General Interest Webinars or SciNet’s User Group Meeting TechTalks). The Colloquia are each 1 hour in length and include time for questions. No registration is required. Click on the button below to access the Zoom link for each webinar. Past events and links to recordings can be found at the bottom of this page. Presentations are also uploaded to the hosting consortium video channel.

May
15

Bioinformatics: Advancements and challenges in the era of big data analysis

Presenter: Sridhar Ravichandran, CAC

Advancements in sequencing technologies have revolutionized biological sciences. Next Generation Sequencing (NGS) approaches are now a routine part of biological research generating staggering amounts of data. This rapid growth poses significant challenges in data acquisition, storage, and distribution. In addition, large-scale data analysis among national and international collaborations requires portable, scalable, and reproducible computational analysis. This webinar will also cover some best practices including workflow management for pipeline development, handling software installation, databases to run on different compute platforms and enable workflow portability and sharing.

View Event →

Apr
24

Data Wrangling with Tidyverse (part 2)

Presenter: Tyson Whitehead, SHARCNET

Tidyverse is an cohesive set of packages for doing data science in R. In an earlier talk, we began reviewing the data munging portions of tidyvese (dplyr, forcats, tibble, readr, stringr, tidyr, and purr) by using it to reconstruct the data hierarchy in a 500 pages reference PDF given only the words on each page and their bounding boxes. This talk will complete this. If you have not seen the first part, or wish to review it, you can find it here: https://www.youtube.com/watch?v=8_Q-WwqY_Og For completeness, we also covered the graphical portion of tidyverse (ggplot) here: https://www.youtube.com/watch?v=PR2Rs0W4zYg

View Event →
Apr
10

PAST: Accelerating data analytics with RAPIDS cuDF

WATCH HERE

Presenter: Nastaran Shahparian, SHARCNET

Pandas, renowned as the go-to library for data manipulation and analysis in Python and widely adopted in machine learning. However, Pandas is slow. With the introduction of NVIDIA cuDF.pandas, the accelerated power of GPUs is integrated into Pandas, enabling faster processing without the need for any code changes. A live demo will showcase this enhancement on clusters.

View Event →
Mar
27

PAST: Accelerating graph analysis on GPUs

WATCH HERE

Presenter: Jinhui Qin, SHARCNET

Graph analysis plays a critical role in many applications across various domains, ranging from social network analysis to bioinformatics, to fraud detection, to cybersecurity, to recommendation systems, etc. NetworkX is the go-to library for graph analysis in Python. However, when dataset and graph sizes grow, the performance of using NetworkX becomes a significant concern. This webinar introduces NVIDIA cuGraph for accelerating graph analysis on GPUs. Moreover, a recent integration of NetworkX with cuGraph, named nx-cugraph, allows accelerating workflows in NetworkX on GPUs with zero code changes. A live demo will be done on the clusters.

View Event →
Mar
13

PAST Make: obsolete or elegant?

WATCH HERE

Presenter: Mark Hahn, SHARCNET

Make is a classic Unix development tool, which may seem archaic and narrow-purpose. But if you think of it as a declarative, parallelized workflow automation tool, it sounds more relevant. We'll consider stereotypical use of make, then its general properties, and show some interesting examples of make applied to unusual uses.

View Event →
Feb
28

PAST: Debugging your code with DDT

WATCH HERE

Presenter: Sergey Mashchenko, SHARCNET

One of the important steps of developing or maintaining a code is debugging: checking the code for errors. Simple toy codes can be debugged using print statements, but realistic codes need specialized debugging tools. We have a powerful debugger "DDT" installed on Graham and Niagara clusters. This presentation will walk you through the steps required to start debugging your codes using DDT, and will present the main features of the software. It will cover a wide range of situations: from debugging serial codes (Python, C/C++, Fortran) to debugging parallel CPU codes (MPI, OpenMP) to debugging GPU codes (CUDA, ROCm/HIP) to debugging hybrid codes (combining MPI, CUDA etc.). No familiarity with DDT or debugging in general is required.

View Event →
Feb
21

Multi-factor authentication on Alliance Clusters

Presenter: Marco Saldarriaga, SciNet

Multi-factor authentication, MFA, two-factor authentication, or 2FA, along with similar terms, is an electronic authentication method in which a user is granted access to a device, website or application only after successfully presenting two or more pieces of evidence (or factors) to an authentication mechanism:MFA protects user data—which may include personal identification or financial assets from being accessed by an unauthorized third party that may have been able to discover, for example, a single password. We will explain the most common uses of MFA today, and how MFA in being implement in our environment.

View Event →
Feb
14

PAST: MySQL Part 3: Constraints and Joins

WATCH HERE

Presenter: Ed Armstrong, SHARCNET

In MySQL, constraints and joins are fundamental concepts used to ensure data integrity in a database and query data from multiple tables. Constraints are rules enforced on the data columns of a table. Constraints provide the accuracy and reliability of the data within a database. Joins in MySQL combine rows from two or more tables based on a related column. Previous parts in the series: Part 1, Part 2.

View Event →
Feb
7

PAST: Research Data Management for Reproducibility

WATCH HERE

Presenter: Jeff Moon, Compute Ontario

This presentation will take a high-level look at what we mean by reproducible research and how research data management (RDM) plays a key role in supporting reproducibility. We will draw from the literature to define and discuss reproducibility across disciplines, and various approaches to ensuring research data, metadata, and code are managed in ways that lend themselves to this key aspect of scientific advancement and integrity.

View Event →
Jan
31

PAST: Introduction to GPU programming with OpenMP

WATCH HERE

Presenter: Jemmy Hu, SHARCNET

OpenMP is a popular, portable and widely supported shared-memory parallel programming model in HPC. The OpenMP API supports multi-platform parallel programming in C/C++ and Fortran. As computer hardware has grown to include GPU and other specialized accelerators, OpenMP has grown as well to add device support for parallel programming on GPU and accelerators. This seminar will give an introduction to GPU programming with OpenMP, the OpenMP device and execution model to offload tasks (map loops and data) on GPUs.

View Event →
Jan
24

PAST: C++ libraries for Computational Mechanics

Presenter: Rakesh Raghavaraju, CAC

In addition to commercial software for computational mechanics, there are several open-source C++ libraries available for developing custom solver, e.g., deal.ii for FEM libraries and OpenFOAM for CFD (FVM-based) libraries. Although one doesn’t require advanced level knowledge in C++, some fundamentals of object-oriented programming (OOP) and compilation of C++ code are required in order to use these libraries. This seminar will introduce some of the concepts of OOP, required to start using deal.ii libraries for writing FEM solvers. Some of the basic concepts of compilation to develop new CFD solvers using OpenFOAM will be discussed. A Laplacian solver for transient heat equation will be developed using deal.ii and an existing solver from OpenFOAM will be recompiled in user’s $HOME directory to explore the concepts of compiling the C++ code.

View Event →
Jan
17

PAST: False sharing and contention in parallel codes

WATCH HERE

Presenter: Paul Preney, SHARCNET

Sequential programs can repeatedly read from and write to memory locations seemingly without issues. On the other hand, parallel programs can easily fall prey to weird behaviours resulting in small to very significant issues and/or performance loss that are not always easily attributable to specific pieces of code one has written. Such behaviours can be seen in multithreaded C, C++, Fortran, OpenMP, etc. parallel codes running on shared memory systems. This presentation will discuss false sharing and contention, the issues resulting from them, and how one can address such so as to minimize if not eliminate their negative effects.

View Event →
Dec
13

CANCELLED: Process Interrogation on Running Jobs

WILL BE RESCHEDULED IN 2024

Presenter: Doug Roberts, SHARCNET

For a job to run efficiently on a cluster one of the most fundamental requirements is that a single process runs on a single core. To achieve this slurm scripts are used to request resources for a specific number tasks per server based on some apriori knowledge how a program is expected to run directly according to predefined input file values or indirectly by slurm environment variables that are set at runtime. To make this procedure as reliable as possible, the Alliance wiki provides many slurm template scripts which researchers can use with minimal changes. In some cases however a program may not behave as expected and start more (or less) processes or threads within a give core reservation on a given compute node than is expected. As a result the compute node can become overloaded causing its total overall performance to rapidly deteriorate, potentially impacting other researcher jobs running on the node (including entire parallel jobs that have processes on the node) through excessive cpu load, memory bandwidth or file system calls. Eventually the node(s) may become unstable and unresponsive. If there are many such jobs launched simultaneously on a cluster then the operation of the entire cluster can be adversely impacted. Once aware of such a situation, the system administrator will set out to track down the problematic job(s) and owner. A decision will then need to be made, depending on the severity of the problem, to either suspend or terminate the jobs immediately or contact the researcher to request they fix it themselves in due course. In most cases the problem can be quickly resolved by correcting a parameter setting in the slurm script. In other cases however the source of the problem may not be obvious making it impossible to correct the script without doing further work. These types of scenarios may occur regardless if a researcher has written a custom code, downloaded a third party code or is running a commercial code. It will be the purpose of this presentation to provide researchers with some basic strategies and tools that can be used to interrogate running programs with the goal of understanding the running process and thread structure so that it may be fixed.

View Event →
Dec
6

PAST: Block Internet Advertisements by Setting up Pi-Hole on an Older PCs or Laptop

WATCH HERE

Presenter: Norbert Krawiec, SciNet

Pi-Hole, a lightweight network-wide ad blocker, stands as a powerful solution for enhancing online privacy, security, and the overall user experience. We will explore the dynamic world of Pi-Hole using a Docker container implementation, which can be easily deployed on older hardware using Docker. We'll guide you through the process of setting up Docker networking to allow Pi-Hole to have its own IP address on your network. Additionally, we'll demonstrate how to use Pi-Hole as a local DNS server. Furthermore, we'll examine other resources that complement Pi-Hole, enhancing its capabilities in blocking ads and malware domains.

View Event →
Nov
29

PAST: Skorch: Training PyTorch models with scikit-learn

WATCH HERE

Presenter: Collin Wilson, SHARCNET

PyTorch is an enormously popular framework for developing deep learning models in Python. However, scikit-learn is one of the most popular libraries for general machine learning. Skorch is a wrapper for PyTorch that allows one to use models written with PyTorch with the scikit-learn library. In this talk, we will explore how skorch allows for PyTorch models to be easily incorporated into scikit-learn data/training pipelines, cross-validation and hyperparameter search schemes and eliminates the need for boilerplate training code.

View Event →
Nov
22

PAST: Web scraping in Python

WATCH HERE

Presenter: Yohai Meiron, SciNet

Web scraping is a method used to extract data from websites. It involves programmatically downloading web pages and parsing their HTML to extract the necessary information. It can be used to harvest data for the purpose of statistical analysis, training machine learning models, and creating alerts. In this talk, we'll discuss how to use basic programming skills in Python to scrape the web. We'll examine the technical and ethical aspects of the method, as well as practical applications.

View Event →
Nov
15

PAST: Squeeze more juice out of a single GPU in deep learning

WATCH HERE

Presenter: Weiguang Guan, SHARCNET

It’s well known that GPUs can significantly accelerate neural network training. However, not everyone knows that a single GPU is sufficient to train most neural networks except for a few large ones (like LLMs). In fact, a GPU is under-utilized in most cases. In this talk, we are addressing the under-utilization issue and proposing a way to make full use of the GPU capacity. The goal is to increase the throughput with a single GPU. We will use a small NN training as an example to illustrate how to achieve the goal by splitting a physical GPU into multiple logical GPUs and then running a particular training process per logical GPU.

View Event →
Nov
8

PAST: Reference-Counted Multidimensional Arrays for C++ with rarray

WATCH HERE

Presenter: Ramses van Zon, SciNet

Compared to languages like Fortran and Python, the support for large multidimensional arrays in C++ is quite poor. There are many libraries trying to fill this deficiency, and there is hope at the horizon in the form of the planned C++23 and C++26 standards. But we would rather not wait for these, nor require C++ programmers to learn large frameworks, or worry about performance, when all they need is a multidimensional array. We will look at "rarray", a header-only library requiring only a C++11 compliant compiler. This library provides reference-counted multidimensional arrays that are easy to use, work now, are efficient, and can interface with many scientific libraries.

View Event →
Nov
1

PAST: Generalized End to End Python and Neuroscience Workflows on a Compute Cluster

WATCH HERE

Presenter: Tyler Collins, SHARCNET

Often, researchers are given the seemingly impossible task of taking a piece of code that they did not develop and generalizing it to run in an HPC environment. This can provide a significant roadblock for researchers whose fields do not often teach skills such as parallel programming, or even terminal use. Through the lens of Python and Neuroscience this talk will provide a step by step guide on how to build the skills necessary to both understand and create workflows. While Neuroscience will be used as an example, many of the concepts will generalize to other fields due to the widespread nature of Python. Particular attention will be paid to what training materials exist, what technologies can be leveraged, and who should be contacted when roadblocks arise. Beginner level skills with Python and Git will be assumed.

View Event →
Oct
25

PAST: SWIFT: A Modern Highly Parallel Gravity and Smoothed Particle Hydrodynamics Solver for Astrophysical and Cosmological Applications

WATCH HERE

Presenter: James Willis, SciNet

Numerical simulations have become one of the key tools used by theorists in all the fields of astrophysics and cosmology. The development of modern tools that target the largest existing computing systems and exploit state-of-the-art numerical methods and algorithms is thus crucial. In this talk, we introduce the fully open-source highly-parallel, versatile, and modular coupled hydrodynamics, gravity, cosmology, and galaxy-formation code Swift. The software package exploits hybrid task-based parallelism, asynchronous communications, and domain-decomposition algorithms based on balancing the workload, rather than the data, to efficiently exploit modern high-performance computing cluster architectures. Gravity is solved for using a fast-multipole-method, optionally coupled to a particle mesh solver in Fourier space to handle periodic volumes. For gas evolution, multiple modern flavours of Smoothed Particle Hydrodynamics are implemented. Swift also evolves neutrinos using a state-of-the-art particle-based method. Two complementary networks of sub-grid models for galaxy formation as well as extensions to simulate planetary physics are also released as part of the code. An extensive set of output options, including snapshots, light-cones, power spectra, and a coupling to structure finders are also included. We describe the overall code architecture, summarize the consistency and accuracy tests that were performed, and demonstrate the excellent weak-scaling performance of the code using a representative cosmological hydrodynamical problem with ≈300 billion particles. The code is released to the community alongside extensive documentation for both users and developers, a large selection of example test problems, and a suite of tools to aid in the analysis of large simulations run with Swift.

View Event →
Oct
18

PAST: p2rng – A C++ Parallel Random Number Generator Library for the Masses

WATCH HERE

Presenter: Armin Sobhani, SHARCNET

p2rng (https://github.com/arminms/p2rng) is a modern header-only C++ library for parallel algorithmic (pseudo) random number generation supporting OpenMP, CUDA, ROCm and oneAPI. Playing fair, mostly required for debugging and unit testing, is one of the unique features of p2rng. That means using the same seed and distribution you always get the same sequence of random numbers on all supported platforms. p2rng provides parallel versions of STL’s std::generate() and std::generate_n() algorithms with the same interface. In this seminar we first start with a quick review of preliminary concepts about algorithmic random number generators in general and parallelization techniques in particular. Then we continue with the standard way of generating random numbers with STL algorithms and how we can turn them into parallel version using p2rng.

View Event →
Oct
11

PAST: High performance computing in R

WATCH HERE

Presenter: Alexey Fedoseev, SciNet

In a world where data has become extremely important, scientists require tools to process large volumes of data efficiently. R has become increasingly popular in recent years for data processing, statistical analysis, and data science. In this session we will discuss tools that measure the performance of an R code, so that we can understand the nature of performance issues. We will also describe techniques that will improve the computational speed of R code. Basic knowledge of programming in R will be assumed.

View Event →
Oct
4

PAST: Exploring job wait times on Alliance compute clusters: a holistic view

WATCH HERE

Presenter: James Desjardins, SHARCNET

Job wait times on the Alliance clusters are impacted by several factors. The target share of an account in relation to its recent usage, availability of resources being requested and overall load on the system from all users are major contributors to wait time variances. This presentation builds on previous seminars regarding wait time assessments, this time with a focus on analytics that describe the dynamic states of the clusters as a whole.

View Event →
Sep
13

PAST: Data Wrangling with Tidyverse

WATCH HERE

Presenter: Tyson Whitehead, SHARCNET

Tidyverse is an cohesive set of packages for doing data science in R. We have demonstrated the graphics portion of this in prior talks (ggplot). In this one we are going to demonstrate the data munging portions (dplyr, forcats, tibble, readr, stringr, tidyr, and purr) by restoring the underlying data hierarchy implicit in the layout of a 500 pages reference PDF file given only the words on each page and their bounding boxes.

View Event →
Sep
6

CANCELLED: Advanced Container Use on Clusters and Personal Computers

Presenter: Paul Preney, SHARCNET

This presentation continues exploring the earlier presentations, "Leveraging the power of Linux on Windows with WSL" and "Parallel Computing: Start from Your Own Computer" and will discuss in detail: * how to run multiple Windows Subsystem for Linux (WSL) instances under Windows, * how to import other Linux containers into WSL on your computer, * how to export a WSL container and use such with Apptainer on such on Digital Research Alliance of Canada's clusters (which all run Linux and have Apptainer installed on them), * various aspects and issues of doing the above in practice, and, * the benefits and limitations of using one's own (Windows) computer in this way. (Using MacOS to do the same will not be discussed in this talk.)

View Event →
Aug
9

PAST: Automating scientific workflows with AiiDA

WATCH HERE

Presenter: Pawel Pomorski, SHARCNET

AiiDA is an open source Python package to help researchers with automating complex workflows. It seamlessly integrates with High Performance Computing resources, enabling automatic submission of jobs to clusters with a SLURM scheduler. Inputs and outputs are tracked automatically so that a record of data provenance is preserved. A set of plugins to support popular scientific packages is available for AiiDA; packages supported include Gaussian, GROMACS, LAMMPS, NWChem, Quantum ESPRESSO and VASP. This seminar will introduce AiiDA and demonstrate its usage on the Alliance clusters.

View Event →