May

What is Compute Ontario Summer School and how do I attend?

Wednesday, May 14, 2025
12:00 PM 1:00 PM
Google Calendar ICS

Presenter: Ann Allan, Compute Ontario

This colloquium will help new users navigate Compute Ontario Summer School Registration and Schedule. The session will cover creating a Compute Ontario Training Account, how to log-in, how to enrol in courses and some helpful tips about this year’s schedule. A brief overview of the summer school content will be reviewed and there will be the opportunity for new users to ask questions.

This year the school is taking place from June 2 to June 20 and offers a comprehensive curriculum packed with over 40 courses. Delivered by experts in the field, these sessions cover a wide range of topics including Advanced Research Computing (ARC), High Performance Computing (HPC), Research Data Management (RDM), and Research Software (RS). With presentations and workshops available at introductory to intermediate levels, there is something for everyone.

View Event →

May

7

Checkpoints: why, when and how

Wednesday, May 7, 2025
12:00 PM 1:00 PM
Google Calendar ICS

Presenter: Weiguang Guan, SHARCNET

Checkpointing is a technique that enables programs to save their current state and resume execution from a saved state in the future. This mechanism is useful in running long jobs, which may be interrupted for various unpredictable causes, such as system failures (either hardware or software), bugs in the running program, timeout, etc.

We have a wiki page about checkpoints that only gives general guidelines. In this webinar, we will introduce checkpointing through a few concrete examples to illustrate what is the state of a program and how its states at different points of execution are saved and restored. We will discuss various topics related to checkpoints, such as saving frequency, checkpoint file types, and how to implement the checkpointing mechanism in different computational job categories: serial, threaded, and MPI.

View Event →

Apr

23

PAST: Parallel Programming: MPI I/O Advanced Features

Wednesday, April 23, 2025
12:00 PM 1:00 PM
Google Calendar ICS

NO VIDEO AVAILABLE

Presenter: Jemmy Hu, SHARCNET

MPI-IO is a set of extensions to the MPI library that enable parallel high-performance I/O operations. It provides a parallel file access interface that allows multiple processes to write to and read from the same file simultaneously. MPI-IO allows for efficient data transfer between processes and enables high-performance I/O operations on large datasets. In this seminar, we will talk about some MPI-IO advanced features such as file view, contiguous vs non-contiguous IO, hints info, independent vs collective IO, blocking vs non-blocking IO as well as MPI derived filetype and datatype.

View Event →

Apr

9

PAST: Too Big to Train: Large model training in PyTorch with Fully Sharded Data Parallel

Wednesday, April 9, 2025
12:00 PM 1:00 PM
Google Calendar ICS

Weekly Colloquia Series

Sponsored by Compute Ontario and presented by CAC, SciNet and SHARCNET