In conversation with Marcelo Ponce about his COVID19.Analytics R package

Data is integral to research and is playing a critical role in helping our health system planners and response teams combat the COVID-19 pandemic. We spoke to Marcelo Ponce, Applications Analyst, SciNet HPC, about the COVID19.Analytics R package he developed to provide researchers access to the latest worldwide data about the pandemic.  The package consolidates the latest data as reported by international health organizations, and allows researchers to perform statistical analyses and visualizations on live data to gain insights about the spread of COVID-19.

Can you tell us more about the COVID19.Analytics R package?

I developed the package to help researchers get quick access to the latest worldwide COVID-19 data. It is designed to detect and warn users about the inconsistencies or glitches or “spurious values” in the data collected, which is common due to the rapidly evolving situation.

Among other things, the package also offers a set of analysis and visualization tools to help the user digest and make sense of the data. It has some basic functionalities to model the spread of the disease, and I add more functionalities and features to it when a user reaches out to me with a specific request, and if I have the chance and the time. One of the latest additions to the package is the latest data from the city of Toronto.

Who can use the package, and how?

The COVID19.Analytics R Package is open source, so any researcher using R language can use it. Some functions are basic, so users with even foundational knowledge can use them. Advanced users who would like to dive into the code or change/modify/update functions can do so too.

How many researchers have used this package, and can you tell us a little about how it has been used?

Since its original release on April 9th, 2020, more than 5,000 users have downloaded the package according to the CRAN network repository and has averaged around 2,500 downloads a month. This link shows a plot reporting number of times the package is downloaded and is updated daily.

The package has been used in several interesting projects like:

  • Ecologist-The Nature Conservancy’s LANDFIRE team – a member of the team in Marquette, U.S; is developing a website ‘covid19 explorer’ using the package to visualize cases and extract data
  • I helped a professor at the Instituto Federal de Educação, Ciência e Tecnologia do Rio Grande do Norte in Brazil upload data from his city so he can use the package to model the spread of the virus there. If data is structured in a particular form, as a “time series,” then many of the functions available in the package will work for a user.
  • I have also been contacted by the following faculties and teams to use the package:
    • Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina
    • Facultad de Ciencias, Universidad de Granada – Computational Epigenomics Lab
    • CoronaWhy (https://www.coronawhy.org/), a globally distributed, volunteer-powered research organization that is trying to assist the medical community’s ability to answer critical questions related to COVID-19, is using the package to make the latest data sets findable and accessible for the medical science community.

How can we support the use of the package?

Spreading the word about the package within the research and ARC communities would be great!

Anything else you would like to add?

From my perspective and personal experience, supporting the idea and opportunity to develop a package like is essential. I would like to thank my team at SciNet HPC, home of the largest supercomputer available to academics in Canada and the main hub for ARC at the University of Toronto, for supporting me in this instance.

We, at SciNet, have the opportunity to interact with many different departments, students and researchers. We not only teach and train them in ARC and HPC techniques but also learn from them, and in many cases, establish collaborations that benefit the scientific community. None of this would be possible without the support of the whole team at SciNet. This under normal circumstances, I’d say, is already remarkable, but being able to continue serving scientists, students and the entire research community during such unprecedented times deserves an additional recognition.

Without the continued support from my colleagues at SciNet, it wouldn’t have been possible to develop this package, so a big thanks to them!