Table of contents:


Summary

Canada’s national advanced research computing (ARC) platform is delivered through the Compute Canada Federation (CCF), which is a partnership of Compute Canada, regional organizations (WestGrid, Compute Ontario, Calcul Québec and ACENET) and institutions across Canada. Providing researchers with access to the infrastructure and expertise they need to accomplish globally competitive, data-driven, transformative research, it serves the needs of nearly 14,000 researchers, including over 4,100 faculty based at Canadian institutions as of January 1, 2019.

Recent investments have enabled a renewal of Canada’s national ARC platform — the incorporation of the new systems, Cedar (Simon Fraser University), Graham (Waterloo University), Niagara (University of Toronto), and Béluga (Calcul Québec) yielded over 90PB of new online storage capacity and approximately 170,300 core years.

However, the dual challenge of the retirement of legacy systems and ongoing growth in researcher demand for resources means that demand continues to outstrip supply. The 2019 RAC competition received the highest number of applications in its history with 507 projects applying for an allocation — 8.1% more applications than 2018. Due to the challenges discussed above, unfortunately, this year’s RAC was only able to award 40% of the total compute requested, 75% of the total storage requested, and 20% of the total GPUs requested. It is important to note that virtual CPUs and cloud allocations are included in the total compute allocation and that this year’s RAC was able to allocate 95% of the total virtual CPUs requested.

While 80% of the resources available through the CCF are allocated through the Resource Allocation Competition (RAC), the CCF reserves 20% for researchers to use through the Rapid Access Service (RAS) and all users have access to these modest quantities of compute, storage and cloud resources as soon as they have a Compute Canada account.

If you have questions about the terminology used in this page, please consult the Technical Glossary.

Table 1: Applications submitted to the Resource Allocation Competitions

YearTotalYear-on-Year Increase
20195078.1%
201846914.7%
201740911.7%
20163664.6%
201535020.3%
201429137.9%
201321132.7%
201215917.8%

Computational Resources

CPU Allocations

Based on available computing resources, 40% of CPU (core year) requests were met by RAC 2019, nearly 15% less than in 2018. New systems (Cedar, Graham, Niagara and Béluga), which are faster and have more memory than older systems, provided nearly 85% of the available capacity or approximately 170,300 cores. This resulted in a modest increase in available cores.

The new systems were allocated at close to 80% capacity, leaving approximately 20% capacity for new users and smaller development projects to access resources via RAS.

Table 2: 2019 Compute Allocations per System

CPU ResourceSupply: Allocatable Core Years (100% capacity)Need: Total Core Years RequestedProvided: Total Core Years Allocated% of CPU Capacity Allocated
Béluga28,00047,28319,65670%
Cedar53,888107,53344,23982%
Graham28,44865,74323,16681%
MP230,98441,03413,81545%
Niagara60,000128,75956,38594%
Total201,320390,352157,26278%

As of February 18, 2019

Table 3: Historical Compute Ask vs. Allocation

YearSupply: Allocatable CPU Core Years Need: Total Core Years RequestedProvided: Total Core Years AllocatedShortfall Capacity Core Years% of the Demand Awarded
2019201,320390,352157,262233,08940.3%
2018211,020287,957158,632129,32555.1%
2017182,760255,63148,10107,53857.9%
2016155,952237,862128,463109,39954.0%
2015161,888191,690123,69967,99164.5%
2014190,466172,989133,50839,48177.2%
2013187,227142,106126,67715, 42989.1%
2012189, 024103, 84587, 31216, 53384.1%

GPU Allocations

The allocation of GPU resources was more competitive this year than it was for CPUs. As Table 4 shows, requests for GPUs have increased 5 fold since 2016. In 2019, 688 new GPU devices became available as part of the Béluga cluster. However, due to increased demand for GPU resources, the allocation success rate in 2019 was 20.3%, slightly lower than the 20.5% in 2018.

Table 4: Historical GPU demand vs. supply (GPU years)

YearSupply: Allocatable GPUsNeed: GPUS RequestedProvided: Total GPUs AllocatedShortfall Capacity GPUs% of the Need Awarded
20191,6646,5551,3315,22420.3%
20189764,0928403,25220.5%
20171,4202,7901,0471,74337.5%
20163731,3572691,08819.8%
201548260830030849.3%
2014n/a42030811273.3%
2013n/a39025913166.4%
2012n/a10100100%

Cloud Allocations

The Arbutus cluster at the University of Victoria has 24,960 allocatable virtual CPUs. These are available via RAC and RAS and are also utilized for internal Compute Canada services such as software development and hosting. A further 36 legacy nodes remain in service as part of Cloud East at l’Université de Sherbrooke, and relatively small cloud offerings are implemented alongside Cedar and Graham. RAC 2019 received a 56% increase in requests for virtual CPUs. Between Arbutus and the additional nodes at Cloud East (UdeS), Cedar (SFU) and Graham (Waterloo), this year’s RAC was able to allocate 95% of the total virtual CPUs requested. In total, cloud storage was allocated at 74% of its capacity for 2019.

Persistent Disk Storage Allocations

Storage integrated with Cedar, Graham, Arbutus, Niagara and Béluga yielded a total of approximately 100PB of online capacity available for 2019. This meant that, across all types of storage, the CCF was able to allocate a total of 86% of its available storage capacity.

Table 5: 2019 Storage Need vs. Supply by Storage Type (TB)

Storage TypeSupplyNeed: Storage Requested TBProvided:  Storage Allocated% of the Demand Awarded
Project54,60044,11433,67376%
dCache9,0609,4599,37699%
Cloud5,1844,8023,85080%
Nearline 32,50030,68729,40096%
Total101,34489,06376,29986%

Review Process

The majority of RAC applicants request resources to support research programs and highly qualified personnel (HQP) that are already funded through other Tri-Council and peer-reviewed sources.

Submissions are evaluated for the merit and feasibility of the application. For the 2019 competitions, there were 507 submissions to RAC — all of them received a technical evaluation and 239 of them, which were new applications, were peer-reviewed.

Please find here the list of 2019 RAC Expert Review Committees.

Technical ReviewTechnical StaffTechnical adjustments are made to ensure requests are compliant with policy and aligned with the technical capabilities of available resources.
Expert ReviewDisciplinary peer review panel evaluates each proposalEach proposal receives multiple independent reviews;Expert Review Committees meet to discuss the applications;The peer review panel may or may not recommend specific cuts for an application;The peer review panel gives an overall score.

Scaling for Compute Requests

As described above, there were insufficient ARC resources to fully meet the allocations requested through RAC 2019.

As a result, a scaling function was applied to the RAC 2019 to provide a means by which decisions on RAC allocations, in a context of insufficient capacity, could be made. This function, which is endorsed by the RAC Chairs Committee, was set so that only applications with an overall score of 2.3 or higher (out of 5) received an allocation. Applicants who did not receive a compute allocation can still make opportunistic use of system resources via the Rapid Access Service.

Until this year, the scaling function took into account only the overall score. However, researchers raised concerns that this did not address differences in the size of requests. In response to this feedback, this year, the CCF implemented a more complex function that takes into account two variables: the overall score and the size of the request. For example, applications with large requests and low scores received larger cuts, while applications with low scores but smaller requests received smaller cuts.

If you would like details of the mathematical implementation, please contact our RAC team.

Monetary Value of the 2019 Allocations

These values represent an average across the national ARC platform’s facilities and include total capital and operational costs incurred to deliver the resources and associated services. These are not commercial or market values. For the 2019 competition, the value of the resources allocated was calculated on a per-year basis using the following rates.

Table 6: Historical Financial Value of RAC Awards

Financial Value of Award20192018201720162015
1 core year$121.34$156.78$188.84$279.00$275.00
1 GPU year$2,435.89$2,960.77$566.52$1,100.00$1,100.00
1 TB of project storage / year$54.96$36.48$128.00$173.00$190.00
1 TB of nearline / year$25.66NANANANA
1 VCPU year$80.93$91.05$40.50NANA
1 TB of cloud storage (Ceph) / year$117.70$236.81$178.50NANA

Costs for CPUs also reflect the inclusion of 30,984 legacy CPU cores valued at $279/year each. Capital costs are not included for legacy cores, only operational costs. The overall calculation methodology was improved from 2018 to be more accurate.

The valuation of each of these resources decreases each year as older, more expensive, resources are retired and replaced with newer, more cost-effective, resources.