Friday, March 17, 2017 Happy St. Patrick's Day! - WestGrid

Transcription

Happy St. Patrick’s Day!WestGrid Town HallPatrick Mann, Director of OperationsSergiy Stepanenko, University of Saskatchewan, Site LeadErin Trifunov, Manager Projects & OutreachFriday, March 17, 2017

Introduction & Outline1.2.3.4.5.New System UpdatesWestGrid Legacy Systems MigrationSilo Decommissioning UpdateNational UpdatesWestGrid Updates

AdminTo ask questions: Websteam: Email info@westgrid.ca Vidyo: Un-mute & ask question or use Vidyo Chat(chat bubble icon in Vidyo menu)Vidyo Users: Please MUTE yourself when not speaking

WestGrid Legacy Systems MigrationPatrick MannDirector of OperationsWestGrid

National Compute UpdateSystemStatusIn-production EstimateArbutus(GP1, UVic) West.cloud.computecanada.ca: 7,640 coresUPDATE: migrating Ceph to new storageDONE (Sep, 2016)Cedar(GP2, SFU) Racks and Servers installed, Cabling in progressMid-April 2017Graham(GP3, Waterloo) Renovations complete (end of January)Delivery in progressLate-April 2017Network Master agreements being signed. DelayedMay 1, 2017Parallel FS Lustre. Available for installation.DONEScheduler Open-source Slurm with commercial support.Working on policy details (lead by WG)Acquired.Niagara(LP1, Toronto) RFP issued. Site visits (Toronto) this weekRFP closes April 24.Late 2017

National Storage UpdateSystemStatusIn-production EstimateSilo Interim Waterloo: migration completeSFU: migration completeAvailableNDC-SFU Waiting for delivery of SBB4’sMid April 2017NDC-Waterloo 13 PB of SBB’s delivered.Early April 2017NDC - ObjectStorage Object Storage. DDN WOS.Lots of demand but not allocated.Initial prototype for internal testing installedon cloud.Summer 2017Attached Scratch High performance storage attached toclustersAvailable with the clustersNDC “National Data Cyberinfrastructure”

IMPORTANT UPDATESSiteSystem(s)Previousdefunding dateNEW defunding /data deletion dateVictoriaHermes/NestorMarch 31, 2017June 1, 2017CalgaryBreezy/LatticeMarch 31, 2017August 31, 2017EdmontonHungabee/JasperMarch 31, 2017October 1, 2017IMPORTANT: Data on defunded systems will be deleted after the publisheddeletion date. WestGrid will not retain any long term or back-up copies of userdata and as noted above users must arrange for migration of their data.WestGrid Migration Details: https://www.westgrid.ca/migration process

WestGrid 2017 Migration ScheduleNew systems availableEmails sent withtimeline and furtherinstructionsMarchRAC 2017award letterssentWestGrid legacy systemsdefunding startsAprilRAC 2016 ENDS -RAC 2017implemented*MayJuneUVic - Hermes &Nestor defunded*Please note the 2017 implementation date is TBC.JulyAugustSeptemberUofA Jasper &HungabeedefundedUofC - Breezy &Lattice defunded

2018 Defunding ScheduleSiteSystem(s)Defunding date*Current StatusCalgaryParallelMar 31, 2018Shared storageVancouver - UBCOrcinusMar 31, 2018Available but with conditionsWinnipegGrexMar 31, 2018New storage comingVancouver - SFUBugabooMar 31, 2018Storage support issues*Please note these dates may be subject to change.WestGrid Migration details:https://www.westgrid.ca/migration process

Who needs to migrate?Any user* of a WestGrid system scheduled to be “defunded”must move any stored data and/or compute use to anothersystem BEFORE the system’s defunding date.Please move your data well in advance of the defunding date toavoid network bottlenecks with file transfers.*Note: Some sites may continue operating systems for local use only. (more details in following slides)WestGrid Migration details:https://www.westgrid.ca/migration process

Future Use after DefundingUofA Will be available for users affiliated with the UofA (1-3 years)* Contact: research.support@ualberta.caUofC Contact: support@hpc.ucalgary.caUVic Contact: sysadmin@uvic.ca

How to migrate?1.2.3.4.Delete unneeded files.Archive and compress files.Review recommendations by systems.Transfer files to new systema. see General Directives for Migration5. Email support@westgrid.ca

Where to Migrate?1. Users with RAC 2017 allocations 2017 system.a.b.Legacy system: can move anytime (soon is good!).Cedar / Graham: once 2017 allocations are implemented (TBC)2. General Users - opportunistic use / Rapid Access vice/Try to choose a suitable system for your application (Serial, Parallel, .)(More details from Erin in the RAC section of the town hall)

Available Resources2017-18 Transition year - Mix of Legacy & New Systems STORAGE 13 Allocatable systems (3 in WestGrid & SFU NDC)NDC available mid-April COMPUTE 18 Allocatable systems CLOUD (4 in WestGrid, Cedar & Cloud)6 Allocatable GPU systems (Parallel in WestGrid & Cedar)ALSO Opportunistic Use only on some legacy systems Consider: Types of jobs (serial vs parallel)Memory requirementsRAC 2017 (upcoming slides)

Questions?Questions about Migration?

Silo Decommissioning UpdateSergiy StepanenkoWestGrid Site LeadUniversity of Saskatchewan

Silo Migration StatsSilo to Waterloo completed Jan.11, 2017: 85M files, 850TB, 140 Users. Note: Only large RAC allocated groups from SiloSilo to SFU completed March 9th, 2017: 103M files, 560TB, 4381 Users. Note: majority of Silo usersOngoing transfers: From ONC to Waterloo From UofC to Waterloo From CANFAR to SFU

Interim Silo Storage I Transfer to Waterloo and SFU is COMPLETE. Silo interim storage “Storage Building Block” (SBB): Waterloo: dtn2.sharcnet.ca - NFS based storage clusterSFU: dtn.sfu.computecanada.ca - Gluster based storage cluster Users login similar to Silo logins. But using National LDAP accounts. For details 16:User Accounts and Groups Backed up to the new tape systems. Backup currently is up to date in SFU and Waterloo (each site dataONLY)

Interim Storage Solution II This is an interim solution for Silo data. A second migration to the final storage system will berequired - likely summer 2017. Silo decommissioning starts on March 23, 2017 Support for National Data Cyberinfrastructure and SiloInterim being provided by USask team

TSM backupsTotal size of TSM at UofS: 3,587 TBWe switched 11 external clients from UofS to SFUTotal backup for those 11 nodes at UofS: 220 TBDTN.SFU.COMPUTECANADA.CA (SFU) TSM usage: 475TBDTN2.SHARCNET.CA(UofW) TSM usage: 965TBWe have complete backup for transferred data and external clients onthe way to have full backups in a couple of days.

Questions?Questions for Sergiy?

National & Regional UpdatesErin TrifunovManager, Projects and OutreachWestGrid

Account RenewalsMust renew by April 13, 2017 If no action taken by April 13, your account will automatically expireand be deactivated.Light-weight process this year -- no CCV required!Sponsored accounts (e.g. PDFs, grad students, etc.) must alsobe renewed. (If no longer needed can let them expire).See Renewals FAQ on Compute Canada website or emailrenewals@computecanada.ca for more info.

RenewalsBackground color ofboxes will change fromRED to GREEN whenrequired steps arecomplete.Must check THISBUTTON when allrequired steps arecompleted to SUBMITRENEWAL REQUEST.

Resource Allocations (RAC)2017 award letters being sent next weekImplementation date TBC Dependent on availability of new systems -- Cedar and GrahamConfirm implementation dates by email early April2016 RAC priorities continue until 2017 allocations implementedQuestions? Frequently Asked Questions (Compute Canada website)rac@computecanada.ca

More resources, more need

Resource Allocation - 20172017 Requests2016 Requests% ChangeCompute - CPU-years256,000238,000 7.5%Compute - GPU-years2,6601,357 96%Storage (TBs)55,00028,660 92%2017 RequestedFraction Available2016 RequestedFraction AvailableCompute - CPU54%*54%Compute - GPU38%20%Storage90 %90 %* 54% in 2017 includes 50k new cores with better performance

RAC & Migration.Reminder: 2016 RAC priorities CONTINUE until 2017 awards areIMPLEMENTED (TBC -- April)If you have 2016 RAC on to-be-defunded system moving to. Legacy system: can move anytime (soon is good!) But 2017 priorities won’t take effect yet (date TBC) Cedar / Graham: wait until 2017 allocations are implemented(TBC) Or can use opportunistic use on ANY system

Other Resource OptionsNo 2017 RAC? Don’t worry 20% of resources are UNALLOCATED, available to any user Rapid Access Service (RAS) on Cedar or Graham(available mid-April) Opportunistic use on regional legacy systems Also out-of-round requests (new PIs or projects)

CC Staff Awards of ExcellenceHave any ARC staff goneabove and beyondto support your research?Nominate an individual or team froman region and help us recognize &celebrate those who help Canada’sresearch community achieve bigger,better, and faster results!Submissions due April 21, 2017www.computecanada.ca/events/awards

HPCSRegistration Now Open(Early Bird 225 - ends April 30)http://2017.hpcs.caSubmission Deadlines: Papers & Posters due April 17SEEING BIG data visualization showcaseentries due May 31

WestGrid UpdatesWestGrid User Training

Training SessionsDATETOPICTARGET AUDIENCEMARCH 29Using ParaViewWeb for 3D Visualization and DataAnalysis in a Web BrowserAnyoneMAY 04Data Visualization Workshop University of CalgaryAnyone (in person)JUNE 05 - 15Training Workshops / Seminar Series on usingARC in Bioinformatics, Genomics, etc.Researchers in Bioinformatics,Genomics, Life Sciences, etc.JUNE 19 - 22WestGrid Research ComputingSummer School - University of British ColumbiaAnyone (in person)JULY 24 - 27WestGrid Research ComputingSummer School - University of SaskatchewanAnyone (in personFull details online at www.westgrid.ca/training

SupportContact us putecanada.ca

Questions? Webstream viewers: Email info@westgrid.ca Vidyo viewers: Un-mute & ask question oruse Vidyo Chat (chat bubble icon in Vidyo menu)

UPDATE: migrating Ceph to new storage DONE (Sep, 2016) Cedar (GP2, SFU) Racks and Servers installed, Cabling in progress Mid-April 2017 Graham (GP3, Waterloo) Renovations complete (end of January) Delivery in progress Late-April 2017 Network Master agreements being signed. Delayed May 1, 2017 Parallel FS Lustre.