Scalable Parallel Computing Lab, SPCL @ ETH Zurich

22:08

How to find Relevant Items using Approximate Nearest Neighbor Search

32:44

Exascale Cloud Computing – A Foggy Tale of Networks, AI, Containers, and Ultra Ethernet

25:52

Swing: Short-cutting Rings for Higher Bandwidth Allreduce

14:33

Neural Graph Databases

20:16

HOT - Higher-Order Dynamic Graph Representation Learning with Efficient Transformers

14:02

LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems

15:50

Compressing Multidimensional Weather and Climate Data Into Neural Networks

25:05

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

22:41

Motif Prediction with Graph Neural Networks

06:18

Demystifying Chains, Trees, and Graphs of Thoughts

01:12:22

[SPCL_Bcast] The digital revolution of Earth system modelling

01:01:01

[SPCL_Bcast] Capturing Computation with Algorithmic Alignment

24:58

Co-design Hardware and Algorithm for Vector Search

07:10

Demystifying Graph Databases

21:28

Fortran is dead – Long live Fortran!

25:03

Hot Interconnects - EtherNET: the present and future of datacenter and supercomputers

01:02:20

[SPCL_Bcast] Can I Cook a 5 o'clock Compiler Cake and Eat It at 2?

44:49

AI-Driven Performance Metaprogramming

33:59

HammingMesh: A Network Topology for Large-Scale Deep Learning

24:42

GDI: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of Cores

59:51

[SPCL_Bcast] Scalable Graph Machine Learning

01:12:46

[SPCL_Bcast] Heterogeneous multi-core systems for efficient EdgeML

58:43

[SPCL_Bcast] Evaluating Large-Scale Learning Systems

01:08:09

ML for High-Performance Climate: Data Post Processing, Compression, and Earth Virtualization Engines

14:05

HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement

13:13

How to Adjust Network-on-Chip Topologies to Design Goals and Architectures

22:28

Noise in the Clouds: Influence of Network Performance Variability on Application Scalability

29:29

Scheduling Task Graphs on Dataflow Architectures

22:32

Bjorn Stevens on Earth Virtualization Engines (EVE)

32:47

"From Two Strong Oxen to Billions of Fleas." Torsten Hoefler's Sidney Fernbach Award Lecture at SC22

01:04:59

[Bcast] HPVM: Performance, Programmability and Retargetability for Heterogeneous Parallel Systems

39:44

Rusty Lusk’s legacy: The role of MPI in modern AI

01:00:55

[SPCL_Bcast] Realizing Petabit/s IO and sub-pJ/bit System-wide Communication with Silicon Photonics

01:06:21

[SPCL_Bcast] A chiplet based generative inference architecture with block floating point datatypes

32:32

Deinsum: Practically I/O Optimal Multi-Linear Algebra

58:33

[SPCL_Bcast] HPC and AI/ML: A Synergistic Relationship

01:02:06

[SPCL_Bcast] Follow the Data: Memory-Centric Designs for Modern Datacenters

24:26

Accelerating Data Serialization/Deserialization Protocols with In-Network Compute

01:00:35

[SPCL_Bcast] Democratizing Deep Learning with DeepHyper

01:03:56

[SPCL_Bcast] Innovating the Next Discontinuity

24:05

[Demo] Cppless: A single-source programming model for high-performance serverless

45:07

Efficient AI: From supercomputers to smartphones

13:26

#SC22 panel - Quantum Computing: A Future for HPC Acceleration?

36:04

Productive Performance Engineering for Weather and Climate Modeling with Python

07:21

#SC22 panel - Reinventing High-Performance Computing

27:45

Interactive Program Performance Optimization

54:29

AI Engine Architecture: Data Movement, Synchronization, Reconfiguration & Application Mapping

47:17

[SPCL_Bcast] Heterogeneous Serverless Computing

08:51

An Efficient Algorithm for Sparse Quantum State Preparation

49:57

[SPCL_Bcast] Next-generation Networks for Machine Learning

40:27

[SPCL_Bcast] Self-Adjusting Networks

16:19

Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models

37:38

Post-Moore spatial computing: from chips to clusters.

47:38

Portable high-performance Python on CPUs, GPUs, and FPGAs

55:50

[SPCL_Bcast] Automating Distributed Heterogeneous Computing for Domain Experts

18:43

Lifting C Semantics for Dataflow Optimization

28:44

Neural Parameter Allocation Search

16:42

Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching

15:05

Metamorphic Fuzzing of C++ Libraries

28:19

Deep Learning for Weather Prediction and Ensemble Post-Processing