Vadim Karpusenko | Poke

Vadim Karpusenko @UCxD2TEeLQopByu6jzKs65UA@youtube.com

1K subscribers - no pronouns :c

Mostly Intel Xeon Phi related stuff, but principals of paral

Videos Live Playlists

Recently Uploaded Popular Oldest

Attention of Convolutional Neural Network (VGG16)

Episode 5.19 - Closing words

Episode 5.18 - Additional Topic- Load Balancing in Heterogeneous Systems

Episode 5.17 - Optimization of Communication- MPI

Episode 5.16 - Optimization of Communication- Offload

Episode 5.15 - NUMA and Allocation on First Touch

Episode 5.14 - Example of Cache-Oblivious Recursion

Episode 5.13 - Example of Loop Tiling

Episode 5.12 - Optimization of Memory Access

Episode 5.11 - Thread affinity control

Episode 5.10 - Do you have enough parallelism in your code

Episode 5.9 - Elimination of False Cache Line Sharing

Episode 5.8 - Optimization of Synchronization in Multithreaded applications

Episode 5.7 - Vectorization Tuning Knobs

Episode 5.6 - Strip-Mining for Vectorization

Episode 5.5 - Optimization of Vectorization: Regularizing Pattern

Episode 5.4 - Optimization of Vectorization: Alignment and Hints

Episode 5.3 - Optimization of Vectorization: Data Structures

Episode 5.2 - Scalar Tuning and General Optimization

Episode 5.1 - Optimization roadmap

Episode 4.9 - Distributed-memory Parallelism and MPI

Episode 4.8 - Parallel Reduction

Episode 4.7 - Race Conditions and Mutexes

Timelapse of video course recording from March 8th, 2015

teaser-v3ct0r1z@t10n

A teaser video of webinar series

Episode 4.6 - Fork-Join Model. OpenMP Tasks

Episode 4.5 - Parallel Loops, Private and Shared Variables, Scheduling

Episode 4.3 - Assumed Vector Dependence and Pointer Disambiguation

Episode 4.4 - Thread Parallelism and OpenMP

Episode 4.2 - Automatic Vectorization and Array Notation

Episode 4.1 - SIMD Parallelism and Intrinsics

Episode 3.9 - File IO in MPI Applications on Coprocessors

Episode 3.8 - Heterogeneous Programming using MPI

Episode 3.7 - Asynchronous Offload

Episode 3.6 - Shared Virtual Memory

Episode 3.5 - Additional Offload Controls

Episode 3.4 - Explicit Offload

Episode 3.3 - Native MPI Applications

Episode 3.2 - Native Coprocessor Applications

Episode 3.1 - Overview of Programming Options

How To Record Video Lectures

Episode 2.1 - Purpose of the MIC architecture

Episode 2.6 - Knights Landing, the Next Manycore Architecture

Episode 2.5 - Will My Application Benefit from the MIC Architecture?

Episode 2.4 - Software Tools for Intel Xeon Phi Coprocessors

Episode 2.3 - Vector Instruction Support in Intel Architectures

Episode 2.1 - Purpose of the MIC architecture

Episode 2.2 - Details of Intel MIC Architecture

February 2015 in Sunnyvale, California.

Episode 08 - Native Coprocessor Applications

Episode 03 - IMCI set and VPUs

Episode 02 - Detail of Intel MIC architecture

Strip-mining optimization for vectorization (Russian)

Sign up for SC14

Sign-up ad for SC14

Lecture - strip-mining for vectorization

Practical Lab - strip-mining for vectorization

High Peaks, Pinnacles National Park, San Benito County, CA