UNCC-logo
University of North Carolina Charlotte
University of North Carolina Wilmington
Parallel Programming Fall 2013
Tuesday/Thursday, 11:00 am - 12:15 pm
UNCW logo
Dr. Barry Wilkinson
University of North Carolina at Charlotte

Office Hours:
2 pm - 3:30 pm T/Th

and
Dr. Clayton Ferner
University of North Carolina at Wilmington

Office Hours:
12:30 pm - 2 pm T/Th

This page is continually updated as the course proceeds. Watch for announcements. Modification date: Dec 9, 2013. Always make sure you have the most recent copy of this page (not cached, re-load page).


ANNOUNCEMENTS


Assignment Frequently Asked Questions
Academic calendar
Lecture Materials
Reading materials
Assignments
Tests
UNC-C Moodle 2
Class videos

Lecture Materials

The following slides are provided as Powerpoint slides. You may wish to print these sides out as 1 x 2 or 2 x 3 thumbnails. The slides are not ready for use until the date of the class. They are likely to be revised just before the class.
Lecture slides

Wk

Date, 2013
No of slides
Slides
Review/Quiz questions
Topics
1
Thurs Aug. 22
26 Outline
  Course outline, prerequisites, course text,  course contents, instructor details.
1
Thurs Aug. 22
23 Assignment Preliminaries
  Assignment preliminaries, access to servers, Moodle, student accounts. TA details and responsibilities.
1/2
Tues Aug 27
13 Parallel Comp. Demand
  Demand for computational speed, grand challenge problems
2
Tues Aug 27
18 Parallel Comp. Potential
Quiz questions Potential for speed-up using multiple process(or)s, speed-up factor, max speed up, Amdahl's law, Gustafson's law.
2
Tues Aug 27
20 Parallel Computers   Types of parallel computers, shared memory systems, multicore, programming shared memory, distributed memory platform, networked computers cluster computing, programming, GPU systems.
2
Thurs Aug 29
30 Pattern Programming-1
 Quiz questions Parallel patterns for structured parallel programming, workpool, pipeline, divide and conquer, stencil, all-to-all patterns, advantages of starting with patterns, tools, Seeds pattern programming framework, user interface, programming example (Monte Carlo pi), Seeds documentation.
2/3
Thurs Aug 29/Tues Sept 3
43


Pattern Programming-2

Matrix add/multiply
  Seeds Framework, workpool module methods, bootstrapping class, further details of Seeds workpool for Assignment 1, Monte Carlo pi code, matrix addition workpool code.
2/3  Thurs Aug 29/Tues Sept 3
  Assignment 1   Using the Seeds Pattern Programming Framework: 1 - Workpool
3
Thurs Sept 5
47 Compiler directive approach   Introduction to Paraguin, parallel regions, barrier, forall, broadcast, scatter, gather, and reduction.
4 Tues Sept 10 37 and 34 Compiler directive approach

Examples
Quiz Questions Patterns, Scatter/Gather, Stencil
4 Thurs Sept 12 23 Assignment 2 slides
Assignment 2
  Using Paraguin to Create MPI Programs - hello world, matrix multiplication, stencil pattern, and Monte Carlo.
5
Tues Sept 17
54 Lower Level Message-passing Computing - MPI
  Basics of message-passing programming, MPI, point-to-point message passing, message tags, MPI communicator, blocking send/recv, command line compiling and executing MPI programs, instrumenting code for execution time, Eclipse IDE Parallel Tools Platform.
5
Thurs Sept 19 61

More MPI routines

Quiz questions

 

MPI collective routines, general features, broadcast, scatter, gather, reduce, barrier, alltoall broadcast, synchronous message passing, asynchronous (non-blocking) message passing, changing to synchronous message passing.
6 Tues Sept 24 16 Synchronization

Quiz questions

Quiz questions

Barriers implementations, counter, reentrant code, tree, butterfly, local synchronization, safety and deadlock, safe MPI routines, MPI_SendRecv(), MPI_BSend(), MPI_Isend/MPI_Irecv()
6
Thurs Sept 26

Assignment 3   Compiling and executing MPI programs. Comparison with Seeds

6


Tues Sept 24/Thurs Sept 26
39 Programming with Shared Memory
  Programming shared memory systems, processes, threads, issues, interleaved statements, thread safe routines, re-ordering code, compiler/processor optimizations, accessing shared data, critical sections, locks, condition variables, deadlock, semaphores, monitors, Pthreads program example.
6/7
Thurs Sept 26/Tues Oct 1
50 Introduction to OpenMP
  Introduction to OpenMP, directives/constructs, parallel, shared and local variables, work-sharing, sections, for, loop scheduling, for reduction, single master, critical, barrier, atomic, flush.
7/8 Tues Oct 1/ Thur Oct 10 32 Shared memory performance issues   Shared memory performance issues, specifying parallelism, par, forall constructs, dependency analysis (Bernstein's conditions), critical sections serializing code, data shared in caches, false sharing, sequential consistency, code re-ordering
7 Thurs Oct 3   Class Test   No lecture. Take at any location
8
Mon/Tues Oct 7-8


  Fall Recess, no classes. Fall break will follow UNC-Charlotte and students at other sites with a different break will need to watch the video of the class missed at their convenience.
8 Thur Oct 10   Shared memory performance issues

Quiz questions

 
8 Thur Oct 10 25 Java threads and synchronization   Brief review of Java threads, Thread class, Runnable interface, Java synchronization, Synchronised methods, statements, atomic.
9 Tues Oct 15

21

 

 

14

Hybrid Programming

 

Paraguin Hybrid Programming

 

Combining MPI and OpenMP to take advantage of clusters that have both distributed-memory and shared-memory. Discussion of whether hybrid is any better than using only MPI or only OpenMP.

Using the Paraguin compiler to generate a hybrid program.

9
October 15/17
 

Assignment 4

cci-grid0x cluster

 

OpenMP and hybrid MPI/OpenMP assignment using command line.

9 October 15/17 32

Synchronous All-To-All Patterns

Demo

 

Synchronous All-To-All pattern, example use in gravitational N-body problem, Barnes-Hut algorithm, Seeds CompleteSynchGraph pattern code for N-body problem, iterative synchronous All-To-All pattern, solving system of linear equations by iteration, Jacobi iteration, convergence rate. Seeds CompleteSynchGraph Pattern, MPI _Allgather() routine

10 Tues Oct 22 37 Stencil pattern Quiz questions Stencil pattern, applications, solving Laplace's eq., heat distribution problem, Seeds stencil pattern, cellular automata, game of life, ways to improve performance, partially synchronous method, red-black, multigrid.
10 Oct 22/24 32 Pipeline pattern Quiz questions Pipeline pattern, space time diagram, speed up factor, matrix-vector multiplication, matrix multiplication, adding rows of an array, unfolding loops, frequency filter, insertion sort, prime numbers, upper triangular linear equations.  Seeds pipeline pattern.
10 Thurs Oct 24 23 Sieve of Eratosthenes Quiz questions Sieve of Eratosthenes Algorithm for computing prime numbers
11 Tues Oct 29 42 Graph Algorithms   Prim's Algorithm for Minimum Spanning Tree, Dijkstra's Algorithm for Single-Source Shortest Path, Dijkstra's and Floyd's Algorithms for All-Pairs Shortest Path
11
Thurs Oct 31
40 Sorting Algorithms Quiz questions Potential speedup of sorting in parallel, compare and exchange, bubble sort, odd-even transposition sort, mergesort, quicksort, odd-even mergesort, bitonic mergesort, shearsort, rank sort, counting sort, radix sort
12
Tues Nov 5
14 Data Parallel Pattern
  Data parallel pattern, use of forall notation, example, data parallel prefix sum algorithm, matrix multiplication.
12

Tues Nov 5

 


21

 

 

21

 

Intro to GPUs and CUDA


CUDA Prog. Model

 

CPU-GPU architecture evolution, 1970s to present, dedicated pipelined GPUs, general purpose GPU design, NVIDIA products, Fermi architecture, GPU performance gains, CUDA.

CUDA SIMT prog. model, CUDA kernel routines, CPU and GPU memories, basic CUDA program structure, code example adding two vectors, compiling and executing on Linux command line, Windows MS Visual Studio.

12/13 Nov 7/12

38

Multidimensional thread structure

 

CUDA programming: threads, blocks, grid, multidimensional grid and blocks, compute capabilities, thread addressing, predefined variables, flattening array, 2-D grid and block code: matrix addition/multiplication.

12
Thurs Nov 7

 

Assignment 5


 

CUDA assignment using Linux environment to compile and execute simple CUDA programs, make file, vector/matrix addition/multiplication, and sorting.

13 Tues Nov 12

21

 

 

14

 

12

Performance measurements

 

Device routines

 

Thread synchronization

 

Measuring performance, timing program execution, CUDA “events”, synchronous and asynchronous CUDA routines, max and effective bandwidth, computation measures, FLOPs.

Declaring routines called from device and from host, local device variables, accessing kernel variables from host, cudaMemcopyToSymbol/FromSymbol

Ways to achieve thread synchronization, __syncThreads(), CPU synchronization, cudaThreadSynchronize(), __threadfence().

13
Thurs Nov 14

Class Test
  No lecture. Take at any location
14
Tues Nov 19

34

28

GPU memory structures

Performance Analysis tools

 

Memory structures and bandwidth optimization, memory coalescing

Introduction to a few performance analysis tools: time, gettimeofday, read_real_time, MPI_Wtime, prof, gprof, xprofiler, mpiP

14 Thurs Nov 21       Review/discussion/sample finals
15
Tues Nov 26
 
 

Last NCREN class. Review/discussions/sample finals

Teaching evaluations done on-line

Top 
Reading materials

Seeds Pattern Programming Framework
MPI

Paraguin

Paraguin Compiler Version 2.1 User Manual

CUDA

UNC-Charlotte Spring CUDA programming course (ITCS 4/5010 Spring 2013)

UNCC parallel programming cci-grid0x cluster

cci-grid0x cluster (newly reorganized)

Generating X11 graphical output

Notes on installing MPI on your own computer

OpenMPI on a Mac here from Tristan Bithell Spring 2013 class

MPICH on Ubuntu http://jetcracker.wordpress.com/2012/03/01/how-to-install-mpi-in-ubuntu/ from Thomas Kraft, Spring 2013 class

Notes on installing OpenMP on your own computer

To add. Let me know if you have anything to post on MPI or OpenMP installation. Thanks. BW

Top 


Assignments

Each assignment is not ready for use until the date set.

Date set
Date to report system/account problems
Assignment Topic Date due
12 pm (noon)
Thursday Aug 29th, 2013 Monday Sept 2nd, 2013

Assignment 1

Seeds Software

Using the Seeds Pattern Programming Framework 1 - Workpool Pattern
Wednesday Sept 11th, 2013
Thursday Sept 12th, 2013
Monday Sept 16th, 2013
Assignment 2
Using Paraguin to Create MPI Programs - hello world, matrix multiplication, stencil pattern, and Monte Carlo.
Wednesday Sept 25th, 2013
Thursday Sept 26, 2013
Monday Sept 30, 2013
Assignment 3

Compiling and running MPI programs
Comparison with Seeds and Paraguin

Wednesday Oct 16, 2013
Thursday Oct 17, 2013

Monday Oct 21, 2013

Assignment 4

Test file here G = 100

Generating X11 graphical output

cci-grid0x cluster

OpenMP and hybrid OpenMP/MPI assignment

New: Thursday Nov 7th, 2013 at 11:55 pm
Thursday Nov 7, 2013
Monday Nov 11, 2013
Assignment 5
CUDA programs, Linux environment to compile and execute simple CUDA programs, make file, vector/matrix addition, and sorting.
Tuesday Nov 26, 2013

Top 

Tests

Class test 1 date: Thursday Oct 3rd, 2013 -- 60 minutes scheduled during class time. Open 10:45 am. Closes 12:30 pm.

Format: 40 questions, multiple choice, Moodle quiz. Closed book.
Topics: All lecture materials presented in class from beginning of course (week 1) to week 6 inclusive, and materials in Assignment 1 (pattern programming) and Assignment 2 (Paraguin compiler directives) and MPI but not Assignment 3.  Does not include shared memory programming.


Class test 2 date: Nov 14, 2013 -- 60 minutes scheduled during class time. Open 10:45 am. Closes 12:30 pm.

Format: 40 questions, multiple choice, Moodle quiz. Closed book.
Topics: All materials after test 1, week 6 to week 13 inclusive - shared memory programming, OpenMP, shared memory performance issues, Java threads, hybrid programming, synchronous all-to-all pattern, stencil pattern, pipeline pattern, Sieve of Eratosthenes, graph algorithms, numerical algorithms, sorting algorithms, CUDA, Assignment 4. Does not include Assignment 5 (although includes CUDA lectures).

Final exam (2 hour exam) date:

    UNC-C students: 11 am - 1:00 pm, Tuesday December 10th 2013
    UNC-W students: 11 am - 1:00 pm, Tuesday December 10th 2013
    UNC A&T students: 1 pm - 3 pm, Thursday December 12, 2013
    ECU students:  11 am - 1:00 pm, Thursday December 5th 2013
    UNC-G students: 12 noon- 2:00 pm, Tuesday December 10th 2013
    WSSU students: 11 am- 1 pm, Tuesday, December 10, 2013

Topics: Comprehensive
Format: Paper test in format of previous posted final tests. Closed book.

Previous tests (UNC-C courses that did not do patterns)

Top