UNCC-logo

University of North Carolina Charlotte
University of North Carolina Wilmington
Parallel Programming Fall 2014
Tuesday/Thursday, 2:00 pm - 3:15 pm

UNCW logo

Dr. Barry Wilkinson
University of North Carolina at Charlotte

Office Hours:
Tuesday/Thursday 9:30 am -10:30 am
(Skype or in person/office)

and
Dr. Clayton Ferner
University of North Carolina at Wilmington

Office Hours:
Tuesday/Thursday 11 am - 1 pm

This page is continually updated as the course proceeds. Watch for announcements. Modification date: Dec 3, 2014. Always make sure you have the most recent copy of this page (not cached, re-load page).


ANNOUNCEMENTS


Academic calendar
Lecture Materials
Reading materials
Assignments Tests
UNC-C Moodle 2
Class videos (Streaming)
Assignment FAQs

Lecture Materials

The following slides are provided as Powerpoint slides. The slides are not ready for use until the date of the class. They are likely to be revised just before the class. The order of materials may also change.
Lecture slides

Wk

Date, 2014
No of slides
Slides
Review/Quiz questions
Topics
Class Videos (Trimmed)
1
Thurs Aug. 21
28 Outline
  Course outline, prerequisites, course text,  course contents, instructor details. TA details and responsibilities.

Lecture1

1
Thurs Aug. 21
12 Assignment Preliminaries
  Assignment preliminaries, Moodle, student accounts.  
1 Thurs Aug. 21 14 pages Pre-Assignment   Test software environment on your computer  
1
Thurs Aug. 21
10 Parallel Comp. Demand
  Demand for computational speed, grand challenge problems
 
2
Tues Aug 26
14 Parallel Comp. Potential
Quiz questions Potential for speed-up using multiple process(or)s, speed-up factor, max speed up, Amdahl's law, Gustafson's law.
Lecture 2 not available
2
Tues Aug 26
19 Parallel Computers   Types of parallel computers, shared memory systems, multicore, programming shared memory, distributed memory platform, networked computers cluster computing, programming, GPU systems.  

2


Tues Aug 26
17 Programming with Shared Memory-1
  Programming shared memory systems, processes, fork, fork-join pattern, threads, Pthreads, thread pool pattern  
2
Thurs Aug 28
37 Introduction to OpenMP
  Introduction to OpenMP, thread team pattern, directives/constructs, parallel, shared and local variables, work-sharing, sections, for, loop scheduling, for reduction, single master. Lecture 3
2 Thurs Aug 28 9 pages

Assignment 1

Matrix addition and multiplication

  OpenMP tutorial  
3 Tuesday Sept 2 29 Programming with Shared memory-2 OpenMP Quiz questions Accessing shared data, critical sections, locks, condition variables, critical sections serializing code, deadlock, semaphores, monitors, Pthreads program example. Lecture 4
3 Tuesday Sept 2 10 OpenMP continued   Sharing data and synchronization, critical, barrier, atomic, flush.  
3 Thursday Sept 4 13 Intro to stencil pattern   Stencil pattern, heat distribution Lecture 5
3 Thursday Sept 4 6 pages Assignment 2   OpenMP heat distribution program, graphics  
3/4 Thurs Sept 4/Tues Sept 9 34 Programming with Shared Memory-3 Shared Memory Quiz questions Shared memory performance issues, specifying parallelism, par, forall constructs, dependency analysis (Bernstein's conditions), data shared in caches, false sharing, sequential consistency, code re-ordering Lecture 6
4
Tues Sept 9/Thurs Sept 11
62 Lower Level Message-passing Computing - MPI
  Basics of message-passing programming, MPI, point-to-point message passing, message tags, MPI communicator, blocking send/recv, command line compiling and executing MPI programs, instrumenting code for execution time, Eclipse IDE Parallel Tools Platform. Lecture 7
4/5
Thurs Sept 11/16 45

Collective data transfer patterns and MPI routines

Quiz questions

 

Message passing patterns, MPI collective routines, broadcast, scatter, gather, reduce, barrier, alltoall broadcast. Lecture 8
5 Tues Sept 16 22 pages Assignment 3   MPI tutorial, using command line and Eclipse-PTP  
5 Thurs Sept 18 35 Synchronization

Quiz questions

Quiz questions

Barriers implementations, counter, reentrant code, tree, butterfly, local synchronization, safety and deadlock, safe MPI routines, MPI_SendRecv(), MPI_BSend(), MPI_Isend/MPI_Irecv(), synchronous message passing, asynchronous (non-blocking) message passing, changing to synchronous message passing. Lecture 9
6 Tues Sept 23 25 Introduction to Patterns   Pattern programming concepts, problem addressed, low message-passing patterns, point to point data transfer, broadcast, scatter, gather, reduce, all-to-all broadcast, higher level message-passing patterns, workpool, pipeline, divide and conquer, all-to-all, iterative synchronous patterns, iterative synchronous all-to-all, stencil, advantages and disadvantages of patterns, our tools. Lecture 10
6 Tues Sept 23 21 Suzaku framework Quiz questions Suzaku, macros, routines, implementation  
6 Thurs Sept 25, 2014   Class quiz   No lecture. Take quiz at any location  
7 Tues Sept 30 4 pages

Assignment 4


MonteCarlo pi

  MPI application. Monte Carlo pi workpool Lecture 11
7
Tues Sept 30
47 Compiler directive approach   Introduction to Paraguin, parallel regions, barrier, forall, broadcast, scatter, gather, and reduction.
 
7 Thurs Oct 2 37 and 34 Compiler directive approach

Examples
Quiz Questions Patterns, Scatter/Gather, Stencil Lecture 12
8 Tues Oct 7      

Fall Recess, no class. Fall break will follow UNC-Charlotte. Students at other sites with a different break will need to watch the video of the class missed in their breaks, at their convenience.

 
8 Thurs Oct 9       Paraguin continued  
8
Thurs Oct 9
45 Seeds framework Quiz questions Seeds pattern programming framework, , module method, bootstrapping class, network and multicore versions, workpool programming examples - Monte Carlo pi, matrix addion, matrix multiplication.
Lecture 13
9 Tues Oct 14

 

 

29

Patterns and Applications

Synchronous All-To-All Patterns

Demo

 

 

Quiz questions

 

All-To-All pattern, iterative synchronous All-To-All pattern, gravitational N-body problem, Barnes-Hut algorithm, solving system of linear equations by iteration, Jacobi iteration, convergence rate.

Lecture 14
9 Tues Oct 14 22 Stencil pattern Quiz questions Stencil pattern, applications, solving Laplace's eq., heat distribution problem, ways to improve performance, partially synchronous method, red-black, multigrid.  
9 Thurs Oct 16 23 Assignment 5 slides
Assignment 5
  Using Paraguin to Create MPI Programs - hello world, matrix multiplication, stencil pattern, and Monte Carlo.  
9 Thurs Oct 16 26 Pipeline pattern Quiz questions Pipeline pattern, space time diagram, speed up factor, applications, matrix-vector multiplication, matrix multiplication, insertion sort, prime numbers, upper triangular linear equations. Lecture 15
9/10
Thurs Oct 16/Tues Oct 21
40 Sorting Algorithms Quiz questions Potential speedup of sorting in parallel, compare and exchange, bubble sort, odd-even transposition sort, mergesort, quicksort, odd-even mergesort, bitonic mergesort, shearsort, rank sort, counting sort, radix sort Lecture 16
10 Thurs Oct 23 23 Sieve of Eratosthenes Quiz questions Sieve of Eratosthenes Algorithm for computing prime numbers Lecture 17
10 Thurs Oct 23 42 Graph Algorithms   Prim's Algorithm for Minimum Spanning Tree, Dijkstra's Algorithm for Single-Source Shortest Path, Dijkstra's and Floyd's Algorithms for All-Pairs Shortest Path  
11 Tues Oct 28
26 pages Assignment 6   Using the Seeds Pattern Programming Framework: 1 - Workpool  
11 Tues Oct 28

21

 

 

14

Hybrid Programming

 

Paraguin Hybrid Programming

 

Combining MPI and OpenMP to take advantage of clusters that have both distributed-memory and shared-memory. Discussion of whether hybrid is any better than using only MPI or only OpenMP.

Using the Paraguin compiler to generate a hybrid program.

Lecture 18
11
Thurs Oct 30
14 Data Parallel Pattern
  Data parallel pattern, use of forall notation, example, data parallel prefix sum algorithm, matrix multiplication.
Lecture 19
11

Thurs Oct 30

 


21

 

 

21

 

Intro to GPUs and CUDA


CUDA Prog. Model

Quiz questions

CPU-GPU architecture evolution, 1970s to present, dedicated pipelined GPUs, general purpose GPU design, NVIDIA products, Fermi architecture, GPU performance gains, CUDA.

CUDA SIMT prog. model, CUDA kernel routines, CPU and GPU memories, basic CUDA program structure, code example adding two vectors, compiling and executing on Linux command line, Windows MS Visual Studio.

 
12 Tues Nov 4

38

Multidimensional thread structure

 

CUDA programming: threads, blocks, grid, multidimensional grid and blocks, compute capabilities, thread addressing, predefined variables, flattening array, 2-D grid and block code: matrix addition/multiplication.

Lecture 20
12 Tues Nov 4

21

 

 

14

 

 

Performance measurements

 

 

Device routines

 

 

Measuring performance, timing program execution, CUDA “events”, synchronous and asynchronous CUDA routines, max and effective bandwidth, computation measures, FLOPs.

Declaring routines called from device and from host, local device variables, accessing kernel variables from host, cudaMemcopyToSymbol/FromSymbol

 
12
Nov 6th

5 pages

Assignment 7


 

CUDA assignment using Linux environment to compile and execute simple CUDA programs, make file, vector/matrix addition/multiplication, prefix sum, sorting.

 
12 Nov 6th

12

 

34

Thread synchronization

 

GPU memory structures

 

Ways to achieve thread synchronization, __syncThreads(), CPU synchronization, cudaThreadSynchronize(), __threadfence().

Memory structures and bandwidth optimization, memory coalescing

Lecture 21
13 Nov 11

28

 

 

22

11

Performance Analysis tools

 

Memory coalescing

Shared Memory Demo

 

Introduction to a few performance analysis tools: time, gettimeofday, read_real_time, MPI_Wtime, prof, gprof, xprofiler, mpiP

Demonstration of memory coalescing, code, performance improvements

Demonstration of using shared memory, code, performance improvements

Lecture 22
13 Nov 13th       Review for quiz Lecture 23
14

Nov 18th


Moodle Quiz
  No lecture. Take at any location
 
14 Nov 20th       Review/discussion/sample finals Lecture 24
15 Nov 25th       Review/discussion/sample finals Lecture 25
15 Nov 27th       Thanksgiving break. No classes  
16
Dec 2nd, 2014
 
 

Last NCREN class. Review/discussion

Teaching evaluations done on-line

Lecture 26

Top 

Reading materials

OpenMP

MPI

Paraguin

Seeds Pattern Programming Framework

CUDA

Textbook home page: http://www.cs.uncc.edu/par_prog

English to American translation

Top 


Assignments Each assignment is not ready for use until the date set.

Assignment FAQs

Clusters used in assignments:

  • Using UNC-C cci-gridgw cluster
  • Using UNC-W babbage.cis.uncw.edu cluster

  • Date set Date to report system/software problems
    Assignment Topic Date due
    12 pm (noon)

    Thurs Aug. 21

    Tues Aug 26, 2014

    Pre-assignment

    Course Software

    Test software environment on your computer Thursday Aug 28, 2014
    Thursday Aug 28, 2014 Tues Sept 2, 2014 Assignment 1 OpenMP tutorial Thursday Sept 4, 2014
    Thursday Sept 4, 2014 Tues Sept 9, 2014

    Assignment 2

    X11 graphics notes

    OpenMP heat distribution program, graphics Tuesday Sept 16, 2014
    Tuesday Sept 16, 2014 Thurs Sept 18, 2014 Assignment 3 MPI tutorial, using command line and Eclipse-PTP Tues Sept 30, 2014
    Tues Sept 30, 2014 Thurs Oct 2, 2014 Assignment 4 MPI program. Monte Carlo pi workpool Friday Oct 10, 2014
    Thurs Oct 16, 2014 Tues Oct 21, 2014 Assignment 5 Paraguin Tues Oct 28, 2014
    Tues Oct 28, 2014 Thurs Oct 30, 2014 Assignment 6 Seeds Thurs Nov 6th, 2014
    Thurs Nov 6th, 2014
    Tues Nov 11, 2014
    Assignment 7
    CUDA (On UNC-C K20 cci-grid08.uncc.edu)
    Thurs Nov 20, 2014

    Top 


    Tests

    Class test 1 date: Thursday Sept 25th, 2014, 2 pm - 3:15 pm (during class period) Take at any location.

    Format: 40 questions, multiple choice, Moodle quiz. Closed book.
    Topics: All lecture materials presented in class from beginning of course (week 1) to week 5 inclusive, and materials in assignments.


    Class test 2 date: Tuesday Nov 18th, 2014, 2 pm - 3:15 pm (during class period) Take at any location.

    Format: 40 questions, multiple choice, Moodle quiz. Closed book.
    Topics: All materials after test 1, week 6 to week 13 inclusive and assignments.

    Final exam date: (2 1/2 hour exam within university scheluded exam period for class):

    UNCC students:     Tuesday Dec 9, 2014, 2 pm to 4:30 pm
    UNCW students:     Thursday Dec 11, 2014, 3:00 pm- 6:00 pm
    App State students:  Thursday, Dec 11, 2014, 12:00 pm (Noon)- 2:30 pm
    ECU students:         Tuesday Dec 16, 2014, 2 pm - 4:30 pm
    UNCG students:     Saturday Dec 6, 2014, 3:30 pm - 6:30 pm 
    WCU students:      Monday Dec 8, 2014 12:00 pm (Noon) - 2:30 pm

    Topics: Comprehensive
    Format: Paper test in format of previous posted final tests. Closed book.

    Previous tests

    Top