This course introduces participants to the computing environment found in UH high performance computing clusters such as Opuntia, including how to prepare work-flows, submit jobs to the queuing systems, and retrieve results. Other topics covered include general HPC concepts, Opuntia’s system architecture, system access, customizing your user environment, compiling and linking codes for CPUs or GPUs, the PBS/SLURM batch scheduling system, batch job scripts, Matlab jobs, submission of serial or interactive or parallel (gpu/cpu) jobs to the batch system. 

Topics in Linux covered include user accounts, file permissions, file system navigation, the Command Line Interface (CLI), command line utility programs, file & folder manipulation, and common text editors. 

Topics covered in Shell scripting include built-in commands, control structures, file descriptors, functions, parameters & variables, and shell scripting.

Prerequisites: None.

Date: 3rd March 2020 - 16th April 2020

Time: Tue Thurs 11:30 AM - 01:00 PM

Instructor: Dr. Amit Amritkar

Location: AERB Room 200, 202

Class Capacity: 44

Evaluation – 2 homework assignments: 25% each (50% total) – 1 final exam: 50% (last day of class)

Python is an easy to learn, powerful programming language. It has efficient high-level data structures that make it suitable rapid application development. Topics covered in this session will include data types, conditional and loop statements, functions, input/output, modules, classes and exceptions. Upon completion of this tutorial series, participants should be able to understand existing scientific python codes as well as write their own simple python applications. This training session also introduces participants to scientific computing extensions of python like numpy for use in high-performance computing. Using advanced python libraries like regular expressions, scipy, pandas, seaborn, scikit-learn, etc for every day scientific computing are also taught.

Prerequisites: Participants are expected to have a working knowledge of the UNIX/Linux environment or should have taken Cluster computing course from HPE-DSI dept.

UPDATED 3/12 Date: 2nd March 2020 - 15th April 2020 

Time: Mon Weds 01:00 PM - 02:30 PM

Instructor: Dr. Jerry Ebalunode

Class Capacity:  44

Location: AERB 200, 202 

Location on UH MAP:

This tutorial will provide hands-on skills to use modern data visualization and analysis platforms, specifically the open source parallel Paraview and Tableau. Paraview is very powerful and popular in the HPC scientific and engineering research communities. In the ParaView part, we will explore representations, color-scales and their controls, data filters, how to build pipelines, multi-view & camera links using synthetic seismic data, streamline plots, plot-over-line analysis, and histograms. Also, the calculator tool, datasets & time, animations & their controls, time interpolation, camera animations, static vector field animations, and Python scripting. Finally, the course will cover how to use these tools/skills to do remote, parallel visualization using HPE DSI computer clusters. In the Tableau workshop, we will use Tableau Public to create interactive data visualizations. It will cover an overview of the program and provide hands-on experience creating basic charts and maps, as well as creating interactive web-based visualization dashboards. We will also use more advanced features in Tableau to manage data, and use calculations and parameters to make views more interactive. In the end, students will publish their visualizations to the Tableau Public web server.

Date: 26th February 2020 - 15th April 2020

Time : Mon Weds 9:00 AM - 10:30 AM

Instructor: Dr. Martin Huarte-Espinosa and MSc. Wenli Gao

Location: AERB 200, 202

Class Capacity: 44

Machine learning is the science of developing statistical methods that quantify relationships within data. This branch of mathematics/computer science has seen an explosive growth over the past decade as our ability to store and process digital data has dramatically increased. Prediction, classification, regression, and identification are the aims of learning from data. All of these problems are routinely performed in data analytic’s.

To obtain an overview of the literature in learning-based methods and applications.

To obtain an understanding of a variety of machine learning techniques for classification, regression, and prediction.

To obtain the ability to implement and experiment with a wide range of machine learning algorithms in Python with examples.

To apply: Unsupervised and Supervised learning and clustering concepts, Dimensional reduction, Kernels and kernel-based classifiers such as SVM, and Deep Learning algorithms.

To understand and implement learning-based methods for classification of images, signals and features.

Prerequisites: Participants are expected to have a working knowledge of the UNIX/Linux environment or should have taken Cluster computing course from HPE DSI dept.

Dates: 25th February 2020 - 14th April 2020

Time: Tue Thurs 10:00 AM - 11:30 AM

Instructor: Dr. Pablo Guillen-Rondon

Location: AERB 200, 202

Class Capacity: 44

Part1: This part of the course will help you get started with debugging and using the gdb/idb debuggers. The topics covered include, understanding debugging, naive debugging, an introduction of debugging tools, serial code debugging, parallel code (OpenMP and MPI) debugging.

Part2:In an ideal case, parallelization would lead to a speed-up which scales linearly with the number of processors used compared to the original serial program running on a single processor. What if a program’s performance does not meet these expectations? Indeed, there are good reasons why these expectations most likely will not be met and we will explore those reasons and their remedies in this hands-on course. The part of the course will cover, understanding of serial and parallel performance (benchmarking), optimizing sequential programs - serial code profiling and analysis, tuning of parallel programs, parallel code profiling, collecting runtime information, and evaluation, analysis and presentation of the collected data.

Prerequisites: Familiarity with a low-level programming language such as C/C++, or Fortran, Matlab and working comfortably in a UNIX/Linux environment or completed corresponding CACDS courses (cluster computing and C++).

UPDATED 3/12 Date: 3rd March 2020 - 14th April 2020

Time: Tue Thurs 01:00 PM - 02:30 PM

Instructor: Dr. Amit Amritkar

Location: AERB 202

Class Capacity: 12