- Instructor: Pasquale Claudio Africa pafrica@sissa.it
- Assistant: Dario Coscia dcoscia@sissa.it
-
GitHub: timetable, lecture notes, exercise sessions.
-
Books (see course syllabus):
- Parallel and High Performance Programming with Python, Fabio Nelly.
- Python Parallel Programming Cookbook, Giancarlo Zaccone.
- High Performance Python: Practical Performant Programming for Humans, Micha Gorelick & Ian Ozvald.
-
Internet (plenty of free or paid resources).
Check out GitHub regularly for up-to-date timetable, rooms, lecture topics, and course material.
Course balance (approximate):
- ~24 hours.
- Frontal lectures: 40%, practical sessions: 60%.
For practical sessions please bring your own laptop.
- Ask!
- Engage with each other!
- Office hours: send an email to book a session.
- Introduction to the UNIX shell.
- Version control (Git) and dependency management (conda, Docker).
- Hardware architectures and parallel computing paradigms.
- Python ecosystem for data science and scientific computing.
- Data types for efficient computing.
- NumPy, SciPy, scikit-learn, visualization tools.
- Libraries for deep learning (PyTorch)
- Best practices for writing reliable code: error handling, unit testing, code profiling, optimization, and software deployment.
- How to use HPC resources.
- Libraries for parallel computing (Numba, Lightning).
- Former knowledge of programming fundamentals (syntax, data types, variables, control structures, functions).
- Prior experience with C, C++, Java, or Python, is recommended, but not mandatory.
Please bring your own laptop with a working UNIX/Linux environment, whether standalone, dual boot, or virtualized.
For beginners: https://ubuntu.com/tutorials/install-ubuntu-desktop.
You can write code using any text editor (such as Emacs, Vim, or Nano), or an Integrated Development Environment (IDE) (such as VSCode, Eclipse, or Spyder).
- Python 3. The presence of Jupyter and conda is recommended.
- A C++ compiler installed with full support for C++17, such as GCC 10 or newer, or Clang 11 or newer. The presence of CMake is recommended.
- Docker Desktop. Please follow the instruction on the official guide and the post-installation steps thoroughly.
Any recent Linux distribution, such as Ubuntu
- Windows Subsystem for Linux (WSL2). Ubuntu version recommended, then follow Ubuntu-specific instructions.
- Virtual machine (such as VirtualBox).
- (Expert users) Dual boot.
- Install Python 3 and GCC using your package manager (such as apt, yum, pacman).
- You can't really understand/modify/improve a text written in English, unless you are proficient in English!
- Career opportunities:
- Coding opens doors to a diverse array of high-demand careers in technology and data-driven sectors.
- It cultivates critical thinking and problem-solving skills.
- Artificial Intelligence (AI) and chatbots lack creativity!
- They derive knowledge from historical data.
- Innovation, idea generation, implementation of novel concepts through software and technology remain a human prerogative (at least for now 😅).
- Understand how AI and chatbots work under the hood.
Source: https://pypl.github.io/PYPL.html
- The build process:
- Compiled vs. interpreted languages.
- Preprocessor, compiler, linker, loader.
- Introduction to the UNIX shell:
- What is a shell.
- Variables.
- Basic commands and scripting.
- Shell scripting.
- Handles directives and macros before compilation.
- Originated for code reusability and organization.
#include: Includes header files.#define: Defines macros for code replacement.#ifdef,#ifndef,#else,#endif: Conditional compilation.#pragma: Compiler-specific directives.
- Example:
#define SQUARE(x) ((x) * (x)) - Usage:
int result = SQUARE(5); // Expands to: ((5) * (5))
- Translates source code into assembly/machine code.
- Evolved with programming languages and instructions.
- Lexical analysis: Tokenization.
- Syntax analysis (parsing): Syntax tree.
- Semantic analysis: Checking.
- Code generation: Assembly/machine code.
- Optimization: Efficiency improvement.
- Output: Object files.
-O: Optimization levels; -g: Debugging info; -std: C++ standard.
- Combines object files into an executable.
- Supports modular code.
- Symbol resolution: Match symbols.
- Relocation: Adjust addresses.
- Output: Executable.
- Linker errors/warnings.
- Example:
g++ main.o helper.o -o my_program
- Static: Larger binary, library inclusion.
- Dynamic: Smaller binary, runtime library reference.
- Loads executables for execution.
- Tied to memory management evolution.
- Memory allocation: Reserve memory.
- Relocation: Adjust addresses.
- Initialization: Set up environment.
- Execution: Start execution.
- Inclusion of external libraries during execution.
- Enhances flexibility.
From http://www.linfo.org/shell.html:
A shell is a program that provides the traditional, text-only user interface for Linux and other UNIX-like operating systems. Its primary function is to read commands that are typed into a console [...] and then execute (i.e., run) them. The term shell derives its name from the fact that it is an outer layer of an operating system. A shell is an interface between the user and the internal parts of the OS (at the very core of which is the kernel).
Bash stands for: Bourne Again Shell, a homage to its creator Stephen Bourne. It is the default shell for most UNIX systems and Linux distributions. It is both a command interpreter and a scripting language. The shell might be changed by simply typing its name and even the default shell might be changed for all sessions.
macOS has replaced it with zsh, which is mostly compatible with Bash, since v10.15 Catalina.
Other shells available: tsh, ksh, csh, Dash, Fish, Windows PowerShell, ...
As shell is a program, it has its variables. You can assign a value to a variable with the equal sign (no spaces!), for instance type A=1. You can then retrieve its value using the dollar sign and curly braces, for instance to display it the user may type echo ${A}.
Some variables can affect the way running processes will behave on a computer, these are called environmental variables. For this reason, some variables are set by default, for instance to display the user home directory type echo ${HOME}.
To set an environmental variable just prepend export, for instance export PATH="/usr/sbin:$PATH" adds the folder /usr/sbin to the PATH environment variable. PATH specifies a set of directories where executable programs are located.
- A login shell logs you into the system as a specific user (it requires username and password). When you hit
Ctrl+Alt+F1to login into a virtual terminal you get after successful login: a login shell (that is interactive). - A non-login shell is executed without logging in (it requires a current logged in user). When you open a graphic terminal it is a non-login (interactive) shell.
- In an interactive shell (login or non-login) you can interactively type or interrupt commands. For example a graphic terminal (non-login) or a virtual terminal (login). In an interactive shell the prompt variable must be set (
$PS1). - A non-interactive shell is usually run from an automated process. Input and output are not exposed (unless explicitly handled by the calling process). This is normally a non-login shell, because the calling user has logged in already. A shell running a script is always a non-interactive shell (but the script can emulate an interactive shell by prompting the user to input values).
When launching a terminal a UNIX system first launches the shell interpreter specified in the SHELL environment variable. If SHELL is unset it uses the system default.
After having sourced the initialization files, the interpreter shows the prompt (defined by the environment variable $PS1).
Initialization files are hidden files stored in the user's home directory, executed as soon as an interactive shell is run.
Initialization files in a shell are scripts or configuration files that are executed or sourced when the shell starts. These files are used to set up the shell environment, customize its behavior, and define various settings that affect how the shell operates.
-
login:
/etc/profile,/etc/profile.d/*,~/.profilefor Bourne-compatible shells~/.bash_profile(or~/.bash_login) forBash/etc/zprofile,~/.zprofileforzsh/etc/csh.login,~/.loginforcsh
-
non-login:
/etc/bash.bashrc,~/.bashrcforBash
-
interactive:
-
/etc/profile,/etc/profile.d/*and~/.profile -
/etc/bash.bashrc,~/.bashrcforBash
-
-
non-interactive:
-
/etc/bash.bashrcforBash(but most of the times the script begins with:[ -z "$PS1" ] && return, i.e. don't do anything if it's a non-interactive shell). - depending on the shell, the file specified in
$ENV(or$BASH_ENV) might be read.
-
To get a little hang of the shell, let’s try a few simple commands:
echo: prints whatever you type at the shell prompt.date: displays the current time and date.clear: clean the terminal.
pwdstands for Print working directory and it points to the current working directory, that is, the directory that the shell is currently looking at. It’s also the default place where the shell commands will look for data files.lsstands for List and it lists the contents of a directory.lsusually starts out looking at our home directory. This means if we print ls by itself, it will always print the contents of the current directory.cdstands for Change directory and changes the active directory to the path specified.
cpstands for Copy and it moves one or more files or directories from one place to another. We need to specify what we want to move, i.e., the source and where we want to move them, i.e., the destination.mvstands for Move and it moves one or more files or directories from one place to another. We need to specify what we want to move, i.e., the source and where we want to move them, i.e., the destination.touchcommand is used to create new, empty files. It is also used to change the timestamps on existing files and directories.mkdirstands for Make directory and is used to make a new directory or a folder.rmstands for Remove and it removes files or directories. By default, it does not remove directories, unless you provide the flagrm -r(-rmeans recursively).⚠️ Warning: Files removed viarmare lost forever, please be careful!
Commands can be written in a script file, i.e. a text file that can be executed.
Remember that the first line of the script (the so-called shebang) tells the shell which interpreter to use while executing the file. So, for example, if your script starts with #!/bin/bash it will be run by Bash, if is starts with #!/usr/bin/env python it will be run by Python.
To run your brand new script you may need to change the access permissions of the file. To make a file executable run
chmod +x script_fileWhen executing a command, like ls a subprocess is created. A subprocess inherits all the environment variables from the parent process, executes the command and returns the control to the calling process.
A subprocess cannot change the state of the calling process.
The command source script_file executes the commands contained in script_file as if they were typed directly on the terminal. It is only used on scripts that have to change some environmental variables or define aliases or function. Typing . script_file does the same.
If the environment should not be altered, use ./script_file, instead.
Some commands, like cd are executed directly by the shell, without creating a subprocess.
Indeed it would be impossible to have cd as a regular command!
The reason is: a subprocess cannot change the state of the calling process, whereas cd needs to change the value of the environmental variable PWD(that contains the name of the current working directory).
In general a command can refer to:
- A builtin command.
- An executable.
- A function.
The shell looks for executables with a given name within directories specified in the environment variable PATH, whereas aliases and functions are usually sourced by the .bashrc file (or equivalent).
- To check what
command_nameis:type command_name. - To check its location:
which command_name.
Space characters in file names should be forbidden by law! The space is used as separation character, having it in a file name makes things a lot more complicated in any script (not just shell scripts).
Use underscores (snake case): my_wonderful_file_name, or uppercase characters (camel case): myWonderfulFileName, or hyphens: my-wonderful-file-name, or a mixture:
myWonderful_file-name, instead.
But not my wonderful file name. It is not wonderful at all if it has to be parsed in a script.
catstands for Concatenate and it reads a file and outputs its content. It can read any number of files, and hence the name concatenate.wcis short for Word count. It reads a list of files and generates one or more of the following statistics: newline count, word count, and byte count.grepstands for Global regular expression print. It searches for lines with a given string or looks for a pattern in a given input stream.headshows the first line(s) of a file.tailshows the last line(s) of a file.filereads the files specified and performs a series of tests in attempt to classify them by type.
We can add operators between commands in order to chain them together.
- The pipe operator
|, forwards the output of one command to another. E.g.,cat /etc/passwd | grep my_usernamechecks system information about "my_username". - The redirect operator
>sends the standard output of one command to a file. E.g.,ls > files-in-this-folder.txtsaves a file with the list of files. - The append operator
>>appends the output of one command to a file. - The operator
&>sends the standard output and the standard error to file. &&pipe is activated only if the return status of the first command is 0. It is used to chain commands together: e.g.,sudo apt update && sudo apt upgrade||pipe is activated only if the return status of first command is different from 0.;is a way to execute to commands regardless of the output status.$?is a variable containing the output status of the last command.
trstands for translate. It supports a range of transformations including uppercase to lowercase, squeezing repeating characters, deleting specific characters, and basic find and replace. For instance:echo "Welcome to Advanced Programming!" | tr [a-z] [A-Z]converts all characters to upper case.echo -e "A;B;c\n1,2;1,4;1,8" | tr "," "." | tr ";" ","replaces commas with dots and semi-colons with commas.echo "My ID is 73535" | tr -d [:digit:]deletes all the digits from the string.
sedstands for stream editor and it can perform lots of functions on file like searching, find and replace, insertion or deletion. We give just an hint of its true powerecho "UNIX is great OS. UNIX is open source." | sed "s/UNIX/Linux/"replaces the first occurrence of "UNIX" with "Linux".echo "UNIX is great OS. UNIX is open source." | sed "s/UNIX/Linux/2"replaces the second occurrence of "UNIX" with "Linux".echo "UNIX is great OS. UNIX is open source." | sed "s/UNIX/Linux/g"replaces all occurrencies of "UNIX" with "Linux".echo -e "ABC\nabc" | sed "/abc/d"delete lines matching "abc".echo -e "1\n2\n3\n4\n5\n6\n7\n8" | sed "3,6d"delete lines from 3 to 6.
-
cutis a command for cutting out the sections from each line of files and writing the result to standard output.-
cut -b 1-3,7- state.txtcut bytes (-b) from 1 to 3 and from 7 to end of the line -
echo -e "A,B,C\n1.22,1.2,3\n5,6,7\n9.99999,0,0" | cut -d "," -f 1get the first column of a CSV (-dspecifies the column delimiter,-f nspecifies to pick the$n$ -th column from each line)
-
-
findis used to find files in specified directories that meet certain conditions. For example:find . -type d -name "*lib*"find all directories (not files) starting from the current one (.) whose name contain "lib". -
locateis less powerful thanfindbut much faster since it relies on a database that is updated on a daily base or manually using the commandupdatedb. For example:locate -i foofinds all files or directories whose name containsfooignoring case.
Double quotes may be used to identify a string where the variables are interpreted. Single quotes identify a string where variables are not interpreted. Check the output of the following commands
a=yes
echo "$a"
echo '$a'The output of a command can be converted into a string and assigned to a variable for later reuse:
list=`ls -l` # Or, equivalently:
list=$(ls -l)- Run a command in background:
./my_command & -
Ctrl-Zsuspends the current subprocess. -
jobslists all subprocesses running in the background in the terminal. -
bg %nreactivates the$n$ -th subprocess and sends it to the background. -
fg %nbrings the$n$ -th subprocess back to the foreground. -
Ctrl-Cterminates the subprocess in the foreground (when not trapped). -
kill pidsends termination signal to the subprocess with idpid. You can get a list of the most computationally expensive processes withtopand a complete list withps aux(usuallyps auxis filtered through a pipe withgrep)
All subprocesses in the background of the terminal are terminated when the terminal is closed (unless launched with nohup, but that is another story...)
Most commands provide a -h or --help flag to print a short help information:
find -hman command prints the documentation manual for command.
There is also an info facility that sometimes provides more information: info command.
A function in a shell is a block of reusable code that you can define and call throughout your script. Functions are useful for organizing complex scripts and avoiding repetition. The general syntax for defining a function is:
function_name() {
# Commands to be executed.
}Example:
greet() {
echo "Hello, $1!"
}In this example, greet is a function that takes one argument and echoes a greeting message.
-
$0: The name of the script/function itself. -
$1,$2,$3, etc.: The first, second, third (and so on) argument passed to the script/function. -
$#: The number of arguments passed. -
$@: The list of all the arguments passed as a single string. -
$*: All the arguments as a single word (not often used).
A shell script is simply a file containing a sequence of commands. It starts with a shebang (#!) that tells the system which interpreter to use.
Example:
#!/bin/bash
echo "Hello, World!"Make the script executable and run it:
chmod +x my_script.sh
./my_script.shShell variables store data and can be set as follows:
name="Alice"
echo "Hello, $name!"echo "Enter your name:"
read user_name
echo "Welcome, ${user_name}!"#!/bin/bash
echo "Enter a number:"
read num
if [ $num -gt 10 ]; then
echo "Number is greater than 10."
else
echo "Number is 10 or less."
fiBash conditional expressions POSIX shell cheat sheet
echo "Choose an option: start, stop, restart"
read action
case $action in
start) echo "Starting service...";;
stop) echo "Stopping service...";;
restart) echo "Restarting service...";;
*) echo "Invalid option";;
esacfor i in 1 2 3 4 5
do
echo "Iteration $i"
donecount=1
while [ $count -le 5 ]
do
echo "Count: $count"
((count++))
donenum=0
until [ $num -eq 3 ]
do
echo "Number: $num"
((num++))
doneset -e: Exit script on error.set -x: Enable debugging (prints each command before execution).trap: Catch errors and execute custom actions.
Example:
trap 'echo "An error occurred!"' ERR
function cleanup() { ... }
trap cleanup EXIT
set -e
mkdir my_directory
cd my_directory
rm nonexistent_file # This will trigger the trap.Using $1, $2, etc. to read input:
echo "First argument: $1"
echo "Second argument: $2"Scripts share the same syntax as functions for parsing arguments.
More advanced argument handling with getopts:
while getopts "u:p:" opt; do
case $opt in
u) username=$OPTARG ;;
p) password=$OPTARG ;;
*) echo "Invalid option" ;;
esac
done
echo "User: $username, Password: $password"#!/bin/bash
my_function() {
echo "Function name: $0"
echo "First argument: $1"
echo "Second argument: $2"
echo "All arguments (\$@): $@" # As separate strings.
echo "All arguments (\$*): $*" # As a single string.
echo "Number of arguments: $#"
}
my_function "Alice" "Bob" "Charlie"
