Lecture VII - NumPy for Scientific Computing

Programming with Python

Dr. Tobias Vlćek

Kühne Logistics University Hamburg - Fall 2025

Quick Recap of the last Lecture

Modules

  • Modules are .py files containing Python code
  • They are used to organize and reuse code
  • They can define functions, classes, and variables
  • Can be imported into other scripts

We can import entire modules or individual functions, classes or variables.

Standard Libraries

  • Python includes many built-in modules like:
    • random provides functions for random numbers
    • os allows interaction with the operating system
    • csv is used for reading and writing CSV files
    • re is used for working with regular expressions

Packages

  • Packages are collections of modules
  • Often available from the Python Package Index (PyPI)
  • Install using uv add <package_name>
  • Virtual environments help manage dependencies

Virtual environments are not that important for you right now, as they are mostly used if you work on several projects with different dependecies at once.

NumPy Module

What is NumPy?

  • NumPy is a package for scientific computing in Python
  • Provides large, multi-dimensional arrays and matrices
  • Wide range of functions to operate on these
  • Python lists can be slow - Numpy arrays are much faster

The name of the package comes from Numerical Python.

Why is NumPy so fast?

  • Arrays are stored in a contiguous block of memory
  • This allows for efficient memory access patterns
  • Operations are implemented in the languages C and C++

Question: Have you heard of C and C++?

How to get started

  1. Install NumPy using uv add numpy
  2. Import NumPy in a script using import numpy as np
import numpy as np
x = np.array([1, 2, 3, 4, 5]); type(x)
numpy.ndarray

You don’t have to use as np. But it is a common practice to do so.

Array Basics

Creating Arrays

  • The backbone of Numpy is the so called ndarray
  • Can be initialized from different data structures:
import numpy as np

array_from_list = np.array([1, 1, 1, 1])
print(array_from_list)
[1 1 1 1]
import numpy as np

array_from_tuple = np.array((2, 2, 2, 2))
print(array_from_tuple)
[2 2 2 2]

Hetergenous Data Types

  • It is possible to store different data types in a ndarray
import numpy as np

array_different_types = np.array(["s", 2, 2.0, "i"])
print(array_different_types)
['s' '2' '2.0' 'i']

But it is mostly not recommended, as it can lead to performance issues. If possible, try to keep the types homogenous.

Prefilled Arrays

Improve performance by allocating memory upfront

  • np.zeros(shape): to create an array of zeros
  • np.random.rand(shape): array of random values
  • np.arange(start, stop, step): evenly spaced
  • np.linspace(start, stop, num): evenly spaced

The shape refers to the size of the array. It can have one or multiple dimensions.

Dimensions

  • The shape is specified as tuple in these arrays
  • (2) or 2 creates a 1-dimensional array (vetor)
  • (2,2) creates a 2-dimensional array (matrix)
  • (2,2,2) 3-dimensional array (3rd order tensor)
  • (2,2,2,2) 4-dimensional array (4th order tensor)

Arrays in Action

Task: Practice working with Numpy:

# TODO: Create a 3-dimensional tensor with filled with zeros
# Choose the shape of the tensor, but it should have 200 elements
# Add the number 5 to all values of the tensor

# Your code here
assert sum(tensor) == 1000

# TODO: Print the shape of the tensor using the method shape()
# TODO: Print the dtype of the tensor using the method dtype()
# TODO: Print the size of the tensor using the method size()

Operations

Element-wise Operations

  • We can apply operations to the entire array at once
  • This is much faster than applying them element-wise

Question: What would happen here?

import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Addition:", a + b)
print("Multiplication:", a * b)
print("Division:", a / b)
print("Power:", a ** 2)
Addition: [5 7 9]
Multiplication: [ 4 10 18]
Division: [0.25 0.4  0.5 ]
Power: [1 4 9]

Matrix vs Element-wise

Important: * is element-wise, @ is matrix multiplication:

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

print("Element-wise (*):"); print(A * B); print("---")
print("Matrix multiplication (@):"); print(A @ B)
Element-wise (*):
[[ 5 12]
 [21 32]]
---
Matrix multiplication (@):
[[19 22]
 [43 50]]

Broadcasting

NumPy can perform operations on arrays with different shapes!

Question: What would you expect here?

matrix = np.array([[1, 2, 3], [4, 5, 6]])
vector = np.array([10, 20, 30])

print("Matrix:"); print(matrix); print("Vector:", vector); print("---")
print("Broadcasting result:"); print(matrix + vector)
Matrix:
[[1 2 3]
 [4 5 6]]
Vector: [10 20 30]
---
Broadcasting result:
[[11 22 33]
 [14 25 36]]

Broadcasting Rules

Broadcasting works when:

  1. Arrays have the same number of dimensions, OR
  2. One array has fewer dimensions (padded with 1s)
  3. Corresponding dimensions are equal OR one is 1

Question: What happens with shapes (3,4) and (4,)?

Coffee Shop Analytics

Task: Analyze coffee shop sales using broadcasting:

# Daily sales for 3 products over 4 days (Monday-Thursday)
sales = np.array([[25, 30, 28, 35],    # Coffee
                  [15, 18, 20, 22],    # Pastries
                  [8, 12, 10, 15]])    # Sandwiches

print("Original sales:\n", sales)

# TODO: Apply a 10% Friday discount to all products using broadcasting


# TODO: Calculate total revenue if coffee=3€, pastries=2€, sandwiches=5€

Axis-specific Operations

For multi-dimensional arrays, specify the axis:

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print("Matrix:"); print(matrix); print("---")
print("Sum along axis 0 (columns):", np.sum(matrix, axis=0))
print("Mean along axis 1 (rows):", np.mean(matrix, axis=1))
Matrix:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
---
Sum along axis 0 (columns): [12 15 18]
Mean along axis 1 (rows): [2. 5. 8.]

Joining Arrays

  • You can use concatenate two join arrays
  • With axis you can specify the dimension
  • In 2-dimensions hstack() and vstack() are easier

Question: What do you expect will be printed?

import numpy as np
ones = np.array((1,1,1,1))
twos = np.array((1,1,1,1)) *2
print(np.vstack((ones,twos))); print(np.hstack((ones,twos)))
[[1 1 1 1]
 [2 2 2 2]]
[1 1 1 1 2 2 2 2]

Statistical Functions

NumPy provides many built-in statistical functions:

data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

print("Mean:", np.mean(data))
print("Standard deviation:", np.std(data))
print("Minimum:", np.min(data))
print("Maximum:", np.max(data))
print("Sum:", np.sum(data))
Mean: 5.5
Standard deviation: 2.8722813232690143
Minimum: 1
Maximum: 10
Sum: 55

Student Grade Analysis

Task: Analyze a class of student grades:

# German grades: 1.0 (best) to 5.0 (worst), 4.0 is passing grade
grades = np.array([1.0, 1.3, 1.7, 2.0, 2.3, 2.7, 3.0, 3.3,
                   3.7, 4.0, 4.0, 2.3, 1.7, 2.0, 3.3, 2.7])

print("Student grades:", grades)

# TODO: Calculate and print the following statistics:
# - Class average
# - Standard deviation
# - Best grade
# - Worst grade

# Your statistical analysis here

Indexing and Methods

Indexing and Slicing

  • Accessing and slicing ndarray works as before
  • Higher dimension element access with multiple indices

Question: What do you expect will be printed?

import numpy as np
x = np.random.randint(0, 10, size=(3, 3))
print(x); print("---")
print(x[0:2,0:2])
[[3 2 7]
 [9 2 4]
 [2 9 1]]
---
[[3 2]
 [9 2]]

Boolean Indexing

Use boolean arrays to filter elements based on conditions:

import numpy as np
data = np.array([1, 5, 3, 8, 2, 9, 4])
print("Original data:", data)
print("Values > 4:", data[data > 4])
print("Even numbers:", data[data % 2 == 0])
Original data: [1 5 3 8 2 9 4]
Values > 4: [5 8 9]
Even numbers: [8 2 4]

Fancy Indexing

Access multiple elements using arrays of indices:

data = np.array([10, 20, 30, 40, 50])
indices = np.array([0, 2, 4])
print("Original data:", data)
print("Selected elements:", data[indices])
Original data: [10 20 30 40 50]
Selected elements: [10 30 50]

Question: What happens if we use negative indices?

Most Common Methods

  • sort(): sort the array from low to high
  • reshape(): reshape the array into a new shape
  • flatten(): flatten the array into a 1D array
  • squeeze(): squeeze the array to remove 1D entries
  • transpose(): transpose the array

Try experiment with these methods, they can make your work much easier.

Weekly Fitness Challenge

Task: Track your weekly step count progress:

# Your daily step counts for one week
steps = np.array([8500, 12000, 6800, 9500, 11200, 15000, 7800])
days = np.array(["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"])
target = 10000  # Daily step goal

# TODO: Find days where you exceeded your target (>= 10000 steps)


# TODO: Print the successful days using fancy indexing


# TODO: Calculate your weekly achievement rate (percentage of successful days)

Types

Data Types

  • Numpy provides data types as characters
  • i: integer
  • b: boolean
  • f: float
  • S: string
  • U: unicode
string_array = np.array(["Hello", "World"]); string_array.dtype
dtype('<U5')

Enforcing Data Types

  • We can also provide the type when creating arrays
x = np.array([1, 2, 3, 4, 5],  dtype = 'f'); print(x.dtype)
float32
  • Or we can change them for existing arrays
x = np.array([1, 2, 3, 4, 5],  dtype = 'f'); print(x.astype('i').dtype)
int32

Note, how the types are specified as int32 and float32.

Sidenote: Bits

Question: Do you have an idea what 32 stands for?

  • It’s the number of bits used to represent a number
    • int16 is a 16-bit integer
    • float32 is a 32-bit floating point number
    • int64 is a 64-bit integer
    • float128 is a 128-bit floating point number

Why do Bits Matter?

  • They matter, because they can affect:
    • the performance of your code
    • the precision of your results
  • That’s why numbers can have a limited precision!
    • An int8 has to be in the range of -128 to 127
    • An int16 has to be in the range of -32768 to 32767

Question: Size difference between int16 and int64?

That’s it for today!

And that’s it for todays lecture!
You now have the basic knowledge to start working with scientific computing.

Literature

Interesting Books

  • Downey, A. B. (2024). Think Python: How to think like a computer scientist (Third edition). O’Reilly. Link to free online version
  • Elter, S. (2021). Schrödinger programmiert Python: Das etwas andere Fachbuch (1. Auflage). Rheinwerk Verlag.

For more interesting literature to learn more about Python, take a look at the literature list of this course.