# Important Python packages and how to use them

A big part of pythons 'power' is the vast availability of packages for nearly everything.
If you want to do anything with python, it is always worth a quick look at google whether there
is a package which is doing just that.

### important packages we will discuss:

- numpy: handling of numpy arrays for fast processing of large data arrays. Implements a lot of mathematical functions -> usage highly recommended for any sort of math task
- matplotlib: for plots of all sort

## 1. import
Packages can be imported with the import keyword.
For some packages there is a common alias like np for numpy. You can specify an alias as shown below. To import only a certain module from a package or only a function or class from a module, you can do so as well.

In [1]:
# packages are included this way:
import numpy

# you can give an 'alias' to imported packages
import numpy as np

# you can import certain methods or classes from a package
from scipy.optimize import curve_fit

## 2. numpy: numerical python

There is nearly always the need of processing arrays of numbers. If you use arrays, you should "always" use numpy arrays

In [None]:
# the numpy array is always recommended to use when it is about doing something with arrays
a = np.array([1,2,3,4])

print(a)

# the handling is quite similar to lists
print(a[-2:-1]) # the slicing means [-2:-2:]

# very useful: functions executed with numpy arrays act on each entry normally without code changes:
def f(x):
    return x**2

# a big exception are if statements, which wont work as expected on arrays!

print(f(a))

### numpy - some important features
These are some methods you will need quite sure.

In [None]:
# important features:

# array of length 4 filled with zeros
a = np.zeros(4)
print(a)

# array of length 3 filled with ones
b = np.ones(3)
print(b)


# matrix of zeros with the size 3x4, meaning there will be 3 rows and 4 columns
a = np.zeros((3,4))
print(a)

In [None]:
# linear distributed numbers between start and end (end included)
# the third argument gives the length
x = np.linspace(0, 10, 21)
print('x =', x)

# exponentially distributed numbers with the exponents from lower to higher
# this means from 1*10**1 = 10 up to 1*10**5 = 100000
y = np.logspace(1, 5, 11)
print('y =', y)

# numbers between low (included) and high (excluded) with stepwidth as third argument
k = np.arange(0, 16, 3)
print('k =', k)

# many functions for arrays are efficiently implemented in numpy
y = np.exp(x)
print('np.exp(x) =', y)

#similar: np.log, np.sqrt, np.sin, np.cos,.........


In [None]:
# in np.random there are several random generators, for example normal, uniform, poisson, ...
x = np.random.normal(0, 1, 1000) #this means, the mean of the distribution is 0, the sigma is 1 and we generate 1000 numbers

# there are many useful functions implemented for arrays like mean, variance, std_devation, sum, prod
mean = x.mean()
print("Mean for x:", mean)
variance = x.var()
print("variance for x:",variance)
std_devation = x.std(ddof=1)
print("Std. deviation for x:",std_devation)
x_sum = x.sum()
print("Sum of x:", x_sum)
x_prod = x.prod()
print("Prod of x:", x_prod)

In [None]:
# Finding the maxima/minima or the indicies for these is also quickly done via

x_max = x.max()
print("Max value in x:", x_max)

x_min =x.min()
print("Min value in x:", x_min)

index_max = x.argmax() # or argmin()
print("Index of the maxima:", index_max)

# Check if its the right index?
print(x[index_max]) 

In [None]:
# How to handle higher dimension arrays

# We create any array with size 100x100x100 with random values between 0 and 1
x_3dim = np.random.normal(0,1, [100,100,100])

# With .shape one can check the shape of an array
print("Shape:", x_3dim.shape)

# With .size one can check the number of entries of an array
print("Size:", x_3dim.size)

# With .reshape one can change the dimensions of an array for example to have 1D indicies
# Warnining: This obviously this only works if the size of the array stays the same 
x_flattend = x_3dim.reshape(100*100*100)

# Or you can use .flatten() to get a 1D array
x_flattend_2nd_ver = x_3dim.flatten()

print("Shape of flattend array:", x_flattend.shape)
print("Size of flattend array:", x_flattend.size)
print("Size of flattend array according to 2nd version: ", x_flattend_2nd_ver.size)

In [None]:
# Array / Matrix Operations
# We initilaize 2 random arrays of size 2x2 with random intergers between 0 and 10
x = np.random.randint(0,10,[2,2])
y = np.random.randint(0,10,[2,2])
print("x = \n", x)
print("y = \n", y)

In [None]:
# Adding Subtraction works as expected
print("x+y = \n", x+y)
print("x-y = \n", x-y)
# * or / are elementwise multiplication/division
print("x*y = \n", x*y)
print("x/y = \n", x/y)

In [None]:
# Matrix Multiplication is possible via @
print("x@y = \n", x@y)
# Matrices can be transposed via .T
print("x = \n", x)
print("x.T = \n", x.T)

### masking
Very often you want to "filter" an array by a value e.g. all values larger than a certain threshold. This can be done very easily!

In [None]:
x = np.random.normal(0,1,5)

# very useful for slicing
print('gaussian distributed x =', x)
print('Is x>0.2?', x>0.2)
print('The numbers with x>0.2 are ',x[x>0.2])

In [None]:
# often overlooked but an extremely powerful tool:
x = np.arange(-5, 5)
print("Values: ", x)

# entry-wise if evaluation
print("Values are below zero at these indices: ", np.where(x < 0)[0])
# you can also apply a function to the parts of the array where the condition is True or False
print("Rectified values: ", np.where(x < 0, 0, 1))

# you can chain multiple evaluations with 
# AND: (first condition in brackets) & (second condition in brackets)
# OR: (first condition in brackets) | (second condition in brackets)
print("Values: ", x)
print("Values are below zero and above -3 at these indices: ", np.where((x < 0) & (x > -3))[0])
print("Values are below zero or above 1 at these indices: ", np.where((x < 0) | (x > 1))[0])

## 3. matplotlib
matplotlib is a quite flexible library for easy plot creation. Usually pyplot from matplotlib is imported with the alias plt

In [None]:
import matplotlib.pyplot as plt

# fill a histogram with the created numbers and show the histogram
x = np.random.normal(0, 1, 1_000)
plt.hist(x, bins=20)
plt.show()

### python histograms
python sees histograms not quite as the same thing like for example root. In python they are stored and handles as a collection of:
> (bin edges, bin contents, "drawn" bars for display)
<br>

You can get these from the function call of plt.hist 

In [None]:
things = plt.hist(x)
print(things)

### Example 3.1: Distributions and histograms

Look up another distribution available from numpy random. Generate some data points, fill them into a histogram and show it.

### plot with matplotlib
This is an example of a more "complex" plot

In [None]:
# lets create an array and calc some formula on it
x = np.linspace(-4, 4) # remember: create 50 equally dirstibuted numbers between -4 and 4
y = x**2

# get a figure and specify the size (in inches ...)
# first number corresponds to the width, second to the height
# dpi are "dots per inch" and refer to image quality
# 300 is good for printing (like a thesis), but you should prefer pdf files for this anyway
f = plt.figure(figsize=(6,4), dpi=300) 

# plot the values, add a 'style' and a label
plt.plot(x, y, 'k+', label = '$f(x) = x^2$') # k stands for black, the + will plot a + as the marker

# add labels to the axis
plt.xlabel('$x$')
plt.ylabel('$y$')

# add a legend to the best position
plt.legend(loc='best')

# activate the grid (True optional as argument)
plt.grid()

# make things look nice if something has gone wrong
plt.tight_layout()

# save to file
plt.savefig('plot.png')

# show
plt.show()

### Example 3.2: Enhance your histograms
Add a label, x- and y- axis labels and a legend to your histograms. Use a different color and save it.
### Example 3.3: Red dotted sine function
Create an array from 0 to 2pi (np.pi is available ...) and compute the sine function of it. Plot it to a canvas using a red dashed line.

### Example 3.4: Slicing plots
Plot positive values black, negative ones red.

### Example 3.5: Add gaussian noise
Add gaussian noise to the sine plot and use the same color for all data points again.

### errorbar
For many plots in physics you need to also show the uncertainties of a given value. This can be done with the errorbar method, parsing the uncertainties as yerr=values.

In [None]:
# another very important 'plotstyle': errorbar

# get another figure
plt.figure(figsize=(6,4), dpi=300)

# the hist function returns the bin contents, the bin edges and the histogram itself
x = np.random.normal(0, 1, 1000)
content, edges, hist = plt.hist(x, label='histogram')
#print(edges)
#print(edges[:-1])
#print(edges[1:])
#print((edges[:-1]+edges[1:])/2)

# add the errorbar plot
# the style is specified using the fmt argument
## we need the bin centres for the position of the errorbars
## we can get them from the bin edges building the mean values of a pair of two edges
## we can use our known form of indexing for that
### plt.errorbar(x-position, y-position, yerr=values, xerr=values, ...)
plt.errorbar((edges[:-1]+edges[1:])/2, y=content, yerr=np.sqrt(content), fmt='+', label='uncertainties')

# adding the legend and show
plt.legend(loc='best')
plt.show()
plt.clf()
# errorbar also takes two arrays as yerr input for assymetric errorbars

# other useful plot styles: semilogy, semilogx, loglog