Notebook 1: A brief introduction to Python and Jupyter¶

In this course, we will use the programming language Python (https://www.python.org/) to use in practice the AI and Machine Learning Algorithms we are going to discuss. For an easier use of Python, we will also use the Jupyter (http://jupyter.org/) notebooks. Therefore we will first spend a bit of time to familiarize ourselves with these tools

How to use Jupyter¶

Using cells¶

In a Jupyter Notebook, you have cells that you can fill with Python code. As we do not suppose you know Python yet, let us just mention 4 simple type of valid Python statements:

Any number is a valid Python code: 54, 34.2
Any arithmetic operation with number: 23+45, 12.2*3-1
The print() command can be used to explicitly ask for a number to be printed: print(45)
A line starting with # is a comment: it is not executed and usually represent an explanation of the code

Executing a cell¶

To execute the content of a cell, press Ctrl+Enter (simply pressing enter will just go to the next line inside the cell)

For example, enter a simple Python line in the next cell. Like:

print(3+5)

Then press Ctrl+Enter. You should see (of course) 5 being printed.

print(3+5)
print(4)

8
4

You can of course have several python instruction in one line:

print(4+6)
10+5
65*8-30
100*2

10

200

Output of a cell¶

Note that, if you do not use the print command, the result of the operation is not printed after execution (like 10+5 or 65*8-30 above).

Except for the last command of a cell (here: 100*2) whose result get printed automatically in front of a Out[] prefix

By the way, note the number inside brakets after Out and In at the left of each cell. It indicate the order in which this cell was executed. When you execute the first cell, it is labeled with In [1]. But if you execute it again, it will get a higher number: In [2], In [3], ...

Alternative ways to execute a cell¶

Instead of Ctrl+Enter, you can also execute a cell by pressing Shift+Enter ot Alt+Enter. The difference is as follows:

Ctrl+Enter will execute the cell and keep the current cell active. (ie. if you press Ctrl+Enter again, the same cell will be executed a second time)
Alt+Enter will execute the current cell, then insert a new cell after the current one and make it the active one.
Shift+Enter will execute the current cell, then move to the next cell (and create a new cell if the current cell is the last cell)

print("Hello")

print("Good Bye")

# TRY TO PRESS ALT+ENTER HERE

Hello
Good Bye

print ("Where am I?")

# TRY TO PRESS SHIFT+ENTER HERE

Where am I?

Creating new cells¶

There are several ways to create a new cell. Here are a few useful ones:

Execute a cell with Alt+Enter to insert a new cell after the current one
Execute the last cell of the notebook (with Shift+Enter): this automatically add a new cell at the end of the notebook
Use Shift+Ctrl+- to split the cell where the cursor is

There are other ways (Such as the Insert sub-menu in the Top-menu, or pressing Alt+Ctrl, ...)

#
#
#Try to split this cell here:
#

#
#
#

Edit and Command mode¶

In theory, you should see that the thin rectangle that surrounds the active cell is green. It indicates that you are in Edit Mode. This is the normal mode in which you should be most of the time. Pressing the Esc key will turn the thin rectangle to blue, indicating that you are now in Command mode. This mode allows to do a few manipulations on cells (such as deleting them). It is less useful, so you should stay in Edit Mode most of the time. To go back from Command Mode to Edit Mode, simply press Enter.

Different types of cell¶

Jupyter has actually 2 types of cells: Code cells and Markdown Cells

The cells in which you executed Python code are Code cells. The text that you see in between the cells containing Python code is actually inside Markdown Cells. You can change the type of the current cell by using the dropdown menu at the top of the notebook. In practice you should not have to do it.

Deleting a cell¶

Deleting a cell can be done by pressing Esc+D, D or Esc+X.

Esc actually put you into command mode. Then pressing D twice will delete the current cell. Pressing X, on the other hand, will cut the cell. You might have to press Enter after, to go back to Edit Mode.

# DELETE ME TOO!

An introduction to Python¶

This is an AI course, not a Python course, nor a general course on programming. Therefore, we will only discuss the Python aspects that will be useful to us for now. As the course continue, we will introduce a few additional Python aspects when we need them. For now, we just need to learn the following:

Numbers, strings and variable
Defining a function
Using the Numpy library for manipulating arrays of numbers
Using the Matplotlib library for visualisation

Basic objects: Numbers and strings¶

There are 2 main elementary type of objects in python: Numbers and Strings (In a moment, we will also use arrays)

Numbers are just that: numbers, that you can add, multiply, etc. Strings are used to represent text. The text has to be put inside double or single quotes: "like this"or 'like this'

4.567
345000
234.56
"Hello, how are you?"
"何時ですか"

'何時ですか'

Strings and numbers can be printed in combination with the print command:

print("I would like to buy", 5+3, "apples and", 5/2, "oranges")

I would like to buy 8 apples and 2.5 oranges

Variables in Python¶

A variable is a name to which we assign a value. For example, if we type my_age = 29 in a cell, and execute the cell, then using the name my_age is equivalent to using 29

my_age = 29
print("my age is", my_age)

my age is 29

Note that the name of a variable has to follows some rules:

it must starts with a letter (hello123 is a good name, but not 123hello)
it should only contain letters, numbers or underscore sign (_)

Note that the letters can be from any alphabet, including Japanese one. But I recommend that you only use Latin alphabet letters (ie. "romaji")

私の年齢 = 29 #Just to show you it is possible, but please try to only use half-width romaji letters

print("my age is", 私の年齢)

my age is 29

on the other hand, it is perfectly fine to use Japanese characters inside a string:

my_name = "明子"
print("私は", my_name, "です")

私は 明子 です

But except inside a string, please write everything in half-width romaji, or you will run into issues...

Variables are shared between all cells. So if you have assigned a variable, in one cell (and executed that cell), you can still use it in another one:

print(my_name)

明子

On the other hand, if you use a variable name that was never assigned, you will get an error:

print(name_not_defined)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-b3dd37838c1d> in <module>()
----> 1 print(name_not_defined)

NameError: name 'name_not_defined' is not defined

A variable can of course be used to assign to another variable:

my_name_again = my_name
print(my_name_again)

明子

Caveat: Cells execution order¶

Something that can be confusing with Jupyter: what matters is the order in which you have executed the cells, not the order of the cells in the Notebook. This means that, for example, if we change the value of my_name in the next cell, execute it, and then go back to execute the cell above, the result will change. This might not be totally intuitive, as you might expect that cells only influence each-others from top to bottom:

my_name = "Patrick"
# Excute this cell, then execute the cell above

Self-referencing assignment¶

If you have already assigned a variable, you can use the value of this variable to assign a new value to the same variable. For example, to add one year to my_age, you can write my_age = my_age + 1. Doing this is especially useful in loops.

my_age = 5

print("I am", my_age, "years old")
my_age = my_age + 1

# Try to press Ctrl+Enter several times in this cell

I am 5 years old

Defining a function¶

Another important aspect of Python we need to know about is how to define functions. The syntax for that is:

    def name_of_function(argument_1, argument_2):
        action1
        action2
        return result

A function can have any number of argument. It can also have 0 argument. action1, action2, ... are the Python commands that are going to be executed every time we run the function.

Note the indentation of action1, action2, return result: this indentation is mandatory in Python. (It is optional in other Programming Languages). You are free to make this indentation by whitespaces (ie. pressing the spacebar two or four times), or by pressing tab. If you use the spacebar, each indented line should have exactly the same number of spaces at the beginning. This is not valid:

    def name_of_function(argument_1, argument_2):
      action1
           action2
        return result

If your indentation is not correct, you will have an error message:

def function_with_bad_indentation():
     weather = "sunny"
    print("The weather is", weather)

  File "<ipython-input-19-362d3b118a24>", line 3
    print("The weather is", weather)
                                       ^
IndentationError: unindent does not match any outer indentation level

The return part is optional. When the function is executed (we also say the function is called), the actions are executed in sequence from top to bottom. When we encounter return something, the value corresponding to something is returned.

def compute_square(x):
    square = x*x
    return square

print(compute_square(3))
print(compute_square(2))
compute_square(4)  # The result of this line will not be outputed because it is not in a print function, 
                    # and it is not the last statement of the cell
compute_square(5)

9
4

25

def print_1_two_and_san():
    print(1)
    print("two")
    print("三")
    
print_1_two_and_san() # Note that, since this function has no `return`, there is no `Out []` field below
print_1_two_and_san()
print_1_two_and_san()
print_1_two_and_san()

1
two
三
1
two
三
1
two
三
1
two
三

Once you have executed a cell containing a definition, this definition is also available for all other cells (like variables)

print_1_two_and_san()
print(compute_square(5))

1
two
三
25

Variable scope¶

Something that might sometime be confusing in Python is variable scope. We have seen that we could give a value to a variable in a cell, and this value was set in other cells as well. When we assign a value to a variable like that, outside of any function, the variable is assumed to be a Global variable.

If we do such an assignment inside a function, the variable will be a local variable

A global variable and a local variable can have the same name. But they will be considered different by Python.

The variables used as arguments of the function in the function definition are also automatically considered local variables

Moreover, in a function call, if a variable name is used, Python will first look for the value of the corresponding local variable. If no such local variable exists, it will look for the value of the global variable with the same name.

Example: Consider the two following function definitions:

x = 4           # Global variable
divisor = 5     # Global variable

def compute_half(x):
    divisor = 2     # Local Variable
    return x/divisor

print(compute_half(7))
print(divisor)

3.5
5

x = 4          # Global variable
divisor = 5    # Global variable

def compute_half(x):
    return x/divisor

print(compute_half(7))

1.4

jdkfsj5566 = 5

Numpy arrays¶

We will often need to manipulate arrays of numbers. What is an array? It is a set of number organized in a line or a table. It corresponds to the mathematical concepts of vectors, matrices and tensors. (You might never have heard about tensors; do not worry it is normal).

Python do not have natively a very efficient way of handling arrays. It only has lists which are often not efficient enough.

Therefore we will use a library. In programming languages, libraries are set of external functionalities that can be added to the basic programming language. For arrays, we will use the Numpy library.

Python lists¶

First let us see the native way of handling list of numbers in Python. We use brackets ([ and ]) to mark the beginning and the end of a list, and separate its entries by a comma.

list_of_odd_numbers = [1, 3, 5, 7, 9, 11, 13]
print(list_of_odd_numbers)

[1, 3, 5, 7, 9, 11, 13]

You can access the elements of a list by adding a number in brackets. For example, my_list[0] represents the first element of my_list. my_list[1] represents the second element. my_list[6] represents the 5th element. Note that since we count from zero, my_list[n] is actually the n+1th element of the list.

print("The fourth odd number is ", list_of_odd_numbers[3])

The fourth odd number is  7

A list can contains other lists as elements.

list_of_list_of_even_numbers_and_odd_numbers = [[0, 2, 4, 6, 8, 10], [1, 3, 5, 7, 9, 11]]
print("list of even numbers:", list_of_list_of_even_numbers_and_odd_numbers[0])
print("list of odd numbers:", list_of_list_of_even_numbers_and_odd_numbers[1])
print("The fourth odd number is ", list_of_list_of_even_numbers_and_odd_numbers[1][3])

list of even numbers: [0, 2, 4, 6, 8, 10]
list of odd numbers: [1, 3, 5, 7, 9, 11]
The fourth odd number is  7

If you use an index larger than the list size, you will get an error.

print("The twentieth odd number is ", list_of_odd_numbers[20])

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-29-d31c9939a34c> in <module>()
----> 1 print("The twentieth odd number is ", list_of_odd_numbers[20])

IndexError: list index out of range

You can use the function len() to know the size of the list:

print("This list only contain the", len(list_of_odd_numbers), "first odd numbers")

This list only contain the 7 first odd numbers

Importing a library¶

As we said, we are going to use the Numpy library to manipulate arrays.

Before using a library, we need to import it. The syntax will be: import library_name or import library_name as alias_name

With numpy, it is customary to import it with the alias name np. What this means is that later, we will be able to refer to this library by simply writing np.

import numpy as np

Creating an array¶

To use the functionalities of a library, we have to add its name followed by a dot before the functions name we want to use. The first function we will check is np.array, that we can use to create Numpy arrays from Python lists:

my_first_array = np.array([2, 4, 6, 7, 0.5])

print(my_first_array)

[2.  4.  6.  7.  0.5]

The values in an array is accessed just like for a list:

print(my_first_array[1])

4.0

Furthermore, we can apply arithmetics operations to arrays:

double_of_the_array = 2* my_first_array
print(double_of_the_array)

array_minus_1 = my_first_array - 1
print(array_minus_1)

[ 4.  8. 12. 14.  1.]
[ 1.   3.   5.   6.  -0.5]

Arrays of the same size can be added (or multiplied, or substracted, etc.):

my_second_array = np.array([1, -1, 2, -2, 3])

print(my_first_array + my_second_array)

print(my_second_array * my_first_array)

[3.  3.  8.  5.  3.5]
[  2.   -4.   12.  -14.    1.5]

If the arrays are not the same size and you try to add them, you will get an error:

my_third_smaller_array = np.array([1, -1, 2])
print(my_first_array + my_third_smaller_array)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-22b8463de8c7> in <module>()
      1 my_third_smaller_array = np.array([1, -1, 2])
----> 2 print(my_first_array + my_third_smaller_array)

ValueError: operands could not be broadcast together with shapes (5,) (3,)

We can know the size of a one-dimensional array by using the len function like with Python lists:

print("this array has a size of", len(my_first_array))
print("this other array has a size of", len(my_third_smaller_array))

this array has a size of 5
this other array has a size of 3

With arrays, it is also possible to know the size by using the shape attribute. In Python, an attribute is some additional information contained by an object that you obtain by adding a dot (.) followed by the name of the attribute to the variable containing the object: object_variable.attribute_name.

The value returned by the shape attribute is a bit different than the one returned by the len function:

print("len() of the array:", len(my_first_array))
print("shape of the array:", my_first_array.shape)

len() of the array: 5
shape of the array: (5,)

We see that the shape attribute returns (5,), which represents a sequence of only one element (5). This sequence appears as (5,) instead of [5] (as would a Python list) because it is actually a tuple and not a list. In practice, we can use tuple like list in this case.

print("len() of the array:", len(my_first_array))
print("shape of the array:", my_first_array.shape)
print("length extracted from the shape of the array:", my_first_array.shape[0])

len() of the array: 5
shape of the array: (5,)
length extracted from the shape of the array: 5

The reason that the shape attribute returns a sequence of one number instead of simply returning the number is that it is also useful for multi-dimensional arrays

Multidimensional arrays¶

We can also make 2 dimensional arrays like this:

my_2D_array = np.array([[0, 2, 4, 6, 8, 10], [1, 3, 5, 7, 9, 11]])

print(my_2D_array)

[[ 0  2  4  6  8 10]
 [ 1  3  5  7  9 11]]

The values in an array can be accessed by putting a sequence of indices in brackets: my_array[2,3]

print("Second line of the 2D array", my_2D_array[1])
print("Value in the third column of second line:", my_2D_array[1, 2])

Second line of the 2D array [ 1  3  5  7  9 11]
Value in the third column of second line: 5

As we mentioned, the shape of a multi-dimensional array will contain more than one number (one size per dimension):

print("The shape is:", my_2D_array.shape)

The shape is: (2, 6)

Numpy universal functions¶

Numpy also define many mathematical functions that can be applied either to numbers or to arrays of numbers. Some of the most usefuls are np.sin, np.cos, np.log, np.exp:

print("sinus of a number:")
print(np.sin(2))

print("sinus of an array:")
print(np.sin(my_2D_array))

sinus of a number:
0.9092974268256817
sinus of an array:
[[ 0.          0.90929743 -0.7568025  -0.2794155   0.98935825 -0.54402111]
 [ 0.84147098  0.14112001 -0.95892427  0.6569866   0.41211849 -0.99999021]]

Numpy functions for generating arrays¶

We saw that we could create numpy arrays from lists with the array function. This is not the only way. Numpy provides many other functions for generating arrays. Here are a few useful ones.

Arrays of zeros / Array of ones:¶

The functions np.zeros and np.ones can be used to create arrays filled with zeros or ones. You have to give them the shape as argument:

print("Array of zeros of shape 3x4:")
print(np.zeros((3,4)))

print("Arrays of ones of shape 2x5:")
print(np.ones((2, 5)))

Array of zeros of shape 3x4:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
Arrays of ones of shape 2x5:
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]

Range of values¶

np.arange will generate a one-dimension array of increasing numbers:

print(np.arange(15))

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]

Randomly generated arrays¶

We can also ask numpy to generate an arrays with random values.

Two useful functions for this are np.random.randn and np.random.uniform. (They begin with np.random because they are located in the random subpackage of numpy.

np.random.uniform(a, b, size=shape) will fill an array of the given shape with values between a and b

print("Array of shape 3x5 filled with random numbers between 0 and 5:")
print(np.random.uniform(-10, 5, size=(2,)))

# Re-execute this cell several times to see that the values generated change (this shows they are random)

Array of shape 3x5 filled with random numbers between 0 and 5:
[-8.78586657  3.95389508]

np.random.randn(dimension1, dimension2,...) generates an array of shape dimension1xdimension2 with values taken from the normal distribution (a.k.a Gaussian distribution a.k.a Bell curve distribution). Note that, unfortunately, the way to call each function is not very consistent between functions.

print("Array of shape 3x4 filled with values from a Normal distribution")
np.random.randn(3,4)

Array of shape 3x4 filled with values from a Normal distribution

array([[ 0.44971859, -0.47976418,  0.96843294, -1.14501106],
       [-0.47502743, -0.81492178,  0.26485823, -0.9034812 ],
       [ 0.3682693 , -0.14166568, -1.91812275,  2.61126857]])

Linspace¶

The function np.linspace(a, b) will generate uniformly spaced numbers between a and b. It will be especially useful for us to plot fuctions with Matplotlib. By default, it generates 50 values. But it can be changed with the optional num argument.

print("50 numbers uniformly spaced between 0 and 10:")
print(np.linspace(0, 10))

print("5 numbers uniformly spaced between 0 and 10:")
print(np.linspace(0, 10, num=5))

50 numbers uniformly spaced between 0 and 10:
[ 0.          0.20408163  0.40816327  0.6122449   0.81632653  1.02040816
  1.2244898   1.42857143  1.63265306  1.83673469  2.04081633  2.24489796
  2.44897959  2.65306122  2.85714286  3.06122449  3.26530612  3.46938776
  3.67346939  3.87755102  4.08163265  4.28571429  4.48979592  4.69387755
  4.89795918  5.10204082  5.30612245  5.51020408  5.71428571  5.91836735
  6.12244898  6.32653061  6.53061224  6.73469388  6.93877551  7.14285714
  7.34693878  7.55102041  7.75510204  7.95918367  8.16326531  8.36734694
  8.57142857  8.7755102   8.97959184  9.18367347  9.3877551   9.59183673
  9.79591837 10.        ]
5 numbers uniformly spaced between 0 and 10:
[ 0.   2.5  5.   7.5 10. ]

Visualisation with matplotlib¶

To understand intuitively what we are doing with functions, it will be interesting to have some visualisation tools. Mostly, we will use the matplotlib library, which is the most commonly used libraries for plots and charts in Python.

Importing matplotlib¶

Like we imported numpy before, we need to import matplotlib here with import matplotlib. We will also explicitly import a subpackage of matplotlib that we will use often: import matplotlib.pyplot as plt.

In addition, we will use a command specific to Jupyter: %matplotlib inline. It allows Jupyter to display the plots below each cell.

The plt.rcParams['figure.figsize'] = [15, 6] statement is optional. It serves to adjust the size of the plots displayed in Jupyter.

%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt

#plt.rcParams['figure.figsize'] = [15, 6]

Plotting¶

To plot a function, we will mostly use the function plt.plot. What plt.plot does is to take a list of x-values and a list of y-values, and then draw lines corresponding to these x and y values.

print("plot of a line going from (x=1, y=-10) to (x=2, y=0) to (x=3, y=1)")
plt.plot([1,2,3], [-10, 0, 1])

plot of a line going from (x=1, y=-10) to (x=2, y=0) to (x=3, y=1)

[<matplotlib.lines.Line2D at 0x11345b4a8>]

Instead of drawing the line, we can also just display the points:

plt.plot([1,2,3], [-10, 0, 1], "o")

[<matplotlib.lines.Line2D at 0x113550b00>]

We can also change the color:

plt.plot([1,2,3], [-10, 0, 1], "r")

[<matplotlib.lines.Line2D at 0x1135bf320>]

We can do both at the same time:

plt.plot([1,2,3], [-10, 0, 1], "ro")

[<matplotlib.lines.Line2D at 0x1136dd438>]

Plotting a function¶

To plot a function, we will use the np.linspace function to generate the x-values. Then we can apply a function to the x-values to obtain the y-values and obtain our graph

x_values = np.linspace(-2,2) # We compute x values between -2 and 2

def compute_square(x):
    return x*x

y_values = compute_square(x_values)
print("x values:")
print(x_values)
print("y values:")
print(y_values)

x values:
[-2.         -1.91836735 -1.83673469 -1.75510204 -1.67346939 -1.59183673
 -1.51020408 -1.42857143 -1.34693878 -1.26530612 -1.18367347 -1.10204082
 -1.02040816 -0.93877551 -0.85714286 -0.7755102  -0.69387755 -0.6122449
 -0.53061224 -0.44897959 -0.36734694 -0.28571429 -0.20408163 -0.12244898
 -0.04081633  0.04081633  0.12244898  0.20408163  0.28571429  0.36734694
  0.44897959  0.53061224  0.6122449   0.69387755  0.7755102   0.85714286
  0.93877551  1.02040816  1.10204082  1.18367347  1.26530612  1.34693878
  1.42857143  1.51020408  1.59183673  1.67346939  1.75510204  1.83673469
  1.91836735  2.        ]
y values:
[4.00000000e+00 3.68013328e+00 3.37359434e+00 3.08038317e+00
 2.80049979e+00 2.53394419e+00 2.28071637e+00 2.04081633e+00
 1.81424406e+00 1.60099958e+00 1.40108288e+00 1.21449396e+00
 1.04123282e+00 8.81299459e-01 7.34693878e-01 6.01416077e-01
 4.81466056e-01 3.74843815e-01 2.81549354e-01 2.01582674e-01
 1.34943773e-01 8.16326531e-02 4.16493128e-02 1.49937526e-02
 1.66597251e-03 1.66597251e-03 1.49937526e-02 4.16493128e-02
 8.16326531e-02 1.34943773e-01 2.01582674e-01 2.81549354e-01
 3.74843815e-01 4.81466056e-01 6.01416077e-01 7.34693878e-01
 8.81299459e-01 1.04123282e+00 1.21449396e+00 1.40108288e+00
 1.60099958e+00 1.81424406e+00 2.04081633e+00 2.28071637e+00
 2.53394419e+00 2.80049979e+00 3.08038317e+00 3.37359434e+00
 3.68013328e+00 4.00000000e+00]

We can then plot the function with plt.plot

plt.plot(x_values, y_values)

[<matplotlib.lines.Line2D at 0x113740390>]

plt.plot(x_values, y_values, "o")

[<matplotlib.lines.Line2D at 0x1138045c0>]

What do we do if we want to plot the function on the interval [-3, 5]?

Now, please try to plot a function of your choice (eg. $$f(x)=sin(x)*cos(x)$$ or $$f(x)= sin(x^2)*x$$ ...)