In this course, we will use the programming language Python (https://www.python.org/) to use in practice the AI and Machine Learning Algorithms we are going to discuss. For an easier use of Python, we will also use the Jupyter (http://jupyter.org/) notebooks. Therefore we will first spend a bit of time to familiarize ourselves with these tools
In a Jupyter Notebook, you have cells that you can fill with Python code. As we do not suppose you know Python yet, let us just mention 4 simple type of valid Python statements:
54, 34.223+45, 12.2*3-1print() command can be used to explicitly ask for a number to be printed: print(45)# is a comment: it is not executed and usually represent an explanation of the codeTo execute the content of a cell, press Ctrl+Enter (simply pressing enter will just go to the next line inside the cell)
For example, enter a simple Python line in the next cell. Like:
print(3+5)
Then press Ctrl+Enter. You should see (of course) 5 being printed.
print(3+5)
print(4)
You can of course have several python instruction in one line:
print(4+6)
10+5
65*8-30
100*2
Note that, if you do not use the print command, the result of the operation is not printed after execution (like 10+5 or 65*8-30 above).
Except for the last command of a cell (here: 100*2) whose result get printed automatically in front of a Out[] prefix
By the way, note the number inside brakets after Out and In at the left of each cell. It indicate the order in which this cell was executed. When you execute the first cell, it is labeled with In [1]. But if you execute it again, it will get a higher number: In [2], In [3], ...
Instead of Ctrl+Enter, you can also execute a cell by pressing Shift+Enter ot Alt+Enter. The difference is as follows:
print("Hello")
print("Good Bye")
# TRY TO PRESS ALT+ENTER HERE
print ("Where am I?")
# TRY TO PRESS SHIFT+ENTER HERE
There are several ways to create a new cell. Here are a few useful ones:
There are other ways (Such as the Insert sub-menu in the Top-menu, or pressing Alt+Ctrl, ...)
#
#
#Try to split this cell here:
#
#
#
#
In theory, you should see that the thin rectangle that surrounds the active cell is green. It indicates that you are in Edit Mode. This is the normal mode in which you should be most of the time. Pressing the Esc key will turn the thin rectangle to blue, indicating that you are now in Command mode. This mode allows to do a few manipulations on cells (such as deleting them). It is less useful, so you should stay in Edit Mode most of the time. To go back from Command Mode to Edit Mode, simply press Enter.
Jupyter has actually 2 types of cells: Code cells and Markdown Cells
The cells in which you executed Python code are Code cells. The text that you see in between the cells containing Python code is actually inside Markdown Cells. You can change the type of the current cell by using the dropdown menu at the top of the notebook. In practice you should not have to do it.
Deleting a cell can be done by pressing Esc+D, D or Esc+X.
Esc actually put you into command mode. Then pressing D twice will delete the current cell. Pressing X, on the other hand, will cut the cell. You might have to press Enter after, to go back to Edit Mode.
# DELETE ME TOO!
This is an AI course, not a Python course, nor a general course on programming. Therefore, we will only discuss the Python aspects that will be useful to us for now. As the course continue, we will introduce a few additional Python aspects when we need them. For now, we just need to learn the following:
There are 2 main elementary type of objects in python: Numbers and Strings (In a moment, we will also use arrays)
Numbers are just that: numbers, that you can add, multiply, etc.
Strings are used to represent text. The text has to be put inside double or single quotes: "like this"or 'like this'
4.567
345000
234.56
"Hello, how are you?"
"何時ですか"
Strings and numbers can be printed in combination with the print command:
print("I would like to buy", 5+3, "apples and", 5/2, "oranges")
A variable is a name to which we assign a value. For example, if we type my_age = 29 in a cell, and execute the cell, then using the name my_age is equivalent to using 29
my_age = 29
print("my age is", my_age)
Note that the name of a variable has to follows some rules:
hello123 is a good name, but not 123hello)_)Note that the letters can be from any alphabet, including Japanese one. But I recommend that you only use Latin alphabet letters (ie. "romaji")
私の年齢 = 29 #Just to show you it is possible, but please try to only use half-width romaji letters
print("my age is", 私の年齢)
on the other hand, it is perfectly fine to use Japanese characters inside a string:
my_name = "明子"
print("私は", my_name, "です")
But except inside a string, please write everything in half-width romaji, or you will run into issues...
Variables are shared between all cells. So if you have assigned a variable, in one cell (and executed that cell), you can still use it in another one:
print(my_name)
On the other hand, if you use a variable name that was never assigned, you will get an error:
print(name_not_defined)
A variable can of course be used to assign to another variable:
my_name_again = my_name
print(my_name_again)
Something that can be confusing with Jupyter: what matters is the order in which you have executed the cells, not the order of the cells in the Notebook. This means that, for example, if we change the value of my_name in the next cell, execute it, and then go back to execute the cell above, the result will change. This might not be totally intuitive, as you might expect that cells only influence each-others from top to bottom:
my_name = "Patrick"
# Excute this cell, then execute the cell above
If you have already assigned a variable, you can use the value of this variable to assign a new value to the same variable. For example, to add one year to my_age, you can write my_age = my_age + 1. Doing this is especially useful in loops.
my_age = 5
print("I am", my_age, "years old")
my_age = my_age + 1
# Try to press Ctrl+Enter several times in this cell
Another important aspect of Python we need to know about is how to define functions. The syntax for that is:
def name_of_function(argument_1, argument_2):
action1
action2
return result
A function can have any number of argument. It can also have 0 argument.
action1, action2, ... are the Python commands that are going to be executed every time we run the function.
Note the indentation of action1, action2, return result: this indentation is mandatory in Python. (It is optional in other Programming Languages). You are free to make this indentation by whitespaces (ie. pressing the spacebar two or four times), or by pressing tab. If you use the spacebar, each indented line should have exactly the same number of spaces at the beginning. This is not valid:
def name_of_function(argument_1, argument_2):
action1
action2
return result
If your indentation is not correct, you will have an error message:
def function_with_bad_indentation():
weather = "sunny"
print("The weather is", weather)
The return part is optional. When the function is executed (we also say the function is called), the actions are executed in sequence from top to bottom. When we encounter return something, the value corresponding to something is returned.
def compute_square(x):
square = x*x
return square
print(compute_square(3))
print(compute_square(2))
compute_square(4) # The result of this line will not be outputed because it is not in a print function,
# and it is not the last statement of the cell
compute_square(5)
def print_1_two_and_san():
print(1)
print("two")
print("三")
print_1_two_and_san() # Note that, since this function has no `return`, there is no `Out []` field below
print_1_two_and_san()
print_1_two_and_san()
print_1_two_and_san()
Once you have executed a cell containing a definition, this definition is also available for all other cells (like variables)
print_1_two_and_san()
print(compute_square(5))
Something that might sometime be confusing in Python is variable scope. We have seen that we could give a value to a variable in a cell, and this value was set in other cells as well. When we assign a value to a variable like that, outside of any function, the variable is assumed to be a Global variable.
If we do such an assignment inside a function, the variable will be a local variable
A global variable and a local variable can have the same name. But they will be considered different by Python.
The variables used as arguments of the function in the function definition are also automatically considered local variables
Moreover, in a function call, if a variable name is used, Python will first look for the value of the corresponding local variable. If no such local variable exists, it will look for the value of the global variable with the same name.
Example: Consider the two following function definitions:
x = 4 # Global variable
divisor = 5 # Global variable
def compute_half(x):
divisor = 2 # Local Variable
return x/divisor
print(compute_half(7))
print(divisor)
x = 4 # Global variable
divisor = 5 # Global variable
def compute_half(x):
return x/divisor
print(compute_half(7))
jdkfsj5566 = 5
We will often need to manipulate arrays of numbers. What is an array? It is a set of number organized in a line or a table. It corresponds to the mathematical concepts of vectors, matrices and tensors. (You might never have heard about tensors; do not worry it is normal).
Python do not have natively a very efficient way of handling arrays. It only has lists which are often not efficient enough.
Therefore we will use a library. In programming languages, libraries are set of external functionalities that can be added to the basic programming language. For arrays, we will use the Numpy library.
First let us see the native way of handling list of numbers in Python. We use brackets ([ and ]) to mark the beginning and the end of a list, and separate its entries by a comma.
list_of_odd_numbers = [1, 3, 5, 7, 9, 11, 13]
print(list_of_odd_numbers)
You can access the elements of a list by adding a number in brackets. For example, my_list[0] represents the first element of my_list. my_list[1] represents the second element. my_list[6] represents the 5th element. Note that since we count from zero, my_list[n] is actually the n+1th element of the list.
print("The fourth odd number is ", list_of_odd_numbers[3])
A list can contains other lists as elements.
list_of_list_of_even_numbers_and_odd_numbers = [[0, 2, 4, 6, 8, 10], [1, 3, 5, 7, 9, 11]]
print("list of even numbers:", list_of_list_of_even_numbers_and_odd_numbers[0])
print("list of odd numbers:", list_of_list_of_even_numbers_and_odd_numbers[1])
print("The fourth odd number is ", list_of_list_of_even_numbers_and_odd_numbers[1][3])
If you use an index larger than the list size, you will get an error.
print("The twentieth odd number is ", list_of_odd_numbers[20])
You can use the function len() to know the size of the list:
print("This list only contain the", len(list_of_odd_numbers), "first odd numbers")
As we said, we are going to use the Numpy library to manipulate arrays.
Before using a library, we need to import it. The syntax will be:
import library_name or import library_name as alias_name
With numpy, it is customary to import it with the alias name np. What this means is that later, we will be able to refer to this library by simply writing np.
import numpy as np
To use the functionalities of a library, we have to add its name followed by a dot before the functions name we want to use. The first function we will check is np.array, that we can use to create Numpy arrays from Python lists:
my_first_array = np.array([2, 4, 6, 7, 0.5])
print(my_first_array)
The values in an array is accessed just like for a list:
print(my_first_array[1])
Furthermore, we can apply arithmetics operations to arrays:
double_of_the_array = 2* my_first_array
print(double_of_the_array)
array_minus_1 = my_first_array - 1
print(array_minus_1)
Arrays of the same size can be added (or multiplied, or substracted, etc.):
my_second_array = np.array([1, -1, 2, -2, 3])
print(my_first_array + my_second_array)
print(my_second_array * my_first_array)
If the arrays are not the same size and you try to add them, you will get an error:
my_third_smaller_array = np.array([1, -1, 2])
print(my_first_array + my_third_smaller_array)
We can know the size of a one-dimensional array by using the len function like with Python lists:
print("this array has a size of", len(my_first_array))
print("this other array has a size of", len(my_third_smaller_array))
With arrays, it is also possible to know the size by using the shape attribute. In Python, an attribute is some additional information contained by an object that you obtain by adding a dot (.) followed by the name of the attribute to the variable containing the object: object_variable.attribute_name.
The value returned by the shape attribute is a bit different than the one returned by the len function:
print("len() of the array:", len(my_first_array))
print("shape of the array:", my_first_array.shape)
We see that the shape attribute returns (5,), which represents a sequence of only one element (5). This sequence appears as (5,) instead of [5] (as would a Python list) because it is actually a tuple and not a list. In practice, we can use tuple like list in this case.
print("len() of the array:", len(my_first_array))
print("shape of the array:", my_first_array.shape)
print("length extracted from the shape of the array:", my_first_array.shape[0])
The reason that the shape attribute returns a sequence of one number instead of simply returning the number is that it is also useful for multi-dimensional arrays
We can also make 2 dimensional arrays like this:
my_2D_array = np.array([[0, 2, 4, 6, 8, 10], [1, 3, 5, 7, 9, 11]])
print(my_2D_array)
The values in an array can be accessed by putting a sequence of indices in brackets: my_array[2,3]
print("Second line of the 2D array", my_2D_array[1])
print("Value in the third column of second line:", my_2D_array[1, 2])
As we mentioned, the shape of a multi-dimensional array will contain more than one number (one size per dimension):
print("The shape is:", my_2D_array.shape)
Numpy also define many mathematical functions that can be applied either to numbers or to arrays of numbers. Some of the most usefuls are np.sin, np.cos, np.log, np.exp:
print("sinus of a number:")
print(np.sin(2))
print("sinus of an array:")
print(np.sin(my_2D_array))
We saw that we could create numpy arrays from lists with the array function. This is not the only way. Numpy provides many other functions for generating arrays. Here are a few useful ones.
The functions np.zeros and np.ones can be used to create arrays filled with zeros or ones. You have to give them the shape as argument:
print("Array of zeros of shape 3x4:")
print(np.zeros((3,4)))
print("Arrays of ones of shape 2x5:")
print(np.ones((2, 5)))
np.arange will generate a one-dimension array of increasing numbers:
print(np.arange(15))
We can also ask numpy to generate an arrays with random values.
Two useful functions for this are np.random.randn and np.random.uniform. (They begin with np.random because they are located in the random subpackage of numpy.
np.random.uniform(a, b, size=shape) will fill an array of the given shape with values between a and b
print("Array of shape 3x5 filled with random numbers between 0 and 5:")
print(np.random.uniform(-10, 5, size=(2,)))
# Re-execute this cell several times to see that the values generated change (this shows they are random)
np.random.randn(dimension1, dimension2,...) generates an array of shape dimension1xdimension2 with values taken from the normal distribution (a.k.a Gaussian distribution a.k.a Bell curve distribution). Note that, unfortunately, the way to call each function is not very consistent between functions.
print("Array of shape 3x4 filled with values from a Normal distribution")
np.random.randn(3,4)
The function np.linspace(a, b) will generate uniformly spaced numbers between a and b. It will be especially useful for us to plot fuctions with Matplotlib. By default, it generates 50 values. But it can be changed with the optional num argument.
print("50 numbers uniformly spaced between 0 and 10:")
print(np.linspace(0, 10))
print("5 numbers uniformly spaced between 0 and 10:")
print(np.linspace(0, 10, num=5))
To understand intuitively what we are doing with functions, it will be interesting to have some visualisation tools. Mostly, we will use the matplotlib library, which is the most commonly used libraries for plots and charts in Python.
Like we imported numpy before, we need to import matplotlib here with import matplotlib. We will also explicitly import a subpackage of matplotlib that we will use often: import matplotlib.pyplot as plt.
In addition, we will use a command specific to Jupyter: %matplotlib inline. It allows Jupyter to display the plots below each cell.
The plt.rcParams['figure.figsize'] = [15, 6] statement is optional. It serves to adjust the size of the plots displayed in Jupyter.
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
#plt.rcParams['figure.figsize'] = [15, 6]
To plot a function, we will mostly use the function plt.plot. What plt.plot does is to take a list of x-values and a list of y-values, and then draw lines corresponding to these x and y values.
print("plot of a line going from (x=1, y=-10) to (x=2, y=0) to (x=3, y=1)")
plt.plot([1,2,3], [-10, 0, 1])
Instead of drawing the line, we can also just display the points:
plt.plot([1,2,3], [-10, 0, 1], "o")
We can also change the color:
plt.plot([1,2,3], [-10, 0, 1], "r")
We can do both at the same time:
plt.plot([1,2,3], [-10, 0, 1], "ro")
To plot a function, we will use the np.linspace function to generate the x-values. Then we can apply a function to the x-values to obtain the y-values and obtain our graph
x_values = np.linspace(-2,2) # We compute x values between -2 and 2
def compute_square(x):
return x*x
y_values = compute_square(x_values)
print("x values:")
print(x_values)
print("y values:")
print(y_values)
We can then plot the function with plt.plot
plt.plot(x_values, y_values)
plt.plot(x_values, y_values, "o")
What do we do if we want to plot the function on the interval [-3, 5]?
Now, please try to plot a function of your choice (eg. $$f(x)=sin(x)*cos(x)$$ or $$f(x)= sin(x^2)*x$$ ...)