(5/5)

This coursework is only compulsory for MSc students taking the 20cr module. We released a different Lab 2 with an earlier deadline for UG students taking the 20cr module.

INSTRUCTIONS TO CANDIDATES

ANSWER ALL QUESTIONS

NB! This coursework is only compulsory for MSc students taking the 20cr module. We released a different Lab 2 with an earlier deadline for UG students taking the 20cr module.

You need to implement one program that solves Exercises 1-3 using any programming language. In Exercise 5, you will run a set of experiments and describe the result using plots and a short discussion.

(In the following, replace abc123 with your username.) You need to submit one zip file with the name niso3-abc123.zip. The zip file should contain one directory named niso3-abc123 containing the following files:

ˆ the source code for your program

ˆ a Dockerfile (see the appendix for instructions)

ˆ a PDF file for Exercises 4 and 5

In this lab, we will do a simple form of time series prediction. We assume that we are given some historical data, (e.g. bitcoin prices for each day over a year), and need to predict the next value in the time series (e.g., tomorrow’s bitcoin value).

We formulate the problem as a regression problem. The training data consists of a set of m input vectors X = (x⁽⁰⁾, . . . , x⁽^m−¹⁾) representing historical data, and a set of m output values Y = (x⁽⁰⁾, . . . , x⁽^m⁻¹⁾), where for each 0 ≤ j ≤ m − 1, x⁽^j⁾ ∈ Rn and y⁽^j⁾ ∈ R. We will use genetic programming to evolve a prediction model f : Rn → R, such that f (x⁽^j⁾) ≈ y⁽^j⁾.

∈

Candidate solutions, i.e. programs, will be represented as expressions, where each expression eval- uates to a value, which is considered the output of the program. When evaluating an expression, we assume that we are given a current input vector x = (x₀, . . . , x_n−₁) Rn. Expressions and eval- uations are defined recursively. Any floating number is an expression which evaluates to the value of the number. If e₁, e₂, e₃, and e₄ are expressions which evaluate to v₁, v₂, v₃ and v₄ respectively, then the following are also expressions

ˆ (add e₁ e₂) is addition which evaluates to v₁ + v₂, e.g. (add 1 2)≡ 3

ˆ (sub e₁ e₂) is subtraction which evaluates to v₁ − v₂, e.g. (sub 2 1)≡ 1

ˆ (mul e₁ e₂) is multiplication which evaluates to v₁v₂, e.g. (mul 2 1)≡ 2

ˆ (div e₁ e₂) is division which evaluates to v₁/v₂ if v₂ ƒ= 0 and 0 otherwise, e.g., (div 4 2)≡ 2, and (div 4 0)≡ 0,

ˆ (pow e₁ e₂) is power which evaluates to v1 , e.g., (pow 2 3)≡ 8

ˆ (sqrt e₁) is the square root which evaluates to √v₁, e.g.(sqrt 4)≡ 2

ˆ (log e₁) is the logarithm base 2 which evaluates to log(v₁), e.g. (log 8)≡ 3

ˆ (exp e₁) is the exponential function which evaluates to e^v1 , e.g. (exp 2)≡ e² ≈ 7.39

ˆ (max e₁ e₂) is the maximum which evaluates to max(v₁, v₂), e.g., (max 1 2)≡ 2

ˆ (ifleq e₁ e₂ e₃ e₄) is a branching statement which evaluates to v₃ if v₁ ≤ v₂, otherwise the expression evaluates to v₄ e.g. (ifleq 1 2 3 4)≡ 3 and (ifleq 2 1 3 4)≡ 4

ˆ (data e₁) is the j-th element x_j of the input, where j ≡ ||v₁∫| mod n.

ˆ (diff e₁ e₂) is the difference x_k − x_A where k ≡ ||v₁∫| mod n and A ≡ ||v₂∫| mod n

|k−A|

t=min(k,A)

ˆ (avg e₁ e₂) is the average ¹ Σmax(k,A)−1 x_t where k ≡ ||v₁∫| mod n and A ≡ ||v₂∫|

√

In all cases where the mathematical value of an expression is undefined or not a real number (e.g.,

−1, 1/0 or (avg 1 1)), the expression should evaluate to 0.

We can build large expressions from the recursive definitions. For example, the expression

(add (mul 2 3) (log 4))

evaluates to

2 · 3 + log(4) = 6 + 2 = 8.

X Y

To evaluate the fitness of an expression e on a training data ( , ) of size m, we use the mean square error

f (e) =

1 mΣ−1 .

y⁽^j⁾ − e(x⁽^j⁾)Σ2 ,

j=0

where e(x⁽^j⁾) is the value of the expression e when evaluated on the input vector x⁽^j⁾.

Exercise 1. (30 % of the marks)

Implement a routine to parse and evaluate expressions. You can assume that the input describes a syntactically correct expression. Hint: Make use of a library for parsing s-expressions¹, and ensure that you evaluate expressions exactly as specified on page 2.

Input arguments:

ˆ -expr an expression

ˆ -n the dimension of the input vector n

ˆ -x the input vector Output:

ˆ the value of the expression

Example:

[pkl@phi ocamlec]$ niso_lab3 -question 1 -n 1 -x "1.0"

-expr "(mul (add 1 2) (log 8))"

9.0

[pkl@phi ocamlec]$ niso_lab3 -question 1 -n 2 -x "1.0 2.0"

-expr "(max (data 0) (data 1))"

2.0

Exercise 2. (10 % of the marks) Implement a routine which computes the fitness of an expression given a training data set.

Input arguments:

ˆ -expr an expression

ˆ -n the dimension of the input vector

ˆ -m the size of the training data (X , Y)

ˆ -data the name of a file containing the training data in the form of m lines, where each line contains n + 1 values separated by tab characters. The first n elements in a line represents an input vector x, and the last element in a line represents the output value y.

ˆ The fitness of the expression, given the data.

1See e.g. implementations here http://rosettacode.org/wiki/S-Expressions

Exercise 3. (30 % of the marks)

Design a genetic programming algorithm to do time series forecasting. You can use any genetic operators and selection mechanism you find suitable.

Input arguments:

ˆ -lambda population size

ˆ -n the dimension of the input vector

ˆ -m the size of the training data (X , Y)

ˆ -data the name of a file containing training data in the form of m lines, where each line contains n + 1 values separated by tab characters. The first n elements in a line represents an input vector x, and the last element in a line represents the output value y.

ˆ -time budget the number of seconds to run the algorithm Output:

ˆ The fittest expression found within the time budget.

Exercise 4. (10 % of the marks)

Describe your algorithm from Exercise 3 in the form of pseudo-code. The pseudo-code should be sufficiently detailed to allow an exact re-implementation

Exercise 5. (20 % of the marks)

In this final task, you should try to determine parameter settings for your algorithm which lead to as fit expressions as possible.

Your algorithm is likely to have several parameters, such as the population size, mutation rates, selection mechanism, and other mechanisms components, such as diversity mechanisms.

Choose parameters which you think are essential for the behaviour of your algorithm. Run a set of experiments to determine the impact of these parameters on the solution quality. For each parameter setting, run 100 repetitions, and plot box plots of the fittest solution found within the time budget.

(5/5)

Use CA10RAM to get 10%* Discount.

This coursework is only compulsory for MSc students taking the 20cr module. We released a different Lab 2 with an earlier deadline for UG students taking the 20cr module.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science

Other Services

This coursework is only compulsory for MSc students taking the 20cr module. We released a different Lab 2 with an earlier deadline for UG students taking the 20cr module.

ANSWER ALL QUESTIONS

Attachments:

Instructions Files

Related Questions

. Introgramming & Unix Fall 2018, CRN 44882, Oakland University Homework Assignment 6 - Using Arrays and Functions in C

. The standard path finding involves finding the (shortest) path from an origin to a destination, typically on a map. This is an

. Develop a program to emulate a purchase transaction at a retail store. This program will have two classes, a LineItem class and a Transaction class. The LineItem class will represent an individual

. SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

. Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of Sea Ports. Here are the classes and their instance variables we wish to define:

Ask This Question To Be Solved By Our ExpertsGet A+ Grade Solution Guaranteed

Our Experts

Um e HaniScience

Muhammad Ali HaiderFinance

Husnain SaeedComputer science

Atharva PatilComputer science