Assignment Overview
This module is assessed by portfolio. Your portfolio will consist of:
A learning journal that uses the specified
Your source code and data files that use the specified
It is recommended that you use a revision control system, such as subversion, for maintaining your source code. This will enable your code to be backup up and will also provide a source of evidence in your defence if collusion is
The module is assessed by evidence that you provide of:
Analysis of supplied programs and experimentation with
Capturing results and depicting them as
Written answers to the given related theoretical
· Writing programs to solve computationally expensive problems that require High Performance Computing.
Written explanations of your results and reflections of your
Two different technologies, POSIX Threads and CUDA will be combined with one application domain of password cracking. The following sections are first ordered, by technology then by application domain.
Portfolio Tasks
Task 1 Parallel and Distributed Systems (15 Marks)
What are threads and what they are designed to solve? (2 marks)
Name and describe two process scheduling policies. Which one is preferable and does the choice of policies have any influence on the behavior of Java threads? (2 marks)
Distinguish between Centralized and Distributed systems? (2 marks)
Explain transparency in D S? (2 marks)
The following three statements contain a flow dependency, an antidependency and an output dependency. Can you identify each?
Given that you are allowed to reorder the statements, can you find a permutation that produces the same values for the variables C and B as the original given statements?
Show how you can reduce the dependencies by combining or rearranging calculations and using temporary variables. (4 marks)
Note: Show all the works in your report and produce a simple C code simulate the process of producing the C and B values. (2marks for solving dependencies and 2marks for the code)
B=A+C B=C+D C=B+D
What output do the following 2 programs produce and why? (3 marks)
#include #include |
#include #include |
int counter;static void * thread_func(void * _tn) { int i; |
int counter;static void * thread_func(void * _tn) { int i; |
for (i = 0; i < 100000; i++) |
for (i = 0; i < 100000; i++) |
counter++; |
counter++; |
return NULL; |
return NULL; |
} |
} |
int main() { |
int main() { |
int i, N = 5; |
int i, N = 5; |
pthread_t t[N]; |
pthread_t t[N]; |
for (i = 0; i < N; i++) |
for (i = 0; i < N; i++) { |
pthread_create(&t[i], NULL, thread_func, NULL); |
pthread_create(&t[i], NULL, thread_func, NULL); |
for (i = 0; i < N; i++) pthread_join(t[i], NULL); |
pthread_join(t[i], NULL); } |
printf("%dn", counter); |
printf("%dn", counter); |
return 0; |
return 0; |
} |
} |
Outputs: (1.5 marks) |
Outputs: (1.5 marks) |
Task 2: Applications of Matrix Multiplication and Password Cracking using HPC- based CPU system: (35 Marks)
Part A: Single Thread Matrix Multiplication (5 Marks)
Study the following algorithm that is written for multiplying two matrices A and B and storing the result in C.
Now answer each of the following questions:
Analyse the above algorithm for its complexity. (1 marks)
Suggest at least three different ways to speed up the matrix multiplication algorithm given here. (Pay special attention to the utilisation of cache memory to achieve the intended speed up). (1 marks)
Write your improved algorithms as pseudo-codes using any editor. Also, provide a reasoning as to why you think the suggested algorithm is an improvement over the given algorithm. (1 marks)
Write a C program that implements matrix multiplication using both the loop as given above and the improved versions that you have written. (1marks)
Measure the timing performance of these implemented algorithms. Record your (Remember to use large values of N, M and P – the matrix dimensions when doing this task). (1 marks)
Part B: Write a code to implement matrix multiplication using multithreading (10 Marks)
Write a C program to implement matrix multiplication using multithreading. The number of threads should be configurable at run time, for example, read via an external file. (5 marks)
The code should print the time it takes to do the matrix multiplication using the given number of threads by averaging it over multiple runs. (2.5 marks)
Plot the time it takes to complete the matrix multiplication against the number of threads and identify a sweet spot in terms of the optimal number of threads needed to do the matrix multiplication. (2.5 marks)
(In doing this exercise, you should use matrices of large sizes e.g. 1024 * 1024 sized matrices).
Part C: Password cracking using POSIX Threads (20 Marks)
You will be provided with a template code of encrypting a password using SHA-512 algorithm. You are asked to run a code using the given template to encrypt a password contains TWO uppercase letters and TWO integer numbers (total 4 digits). Afterwards, you need to run the given code to crack the password, i.e. find the plain- text equivalents. An example password is AS12.
Run the given cracking program 10 times and calculate the mean running time (Using 2 uppercase letters and 2 integer numbers). (2 marks)
In your learning journal make an estimate of how long it would take to run on the same computer if the number of initials were increased to 3. Include your working in your answer. (2 marks)
Modify the program to crack the three-initials-two-digits password given in the
three_initials variable. An example password is HPC19. (2 marks)
4. Write a short paragraph to compare the running time (average of 10 times) of your three_initials program with your earlier estimate. If your estimate was wrong explain why you think that is. (2 mark)
Modify the original version of the program to run on 2 threads. It does not need to do anything fancy, just follow the following algorithm. (10 marks)
Record the system time a nano second timer
Launch thread_1 that calls kernel_function_1 that can search for passwords starting from A-M
Launch thread_2 that calls kernel_function_2 that can search for passwords starting from N-Z
Wait for thread_1 to finish
Wait for thread_2 to finish
Record the system time using a nano second time and print the elapsed time.
Compare the results of the mean running time of the original program, (obtained by step no. 1), with the mean running time of the multithread version (10 running times). (2 marks)
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of