CSCI.6500/CSCI.4971 Distributed Computing over the Internet

Spring, 2016

Programming Assignment 2.

This project is to be done individually or in pairs. Do not show your code to any other student/group and do not look at any other student/group's code. Do not put your code in a public directory or otherwise make it public. You are encouraged to use the LMS Discussions page to post problems so that other students can also answer and see the answers.

The goal of this assignment is to practice concurrent and distributed programming using the SALSA programming language.

You are to analyze three-dimensional data from the Sloan Digital Sky Survey, in particular, stars in our MilkyWay galaxy, for future human space colonization. Your program needs to compute the following:

For each computation, output multiple answers if there are ties. For example, when computing closest neighbours, if there are multiple pairs with the same minimum pairwise distance, output both pairs.

More information on Milky Way visualization can be found in the associated MilkyWay@Home forum.

You are given a stars text file with the first line giving the total number of stars in the file followed by one line per star representing each star's three dimensions <x,y,z> as X Y Z. Please remove duplicate entries in your program. Your output should look as follows:

d1  // minimal pairwise distance
s1 s2
...

d2  // maximal pairwise distance
s1 s2
s3 s2
...

d3 // minimum maximal distance
s3 s1 
...

d4 // maximum minimal distance
s4 
s5 
... 

d5 // minimal average distance
s1
...
where d_i denotes a distance and s_i denotes a star, which is represented as (x,y,z), corresponding to the x, y, and z coordinates of the star.

Part 1 - Concurrent Solution

Write an actor-based solution to the space colonization problem.

Part 2 - Distributed Solution

Write a distributed space colonization solution. Note that in this case, the actors must communicate over a network although you do not necessarily have to run each actor on a separate machine. When using SALSA, your solution will be distributed if you make use of universal actors.

Time-Saving Hints

  1. For reference, please see the SALSA webpage, including its FAQ. Read the tutorial and a comprehensive example illustrating distributed programming in SALSA.
  2. salsac and salsa are UNIX aliases or Windows batch scripts that run java and javac with the expected arguments: See .cshrc for UNIX, and salsac.bat salsa.bat for Windows.
  3. To run the distributed program, first, run the name server and the theaters:

    [host0:dir0]$ wwcns [port number 0]
    [host1:dir1]$ wwctheater [port number 1]
    [host2:dir2]$ wwctheater [port number 2]
    ...
    
    where wwcns and wwctheater are UNIX aliases or Windows batch scripts: See .cshrc for UNIX, and wwcns.bat wwctheater.bat for Windows. Make sure that the theaters are run where the actor behavior code is available, that is, the pa2 directory should be visible in directories: host1:dir1 and host2:dir2. Then, run the distributed program as mentioned above.

  4. The theaters all cache actor behaviors. Restart all the theaters each time changes are made to the code.
  5. The module/behavior names in SALSA must match the directory/file hierarchical structure in the file system. e.g., a Space behavior should be in a relative path pa2/Space.salsa, and should start with the line module pa2;.
  6. Messaging is asynchronous. m1(...);m2(...); does not imply m1 occurs before m2.
  7. Notice that in the code m(...)@n(...);, n is processed after m is executed, but not necessarily after messages sent inside m are executed. For example, if inside m, messages m1 and m2 are sent, in general, n could happen before m1 and m2.
  8. (Named) tokens can only be used as arguments to messages.

Due Date: April 12th, 6:00PM

Grading: The assignment will be graded mostly on correctness, but code clarity / readability will also be a factor (comment, comment, comment!).

Submission Requirements: Please submit a ZIP file with your code, including a README file. In the README file, place the names of each group member. Your README file should also have a list of specific features / bugs in your solution. Your ZIP file should be named with your LMS user name(s) as the filename, either userid1.zip or userid1_userid2.zip. Only submit one assignment per pair via LMS.