1.0 The Problem
2.0 The Java Language
2.1 A Java Applet
2.2 Java Tools
3.0 The Fuzzy Clustering Applet
4.0 The Original Source Application - Fitter
5.0 Loading Data Files Into The Applet
6.0 Running the Algorithms
7.0 Saving the Data Points
8.0 The Server Side Application
8.1 The FuzzyClustInit CGI Script
8.2 The FuzzyClustResults CGI Script
8.3 The FuzzyClustSave CGI Script
8.4 The FuzzyClustHelp.html File
9.0 Development Problems Encountered
10.0 The Source Code Description
11.0 The Source Code Listings
1.0 The Problem
There exists a very useful fuzzy-logic clustering simulation
program which is strongly dependent upon the UNIX operating system,
the X Windows interface and the SUIT prototyping tool. It is
highly unlikely any group not possessing these three tools can
successfully utilize the simulation program. There is therefore
a strong compulsion to port the application to a more readily
accessible platform in order to better demonstrate it's usefulness
and promote it's acceptance. Additionally, it would be highly
desirable to make the application accessible via the World Wide
Web offering the broadest possible audience. Clearly the platform
of choice then is the Java language.
2.0 The Java Language
For the first time in my recollection there exists a practical programming tool that allows for effortless portability amongst diverse hardware and OS platform combinations. But this platform independence does not come without penalty. Java is an interpreted language, which is inherently slow by definition. To maintain compatibility, Java forces the programmer to the lowest common denominator amongst all the platforms it is written to support. Features like multi-threading, memory allocation and memory deallocation are all left to the run-time interpreter. This ties the hands of the developer for optimization in the name of compatibility and portability.
Writing applications in Java is quite simple compared to other
contemporary languages. Java is Objected Oriented by design.
Java does not support pointers and does not have a preprocessor
(like C++), making the language easy to debug. The Java compiler
enforces strict type checking and array indexing checks; trapping
potential problems at compile time. The default packages include
a full graphics library, and a powerful networking library; making
Java well positioned in today's development environment.
2.1 A Java Applet
A final feature of the language is its most exciting. Programs
written in Java that extend (are children of) the class applet
can be executed inside of a web browser. This simple feature
opens a whole world of opportunities for developers. But, at the
same time the user must be protected. A malicious program must
not be able to harm the client's computer or browser. Because
of this concern, Java applets have two major restrictive rules
under which they must operate. First, an applet cannot read from
or write to the client's local disk storage. This prevents snoop
attacks and file renaming or removing. Second, an applet can
only open a network connection to the server where it was downloaded
from. This prevents bypassing network firewall security. While
these security measures clearly protect the user, they make it
very difficult to develop interactive applications, like fuzzy
clustering. To overcome this problem I chose to use a combination
of HTML forms and CGI scripts which I will discuss later.
2.2 Java Tools
The Java language is relatively young. The current release is version 1.1.x. But surprisingly, there already exists quite a range of development tools. I had a chance to evaluate two integrated development environment packages (IDE). Microsoft's J++, and SUN Microsystems Java Workshop. Both tools were very good but imposed upon the resultant applications the added overhead of loading extra class files from the IDE. These files, some of which were over 200K in size, seemed like overkill for our applet. If Fuzzy Clustering applet was to be truly web based, it should be as small as possible to minimize the download time. Therefore I ruled out using an IDE development tool.
The Java language was recently upgraded from version 1.0.2 to
version 1.1 earlier this year. Because of this, the major browser
suppliers, Netscape and Microsoft, do not fully support the new
Java. Therefore to minimize incompatibilities, I chose to write
the applet using the older Java release. The Java compiler I used
comes directly from SUN Microsystems: the Java development kit
(JDK) version 1.0.2.
3.0 The Fuzzy Clustering Applet
Figure 1. The Fuzzy Clustering Applet
Above, in figure 1, is a screen shot of the Fuzzy Clustering Applet. The generic controls of the application are straight forward. The white square with a cube in it is the plotting canvas where all data points are plotted. This surface is also sensitive to mouse clicks. The user can add points to the canvas by simply clicking the left mouse button on the canvas. The points on the plotting canvas may be manipulated using the controls under the heading Point Marker Control (See figure 1 the section labeled 5). To change the point size use the size controller. To change the point color use the color controller. To hide the cube, display the cube, or display a truncated cube, use the cube controller.
At the bottom of the applet there are a number of buttons whose function is self explanatory. To run the selected algorithm press the Run Algorithm button. To save the data points press the Save Points button. To clear the computed clusters or computed shells press the Clear Clusters button. To delete all the data points press the Clear Points button. For a help screen press the Help button.
The remaining inputs of the applet control the parameters to the different algorithms. If the reader refers to figure 1 again, the section labeled 4 contains two parameters valid for all algorithms. The approximate number of clusters is an integer value representing the number of groups of points in the data set. The weight assignment strategy can be: alternate, random or in order. This specifies the way in which cluster membership weights are initialized for every point in the graph.
If you refer to the section 1 of figure 1, this is where the user can choose the clustering process: Robust clustering or Shell clustering. If the user selects Shell clustering then they can select a Shell clustering algorithm to run in the section labeled 2. The choices for Shell clustering algorithms are: AFCES Simple, AFCES Newton, and AFCES U. In figure 1 again, the section labeled 3 contains all the controls for the Robust clustering algorithm. The Lambda and Fuzziness Value must be a real numbers greater than zero. The Initial Membership Weight is a percent value. It must be greater than zero but not exceeding one.
Figure 2. Display Clusters
The radio buttons Display Clusters and Display Confidence control how the Robust clusters are plotted. If the former is selected the points that are members of the cluster are colored a unique color. If the latter is selected the points surrounding the cluster are colored a unique color depending upon the probability that they belong to that cluster (see figures 2 and 3).
If the Run Once option is selected the Robust clustering algorithm will compute one iteration using the given number of clusters. Otherwise, the algorithm will first compute the optimum number of clusters then compute the clustering. This process takes a significant amount of time.
Figure 3. Display Confidence
The second to last option - Display After Each Iteration, plots
the intermediate clustering results while computing. Finally
the last option - Show All Computed Clusters, plots all clusters
even if they fall below the noise threshold.
4.0 The Original Source Application - Fitter
Figure 4. The Fitter Application
For the readers reference here is a screen snapshot of the original
C program, Fitter (see figure 4), written by Mary Anne Egan.
I took care to reproduce the controls and functionality of the
original within my applet.
5.0 Loading Data Files Into The Applet
Figure 5. The HTML Data and Image Loading
Form
Figure 5 is a screen shot of the loading HTML form. This is the first screen the user sees upon starting the applet. As you discovered earlier, a Java applet cannot read from the local disk of a client's computer. This is a real problem; how are you going to give the applet some personal test data? I solved the problem by using a little known extension of HTML, the File Selection field. In a HTML form containing a File Selection field, the browser will automatically upload the user specified file to the server.
Once the file is safely on the server, I spool it until the Fuzzy Clustering Applet calls for it. So in a roundabout way, data files are transferred from the user of the applet into the applet using the server as the transfer medium. The only drawback of this procedure is that to load another data set it is necessary to return to the HTML form, specify the new data file, then reload the applet again.
To load a file into the applet, you simply have to enter its name in the appropriate field on the form. The first field is for points files, and the second is for image files. If you don't know a filename, select the browse button to search your local directories.
A points file layout is simple. You just have to do is create an ASCII file with the following:
The format of this file is quite liberal. You can use any white space character, a comma, or even a newline interchangeably as field delimiters.
If you wish to load an image the applet supports gif and jpeg
encoded image files. The plotting canvas is 300 by 300 pixels
so remember to keep your images small.
6.0 Running the Algorithms
Figure 6. The Algorithm Results Dialog (Running)
Figure 7. The View Results Button
As you read earlier, to start an algorithm computing: first you
have to load some data points, set the computation controls, then
select the Run Algorithm button. When an algorithm gets
run, two things happen: a new thread is created to control the
computation and a Results Dialog gets created (see figure 6).
Collected in the scrollable area of the dialog box is any textual
output from the running algorithm. While an algorithm is active,
the Results Dialog appears like in figure 6. If you select the
Hide button, the dialog disappears from view. To restore
the dialog then, press the View Results button (see figure
7). You can also stop the computation before it completes by
pressing the Stop button. This destroys any work that has
not completed and kills the thread.
When the algorithm has finished computing, the thread terminates. The Results Dialog appearance then changes to something similar to figure 8. To close this dialog without saving the text output, press the Close button. To save the captured textual output, select the Save Results button. If you opted to save, the next screen you will see is similar to figure 9. The saving process is similar to the loading process. The applet must upload its information to the server then instruct the user's browser to fetch this information back from the server. To complete the saving process, the user must select the save option from the browser's file menu, writing to local disk.
Figure 9. The Save Results Screen
Figure 8. The Algorithm Results Dialog (Finished)
7.0 Saving the Data Points
Figure 10. The Save Data Points Screen
If you have entered data points on the canvas manually through
clicking the mouse, you should save your changes. To save the
data points select the Save Points button (see figure 1).
The screen in figure 10 is similar to what you will see next.
It is a new browser window in which the data points in the applet
are exported. As in all saving and loading operations, the applet
must first upload the data to the server then instruct the browser
to fetch it from the server. To complete the saving process,
the user must select the save option from the browser's file menu,
writing to local disk.
8.0 The Server Side Application
The previous sections have been solely about the client side Java applet. This section presents the data handling and initialization scripts present upon the server which are needed for the applet to properly function as an application.
The first piece of the server side application is the input form
InitFC.html. This file generates the web page seen in
figure 5. It is the entry point to the applet. This HTML form
calls the initialization script FuzzyClustInit.cgi. This
file with a .cgi extension is called a CGI script. CGI
stands for Common Gateway Interface. The power of a CGI file
is that when it is accessed by the web server, it is executed
as if it were a program. The actual content of the FuzzyClustInit.cgi
script is written in Perl.
8.1 The FuzzyClustInit CGI Script
This script's job is two fold. It's responsible for spooling and unspooling of the user's input data files. Its operation is simple. If the script is called using a http "post" request, it attempts to spool data. If the script is called using a http "get" request, it attempts to despool data. The script assumes when it receives a "post" request that it originated from a user's browser who processed the InitFC.html form. This form is configured to call the script when the user selects the Start Applet button (see figure 5). The script checks the "post" request for the proper format, and then spools any data files that it might contain. Now the client's browser is waiting for a response from the server. The script dynamically generates a HTML web page and passes it to the browser. Imbedded in this web page are the applet, a data file key, and an image file key.
When the applet is fully loaded in the client's browser, it searches
for these file keys. A file key present indicates that there
is data on the server to be downloaded. For example, if the file
key for an image exists, then the server has spooled an image
file. To retrieve the spooled data, the applet open a connection
to the server, then generates a http "get" request.
Inside this request is the key to the file the applet needs.
When the FuzzyClistInit script is called with a http "get"
request, it searches for the spooled file that matches the key.
If found, the script returns the file to the applet and then
deletes it.
8.2 The FuzzyClustResults CGI Script
This script is very similar to the script of the previous section
and, for that manner, the script of the next section. It performs
two different operations depending upon how it is called. This
script differs in that it assumes that it will always be called
from inside the Fuzzy Clustering Applet. Like the previous script,
a http "post" request means spool data and a http "get"
request means despool data. The applet invokes this script when
the user selects Save Results button in the Results Dialog
(see section 6.0). The applet opens a "post" connection
to the script and uploads the data. When the upload is finished
the server send the file key back to the applet. The applet now
opens a new browser window with a http "get" request,
including the key. The script fetches the appropriate file then
deletes it from the spool directory.
8.3 The FuzzyClustSave CGI Script
This is the last of the server scripts, all of which operate virtually
the same. Like the two previous scripts, it performs dual operations
depending upon how it is called. It also assumes that it will
always be called from inside the Fuzzy Clustering Applet. This
script differs internally because it has to parse and format the
input files while spooling them. Like the other scripts, a http
"post" request means spool data and a http "get"
request means despool data. The applet invokes this script when
the user selects the Save Points button (see section 7.0).
The applet opens a "post" connection to the script
and uploads the data. The script reads in the data. Before spooling
it, the script adds a header and then formats the numbers, ten
to a line separated by commas. When the upload is finished the
server send the file key back to the applet. The applet now opens
a new browser window with a http "get" request, including
the key. The script fetches the appropriate file then deletes
it from the spool directory.
8.4 The FuzzyClustHelp.html file
The last piece of the server side application is the help file
FuzzyClustHelp.html. This file is retrieved by the applet
when the "help" button is selected.
9.0 Development Problems Encountered
The first problem in the project I encountered was applet security. As stated in section 2.1, the framers of the Java language wanted the applet mechanism to be hacker proof. While they have succeeded in doing so, they have made very difficult to write an applet that supports the loading and saving of data. For the project I was able to code an acceptable compromise solution using server side CGI scripts. An unfortunate consequence of this is that only the browsers Netscape Navigator version 3.x and newer and Microsoft's Internet Explorer version 4.x and newer can run the Fuzzy Clustering Applet.
Another problem which turned out to be very difficult to fix was the applet's layout inconsistency between different browsers on different OS platforms. Netscape implements spacing more liberally than does Microsoft. The PC platform of the browsers have totally different looking controls. Radio buttons, choice boxes and text fields all appear differently on the PC platform. Finally the default font sizes and screens sizes produced havoc. The final version of my applet appearance should be acceptable on all major OS/browser combinations.
A final and the most shocking problem I encountered with the
Java language is threads. I discovered that on UNIX platforms,
particularly Solaris, the Java virtual machine implementations
are cooperatively threaded. This means that if an executing thread
does not yield the processor, all the other Java thread will
starve. I could not find a clear solution to the problem which
impacted my application tremendously. Having to code around this
problem decreased the applet's performance by as much as 20 percent.
10.0 The Source Code Description
The Fuzzy Clustering Applet version 2.0b contain the following source files:
The Java source code:
The build utilities:
The server application (CGI) files:
The server HTML files:
11.0 The Source Code Listings