An associative array, when implemented in Perl, has come to be known as a "hash" -- a word that is also used to describe the digested value (i.e., the "message digest") generated from a longer piece of text and used to ensure that the text has not been altered (if a message before and after transmission results in the same hash, the text can assume to be unaltered).
This choice of the word "hash" for a seemingly unrelated use is not surprising when the function of a hash is considered. The value of each hash element is computed from its key in a manner that is not entirely dissimilar from the way in which message digests are computed from the text that they represent -- thus, the concept of computing a "hash" that leads to the proper data value for an element in the array.
Using the array from last week's column, the wavelength of various rainbow colors can be assigned to a hash in several different ways. In one of the assignments we looked at last week, the hash elements were set up in one command that incorporated key/value groups using separate lines to clarify the relationships between the color names and wavelength values:
%wavelength = ("red", 650,
"orange", 590,
"yellow", 570,
"green", 510,
"blue", 475,
"indigo", 445,
"violet", 400);
This format is essentially no different than assigning the values in a single line as illustrated in the line below. We've just introduced line breaks to highlight the association between red and 650, orange and 590 and so on.
%wavelength=("red",650,"orange",590,"yellow",570,"green",510,"blue",475,"indigo",445,"violet",400);
Most Perl programmers prefer to use the newer, even more readable, syntax that makes the association between each hash key its hash value even more clear:
%wavelength = (
red => 650,
orange => 590,
yellow => 570,
green => 510,
blue => 475,
indigo => 445,
violet => 400
);
And, yes, you could put this assignment all on a single line if you were so inclined. It's just not as nice to read:
%wavelength=(red=>650,orange=>590,yellow=>570,green=>510,blue=>475,indigo=>445,violet=>400);
To loop through the elements of a Perl hash, we can use a while statement like this one in which each key/value pair is retrieved and displayed.
while ( my ($key, $value) = each(%wavelength) ) {
print "$key => $value\n";
}
We could also set up a for loop which might look like this:
for my $key ( keys %wavelength ) {
my $value = $wavelength{$key};
print "$key => $value\n";
}
Either of these loops would result in the following output:
blue => 475
green => 510
indigo => 445
violet => 400
red => 650
yellow => 570
orange => 590
As we saw last week, the order in which the values are displayed is odd. Hashes are unordered collections of values, so the spectral order with which we think of rainbow colors has no bearing on the way that arrays are actually implemented, nor does the order in which we add elements to our array. Interestingly, this order may not be the same from one system to another. Here's the same hash displayed on Mac OS X:
blue => 475
orange => 590
green => 510
violet => 400
yellow => 570
red => 650
indigo => 445
This doesn't mean, of course, that we're confined to looping through hashes in these difficult to fathom orders. We can elect to display hashes in key or value order with loops like these:
foreach $key (sort(keys %wavelength)) {
print $key, ' => ', $wavelength{$key}, "\n";
}
This loop displays our wavelength hash in key order:
blue => 475
green => 510
indigo => 445
orange => 590
red => 650
violet => 400
yellow => 570
The following loop displays the same hash in value order -- undoubtedly
a more meaningful ordering for this particular array:
foreach $key (sort { $wavelength{$b} <=> $wavelength{$a} } keys %wavelength) {
printf "%4d %s\n", $wavelength{$key}, $key;
}
Here is the output:
650 red
590 orange
570 yellow
510 green
475 blue
445 indigo
400 violet
To view the size of a hash in Perl, you can use the keys function like this:
print "the size of the hash: " . keys( %wavelength ) . "\n";
The keys function returns the keys of the named hash. So, the command, "print keys( %wavelength );", would print:
bluegreenindigovioletredyelloworange
Concatenated into a string, on the other hand, "keys( %wavelength )" gives you the size (number of elements in) the array:
the size of the hash: 7
If you want to delete an element in a hash, you can use the delete function. In this respect, working with a hash is considerably easier than working with an indexed array. Imagine what you would have to do to squeeze out an element from the middle of an indexed array and you will see what I mean.
delete $wavelength{green};
Hashes are particularly useful because they reflect the way that most people think about most of the information they work with. Hours worked on Monday makes more sense to most of us than hours worked on day 1 (or day 0!).
Hashes do not require that each possible key has a value. We can store the wavelength for the colors of the rainbow, for example, but omit other wavelengths, such as those for ultraviolet and infrared.
Which Korn Shell?
In last week's column, I mentioned a script that can display which version of the Korn shell you are using. Several people wrote in that they prefer to type Escape, ^V with the vi option set. In the output below, the Escape, ^V sequence was typed after the set command. The M in this output refers to the "multi-byte" binary build of the 1988 or newer versions of the Korn shell. The "88i" indicates that this is the 9th ("i" being the ninth letter of the alphabet) version of ksh88.
$ set -o vi
$ Version M-11/16/88i
Looking for ksh93 in Solaris?
While versions of Solaris as late as Solaris 10 use ksh88 for the default Korn shell (/bin/ksh), you might have ksh93 installed as well. If you have CDE installed, you should have ksh93 available as /usr/dt/bin/dtksh.
Using Associative Arrays in the Korn Shell and Perl (08/17/2006)
Using Indexed Arrays in the Korn Shell (08/10/2006)