Send in your Unix questions today! |
See additional Unix tips and tricks
One of the many significant differences between Unix systems and Windows boxes
is the degree to which file extensions are mandated. Unix commands, tools and
scripts are perfectly content to exist without the encumbrance of file extensions
while Windows systems require that files have extensions for even the most routine
use (try running an executable that doesn't have a .exe extension and you'll
see what I mean). Part of the reason for this divide is that Unix systems emerged
back when the command line was all there was. No windows (note the lower case
"w"), no file managers, no drag and drop -- no GUIs! With the emergence
of sophisticated desktops, Unix systems today attach considerably more significance
to file extensions than they did in those early days, but still with far more
flexibility than their Windows counterparts. What happens when you double click
on a file on your Unix desktop is, after all, largely driven by its file extension.
Click on an html file and a browser will open regardless of whether you're working
on Unix or Windows. Even so, whether sysadmins call their scripts myscript.sh,
myscript.pl, myscript.cgi or just myscript is as much determined by local custom
as by the dictates of the OS.
What happens when you click on a file with a particular extension depends on
the file "binding" that it is in effect. If you have more than one
browser installed on a Windows box, you have probably noticed that each of them
at some point has asked whether you would like it to be your default browser.
If you answer "yes", then it will assign itself the duty of opening
html files when you click on them, bumping any other browser that might have
previously had this responsibility.
When working on the command line, knowing what particular file extensions mean
is often necessary before you know how to work with them. A tgz file, for example,
has to be both unzipped and untarred before you can work with its contents,
whether you do this in one step or two.
You probably know what a tgz file is, what a jpg is, what an mp3 is, but do
you know what an ISO file is? Do you know what an AUP file is? How about a CSV
file? What about SCP files? The fact is there are literally hundreds of file
extensions, many which represent defined standards like PNG (portable network
graphic) files and others which represent the personal choice of the person
assigning them. Some file extensions, like PNG, might even be used for more
than one file type. PNG is used both for the (lossless compression) portable
network graphics files and Paint Shop Pro browser catalogs.
Arbitrary file extensions often reflect the preferences of those assigning
them. You have probably seen files called *.old, *.save, *.keep, .bak and *-
(as in "mv myfile myfile-"). While these file extensions likely mean
nothing to your file manager, they tell sysadmins that, regardless of the type
of file being so named, these files are meant to be preserved, most likely to
ensure that he or she can back out of some system change. A similar convention
is to use a date string (such as 071130) as an extension or to use an arbitrary
number often expressed as $$ which represents the process ID.
One hopes that any other sysadmin that happens upon a file named sshd_conf.bak
or sshd_conf- will understand that the file is meant to be saved -- at least
until the system changes have proven themselves. But not all file extensions
are so obvious. Does .bad indicate that a file is corrupt or is it a some type
of address file? Sometimes only the context in which a file was found can answer
questions such as this. Sometimes, you have to analyze the file with various
display and dump commands to determine what it is.
File Extensions Online
If you're curious about the variety of file extensions that are used on systems
today -- both Unix and non-Unix -- you might browse your way over to http://www.file-extensions.org/.
This site contains a growing collection of information on file extensions, what
they mean and what applications use them. On file-extensions.org, you could
learn that an AUP file might be an Audacity project file. Audacity is a multi-track
audio editor and recorder for Linux, BSD, Mac OS and Windows. You could learn
that ISO files are CD/DVD ISO binary image files -- the type of disk image files
that you would use in burning data files onto these media types. The same file
extension is used, however, for a type of bitmap graphics.
CSV files are comma-separated value files. In this format, the fields in each
record are separated from each other with commas instead of tabs, whitespace
or some other character. This same extension, however, is used for CompuShow
adjusted EGA/VGA palettes.
SCP files might be BITCOM scripts, ColoRIX bitmap images, Microsoft Dial-Up
Networking script or Palm OS configuration files. Go figure.
File-Extensions.org also categorizes file extensions in groups. If you would
like to browse a list of file extensions used for audio and music, you can click
on the link that points to that category and read the descriptions of all 66
file extensions in this category. I got a kick out of learning that Monkey's
Lossless Audio Compression Format is called "ape". Very funny. Most
of the 66 file extensions were completely new to me.
On any particular system, you are unlikely to run into many files of unknown
type. On the other hand, it's good to have a resource available to help you
identify the type of unusual files when you run across them or to learn a little
more about the format of a familiar file type.