Random Ramblings About BSD on MacOS X (Part 2)

By Jeffrey Carl and Matt Loschert

This is the first chapter in a series of observations, representing the adventures of a couple of BSD admins (one with a lot of prior MacOS experience, the other with more on the BSD side) poking around the command line on an iBook laptop running Apple’s Mac OS X Public Beta. We’ll attempt to provide a few notes and observations that may make a BSD admin’s work with Mac OS X easier.

Note: Again, as members of the FreeBSD Borg collective MIND-STABLE, we’ll refer to our various comments/sections by the singular “I.” This also prevents either of us from admitting which dumb parts of the article were ours specifically. 🙂

New Information from Last Time

The quality and quantity of the feedback that we (OK, Borg segfault here) received after part one of this series was fantastic. Thanks to everyone (you know who you are) who wrote in to clear up points or to show us a better way to do things. Aside from a few flames on Slashdot (“You are stupid and blow goats!”, “Duh!”, and “N4t4li3 P0rtm4n r00tz y00”, etc.) the feedback was very helpful.

The number of NeXT gurus (and some BSD overlords, as well) out there who came to the rescue to correct mistakes and offer answers to the questions posed in the article was amazing. This time around, our readers who are l33t NeXT h4Xors, BsD r00tz or k3wl m4c d00dz are still invited to help clear up the questions or postulations herein on Mac OS X. So, to follow up and answer some of the questions posed by the first article, here are some of the best responses received:

------------------------------------------------

From: Dwarf
Subject: Daemon News Article

Not sure if any of this is new info to you guys....

OSXPB inherits a lot of "philosophy" from OS X Server. Thus, the lack of logging that occurs. Apple seems to have come down in favor of system responsiveness versus use monitoring. Their rationale for turning off logging(for almost everything) by default is that it impacts network thruput. If the logs are something you need, they can all be enabled from the command line, but network (and probably GUI) responsiveness will likely suffer as a result Apple also seems to have made several assumptions about how OS X (any flavor) will be used.

Apparently their idea is that it will provide services to a LAN and be hidden from the world by a firewall of some sort, thus the default enabling of NFS and having NetInfo socketed to the network by default. Since NetInfo is a multi-tiered database, your "local" NI Server may also be the "master" NI Server for subnetted machines, while being either a clone or a client of a still higher level NI Server. So, they hook it to the net by default. This also provides the mechanism by which other machines can automatically be added to your network. At bootup, each machine tries to learn who it is and where it lives by querying the network for its "mommy" (a Master NI Server). If it finds one it accepts the name and IP that server furnishes and initializes itself accordingly. If it doesn't, it uses the defaults from its initialization scripts. Getting this all to work painlessly is one of the things about which the NetInfo documentation is pretty obscure. Owing primarily to the fact that it is written in terms of tools that no longer exist as separate entities, but have been combined into more powerful tools. Further, if NFS is properly setup, each machine will automount the appropriate NFS volumes at startup. Another area where making it work is not clearly explained. I will only touch on the confusion that exists about setting up MailApp and making it work. Another documentation shortcoming.

Another facet of operation that isn't clearly explained is the Apple philosophy about how the file tree is organized. Their thinking is that users should only install things into the /Local tree. /System should be reserved for those things that administrators add. My guess is that naïve users will be fine so long as they confine themselves to operating within the GUI, as the GUI tools seem to be pretty smart about where they put things. But, if those users start installing things from the CLI....

A problem area about which not much has been written is the fundamental incompatibility between the Mac HFS and HFS+ filesystem and BSD. Mac files are complex, containing both a Data Fork and a Resource Fork. BSD knows nothing about complex files and thus when BSD filesystem tools are used to manipulate Mac files, the resource forks get orphaned. See: http://www.mit.edu/people/wsanchez/papers/USENIX_2000/ for a better explanation of this. This may be the source of a longstanding OS X Server problem whereby the Desktop Database process eventually goes walkabout and consumes over 90% of the CPU.

Authors’ Note: I’ve received a large number of comments about how the existing state of Mac OS X/Darwin documentation sucks. Frankly, I agree – that’s why I/we wrote these articles. While there’s a certain thrill to “spelunking” a new OS, it’s not what an administrator would like to have to be doing in their spare time. However, it’s hard to point a finger at Apple, since they’re currently under a hiring freeze after their recent absurd stock devaluation (post-Q3 results), and they would be perfectly right to have every man/woman/droid/vertebrate/etc. working on developing the OS rather than documenting it.

Nonetheless, there is still a significant problem lurking. There are tens of thousands of otherwise non-super-technical folks who have become MacOS gurus through inclination and experience, able to roam around a school or office and fix traditional MacOS problems. At the moment, the folks who (working with the current paltry documentation) can do this for MacOS X are incredibly few, since it requires significant knowledge of MacOS as well as Unix experience – and even then, it’s only NeXTstep mavens who will be truly at home with some of its aspects.

The good folks at Daemon News have provided a space here to try to answer some of these questions, but it’s up to the knowledgeable folks already out there to contribute to sites like Daemon News, Darwinfo, Xappeal, MacFixit, etc. to make whatever knowledge is out there available to the soon-to-be Mac OS X community. If Apple can’t document this OS thoroughly while rushing every available resource to develop it, it’s up to the folks who (at least marginally) understand it to do so, for the good of all its users.

From: Brian Bergstrand <[email protected]>
Subject: Re: Daemon News : Random Ramblings about BSD on MacOS X (part 1)

In your article you said: "(as mentioned, etc, tmp and var are all links to inside /private; I refuse to speculate why without having several drinks of Scotch first)".

The /private dir. is again a part of Mac OS X's NeXT heritage. Originally, the thought behind /private was that it could be mounted as a local drive for NeXT stations that were Net booted. That way you would not have to mount volumes for /etc, /var, or whatever else needed write perms. This also worked well if you booted from a CD. /private, meant data that was to be used only on a specific machine.

HTH.

Brian Bergstrand
Systems Programmer Northern Illinois University
Imagination is more important than knowledge. - Albert Einstein

Authors’ Note: We’re discovering more and more that Mac OS X seems very much like the next revision of OpenStep – with MacOS 9 compatibility and a new GUI thrown in. Not that this is necessarily a bad thing; it just seems like NeXT was the “Addams Family” member of the BSD clan that nobody else noticed, and we’re not sure why. If anyone would like to speculate on the reasons that NeXT’s new ideas were largely ignored by the industry (aside from the typical Steve Jobs-ian tendency to make your computers too expensive for normal people to buy), we’d love to find out more.

------------------------------------------------

From: Daniel Trudell <[email protected]>

Subject: random bsd os x ramblings...netinfo and ipconfigd

[...]

Netinfo is interesting. One thing I noticed is that most of the stuff in the utility applications are like mini versions of netInfoManager. IE I can edit/add/delete users in netInfoManager including root and daemon, and those changes are present in multiple users after the fact, and vice-versa. However, some things still depend on /etc/passwd even in multiuser mode. I installed samba, and I needed an /etc/passwd file for it. i used "nidump passwd .> /etc/passwd" to generate one from netinfo....but there was a twist...some of the users were shadowed, some were not. I s'pose that might be an issue. Also, I was conforming UID's on my box to the company UID's....if somebody with a UID of 500 logs in remotely the machine forgets how to handle users...a reboot fixes this.

[...]

In general, i think there's a consensus about acceptance of netinfo. When you first run around in tcsh, a geek asks themself "what the f**k...this is jacked up, what's up with /etc?", but once you figure out netinfo, a geek says "hey, check this out, it's nifty!"

tack

Authors’ Note: I agree. Still, it will take a while (and some seriously improved documentation [see above]) to get used to.

------------------------------------------------

From: Rick Roe
Subject:Re: Mac OS X article

Well, I'm not the foremost expert on Darwin, but I've learned a few things from "this side of the playground" that might help...

- The "Administrator" issue is Apple's compromise between the single-powerful-user paradigm of Classic Mac OS and the Unix/NeXT multiuser system with it's too-powerful-for-your-own-good root.

An "administrator" has the privileges an upgrading Mac user expects: ability to change system settings and edit machine-wide domains on the disk (like /Applications). However, it still protects them from the dangers of running as root all the time, since they don't get write access to the likes of /etc (except through configuration utilities), or to /System (which is a partitioning that keeps the Apple-provided stuff separate from the stuff you install, like /usr vs. /usr/local).

The inability of "administrator" users to make changes to items at the top level of the filesystem is a bug in the current version.

- Actually, we got NTP support back in Mac OS 8.5, not 9.0 :)

- The developer tools are available separately from Mac OS X through Apple's developer program. The basic membership level is free, and gets you access to not only the BSD/GNU developer tools, but also the cool GUI tools, headers, examples, and documentation out the wazoo. Of course, you can also get a lot of this stuff from the Darwin distribution, too.

- Regarding the list of top-level files and directories in .hidden:

- Desktop DB and Desktop DF are used by Mac OS 9 to match files to their "parent" applications. OS X maintains them for the sake of the Classic environment but only uses them as a fallback, as it has a more sophisticated per-user system for this purpose.

- Desktop Folder is where native OS 9 stores items that show up on the desktop. In OS X, they're in ~/Library/Desktop.

- SystemFolderX was where the BootX file (a file containing info for Open Firmware and some bootstrap stuff to get the kernel started) was kept in previous developer releases. It's elsewhere now.

- Trash is the Mac OS 9 version. OS X uses /.Trash, /.Trashes, and ~/.Trash.

- So you've discovered how cool NetInfo is. I got tired of reading reviews that were just complaints about not being able to edit stuff in /etc to change things. :) Here's some extra info for you:

- There's a convenient GUI utility for editing NetInfo domains,

/Applications/Utilities/NetInfoManager.

- root's password can be changed in NetInfoManager, or the Password panel in System Preferences, in addition to the command line.

- NetInfo is a pretty cool "directory service" for administering groups of computers... one of those unfortunate "best kept secrets" of NeXT... but what's cooler is that, in OS X, it's just one possible backend to a generic directory services API. So it's also possible to run your network using LDAP, or Kerberos/WSSAPI (er, whatever that acronym was), or NDS, or (god help you) Active Directory -- and the user experience for Mac OS X will be the same.

- You might like this... try entering ">console" at the login window.

Authors’ Note: Typing “>console” at the login window (no password necessary) and hitting “Return [Enter]” boots you directly into Darwin, skipping the Mac OS X GUI layer entirely. Sooper cool.

------------------------------------------------

From: Larry Mills-Gahl <[email protected]>
Subject: NetInfo and changing network settings

[...]

One bit that I've been sending feedback on since the Rhapsody builds (pre-OS X Server) is the suggestion that you must reboot to have network settings changes take effect. This is one area in NT that drives me absolutely nuts and I feel like billing Bill G for the time it takes for multiple restarts of every NT or 9X machine you setup!!! Unix seems to have figured this out long ago. The Mac OS has figured this out long ago!!! I appreciate the engineers being conservative because their market is notoriously unforgiving about issues that work, but are not clairvoyant and anticipate how each luser wants the system to work. I hope that they will have this cleared up by release time.

In the interim, here is a script that HUPs netinfo services to get a hot restart.

#!/bin/sh
case `whoami` in
root)
;;
*)
echo "Not Administrator (root). You need to be in order to restart the network."
return
;;
esac
echo "Restarting the network, network will be unavailable."
kill `ps aux | grep ipconfigd | grep -v grep | awk '{print $2}'`
echo " - Killed 'ipconfigd'."
ipconfigd
echo " - Started 'ipconfigd' right back up."
sleep 1
ipconfig waitall
echo " - Ran 'ipconfig waitall' to re-configure for new settings."
sleep 1
kill -HUP `cat /var/run/nibindd.pid`
echo " - Killed 'nibindd' with a HUP (hang up)."
sleep 2
kill -HUP `cat /var/run/lookupd.pid`
echo " - Killed 'lookupd' with a HUP (hang up)."
echo "The network has successfully been restarted and/or re-configured and is now available."

Authors’ Note: Larry gives credit to “Timothy Hatcher” as the original author of this script. You can find the original script at the bottom of the page at http://macosrumors.com/?view=archive/10-00, as well as another script about 3/4 of the way down the page which reproduces (roughly) the very useful MacOS 8-9 “Location Manager” functionality. I don’t like to cite the site MacOS Rumors as any kind of source of reliable info, since it’s 90 percent pretentiously uninformed speculation that doesn’t admit itself as such, but one of its readers did give the Mac community “the scoop” here, as far as I can tell. If Timothy Hatcher or anyone else out there wants to speak up as the original author of this script, please let me know – I’d love to ask you some questions about NetInfo. 😉

————————————————

From: Dag-Erling Smorgrav <[email protected]>
Subject: mach.sym in Darwin

I'll bet you a dime to a dollar the mysterious mach.sym in MacOS/X's root directory is simply a debugging kernel, i.e. an unstripped copy of mach_kernel.

————————————————

From: Paul Lynch <[email protected]>
Subject: MacOS X Daemon News

I can give you a few updates on some parts that might be of interest. In no particular order:

- the kernel (Mach) is only supplied in binary. Most MacOS X admins won't be expected to be able to build a new kernel; that requires a BSD/Mach background (and who's got that outside of Apple?) and a Darwin development system. So building it with firewall options enabled is reasonably smart.

- as well as /.hidden, you will notice that dot files aren't visible in the Finder. .hidden should be looked for in the root of any mounted filesystem, not just /.

- /private is a hangover from the old days of diskless workstations. NeXT had a good netboot option, which meant that you could stash all the local configuration and high access files (like swap - /private/vm/swapfile) in a locally mounted disk. This is all part of the Mach, as opposed to BSD, option.

- MacOS X doesn't only support HFS. It also supports UFS, and that may shed some light on some of the "but HFS does this" quirks.

Paul

Authors’ Note: The inclusion of a distinct /private now seems to make a lot of sense, especially for those who are willing to believe in grand computer-industry conspiracy theories. 🙂 I saw a Steve Jobs keynote in which he showed 50 iMacs net-booting from a single server, showing their abilities as (relatively) low-cost “Network Computers.” And who is a greater believer in NCs than Apple board member (and reputed best buddy of Steve Jobs) than Oracle chief Larry Ellison? No specific documentation of the role of /private has thus far been provided by Apple (as far as I can tell), but the above explanation seems very plausible, leaving open the door for future uses as described.

------------------------------------------------

From: Peter Bierman

Subject: http://www.daemonnews.org/200011/osx-daemon.html

Try 'nicl .'

Then try combinations of cd, ls, and cat.

cd moves around the netinfo directory structure

cat prints the properties of the current directory

ls prints the subdirectories of the current directory

A few minutes in nicl, and NetInfo will make a lot more sense.

Unfortunately, there's no man page yet.

Another tidbit:

In X-GM, volumes will mount under /Volumes instead of at /

————————————————

From: James F. Carter <[email protected]>

Subject: Comments on Random Ramblings about BSD on MacOS X

Why a firewall with no rules? Because the firewall code has to be selected at compile time, but when the CD is burned they don't know what rules the end user will want, and they don't want to lock out any "traditional" behavior, such as the ability to play host to script kiddies with a port of Trinoo for the Mac. I agree that a set of moderately restrictive default rules would be a good idea for the average grandmother, but I can understand the developers' attitude too.

Why have a plain file /tmp/console.log in addition to the syslog? In case syslogd dies. I have this problem on Linux: there's a timing dependency which if violated kills syslogd, and I'm running a driver (you don't want to know the gory details when I suspend the laptop to RAM and restart an hour later). If "sync" got done, and if the file is rotated by the code that opens it, you have a chance to see your machine's death cry when it crashed and burned. I've hacked my Linux startup files to do this partially to catch (sysadmin-induced) screwups during the boot process.

Why put resolv.conf in /var/run? ppp and dhcpd often obtain the addresses of the ISP's DNS servers during channel setup, and have to write them into resolv.conf. Modern practice, at least on SysV-ish systems like Solaris and Linux, is that /etc is potentially mounted read-only on a diskless workstation or CDROM, and dynamic info goes in /var/something.

Running the NFS daemon: Agreed that it's a security hole. Solaris only starts nfsd if /etc/dfs/sharetab (the successor of /etc/exports) contain the word "nfs". I've hacked my Linux startup files to do something similar.

/private/Drivers: I assume this contains drivers, similar to /lib/modules/$version on Linux. You wouldn't want to intermix code segments with device inodes, would you? :-) Or does recent BSD do something weird and wonderful along these lines? I've thought about a hypothetical UNIXoid operating system in which the device inode is [similar to] a symbolic link to its driver. (Paraquat on the grass?)

James F. Carter

Internet: [email protected] (finger for PGP key)

Authors’ Note: The NFS inclusion seems like yet another attempt by Apple to include functionality “in the background” that they may or may not make use of. It’s an opposite to their attempts in recent versions of Mac OS (via the “Extension Manager”) to allow users to enable or disable anything that patches traps or otherwise alters the functions of the base OS, it still makes sense. The current Extension Manager functionality was most likely included because third-party utilities included this functionality, rather than because Apple really wanted end-users to have fine-grained control over the OS, and because so many poorly-written current MacOS extensions could interfere with Apple-provided OS functionality (if not hose the OS completely).

The prevailing attitude at Apple regarding OS X may very likely be that since it would be very difficult for typical users to modify their kernel (at least with existing tools), it’s best to open up everything that might be needed at some current or future point. However, this holds only until someone creates a kernel extension/module interface as easy as the current MacOS Extension Manager (something that, despite FreeBSD’s /stand/sysinstall attempts is still far away for any other *nix).

————————————————

Further Exploration: Das Boot

Last time, we mentioned that holding down the “v” key at startup shows a BSD-esque text console startup rather than the standard MacOS X GUI startup. Considering that the hardware on a revision-A iBook is pretty different than the hardware in your average x86 Free/Open/NetBSD box, we thought it would be interesting to see just what XNU (the Darwin kernel) does on startup.

Let’s look at what happens at boot time (as shown in the message buffer using dmesg just after boot time). Comments are shown on following lines (after “<=”).

minimum scheduling quantum is 10 ms

<= Haven’t seen this before on any BSD boot.

vm_page_bootstrap: 37962 free pages

<= RAM on this iBook is 160 MB.

video console at 0x91000000 (800x600x8)

<= The iBook’s screen, 800×600 resolution (x 8 bit color?)

IOKit Component Version 1.0:
Wed Aug 30 23:17:00 PDT 2000; root(rcbuilder):RELEASE_PPC/iokit/RELEASE
_cppInit done

<= IOKit is Apple’s Darwin device driver scheme

IODeviceTreeSupport done
Copyright (c) 1982, 1986, 1989, 1991, 1993
      The Regents of the University of California. All rights reserved.

<= It’s nice that they included this BSD-style, without any “copyright Apple blah blah” stuff

AppleOHCI: config @ 5505000 (80080000)
AppleUSBRootHub: USB Generic Hub @ 1

<= This is the iBook’s one built-in USB port

AppleOHCI: unimplemented Set Overcurrent Change Feature

<= I believe that OHCI is the USB Open Host Controller Initiative, a generic standard for USB devices. This option appears to be a standard USB driver parameter which is not currently implemented (?).

AppleUSBRootHub: Hub attached - Self powered, power supply good
PMU running om NonPolling hardware
IOATAPICDDrive: Using DMA transfers
IOCDDrive drive: MATSHITA, CD-ROM CR-175, rev 5AAE [ATAPI].

<= The iBook’s ATAPI 24x CD-ROM drive

IOATAHDDrive: Using DMA transfers

<= The iBook’s HD is UltraDMA/66, if I recall correctly

IOHDDrive drive: , TOSHIBA MK3211MAT, rev J1.03 G [ATA].
IOHDDrive media: 6354432 blocks, 512 bytes each, write-enabled.

<= The iBook’s 3.2 GB hard drive; there are three logical volumes on this one.

ADB present:8c

<= Not sure about this. ADB in Apple-speak generally refers to the legacy Apple Desktop Bus, which was a low-speed serial bus used for connecting keyboards/mice. The iBook does not have an ADB port; so this probably just indicates the presence of the ADB driver.

struct nfsnode bloated (> 256bytes)
Try reducing NFS_SMALLFH
nfs_nhinit: bad size 268

<= Not sure why it’s reporting errors against its own default settings for NFS (asking the average Mac user to recompile their kernel with this option is like asking the average driver to rebuild their engine with an extra cylinder). This is presumably a “beta” bug.

devfs enabled

<= The Unix devfs (separate from MacOS X drivers?) is enabled.

IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to accept, logging disabled

<= The packet filtering that we mentioned last time.

IOKitBSDInit
From path: "/pci@f2000000/mac-io@17/ata-4@1f000/@0:11,\\mach_kernel", Waiting on <dict ID="0"><key>IOProviderClass</key>
<string ID="1">IOMedia</string><key>IOPath Separator</key>
<string ID="2">:</string><key>IOPath Extension</key>
<string ID="3">11</string><key>IOLocationMatch</key>
<dict ID="4"><key>IOUnit</key>
<integer size="32" ID="5">0x0</integer><key>IOLocationMatch</key>
<dict ID="6"><key>IOPathMatch</key>
<string ID="7">IODeviceTree:/pci@f2000000/mac-io@17/ata-4@1f000</string></dict></dict></dict>

<= System preferences in MacOS X are set with XML files. Kudos to Apple for this forward-looking use of XML. See below for more on this and the “defaults” command.

UniNEnet: Debugger client attached
UniNEnet: Ethernet address 00:0a:27:92:04:3a

<= I think that “UniN” here refers to the “UniNorth” Apple MoBo chipset used in the iBook, which has a 10/100BT RJ-45 interface, among other things, built into it (and Ethernet network set up as the default under this configuration).

ether_ifattach called for en

<= Presumably “en” is the device driver type for this NIC

Got boot device = IOService:/Core99PE/pci@f2000000/AppleMacRiscPCI/mac-io@17/KeyLargo/ata-4@1f000/AppleUltra66ATA/IOATAStandardDevice/IOATAHDDrive/IOATAHDDriveNub/IOHDDrive/TOSHIBA MK3211MAT Media/IOApplePartitionScheme/Hard Drive@11
BSD root: disk0s11, major 14, minor 11
bsd_init: rootdevice = 'disk0s11'.

<= Finding the boot device; unsure why it calls it “BSD root” rather than “Darwin root” or just “MacOS X root.”

devfs on /dev
Ethernet(UniN): Link is up at 10 Mbps - Half Duplex

<= Yep, it’s plugged into the 10BT/half-duplex hub in my Netopia SDSL router.

Resetting IOCatalogue.
kext "IOFWDV" must change "IOProbe Score" to "IOProbeScore"

<= This appears to be a debugging (?) warning in a Darwin kernel extension.

kmod_create: ATIR128 (id 1), 23 pages loaded at 0x5878000, header size 0x1000

<= This appears to describe a kernel module/driver for the ATI Rage 128 chipset, although this Rev. A iBook uses only an ATI Rage Pro chipset. Perhaps this is a driver for the ATI family up to the Rage 128 series?

kmod_create: com.apple.IOAudioFamily (id 2), 16 pages loaded at 0x588f000, header size 0x1000
kmod_create: com.apple.AppleDBDMAAudio (id 3), 5 pages loaded at 0x589f000, header size 0x1000
kmod_create: com.apple.AppleDACAAudio (id 4), 9 pages loaded at 0x58a4000, header size 0x1000

<= Loading drivers for the audio chips on the iBook MoBo.

PPCDACA:setSampleParameters 45158400 / 2822400 =16
kmod_create: com.apple.IOPortServer (id 5), 13 pages loaded at 0x58be000, header size 0x1000
kmod_create: com.apple.AppleSCCSerial (id 6), 9 pages loaded at 0x58cb000, header size 0x1000

<= More drivers for Apple MoBo chipsets.

creating node ttyd.irda-port...
ApplePortSession: not registry member at registerService()

<= This looks like the IrDA infrared transfer port drive attempting to create a connection and failing.

creating node ttyd.modem...
ApplePortSession: not registry member at registerService()

<= It looks like it’s trying to create a connection to the modem port and failing.

.Display_ATImach64_3DR3 EDID Version 1, Revision 1
Vendor/product 0x0610059c, Est: 0x01, 0x00, 0x00,
Std: 0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101, 0x0101,
.Display_ATImach64_3DR3: user ranges num:1 start:91800480 size:ea680
.Display_ATImach64_3DR3: using (800x600@0Hz,16 bpp)

<= These appear to be setting GUI resolution at 800 x 600 in 16-bit color (which is what they had been set to in the GUI controls)

kmod_create: SIP-NKE (id 7), 7 pages loaded at 0x59b8000, header size 0x1000
kmod_destroy: ATIR128 (id 1), deallocating 23 pages starting at 0x5878000

<= Not sure about this; probably unloading the ATI Rage 128 driver from the kernel (?)

MacOS X’s Hardware Drivers and Support

Following on the above: many people, when they think about hardware and drivers that Apple needs to create for Darwin/Mac OS X, have one of two thoughts. It’s either “That should be really easy since Apple has all standardized hardware,” or “Won’t that be hard, since Macs use a bunch of whacked-out hardware that nobody else has ever heard of?” The answer is somewhere in between.

One of Apple’s few built-in advantages has always been that, since it creates its own hardware as well as software, it needs to support only a small fraction of the devices that any commodity x86-based OS might need to support. Furthermore, Apple’s dictum that Mac OS X will support only “Apple G3-based computers” makes it seem that hardware driver support would be relatively straightforward. This, however, is not the case. Apple’s “G3” support actually involves support for two wildly differing branches of hardware (and some “in-between” models).

Apple (circa 1996 or so) suffered in terms of hardware manufacturing and compatibility because of the amount of “non-standard” hardware it used. Aside from the obvious use of Motorola/IBM PowerPC CPUs instead of commodity x86 CPUs, (some) Apple’s desktops used the Texas Instruments NuBus expansion system instead of PCI, AGP or ISA; a proprietary serial bus for printers/modems; the Apple Desktop Bus (ADB) for keyboards and mice; external SCSI-1 for all other peripherals; and a variety of other custom Apple MoBo (motherboard) components.

However, Apple in the “Age of Steve” has moved to a more industry standard-compliant position. When Apple ditched beige colors (starting with the Jobs-directed iMac in 1998), it moved to a legacy-free environment, ditching a lot of its older custom hardware. Apple also effected a number of other hardware changes, moving more of the Mac OS “Toolbox” routines from custom ROM chips into software (MacOS X shouldn’t need them at all) and moving to unify all of its lines with a unified motherboard architecture (UMA-1).

The new machines (the ever-fly Steve Jobs’ “kickin’ it new-school legacy-free style”) ditched floppy disk drives and the old Apple Desktop Bus and serial ports in favor of USB for keyboards, mice and low/medium-speed peripherals. Built-in external SCSI was soon eliminated in favor of a mixture of USB and IEEE 1394 (“FireWire”) for higher-speed peripherals. With the introduction of the G4-based desktops, 2xAGP replaced PCI as the video card slot, leaving three 33-MHz PCI slots for internal expansion.

Drawing OS X’s “supported models” cut-off line at Apple’s G3s (which excludes Apple or clone-vendor models with G3 CPU upgrade cards) eliminates needing to support much of the legacy hardware that Apple has used in the past. However, there are a few Apple G3 models that bridge both technology generations, creating a notable thorn in Apple’s side. Because the original G3 desktops and PowerBooks included legacy technology (ADB, Apple serial ports, older chipsets), Apple must support these devices in Darwin and Mac OS X.

Type I/II PC card support does not appear to be available in Mac OS X Public Beta (possibly the reason there is no *official* support for Apple’s IEEE 802.11 “Airport” cards); IEEE 1394 (FireWire) support does not appear to be available, either. Apple’s new UMA-2 chipset is rumored to be introduced with new models at January’s Macworld expo; however, the specs of this chipset can only be guessed at now.

The Mysteries of “defaults”

Much of this article and the previous one have been the result of aimlessly exploring the file system of MacOS X from the command line, finding things that seemed odd or interesting and wondering, “Hmm, what will this do when I poke it?” I almost wish I didn’t stumble onto this next item. As I investigated its operation and the history of its implementation, I just became more curious about certain design decisions. Sort of like … well, NetInfo.

The item in question is the defaults command. After finding it, reading its man page (defaults(1)), and playing with it a little, it appeared sort of cool, but pretty mundane. The command allows for the storage and retrieval of system and application level user preferences. Please forgive the reference if it’s too far off, but it’s like a sort of “Windows registry” for MacOS X.

To get an idea of what information the system stored, I experimented with the command to see what it would tell me. Typing ‘defaults domains’ spit back the following list of information categories, or “domains” in Apple parlance:

% defaults domains
NSGlobalDomain ProcessViewer System%20Preferences com.apple.Calculator com.apple.Console com.apple.HIToolbox com.apple.Sherlock com.apple.Terminal com.apple.TextEdit com.apple.clock com.apple.dock com.apple.finder com.apple.internet com.apple.keycaps loginwindow

Those domains that related to applications illustrated Apple’s suggested domain naming convention of preceding the application name with the software vendor’s name. In this case, on a system containing solely Apple software, the application domains all began with ‘com.apple’.

I found that there were a variety of ways to view the data contained in these domains. In order to view the data for all domains, simply type ‘defaults read’. To query instead for information specific to a domain, use the form ‘defaults read <domain name>’ (without the arrows), or to grab a specific value, use ‘defaults read <domain name> <key>’. The command also allows the setting of values using the form ‘defaults write <domain name> <key> <value>’. The man page describes a variety of other ways to use the command.

Having read about XML-based plists (property lists) in MacOS X (originally from John Siracusa’s excellent series of Ars Technica articles on the OS, indexed at http://arstechnica.com/reviews/4q00/macosx-pb1/macos-x-beta-1.html), I assumed that this service might be based on an XML back-end. A quick search through the filesystem confirmed this suspicion. Per user preferences were found under ~/Library/Preferences, with each application having its own plist file. System-wide preferences showed up under /Library/Preferences, and although they were not present on this non-networked machine, I would be willing to bet that network-wide preferences could be found under /Network/Library/Preferences.

I did a little comparison using the preference data from the system clock application. First, I read the data using the defaults command, then I located the plist version in my ~/Library/Preferences directory. The results below show how the system translates the clock data into the XML plist format.

% defaults read com.apple.clock 
{
 24Hour = 0; 
 ColonsFlash = 0; 
 InDock = 0; 
 "NSWindow Frame Clock" = "-9 452 128 128 0 4 800 574 "; 
 ShowAnalogSeconds = 1;  
 Transparancy = 4.4;  
 UseAnalogClock = 1;  
 UseDigitalClock = 0;  
} 

% cat Library/Preferences/com.apple.clock.plist 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd">
<plist version="0.9">
<dict>
 <key>24Hour</key>
 <false/>
 <key>ColonsFlash</key>
 <false/>
 <key>InDock</key>
 <false/>
 <key>NSWindow Frame Clock</key>
 <string>-9 452 128 128 0 4 800 574 </string>
 <key>ShowAnalogSeconds</key>
 <true/>
 <key>Transparancy</key>
 <real>4.4e+00</real>
 <key>UseAnalogClock</key>
 <true/>
 <key>UseDigitalClock</key>
 <false/>
</dict>
</plist>

I was a bit intrigued by the date of the defaults man page. It was dated March 7, 1995, which as far as I knew was prior to the advent of XML. Apparently, there was some amount of history to this command and database that was not immediately obvious. A little research revealed that the defaults command and database date back at least to (yes, you guessed it) NeXTstep. The information found confirms that it was subsequently adopted for OpenStep and finally MacOS X. It also appears that the “back-end” evolved over time, from straight ASCII string based configuration files to today’s XML-based plist files. (As usual, if you know more about the history of this database, please feel free to share your knowledge, and please forgive my lack of knowledge of all things NeXT.)

With this history information in hand, I became curious about the present implementation. I was also curious if an API existed. Obviously, normal applications would not use this command to store and retrieve persistent data. However there was no mention of an API in the defaults man page, nor was there a link to any other information. A quick trip to the Apple developer website provided the information I was looking for. The API was called “Preference Services” and was a part of MacOS X’s “Core Foundation (http://developer.apple.com/techpubs/corefoundation/PreferenceServices/Preference_Services/Concepts/CFPreferences.html). Apple provides two sets of APIs for use based on the developer’s needs, a high-level one that allows the developer to quickly set and retrieve data in the default domain (defined as the current user, application, and host), and a low-level API for setting and retrieving data in very specific domains (specific user(s), application(s), and host(s)). The API also provides a method of storing data for suites of applications such as the standard office productivity apps we all know and love to hate.

During this exploration, I have been having a hard time deciding whether I think this is all cool or not. On one hand the idea of a system-(and even network-)wide preferences database seems pretty cool, especially to a BSD-er like myself, but at the same time, it is really nothing new. In the same way, at first glance, the idea of an XML-based back-end seems pretty innovative, but is it? Sure it’s cool to look at, but so what? The existence of the defaults command and an access API mean that the actual plist files are not intended for public consumption.

The defaults command and Preference Services API indicate that the whole database is supposed to be a black box to the system, the application, and especially the user. If this is the case, why not go with a high-horsepower back-end, one that offers more robust searchability and speed than that which could be achieved via the “crapload of text files” approach. I think the argument from Apple was supposed to be that the files should be architecture-neutral in order to be easily portable. If this was the case, why not just leverage an existing architecture independent binary database format. I know, for instance, that MySQL can do it, why not Apple?

The other argument I can think of might be that the XML format is essentially patch-able. Assuming the data is not customized too much by the running application, updates could be distributed and installed from small plist patch files. However, that doesn’t seem like a very convincing argument. All this having been said, this article will probably be published and the next day, a reader will say, “Well, <profoundly obvious answer succinctly stated>. Please unset your $LUSER variable.” The first person to write in with this wins a prize. 😉

Random Ramblings about BSD on MacOS X (Part 1)

By Jeffrey Carl and Matt Loschert

Mac vs. Unix

Right up front, I should mention what a strange animal Mac OS X is. Since 1984, the Mac OS has always been about the concept that “my OS should work easily, without my knowing how it works.” Unix has always been about the idea that “I should be able to control every aspect of my OS, even if it isn’t always easy to figure out how.” Apple – whose very name is anathema to many Unix admins – is now trying to combine both, in the world’s first BSD-based OS that is likely to ship 500,000 copies in its first year.

Apple isn’t alone in this hybridization effort; Microsoft will presumably ship “Whistler,” its NT/2000-based consumer OS, sometime in mid-2001. It appears that everywhere, the rush is on to deliver consumer OSes based on the kernels of server operating systems. In the long run, this is probably a Good Thing®, but Mac OS X and MS Whistler will have to introduce millions of desktop users to the advantages and problems of server OS design that we as server OS admins have been dealing with for years. Real protected memory and pre-emptive multitasking, sure … but also real multi-user systems and all of the permission and driver complexities that requires, as well. However, both Microsoft and Apple are investing the time and effort into developing a real user-friendly interface that your grandmother could configure (sorry, KDE and GNOME). It should be interesting to watch.

If you, like most other Internet server admins with a fully-functional brain stem, prefer Unix over the NT kernel, then Mac OS X will be the first true consumer OS that you will ever feel comfortable with. And, whether you love or hate Apple, it’s worth your time to get acquainted with Mac OS X and what it offers to a BSD Unix administrator.

Right now, there are a lot of questions out there about Mac OS X and its attempts to marry a BSD / Mach microkernel with an all-singing, all-dancing Mac GUI. This is largely because there is unfortunately a comparatively small crossover between Mac administrators and Unix administrators (much like the slim crossover between supermodels and serious “Dungeons and Dragons” players). Though neither Jeff nor Matt is a total über-guru in Mac OS or BSD, we’ll attempt to bridge the gap a little and give the average BSD admin an idea of what he or she will see if they peek around under the hood of Mac OS X Public Beta. Much of this is “first impression” notes and questions, without too much outside research behind it; explanations or clarifications from those who have more detailed answers are gratefully appreciated.

On a side note, Matt and Jeff will both use “I” rather than “we” in this article as we go, since we are both members of the FreeBSD Borg Collective and think of ourselves as one entity. 😉

Poking Under the Hood: First Impressions

To examine the hidden Unix world on OS X, navigate in the GUI to Applications >> Utilities >> Terminal. The first thing you may notice is that – as you might expect – the default shell is tcsh. You’ll quickly find that many of the user apps you’d expect in a typical Unix installation are there, plus a few extras and minus a few others (showing the refreshing influence of many developers, each pushing for the inclusion of their favorite tools). Interestingly, pico is there, although Pine is not; Emacs and vi are there, also (ed may be there, but I didn’t check since I thought everyone who might use ed was fossilized and hanging on display in a museum by now). Other fun inclusions are gawk, biff, formail, less, locate, groff, perl 5.6, wget, procmail, fetchmail, and an openssh-based client and server.

The first thing that I do when beginning to work on a new system is to set my shell and copy over my personal config files. When presented with the Mac OS X shell prompt, I was pleased to see that it was tcsh; but at the same time I was a little confused at how to customize it. Usually, I just drop my .cshrc file in ~/, and off I go. Well, ~/ in this case is /Users/myusername. This is unlike the normal /usr/home or /home that most admins are used to. Would tcsh be able to find my .cshrc there?

Well, since I didn’t have the machine network-connected while I was originally exploring, and I was too lazy to hand-copy parts of my personal .cshrc file over to this directory, I instead began poking around /etc looking at how the default .cshrc worked. I quickly found a small csh.cshrc which contained a single line sourcing /usr/share/init/tcsh/rc. This is where it gets interesting.

This rc file sets up a very cool method of providing standard system shell features that can be overriden by each Mac OS X user (assuming he/she realizes that the “shell” does not refer to the funky computer case). The rc file first checks to see if the user has a ~/Library/init/tcsh directory. If so, it sets this as the tcsh initialization directory; otherwise it sets the /usr/share/init/tcsh directory as the default. It then proceeds to source other files in the default directory including, for instance, environment, aliases, and completions files which each in turn source files (if they exist) found in the abovementioned tcsh initialization directory.

In this way, the system provides some powerful “standard” features, but still gives the experienced user the ability to override anything and everything. I dropped some customizations in my personal ~/Library/init/tcsh directory and immediately felt at home. Without a doubt, the old school UNIX curmudgeons will hate the built-in shortcuts, but I still must applaud the developers for making an attempt at providing a useful set of features for the “new” user. No one will ever agree on a standard set, but it’s good to see that the average Mac “What is this, some kind of DOS prompt?” user will have a very functional command line, should he/she choose to explore it. I must admit that some of these completions can be a little annoying though, while typing quickly.

Digging Around

When you drop to the shell, you’ll notice that you’re logged in with the “user name” you selected for yourself when you installed the OS. When you create a user during the installation process, you are informed that “you will be an administrator of this computer.” Yet, when in the shell, you take a look at /etc/passwd, you’ll see no entry for your username. Nonetheless, if you su to “root,” you’ll notice that root’s password is the same as the administrative user you created. Even stranger, although you are an “administrator,” as long as you’re logged in as that user, you still don’t have the permissions of “root.” What’s going on here? Well, it turns out that this functionality has been co-opted by Apple’s NeXT-inherited Net Info (which I will describe below).

I couldn’t resist running dmesg to see what would come up. To my surprise, the output was pretty boring. The most bizarre thing was that a packet filter was enabled, but with a default to open policy and no logging. An open firewall … why? If they weren’t planning to enable firewall rules, why compile packet filtering code into the kernel in the first place? At this point, I felt I had to do at least a little searching for some sign that this had been considered at one point. Well, /etc held no clues, and there was nothing obvious in /var (the only other probable location I could think of). I guess they simply wanted to leave the door conveniently ajar should they choose to go back and reconsider the decision.

When I visit a machine, I am always curious to check out /etc. This directory usually reveals some interesting information about the operating system and/or the administrator. In this case, while touring the usual suspects, I noticed something that, again, I did not expect. The wheel group listed in /etc/group did not contain any users (it seemed like some Communist Linux system!). You typically expect to find root along with at least one administrative user account in /etc/group. In this case, wheel was empty, leaving me to wonder why they even kept the concept, since this functionality had apparently been folded into an alternate privilege manager. As I found out, this functionality is also handled by NetInfo.

While checking out /etc/syslog.conf, I noted that quite a bit of system information gets routed to /dev/console, as one would expect. What was strange was that later while exploring /var/tmp (to look for tell-tale temp files), I found a file named console.log. It appeared to contain a file version of the most recent couple of console messages. I verified that identical (albeit fewer) entries appear when using the GUI-based console viewer. I’m no super OS guru … but this strikes me as a strange hack. Why not simply send the same information to a standard log file via syslog and have the GUI console app read from that? That’s what syslog’s there for! Maybe someone else out there can shed some light on this one.

Another oddity was /etc/resolv.conf. It was symlinked to /var/run/resolv.conf, which seemed a little strange. Even more so when /var/run/resolv.conf turned out to be non-existent. Okay … well, there was no named running (I checked) and the resolver man pages gave no clue. So what was going on here? Apparently something odd, since a quick nslookup responded with the DNS server listed as being the local machine. No named, no resolv.conf, how about /etc/hosts? Well, /etc/hosts had no special magic in it, but it did have a note mentioning that the file is only consulted while the machine is in single-user mode. At all other times, the request is handled by lookupd working through the facilities provided by NetInfo. Hmm … NetInfo was beginning to sound very interesting.

After a bit of roaming around, I became curious as to whether root would have a home directory on the machine. Given the difference in user and administrator handling on Mac OS X, I wasn’t even sure that root would have a home directory. But, lo and behold, there was ~root under /var (or more properly /private/var since it’s a symlink) with a set of default files and directories resembling any other user in /Users. None of this should have come as a surprise to me since /etc/passwd contained this information.

Strange Services

While exploring the system, you’ll find some interesting services enabled. I was surprised to see the nfs daemon running. Thus, it didn’t faze me when I found the portmap daemon (a service required in order to run nfs) hanging around. Finding nfs active was unexpected, since I would only run it if necessary, due to potential security concerns.

Another surprise was seeing a full-fledged NTP daemon running. It was cool to see this service as a standard part of Mac OS; but I thought it a bit strange though to do continuous high accuracy time synchronization via ntp for a desktop, consumer OS. Why not use ntp’s little brother, ntpdate, on a periodic basis for this same functionality? Do Mac OS users really need the accuracy of continuous ntp, when a functionally similar result could be obtained without the security risk of running another network daemon?

The answer is that, for now, “yes they do.” Since the introduction of Mac OS 9, Mac users have been given the default option of using NTP to synchronize their desktop clocks via NTP servers, and it appears that Apple wants to give Mac OS X users (presumably in a server setting) the option of running an NTP server for other Macs on a LAN/WAN.

Speaking of network services, a little netstat lovin’ showed a couple of other interesting tidbits. The machine was not as quiet network-wise as I had expected. Along with the nfs, portmap, and ntp network activity, I found listening sockets open on ports 752 and 755, with an active connection to port 752 from another local privileged port.

After some man page reading and poking at a few of the running daemons, I was able to narrow things down a little. I found that HUP-ing lookupd (an interface daemon to NetInfo) caused the active connection to be disconnected and reestablished (obvious since the new connection originated from a new port). I also found that HUP-ing nibindd (the NetInfo binder, a sort of librarian for NetInfo) caused the listening sockets to be closed and reappear on new ports. That struck me as quite odd. I would have expected them to re-bind to the same ports. Even with these addresses changing, lookupd somehow knew where to find the ports, as I saw new connections established shortly after the HUP signal was received.

Not knowing much about NetInfo, I was curious why this service was implemented using tcp sockets. I assumed that the service must be distributed, available to and from remote hosts. Otherwise, the developers probably would have implemented this using Unix domain sockets since they are substantially faster and are safer for local-only protocols. Not having much background on NetInfo at the time, I was somewhat puzzled.

Welding the Hood Shut

Apple’s longstanding (since Mac System 1.0 in 1984) policy has been to prevent users from screwing up their systems by hiding important items from them, or making it difficult to access the cool things that they might play with and hose their hard drive. This attitude has extended to OS X, and things like header files, gcc and gdb are not present (although these were included in Apple’s earlier “developers-only” previews of Mac OS X), not that the average Mac user would be able to “accidentally” damage their system with gcc.

Not to worry, though; the Mach / BSD portion of Mac OS X is maintained separately as an open-source operating system that Apple calls “Darwin.” In fact, a uname -a at the command line in MOSXPB reveals:

Darwin ibook 1.2 Darwin Kernel Version 1.2: Wed Aug 30 23:32:53 PDT 2000; root:xnu/xnu-103.obj~1/RELEASE_PPC  Power Macintosh powerpc

You can get the abovementioned developer goodies (and much more) by downloading Darwin or following the CVS snapshots for various packages.

At the root of the filesystem, it’s not hard to see that most of what you would expect to find is there somewhere; it may just be in a different place than you expected. The first thing you’ll likely notice is that a lot of what’s there is being hidden from the user by the MacOS X GUI. The Mac OS has always had its share of hidden files (for volume settings, the desktop database, etc.) However, you’ll notice a file in / called “.hidden” which lists the items in / that the GUI hides. The contents of .hidden are:

bin (the Unix directory for binaries needed before /usr is mounted)

cores (a link to /private/cores; a directory – theoretically, a uniform location for placing Unix core dumps)

Desktop DB (the GUI desktop database for files, icons, etc.)

Desktop DF (don’t remember what this does)

Desktop Folder (for items that are shown on the desktop)

dev (the Unix devices directory)

etc (a link to /private/etc; contains normal Unix configuration files and other normal /etc stuff)

lib (listed, but not present in /)

lost+found (the Unix directory for “orphaned” files after an unclean system shutdown. Classic Mac OS users are used to looking in the Trash for temp files saved after a crash or other emergency shutdown)

mach (a link to mach.sym)

mach_kernel (the Mach kernel itself)

mach.sym (listed as a Mach-O executable for PowerPC, but I’m not sure exactly what it does. Any Darwin users out there with more knowledge should correct me.)

private (contains var, tmp, etc, cores and a Mac OS X directory called Drivers, whose relationship to /dev is unclear)

sbin (Unix system binaries traditionally needed before /usr mounts)

SystemFolderX (not present in /; may be a typo or used in the future?)

tmp (a link to /private/tmp; the Unix temporary files directory)

Trash (files selected in the GUI to be deleted)

usr (just about everything in BSD Unix these days 😉 )

var (Unix directory meant once upon a time for “variable” files – mail, FTP, PID files, and other goodies)

VM Storage (the classic MacOS virtual memory filesystem – usually equal to the size of physical memory + swap space)

A brief digression about .hidden: Editing this file and adding any directory or file name will cause the GUI to hide the item. For example, edit this file in the shell and add ‘Foo,’ create a directory in / with that name, log out and log back in, and it won’t be seen.

However, it isn’t as simple as it sounds. Strangely, there’s an item in / called “TheVolumeSettingsFolder” whose name isn’t in .hidden, but still doesn’t show up in the GUI. This indicates that there is more to what’s shown and hidden in the GUI than just the “.hidden” file. Also, adding a “.hidden” file to another directory than “/” does not appear to cause files of that name to be hidden in that directory. Furthermore, /.hidden does not prevent files with those names from appearing in lower directories, nor does it hide directories in lower directories when an absolute path is placed in .hidden. If anyone can clear this up for me, I’d love to know what’s at work here.

Getting back on track: You’ll notice that all of the traditional BSD file hierarchies are present, although the GUI hides them from the user so that they are unaware that these Unix directory structures exist. Furthermore, some of the directories you see are in fact soft links to unusual places (as mentioned, etc, tmp and var are all links to inside /private; I refuse to speculate why without having several drinks of neat Scotch first). Also, unlike previous incarnations of the Mac OS, the user is unable to rename the “System” directory from the GUI.

As a brief aside, it should be noted that Apple’s great struggle here (and the immensity of their effort should be appreciated) is to combine the classic Mac OS “I can move or rename damn near anything I want” ethos with Unix’s “it’s named that way for a reason” ethos. How Apple chooses to resolve this dilemma in the final release will speak volumes about their dilemma over (I’m ashamed to admit I’ve forgotten who put it this way, but it’s brilliant) an OS that “the user controls,” versus an OS that “allows you to use it.” And, let’s face it, Unix has always been the latter.

Also, logical or physical volumes are listed under “/” using their “name.” Rather than use a Unix filesystem hierarchy, DOS’s A: or C: drives or even Windows 9x’s “My Computer,” Mac users have always been able to name their hard drives or partitions arbitrarily. Each drive or partition thereof was then always mounted and visible on the desktop. If I have a drive in my computer which I’ve named “Jeff’s Drive,” you’ll see it from the shell as: /Jeff’s\ Drive, and all of its file and directory contents are viewable underneath it. Similarly, if a user installs MOSX PB on a drive with Mac OS 9 already installed, at installation time their old files and directory hierarchies are all moved to a directory called “Mac OS 9” in /.

** What is strange about the above?? It sounds very UNIXy. As an example, many people mount their floppy drives as ‘/floppy’, but nothing is preventing them from mounting the drives as ‘/Cool\ Ass \Plastic\ Coaster\ Partition’. **

NetInfo & Friends

Well, by the time I got this far, I realized that this article wouldn’t really be complete without some discussion of the mysterious NetInfo. I also knew that a little bit of research was in order. While searching, I learned that NetInfo is a distributed configuration database consisting of information that would normally be queried through a half-dozen separate sub-systems on a typical UNIX platform. For example, NetInfo encompasses user and group authentication and privilege information, name service information, and local and remote service information, to name just a few.

When the NetInfo system is running (i.e. when the system is not in single-user mode), it supercedes the information provided in the standard /etc configuration files, as well as being favored as an information source by system services, such as the resolver. The Apple engineers have accomplished this by hooking a check into each libc system data lookup function to see if NetInfo is running. If so, NetInfo is consulted, otherwise the standard files or services are used.

The genius of NetInfo is that it provides a uniform way of accessing and manipulating all system and network configuration information. A traditional UNIX program can call the standard libc system lookup functions and use NetInfo without knowing anything about it. On the other hand, MacOSX-centric programs may directly talk to NetInfo using a common access and update facility for all types of information. No longer does one have to worry about updating multiple configuration files in multiple formats, then restarting one or more system daemon or daemons as necessary.

The other benefit of the system is that it is designed to be network-aware from the ground up. If information cannot be found on the local system, NetInfo may query upward to a possibly more knowledgable information host.

NetInfo also knows how to forward requests to the apropriate traditional services if it does not have the requisite information. It can hook into dns, nis, and other well-known services, all without the knowledge of the application making the initial data request.

NetInfo is a complex beast, easily worth an article of its own. If you want more information, here are a few tips. I found that reading NetInfo man pages was frustrating. Most of the pages tended to heavily use NetInfo-related terms and concepts with little to no definition. Nevertheless, if interested, check out netinfo(5) (netinfo(3) is simply the API definition), netinfod(8), nibindd(8), and lookupd(8). However, the best information that I found was on Apple’s Tech Info Library at http://til.info.apple.com/techinfo.nsf/artnum/n60038?OpenDocument&macosxs.

Getting More Information

Finally, for a great resource on all things Darwin, don’t go to Apple’s website (publicsource.apple.com/projects/darwin/). Instead, go to www.darwinfo.org, and you’ll find lots of great stuff, including the excellent “Unofficial Darwin FAQ.”

If you don’t have access to MacOS X Public Beta but would like to read its man pages to get a better idea of how some of these commands are implemented, you can find a collection of MOSXPB/Darwin man pages online at www.osxfaq.com/man.

Jeff and Matt hope this little tour has been semi-informative and has raised the curiosity of the few brave souls who have made it to the end of this MacOS travelogue. Next time we will take a look at … and discuss …. See you next time.

Why Microsoft Will Rule the World: A Wake-Up Call at Open-Source’s Mid-Life Crisis

By Jeffrey Carl

Boardwatch Magazine was the place to go for Internet Service Provider industry news, opinions and gossip for much of the 1990s. It was founded by the iconoclastic and opinionated Jack Rickard in the commercial Internet’s early days, and by the time I joined it had a niche following but an influential among ISPs, particularly for its annual ranking of Tier 1 ISPs and through the ISPcon tradeshow. Writing and speaking for Boardwatch was one of my fondest memories of the first dot-com age.

In a Nutshell: The original hype over open-source software has died down – and with it, many of the companies built around it. Open-source software projects like Linux, *BSD, Apache and others need to face up to what they’re good at (power, security, speed of development) and what they aren’t (ease of use, corporate-friendliness, control of standards). They will either have to address those issues or remain niche players forever while Microsoft goes on to take over the world.

The Problem

There is a gnawing demon at the center of the computing world, and its name is Microsoft.

For all the Microsoft-bashing that will go on in the rest of this column, let me state this up front: Microsoft has done an incredible job at what any company wants to do – leverage its strengths (sometimes in violation of anti-trust laws) to support its weaknesses and keep pushing until it wins the game. That’s the reason I hold Microsoft stock – I can’t stand the company, but I know a ruthless winner when I see it. I hope against reason that my investment will fail.

It has been nearly two years since I wrote a column that wasn’t “about” something, that was just a commentary on the state of things. Many of you may disagree with this, and assign it a mental Slashdot rating of “-1: Flamebait.” Nonetheless, I feel very strongly about this, and I think it needs to be said.

Here’s the bottom line: no matter how good the software you create is, it won’t succeed unless enough people choose to use it. Given enough time and the accelerating advances of other software, I guarantee you it will happen. You may not think this could ever be true of Linux, Apache, or any other open-source software that is best-of-breed. But ask any die-hard user of AmigaOS, OS/2, WordPerfect, and they’ll tell you that you’re just wishing.

Sure, there plenty of reasons these comparisons are dissimilar; Amiga was tied to proprietary hardware, and OS/2 and WordPerfect are the properties of companies which must produce profitable software or die. “Open-source projects are immune to these problems,” you say. Please read on, and I hope the similarities will become obvious.

For the purposes of this column, I’m going to include Apple and MacOS among the throng (even though Apple’s commitment to open source has frequently been lip service at best), because they’re the best example of software that went up against Microsoft and failed.

You say it’s the best software out there for you – so why does it care what anyone else thinks? It doesn’t, at first. But, slowly, the rest of the world makes life harder for you until you don’t have much choice.

In the Software Ghetto

Look at Apple, for example (disclaimer: I’m a longtime supporter of MacOS, along with FreeBSD and Linux). My company’s office computers are all Windows PCs; my corporate CIO insisted, despite my arguments for “the right tool for the job,” that I couldn’t get Macs for my graphics department. “We’ve standardized on a single platform,” is what he said. He’s not evil or dumb; it’s just that Windows networks are all he knows and are comfortable with.

Big deal, right? Most Mac users are fanatics. There’s a registered multi-million-person-plus community out there of fellow Mac-heads that I can always count on to keep buying Macs and keep the platform alive forever, right? An installed base of more than 20 million desktops plus sales of 1.5 million new computers in the past year alone is enough for perpetual life, right?

Sure, until that number is fewer than the number of licenses that Microsoft decides that it needs to keep producing MS Office for Mac. Right now, the fact that my Mac coexists with the Windows-only network at my office, because I can seamlessly exchange files with my Windows/Office brethren. But as soon as (which Microsoft could easily do by upgrading windows with a proprietary document format that can’t be decoded by other programs without violating the DMCA or something asinine like that) platform-neutral file-sharing goes out the Window (pun intended) … I’m going to have to get a Windows workstation.

Or Intuit decides there’s just not enough users to justify a Mac version of Quicken. Or, several years from now, Adobe decides it’s just not profitable to make a Mac version of Photoshop, InDesign or Illustrator … or even make critical new releases six months or a year behind the Windows version. I can still keep buying Macs … but I’ll need a Windoze box to run my critical applications. As more people do this, Apple won’t have the revenues to fund development of hardware and software worth keeping my loyalty.. And I’ll keep using the Windows box more and more until I finally decide I can’t justify the expense of paying for a computer I love that can’t do what I need.

“Apple,” you say, “is a for-profit company tied to a proprietary hardware architecture! This could never happen to open-source software running on inexpensive, common hardware!”

Open Source with a Closed Door

Let’s step back and look at Linux. A friend of mine works as a webmaster at a company that recently made a decision about what software to use to track website usage statistics. His boss found a product which provided live, real-time statistics – which only ran on Windows with Microsoft IIS. My friend showed off the virtues of Analog as a web stats tool, but they were too complicated for my friend’s boss to decipher. Whatever arguments my friend provided (“Stability! Security! The Virtues of Open Development!”) were simply too intangible to outweigh the benefits his boss wanted, which this one Windows/IIS-only software package provided. So, they switched to Windows as the hosting environment.

There may come a day when you suggest an open-source software solution (let’s say Apache/Perl/PHP) to your boss or bosses, and they ask you who will run it if you’re gone. “There are plenty of people who know these things,” you say, and your boss says, “Who? I know plenty of MCSEs we can hire to run standardized systems. How do we know we can hire somebody who really knows about ‘Programming on Pearls’ or running our website on ‘PCP’ or whatever you’re talking about? There can’t be that many of them, so they must be more expensive to hire.” Protest as you might, there isn’t a single third-party statistic or study you can cite to prove them wrong.

If you ask the average corporate IT manager about open source, they’ll point to the previous or imminent failures of most Linux-based public companies as “proof” that open-source vendors won’t be there to provide paid phone support in two years like Microsoft will.

I’m willing to bet that most of you out there can cite examples of the dictum that corporate IT managers don’t ever care about the costs they will save by using Linux. They are held responsible to a group of executives and users that aren’t computer experts, aren’t interested in becoming computer experts, and wouldn’t know the virtues of open source if it walked up and bit them on the ass. They want it to be easy for these people, and fully and seamlessly compatible with what the rest of the world is using, cost be damned. Say what you will – right now, there’s just no logical reason for these people not to choose Windows.

So maybe Linux users drop down to the point where Mac users have (still a significant number) – only the die-hard supporters. But how many of you Linux gurus out there don’t have a separate Windows box or boot partition to play all the games you like that aren’t developed for Linux because of lack of users/market share? Well, what about the next killer app that’s Windows-only until you use Linux less and less? Or the next cool web hosting feature that only MS/IIS has? Or as more MS Internet Explorer-optimized websites appear?

I’m not arguing that Linux or BSD would ever truly disappear (there are still plenty of OS/2 users out there). I am, however, saying that as market share erodes, so does development; and, over the long run – if things continue on the present course – Windows has already won.

The main point is this: niche software will eventually die. It may take a very long time, but it eventually will die. Mac or Linux supporters claim that market share isn’t important: look at BMW, or Porsche, which have tiny market shares but still thrive. The counterpoint is that if they could only drive on compatible roads, and the local Department of Transportation had to choose between building roads that 95% of cars could ride on or building separate roads for these cars, they would soon have nowhere to drive. True, Linux/BSD has several Windows ABI/API compatibility projects and Macs have the excellent Connectix VirtualPC product for running Windows on Mac, but very few corporate IT managers or novice computer users are going to choose those over “the real thing.” And I’m willing to bet that those two groups make up 90% of the total computer market.

You can argue all you like that small market share doesn’t mean immediate death. You’re right. But it means you’re moribund. One of the last bastions of DOS development, the MAME arcade game emulator, is switching after all these years to Win32 as its base platform – because the lead developer simply couldn’t use DOS as a true operating platform anymore. It will take time, but it will happen. Think of all the hundreds of thousands (if not millions) of machines out there right now running Windows 3.11 for Workgroups, OS/2, VMS, WordPerfect 5.1, FrameMaker, or even Lotus 1-2-3. They do what they do just fine. But, eventually, they’ll be replaced. With something else.

The Solution

All this complaining aside, the situation certainly isn’t hopeless. The problems are well known; it’s easier to point out problems than solutions. So, what’s the answer?

For that 90% of users that will decide marketshare and acceptance, two things matter: visible advantages in ease of use, or quantifiable bottom-line cost savings. Note for example how Mac marketshare declined from 25% to less than 10% when the “visible” ease-of-use differential between Mac System 7 and Windows 95 declined. Or, look at how the cost of more-expensive Mac computers and fewer support personnel versus cheaper Windows PCs and more (but certified with estimable salary costs) support personnel.

Open-source software development is driven by programmers. Bless their hearts, they create great software but they’re leading it to its eventual doom. They need to ally firmly with their most antithetical group: users. Every open-source group needs to recruit (or conversely, a signup is needed by) at least one user-interface or marketing person. Every open-source project that doesn’t have at least one person asking the developers at all steps “Why can’t my grandmother figure this out?” is heading for disaster. Those that do are making progress.

Similarly, those open-source software projects that have proprietary competitors and are dealing with some sort of industry standard that aren’t taking a Microsoft-esque “embrace, extend” approach are going to fall behind. If they don’t provide (and there’s nothing against making these open and well-documented) new APIs or hooks for new features, Microsoft will when they release a competing product (and, believe me, they will; wait until Adobe has three consecutive bad quarters and Microsoft buys them). The upshot of this point is that open-source projects can’t just conform to standards that others with greater marketshare will extend; they need to provide unique, fiscally-realizable features of their own.

Although Red Hat has made steps in this direction, other software projects (including Apple, Apache, GNOME, KDE and others) should work much harder to provide some rudimentary for of certification process to provide some form of standardized qualification. Otherwise, corporate/education/etc. users will have no idea what it costs to hire qualified support personnel.

Lastly, those few corporate entities staking their claims on open source should be sponsoring plenty of studies to show the quantifiable benefits of using their products (including the costs of support personnel, etc.). The concepts of “ease of use” or “open software” don’t mean jack to anyone who isn’t a computer partisan; those who consider computers to merely be tools must be shown why something is better than the “safe” choice.

The Great 1U Linux Server Shoot-Out

By Jeffrey Carl and Eric Trager

Web Hosting Magazine represented the high water mark of the market for print publications around the Internet Service Provider industry. In other words, it was probably my last-ever opportunity to get a paid gig writing for something that would appear physically on a newsstand. It was intended to be more focused than Boardwatch and, according to Wikipedia, “was written and edited with a deliberately ‘edgy’ style, designed to appeal to the primarily young and male constituency of the ISP industry.” Matt Loschert, Eric Trager and I contributed a few long-form articles before the magazine – like the rest of the telecom industry at the end of the dot-com era – vanished without much of a trace.

Are you looking for excitement, adventure and romance?

Well, so are we. However, if you’re looking for 1U rack-mountable Linux webservers, we’ve got the goods.

Something Old, Something New, Something Borrowed, Something Blew

Last year, we agreed to meet with Web Hosting Magazine editor Dmitri Eroshenko about being hired to write an in-depth exposé on tradeshow booth babes. Instead, we were chloroformed and when we woke up, we had been locked in a small room with four thin rack-mountable Linux-based webservers to evaluate.

Some of the servers sent to us were early or pre-release models (dating back to December 2000). We were going to test them right away, but in early 2001, the web hosting provider that we all worked for at the time did a belly-flop into the financial deep end of the pool. This created some serious havoc, since it’s hard to do adequate server testing when you keep getting interrupted by people named “Guido” coming to the server facility to repossess your routers.

As a result, we should note up-front that many of the questions or shortcomings noted in our reviews may have been addressed in more current versions of the servers (like improved documentation, or inclusion of the current Linux 2.4 kernel instead of the then-current 2.2.x kernel with SMP support compiled in). Some of the equipment we reviewed was clearly work-in-progress material, and that shouldn’t be held against the manufacturers, since they were kind enough to supply the results of their hard work to chloroform-hungover and inherently cynical server reviewers.

Hey, Nice Rackmount

Each of the servers was evaluated on 10 criteria:

1.) Out of box/set up/rack mount – How easy is it to get the server out of the box and into a rackmount production location? Is anything unusual about the server hardware and components?

2.) Documentation – What is the quality and quantity of documentation, and its comprehensibility for an intelligent but not necessarily experienced server administrator? How polished and professional is it?

3.) Basic net configuration – How easy is it to configure its essential network connection information to get it online and accessible to the rest of the ‘net?

4.) GUI tools – The intended user will probably want to use graphical setup tools rather than relying on the command line. How intuitive, powerful and flexible are these tools?

5.) Set up/start Apache/add a virtual host to Apache – Apache is the de facto web server of choice (and not without good reason) for the free *nix world. How easy is it to set up and configure a typical (but basic) Apache setup?

6.) Set up/start secure web server – If the server included secure (HTTPS) web server software (essential for encrypted e-commerce transactions), how easy was it to set up and/or configure?

7.) Set up/start SSH – Secure SHell (SSH) is more or less indispensable these days as a security-conscious encrypted replacement for the venerable telnet; if you’re going to do remote command-line administration, it’s a must. Is it installed, and how easy is it to get running?

8.) Set up/start MySQL – MySQL is the “Budweiser” of free relational database software, and is almost a necessity for DB-based site hosting on *nix. Is it installed? If not installed, is it easy to get and install?

9.) Sex appeal/miscellaneous – This section is for how the box looked and felt sitting on a rack, and other cool features or options which don’t fit in anywhere else. I guess the fact that we were willing to describe rackmount servers in any way with the word “sex” involved shows what losers we are.

10.) The big picture – What’s the overall assessment of the server, taking into account the good, the bad, and the ugly?

We decided to approach ease of use for administration from the point of view of a non-expert user – someone with an understanding of the concepts of what needed to be done, but without specific experience in using those tools. The idea was to be akin to someone who understood routing but would not know what to do at a Cisco IOS command prompt.

We were unsure how exactly to address this, since we were all fairly well-trained *nix server administrators. We eventually decided that the best solution was to each drink about a gallon of gin before testing, to simulate the reactions of an unfamiliar administrator, AOL user or above-average intelligence eggplant. Okay, we were kidding about the last part, but we did decide to “jump in” to each server without thoroughly reading all of the included documentation – as many impatient server admins are likely to do. We hoped that skimming the manuals and trying to use our common sense as much as possible would replicate the experience of a “common-denominator” user.

The Heart of Dorkness

Here, then, were our testing results:

Siliconrax/Sliger:

Model: ????

Suggested Price: ????

Website: www.siliconrax.com

1. Out of Box Setup: It’s powerful, put it isn’t pretty. The dual Pentium-III (866 MHz) server we received came without a power cable or mounting brackets – making out-of-the-box setup difficult for those who aren’t prepared (we assume this little oversight has been remedied for production models). Access to the Ultra SCSI 3 (cool!) hard drives comes through the back fan setup – not entirely intuitive, but nonetheless pretty easy. For access to the motherboard, unscrew the top of the case; upon looking inside, we questioned why the fan seemed to be placed right next to the internal cabling.

2. Documentation: Frankly, the documentation we were provided with the demo unit was, well … awful for anyone other than an already-experienced Linux user. Clearly, the good folks at Siliconrax had spent their time on their server rather than their documentation (true to form for engineers), but what was included didn’t give us any “warm, fuzzy” feelings about the server.

Items 3 – 9: The server was well-stocked, and included a standard version of Red Hat Linux 7.0 running a SMP-enabled 2.2 Linux kernel (2.2.16-22smp), with the powerful (although not, in our likely-to-be-controversial opinion, as user-friendly out of the box as KDE) GNOME installed as the GUI environment.

Shortly before we were to begin detailed testing of the server, it was recalled home. We can easily imagine why, since it appeared that the evaluation server we had been given was clearly a beta-stage unit. While we were unable to put the server through its normal paces, this was probably a positive move, since the unit seemed to have been prepared for us in a rush and the loose edges seemed fairly evident. However, the server appeard to have a lot of potential, and we’re confident that more recent units of the server (which already included some pretty impressive hardware) will probably tie up those loose edges, and have a more polished and “finished-product” feel.

10. The Big Picture: Impressively powerful hardware for a 1U server, and Red Hat is a good choice of Linux distributions to ensure maximum compatibility with third-party applications. Due to the pre-release feel of the server, it probably isn’t kosher for us to offer an evaluation of the other aspects of the unit without adequate production-unit evaluation. While the server we reviewed wasn’t a very polished package, we think it certainly has potential.

StarBox:

Model: iBox 1U Server

Suggested Price: ????

Website: www.starbox.net

1. Out of Box Setup: The test server we received included no install media; this can be a problem if, for example, you need to reformat the drives and re-install the default configuration. (Whether this is the case for production servers, we’re not sure; we guessed that this might be an early pre-production unit, since the dual-Pentium III server came in a box with a large “UltraSPARC-powered” sticker.) The server included extra ports, although we weren’t sure why. Nice features included an easy-open case for hardware enthusiasts, and dual power supplies for high reliability (a feature not seen on any of the other servers we evaluated).

2. Documentation: Like reading someone else’s college Computer Science Unix course notes. The box included a sheaf of black-and-white papers that were loaded with information but lost some points on presentation and organization. This, however, seemed like an excusable fault for a review unit; we assume that the documentation shipping with production servers is a bit more polished.

3. Basic Net Configuration: This posed a bit of a problem for us, since the IP address assigned to the machine in the default configuration we received was a non-routable one; we figured that either 1.) the presumption was that this server would be set up inside a protected LAN environment, or 2.) the machine had been shipped to us having just come from such an environment. Since our testing environment was a “real, live IPs” production setup, we initially could not access the server from any machine other than itself. As a result, we could not access the web-based administration from another computer and needed to play at the command line to get it set up for proper networking (in this case, we used linuxconf to change the IP address, then manually edited the Plesk httpsd.conf file to reflect the new IP). However, if you have a fair amount of experience with *nix servers, this shouldn’t prove too much of a problem.

4. GUI Tools: Once we had the server set up for networking, we tried the web-based adminstration. For its graphical interface, the server uses the popular Plesk GUI. Unfortunately for us, when we tried to run it, we received the message: “Time expired. If you want to continue to use PLESK, please buy a license.” This was obviously a simple error made in a pre-production server sent out for review but, alas, it ended our experiments with the GUI tools on the server. However, from other experiences with Plesk, we can vouch that it’s a very cool admin tool, and a good choice on the part of the designers.

5. Apache Setup: Kids, don’t try this at home. Lacking the Plesk GUI, we figured that we’d just set it up from the command line as we normally do. We can’t pinpoint exactly what happened along the way, but after we finished our normal setup routine, Apache was f***ed up beyond all recognition. This, in a sense, is an unfair response, since the system doesn’t seem to have been intended to be configured via the command line, and as noted, Plesk is a very good GUI configuration tool. Still, those seeking to administer via the command line should thoroughly check the documentation beforehand to avoid the kind of problems we encountered.

6. Secure Server: We learned our lesson and didn’t even try this one from the command line. However, a secure server is already installed, and outside experiences confirm that it isn’t too hard to set up via the intuitive Plesk GUI.

7. SSH Setup: Thankfully, SSH was pre-installed on the server, so we didn’t have to break anything manually.

8. MySQL Setup: Also thankfully, MySQL was already installed, so this prevented us from trying to set it up, which presumably would have ended up with the CPU catching on fire.

9. Sex Appeal: The box itself was solid, although unremarkable. It was certainly impressive in the sense that it sounded like it had a jet engine running inside; the noise was noticeable enough to be slightly troublesome, but not loud enough to make us think that something was broken. Our only lingering question about the hardware we received was the choice of giving it great processing power (dual 800 MHz Pentium IIIs) but only 128 MB of RAM.

For some reason, despite the obvious power of the machine’s hardware, we found for example that the popular emacs text editor seemed to run very slowly. This was possibly due to the seemingly underpowered (compared to CPU) RAM allocation, or the fact that emacs (“now with Kitchen Sink module!”) is a notorious resource hog.

10. The Big Picture: Again, limitations in the testing unit prevented us from providing a really accurate evaluation of a production server. Still, from what we knew of the Plesk GUI and the installed software on the server, it should be a pretty user-friendly box, and the included software made for a good start for inexperienced server administrators. There’s obviously some solid engineering behind this server. Aside from minor quibbles about the included server hardware, we felt that the server has a great deal of potential.

ServeLinux:

Model: ServeLinux 1U Rackmount Server

Suggested Price: ????

Website: www.servelinux.com

1. Out of Box Setup: Whoa, this one was large and heavy. Its size and weight will make you wonder briefly about the stability of the supplied rack mounting brackets, but we can assure you that the included brackets hold up just fine. The install media we received was evidently an in-house-made TurboLinux 6 CD, with a hand-scrawled label.

On the positive side, the box included ports on both front and back, and the case had keys and lockable drive bays for those concerned that their colo providers might randomly rip the drives out and use them for ping-pong paddles. The setup included wire ties, and there was thermal tape for the fan, as well. Still, the two 46 GB IBM DeskStar HDs offered ample if not excessively fast storage, and the dual 933 MHz Pentium III processors showed serious power.

2. Documentation: Bound, printed manuals with diagrams and good step-by-step walkthrough text. There is a separate manual for Slash, the GUI administration tool – a good choice.

3. Basic Net Configuration: On original startup, it launches a ServeLinux-branded LinuxConf-esque semi-graphical tool to configure basic network settings (allowing DHCP or static config). After entering settings into this tool, it saves these settings appropriately and then boots normally into Linux. This original setup screen also allows you to choose whether you’d like to use an SMP-enabled kernel or a single-CPU kernel.

4. GUI Tools: Uses Slash, a ServeLinux proprietary GUI for administrator setup. It runs on a secure port, which is a good choice. Aside from Slash for the administrator, it also includes Webmin for virtual hosting users to use.

5. Apache Setup: Straightforward and easy with the GUI tools. As a bonus, Microsoft FrontPage extensions are included by default; we found it heart-warming that they would include default support for those individuals who have been enslaved by their unfortunate crack cocaine habit and chose to use FrontPage for their web publishing. As a nice bonus, the well-featured OpenMerchant shopping cart software was included as well.

6. Secure Server: Includes ApacheSSL, the open-source Secure Server variant of Apache. In the past, ApacheSSL has been derided for its relative lack of speed compared to commercial secure servers; but its integration with non-secure Apache, and speed increases in more recent versions, should make it a respectable choice.

7. SSH Setup: OpenSSH was pre-installed on the server, and was ready for starting via Slash without any problems.

8. MySQL Setup: MySQL was pre-installed and configured to work with Slash, which was a nice touch.

9. Sex Appeal: Like the StarBox server, the loudness of the fan made the server (sporting dual 933 MHz Pentium IIIs) sound like it contained a 12-cylinder engine idling in fourth gear. The box itself was pretty cool, sporting a hip shade of purple.

We applauded the choice of the powerful and easy-to-use Qmail as the default mail transfer agent; we love the venerable Sendmail MTA as much as the next *nix geeks, but in-depth configuration of the program is a mysterious black art like alchemy, necromancy or NASCAR. The server we tested had the TurboLinux 6 (a plucky upstart but solid distribution) using Linux kernel 2.2.14 with SMP enabled.

One of the nicest touches we found (something that should be an option with every *nix machine) was the “Mulligan” option, which refers to the golf term for re-taking a muffed shot. After running the original net configuration, if you find that you’ve screwed it up, you can run a special command (/root/Mulligan/Mulligan) from the command line interface. The server will then reset to its default net config options, reboot and present you with the original net configuration screen.

10. The Big Picture: For raw power, this one’s the winner. It lacks a bit of polish, and Slash may not have all the niceties that some other commercial GUIs offer; still, it’s got pretty much all the software you’d want, plus a few extra thoughtful touches. Overall, it’s a very nice package.

Cobalt:

Model: ????

Suggested Price: ????

Website: www.cobalt.com

1. Out of Box Setup: Cobalt certainly pays attention to style as well as function. The box included rackmount brackets, and the server itself has plastic “feet” for surface-mounting. The front of the server has a cool LCD panel with simple button controls, through which initial network configuration settings can be made. Unlike the other servers, the Cobalt seemed deliberately constructed to make it difficult to access its internal structure; simple enough for most users, but difficult for those of us who were determined to violate the warranty and crack it open.

2. Documentation: Documentation for Unix servers is frequently a pretty sparse thing, leaving novice users to thrash around; Cobalt does its best to ease the pain. The Cobalt includes a pretty cool spiral-bound book with thorough directions and screen captures. It’s also pretty obviously the documentation that the makers spent the most time and expense on.

3. Basic Net Configuration: Basic net setup was easy, and could be done either through an LCD panel on the front of the unit or through a graphical web interface. Changes to IP address or net configuration required a reboot, although it seems that this information must also be stored in some sort of flash RAM (when setting the server up originally, we entered the network settings via the LCD before Linux even began booting).

4. GUI Tools: The Cobalt offered a good interface for configuring the server via a web page, with very simple layout and a rich set of features. The server management stuff was very cool, but we found it somewhat troublesome that the configuration was done through a non-encrypted session (call us paranoid, but we’re security-conscious enough that we’re troubled about sending our Slashdot usernames and passwords over insecure links).

The GUI interface also offers configuration tools for many of the non-webserver and third-party software that ships with the server, like time, DNS, RAID and the Legato and Arkeia backup systems. It should be noted that besides the web interface (and the obvious command-line Linux tools), most of the most basic configuration options can be set with buttons on the LCD console.

5. Apache Setup: Setup was very easy, and the included GUI tool was very cool; the more we worked on the server, the clearer it became that Cobalt had certainly spent more time than anyone else on the slickness and user-friendliness of their administration tools. The default Apache build includes mod_perl and mod_frontpage. Chilisoft’s ASP-on-Linux software was included as well (a very nice and otherwise expensive freebie), and its setup web tool was very cool; it also offered the non-Apache-standard ability to limit bandwidth by IP address.

6. Secure Server: Secure server setup is quick and easy to understand; we were never tempted to touch the command line to tweak it.

7. SSH Setup: For some reason we couldn’t explain, Cobalt had elected not to include an SSH server/client with the default installation (we figured the reason was probably Alien Mind Control Rays from Outer Space). Nonetheless, installing OpenSSH was rather quick and simple.

8. MySQL Setup: For a change, MySQL wasn’t pre-installed, but feisty challenger PostgreSQL was. We found it to be a nice choice.

9. Sex Appeal: Let’s just say it up front: the deep blue case looks damn cool, and the Cobalt logo on front lights up from inside – normally the sort of nice design touches you’d expect from someone like Apple.

The default install includes APOP (secure POP3), and to add extra software, you can install Debian .pkgs provided with the server, or via HTTP through the GUI. One odd touch (depending on your point of view) is that the server can’t be shut down through the web interface.

10: The Big Picture: Unquestionably, the Cobalt is the slickest and most user-friendly one of the bunch. The extra third-party software included also makes it the top value – as long as you have need of it. However, we questioned some of the absent software; and it seems a little unfair to compare the polish of Cobalt to some of the other newer and less well-funded outfits on just polish – we feel that the others may easily catch up in time. Still, overall, the Cobalt is the likely choice, based on the units we tested, for a server with the power of Linux without the hassles (for many users) of Linux administration.

Summing it All Up

Let’s face it – Linux is Linux, Apache is Apache, and Pentium-based servers are Pentium-based servers. Variances in version numbers or clock speeds among these servers are important, but only a minor issue in the long run. What really matters are the price, the power and tested stability of the hardware, the polish and ease-of-use, and the extra goodies that are included – and which of these qualities are most important to you. One person’s winner is the other person’s loser.

All of the servers we tested (considering the improvements that were likely made between testing and release models, based on the promising work that had been done already) are more than adequate for the task of mid-range webserving. In the end, it comes down to which server suits your needs best. For price, ???? was the winner. Based on sheer power, ServeLinux comes out on top (but needs more default RAM!) And for software goodies and ease-of-use, Cobalt finishes way ahead of the rest of the pack.

So, in the end, the decision comes down to which factors are most important to you. Fortunately, the fact that all of these servers are working with mutually available choices of software and hardware will keep them competing over who can build the best server from the newest equipment for the best price – making this month’s winner into next month’s potential loser. All of the servers we reviewed are worth keeping an eye on as they grow and progress – and they will all prove great potential values for web hosters.

Darwin Evolves – Apple’s BSD Hits the Prime Time

By Jeffrey Carl

In a Nutshell: DarwinOS, the core of Apple’s just-released MacOS X, is open-source and available as a free download. While it inherits many characteristics from current BSD Unixes, it’s also radically different – revealing its true identity as the direct descendant of the ahead-of-its-time NeXTSTEP operating system. Darwin has some rough edges and missing documentation, but offers some very interesting possibilities. Unfortunately, the fact that it’s intended for Apple hardware will limit its appeal to many ISPs and sysadmins.

By the time you read this, Apple’s MacOS X should be on the street – the first-ever consumer OS with a Mach microkernel/BSD Unix core. That core is called Darwin, and it’s an open-source project – available under the BSD-ish Apple Public Source License – that can be downloaded for free (MacOS X, which includes the new Mac GUI and lots of other goodies, costs $129).

I’ve talked about Darwin here before, when it was in its earlier stages, but wasn’t able to go into many specifics about what Darwin was really like for a Unix admin. Now that the version that ships with MacOS X 1.0 is here, let’s take a look at this remarkable OS change for Apple.

The Politics Behind Darwin

Darwin is the triumph of NeXT over Apple. The GUI-only, outmoded classic MacOS of Apple’s last 17 years is gone, and a tough, sexy Unix core has replaced it (although MacOS X includes a compatibility layer to run existing MacOS apps). To understand why – and to understand why a Unix admin might be very interested in an Apple OS (other than the “research project” MkLinux) for the first time in many years – it helps to know the politics behind the situation.

In case you haven’t been paying attention to Apple’s corporate soap opera for the last few years (or, more likely, just don’t care), here’s the short version. Apple was at its lowest depths in late 1997, having wasted more than six fruitless years trying to create a modern operating system to replace MacOS (which was still building on foundations laid in 1984). Then-Apple CEO Gil Amelio instead looked to buy a company that already had an advanced OS, and settled on NeXT, the failing company run by Apple’s exiled cofounder Steve Jobs. The brilliant, mercurial, occasionally tantrum-prone Jobs was instrumental in building Apple from Steve Wozniak’s garage into an industry giant. But in 1985, he was fired by his own board of directors, essentially for being a major jerk to everyone within a 50-yard radius. It was a very nasty “divorce,” and Jobs left in a huff to found NeXT.

The NeXTSTEP (later called OPENSTEP) operating system which powered NeXT’s computers was years ahead of its time – but shipped on underpowered, grossly overpriced computers (sound familiar?) and was rejected by the marketplace. NeXTSTEP was based on Unix and the Mach microkernel, and included a radical new GUI and a pretty cool object-oriented development framework.

Microkernels (of which Mach is the most famous example) provided a much more elegant OS design than most OSes used (and still use), by moving everything but memory management and other lowest-level tasks out of the kernel. However, the overhead of this elegant and scalable design provided reduced performance in many “everyday” situations (like replacing a simple Excel spreadsheet with a full relational database), and microkernels were sidelined as academic curiosities. The NeXT object-oriented development kit was very advanced, but required knowledge of the relatively obscure Objective-C language and was largely ignored as well.

For Apple’s $400 million purchase of NeXT in 1997, they got not only the company’s OS but CEO Jobs as well. Then-Apple CEO Gil Amelio thought he was getting a valuable figurehead/consultant in Steve Jobs. But an ex-Apple employee who knew Jobs better than Amelio did predicted to Apple CTO Ellen Hancock that “Steve is going to f*** Gil so hard his eardrums will pop.” Sure enough, on July 6 1998, Gil Amelio was forced to resign and Steve Jobs once again was in charge of Apple. (Hancock resigned when Amelio did, and went on to become president of web hosting and datacenter giant Exodus.)

The rest is history. In the foreground, Jobs was introducing the fruit-colored, legacy-free iMac to the world, sparking Apple’s sales resurgence. In the background, nearly all of Jobs’ loyal NeXT troops were assuming the top posts at Apple and changing the company’s technology and direction.

In 1999, the hype about Linux and open source was at its height, and Apple felt the pressure to join the crowd. Since BSD and Mach – which formed the core of NeXTSTEP – were already open source, it wasn’t hard for the normally ultra-proprietary Apple to take the step of officially open-sourcing the core of the forthcoming MacOS X. The NeXTSTEP core of MacOS X officially became “Darwin,” and a developer community of Apple engineers, Macolytes and Unix hackers began to form around the project. Along the way it saw contributions from Quake designer John Carmack, a basic port to Intel x86 hardware, and a “1.0” version that has evolved significantly as MacOS X neared release.

It’s noteworthy to keep in mind how much of a departure Darwin is from the old MacOS. It’s as if the “House that GUI Built” was taken over by the Unix geeks from Carnegie-Mellon and handed over to the hacker community for safekeeping. NeXT was the “ugly step-child” of the BSD family that nobody else noticed; and now, it will claim more users than the rest of the family combined – probably in less than six months.

Getting In-Depth with Darwin

After all that, it’s time to get under the hood of Darwin as a Unix admin would approach it. Many reports have claimed erroneously (at times, I have said this as well) that Darwin was basically “FreeBSD on Mac.” In fact, it’s a bit of updated Mach, a bit of traditional Free/Open/NetBSD, and a lot of NeXTSTEP flavoring.

Darwin uses the Mach microkernel, but attempts to solve some of its performance problems by moving much of the BSD functionality into the same “kernel space” (rather than in “userland” as a pure microkernel would). As such, it merges the two worlds in a way that is designed to keep the architectural elegance of a microkernel design while minimizing the performance overhead that microkernel process scheduling causes.

The first thing a BSD Unix admin will notice upon logging into Darwin is its familiarity. The default shell is tcsh, and nearly all of the customary /bin, /usr/bin/ and /usr/sbin utilities are there. In addition, you’ll find vi, pico, perl, sendmail, Apache, and other typical goodies. Ports to Darwin of popular Unix open-source Unix apps – from mysql to XFree 4.0.2 are proliferating rapidly. From a typical user’s perspective, it’s almost indistinguishable from *BSD.

It’s only once you become root and muck around in the system’s administration internals that you start to notice what makes the system a true child of NeXTSTEP. You’ll notice in /etc (actually a link to /private/etc; see below for more information) that /etc/resolv.conf doesn’t contain what you would expect. Nor does /etc/hosts, /etc/group or /etc/passwd. Why? It’s because Darwin wraps many of the functions contained in separate system and network configuration files in *BSD into a single service (inherited from NeXT) called NetInfo. Why is this useful rather than just annoying?

In *BSD and Linux, a number of different services derive their information from separate text configuration files – each with its own syntax and options. There’s no global preferences database – for example, any application that doesn’t automatically know how to read all the items in /etc/resolv.conf can’t find out what name servers your computer is using, or a program that can’t parse the syntax of /etc/passwd doesn’t know which users are on your system.

Somewhat like the Microsoft Windows Registry, Darwin’s NetInfo provides a database of network and user settings that can be read by any other NetInfo-aware application. NetInfo supersedes the information in the traditional /etc/* configuration files, as well as being favored by system services. NetInfo is consulted not only by MacOS X-native applications, but also by traditional BSD/Unix applications as well (making it much easier to port these apps to Darwin). The Apple engineers have accomplished this by hooking a check into each libc system data lookup function to consult NetInfo if it’s running (by default, it’s only “off” in single-user mode).

MacOS X’s GUI provides graphical tools for manipulating the NetInfo database; in Darwin, this can be done using the niutil and nicl commands (use man niutil and man nicl to see the syntax and options; it’s interesting to note that these man pages are dated from NeXT days). NetInfo can also “inherit” its settings from a “parent” NetInfo server, so you can create one server which has everyone’s account information on it, and all of its client machines will have their login info, network setup, et cetera (imagine a “family” of servers where users can interchangeably log in with the same accounts, settings, etc.).

Like NetInfo settings, application preferences are stored in a global XML database; they can be manipulated from the command-line defaults program. Typing man defaults from the command line will give you an idea of how its structure works.

One area that has been changed since the MacOS X Public Beta is that it is no longer necessary to reboot or log out/log back in to change network settings. Anyone who has used *BSD/Linux on mobile computers, or changed network profiles in Windows NT will appreciate this difference.

Darwin/MacOS X includes a /private directory at the root filesystem level, which includes the normal BSD /etc, /var, /tmp and the Darwin /cores and /Drivers). It appears that (in behavior inherited from NeXT) this directory is filled with info that is machine-specific, so that the rest of the filesystem can be booted from a parent network server. This clearly plants the roots for Darwin or MacOS X-based systems to serve as “thin clients.”

As for performance, recent builds of Darwin work admirably on mid-range Mac hardware. The only real complaint I have about Darwin (my gripes with the MacOS X user interface could fill a small book, however) is its woeful lack of documentation. While many BSDs suffer a similar problem, their user communities have had time to fill in the gaps; the changes that Darwin makes to the traditional BSD model are largely known only to the Darwin community and old-school NeXT gurus.

The closer I look at it, the more Darwin (and MacOS X) is appearing to take up where Novell left off in the race to compete with Windows NT/2000 in the corporate network space. It might be possible that Apple is looking to go head-to-head with Windows Whistler/XP while nobody is looking. And any victory in that space for *nix is something to be cheered.

Darwin’s Dilemma

Unfortunately, most of the work that has gone into Darwin thus far (understandably) has been in developing it for recent Apple hardware. While an Intel x86 port exists (but is still as of this writing in embryonic stage for hardware support), and older PowerPC Mac hardware support will undoubtedly extend over time, the current release of Darwin is not officially supported for Mac hardware older than G3 Power Macs. This sadly eliminates (at least for now) using Darwin to make a great server out of any old Mac hardware you have sitting on a shelf.

When you buy a new Mac, you’re paying for things (like full MacOS X, an ATI Radeon or nVidia GeForce video card, optical mouse, etc.) that you aren’t going to care about as a server admin. While new Mac systems are surprisingly powerful considering their CPU clock speed (I would sooner compare a G4 to an UltraSPARC III than a Pentium III), you still won’t get the same performance dollar-for-dollar as you would with commodity x86 hardware and a free OS.

As a result, buying a new Mac just to make into a Darwin server simply isn’t worth the money. If you love the GUI tools of MacOS X, it may be worthwhile (I’m personally salivating over the new Titanium PowerBook G4); otherwise, it still doesn’t make bottom-line dollars and sense to purchase a new Mac as a Darwin server.

As Darwin expands its processor base, this may change. In the meantime, it’s well worth your while to keep an eye on the Darwin project, and to get to know it, since some of its features are well worth adopting by the other BSDs and Linux.

Some of proceeding items stem from a series of articles that BSD guru Matt Loschert and I wrote for the BSD news site Daemon News (see www.daemonnews.org/200011/ and www.daemonnews.org/200012/ for excessively long and drawn-out versions). 😉 For great info on Darwin, you can skip its home page (www.opensource.apple.com) and go straight to Darwin Info (www.darwinfo.org). Also, check Xappeal (www.xappeal.org) and Apple’s Darwin lead Wilfredo Sanchez’s updates page (www.advogato.org/person/wsanchez). As always, let me know about any comments, corrections or suggestions you have, and I’ll publish them in a future column.

Open-Source vs. Commercial Shoot-Out

By Jeffrey Carl, Matt Loschert and Eric Trager

Welcome to part one of the Open Source vs. Commercial Server Software Shootout. This time around, we’ll be looking at Web Server Software (Microsoft IIS vs. Apache); Site Scripting (Microsoft ASP vs. PHP); and Microsoft Web Compatibility (FrontPage extensions and ASP options). Before you send nasty response e-mails (these, especially nasty threats, should be addressed to [email protected]), keep in mind that this is merely the advice of three long-time server admins; your mileage may vary). Tax, tags, title and freight not included.

Web Server Software

Overview:

The breakdown between commercial and open-source web server engines is more about the operating system than the software itself. The most popular commercial option is Microsoft’s IIS, or Internet Information Server. It runs only on commercial operating systems: Windows NT, or the more robust Windows 2000. The far-and-away leader in open-source server options is Apache, and, while it features a version that runs in the Windows server environment, its flagship server runs under even obscure flavors of Unix and has no real competition.

The battle of IIS vs. Apache in raw performance evaluations is as tangled as the Linux vs. NT benchmarks are. Depending on whose tests or opinion you listen to, you could hear everything from “Apache is blown away at heavy web loads” to “IIS all-around blows goats.” However – all OS and configuration issues aside – it’s probably safe to say that IIS performs better with heavy loads of static pages and Apache tends to handle lots of dynamic DB/CGI stuff better (at low loads, there’s little performance difference to speak of). Keep in mind, though, that because IIS is so tied to Windows and the Apache/Win32 version lags behind its Unix counterparts, it’s unlikely that there will ever be a really even contest between the two.

Since IIS is the clear leader with Windows servers and Apache is by far the most popular choice for Unix foundations, the comparisons here are largely about the duets of OS and webserver. Such a comparison is appropriate, considering how critical the operating system is in funneling the horsepower to a server with hundreds if not thousands of simultaneous clients (we hope your state motor vehicles department is listening).

Commercial Option: Microsoft Internet Information Server 5.0

Platforms: Windows NT/2000

Publisher: Microsoft, www.microsoft.com/windows2000/library/howitworks/iis/iis5techoverview.asp

IIS is a free package when Windows 2000 server software is acquired, and was available as part of an option pack for Windows NT server. IIS is technically free, but if your goal is to run a webserver and you choose IIS, you’re paying for it by choosing NT or Windows 2000. NT still starts at around $500 for the base server package, while Windows 2000 goes for around the same price and that’s only starting at a five-client license. While that license permits as many web hits as possible, those hits can only be unauthenticated requests; simultaneous accesses that are under a username/password are limited to the number of users permitted by the license (which really cramps your style if you’re running adult-oriented membership houses).

There are advantages to using IIS. One is a relatively simple installation and set-up process. Using an all-too-familiar GUI, a few points and clicks will have IIS serving pages on your Windows system in no time. Once running, IIS is administered with the same familiar Windows GUI and allows for easy access to run-time information including current access statistics and resource consumption. IIS also includes some niceties Apache doesn’t offer, such as making it easy to throttle bandwidth for individual hosts. The IIS setup allows file security to be set easily through the GUI (Apache allows some of this, but not as many options as IIS, and relies on Unix file security options for the rest). Also, it has a clear advantage in dealing with sites created using Microsoft tools (see Microsoft Web Compatibility below).

Using IIS also has its drawbacks. When looking at the open-source model as another option, the potential for security issues with IIS has to make you sleep less easily. Open-source bugs are usually squashed pretty quickly, when there are thousands of developers who examine the insides of a program. Microsoft has had issues in the past dealing with revealed exploits in an (un-)timely fashion, and there’s no reason to believe that IIS is any more immune than Internet Explorer or Windows itself.

Second, because it is tied to Microsoft’s server OSes, you’re … well, you’re tied to Microsoft’s server OSes. We don’t have the several thousand pages it would take to get into this, so we’ll leave it as a matter of preference. If it’s “the right tool for the job” for you, so be it.

ALTERNATIVES: There are plenty of other commercial web server software options. The leader in marketshare after IIS as measured by Netcraft (www.netcraft.co.uk/survey/), is iPlanet Enterprise Edition 4.1 (www.iplanet.com/products/infrastructure/web_servers)– the former Netscape server, now a Netscape/Sun collaboration – which starts at $1495; the iPlanet FastTrack edition is available for free. Zeus (www.zeus.com) is widely known as a heavily optimized, very speedy server for Unix platforms, and starts at $1699.

WebLogic (www.bea.com/products/weblogic/server) is a popular application server that costs some amount of money, which I wasn’t able to figure out by reading their website. WebSite Professional (website.oreilly.com), the “original Win32 webserver,” starts at $799. And, for those brave enough to run a webserver on MacOS 8/9, there’s WebStar (www.webstar.com/products/webstar.html), which starts at $599.

Open-Source Option: Apache 1.3.12

Platforms: AIX, BSD/OS, DEC Unix, FreeBSD, HP/UX, Irix, Linux, MacOS X, NetBSD, Novell NetWare, OpenBSD, OS/2, QNX, Solaris, SCO, Windows 9x/NT/2000 (plus many others; see www.apache.org/dist/binaries for a list of supported platforms with pre-compiled binaries)

Publisher: The Apache Software Foundation, www.apache.org/httpd.html

The Netcraft (www.netcraft.co.uk/survey/) survey listed the Apache webserver running 61 percent of sites reporting. While Apache’s awesome domination has slipped a little in recent years, it still holds the majority and runs well ahead of IIS, which has (as of this writing) a (relatively) paltry 19.63 percent of the market. Why is it that Apache has such a Goliath-like grasp on the field of webservers in use?

The reasons are numerous and simple. First, Apache is free. Consider that the business of the Internet is still relatively very young and, therefore, not terribly consolidated… yet. There are quite a few small hosting providers that need powerful web servers, and Apache presents an obvious choice, running on equally inexpensive operating systems like Linux and FreeBSD. Consider also that many businesses host sites remotely with those same providers, and Apache presents what has traditionally been the easiest server to configure and maintain remotely (through Telnet/SSH).

Second, Apache is stable and powerful. In years of work on both lightly and heavily worked Apache servers on different operating systems, I’ve had to work very hard to create a scenario that choked the server. The only problem that consistently knocked down Apache was the human one: like any program, Apache can’t work around typos in the configuration file and sometimes isn’t too descriptive about what’s wrong even through its apachectl configtest mechanism (although you’d be amazed at what you can get away with). Also, while IIS allows easy setup of most security permissions through its GUI, it can’t compete with the easy flexibility of Apache’s htaccess feature (especially when doing things like limiting access to a host with password permissions).

The Apache system is modular in nature as well. The initial package has what is necessary to get a web server online, but to accommodate special functions like FrontPage compatibility, persistent CGI, on-the-fly server status or WebDAV, server modules can be downloaded and installed to be incorporated into the Apache daemon. Apache maintains a directory of resources on available modules at www.apache.org/related_projects.html#modulereg. The benefit is a streamlined program by default for what the majority of users need, and the ability to fatten it up for special functions only when necessary.

Apache has a few disadvantages. It isn’t nearly as easy to install as IIS; but for anyone familiar with administrative aspects of Unix systems (and that base is growing faster with Linux’s ballooning popularity), it’s rather simple to obtain, install, and configure. Support for Apache is free, but is not as easy to obtain as picking up the phone and calling Microsoft. Of course, considering the user base and its roots in a largely helpful Unix culture, plenty of assistance is there if you’re willing to look (and, please, look first). Lastly, Apache’s performance in some situations isn’t all it could be – hopefully the impending arrival of the heavily-rewritten Apache 2.0 will fix much of that.

ALTERNATIVES: Popular (relatively speaking) alternatives include AOL’s in-house (but open-source) AOLserver (www.aolserver.com); and the tiny but very fast thttpd (www.acme.com/software/thttpd).

Final Judgement:

Even the O.J. Simpson jury wouldn’t have much problem with this one. Apache takes the lead in so many categories that it’s not much of a contest. The user statistics don’t lie, either: in terms of servers out there, Apache beats IIS by three to one. While on the surface this could be attributed its cost (zero), remember that IIS is also free if you already have Windows NT/2000 (you did pay for that, right?). Ignoring that factor, Apache is generally regarded to be superior to IIS in reliability and overall performance, which, with a price tag of zero and a support network of users across the globe, makes it an extraordinary value.

IIS has some bright points, which is important if you’re in the position of being locked into running your webserver on a Microsoft platform. If Microsoft is your exclusive IT vendor, if you’re hosting sites created with Microsoft tools or using third-party apps only available for Windows, or you have lots of Win NT/2K experience and little or no desire to learn Apache and Unix, IIS is your clear choice. Otherwise, even if you have Microsoft platforms running your IT infrastructure, administrative functions, etc, it may be well worth it to purchase additional systems to run Unix, just so your web functions can be handled by Apache.

Microsoft FrontPage Compatibility

For a web hoster, it’s becoming increasingly imperative to be able to provide compatibility for the primary issues associated with sites designed using Microsoft tools: support for FrontPage (Microsoft’s website-building tool, included with Office 2000) and Active Server Pages (Microsoft’s site scripting/application development framework with ODBC database connectivity). On Windows, the commercial option of Microsoft Internet Information Server (IIS) enjoys an obvious advantage, since it’s playing on its “home field,” built by its “home team.” Still, there is a free FrontPage alternative willing to put up a fight on Unix, as well as open-source and commercial alternatives for ASP and ODBC on Unix and Windows.

Commercial Option: Internet Information Server 5.0

Platforms: Windows NT, Windows 2000

Publisher: Microsoft, www.microsoft.com/windows2000/library/howitworks/iis/iis5techoverview.asp

Basically, IIS is the software that FrontPage and ASP were built to run with. Windows 2000 Server (which includes IIS 5) runs for about $950 – $1000 with a 10-client license. If you’re hosting on a Windows NT/Windows 2000 platform, this is a no-brainer. If, however, you’re hosting in a Unix (or anything else besides WinNT/2K) environment, IIS simply isn’t an option. There are no open-source alternatives I’m aware of for Apache/Win32.

Under Windows 2000, setup of IIS Server Extensions is pretty simple. After creating a new Web Site in the IIS administration interface, right-click on a site and select “Install Server Extensions.” A wizard will guide you through setting up the extensions for that host, including e-mail options (which you can set for that server or another mail host). After installation, you can select properties for that host’s server extensions through the “properties” dialog box by right-clicking on the host file.

Once they’re set up, IIS (at least in my experience) supports FrontPage extensions and ASP scripts very well and without much need for special administration. If (when) you experience problems, you should check out the FrontPage Server Extensions Resource Kit (SERK) at officeupdate.microsoft.com/frontpage/wpp/serk/. Note that IIS 5 also provides support for Active Server Pages and ODBC (see below) on the Windows platform.

Free Software Option: FrontPage Extensions 4.0 for Unix

Platforms: AIX, BSD/OS, FreeBSD, HP/UX, Irix, Linux, Solaris/SPARC, Solaris/x86, Tru64 (Digital)

Publisher: Ready to Run Software, www.rtr.com/fpsupport

The FrontPage Server Extensions for Unix (FPSE-U) version 4.x (version 3 was for FrontPage 98, 4.x is for FrontPage 2000) are available only for Apache on Unix platforms. These consist of an Apache module (mod_frontpage) and special binary server extensions that duplicate the publishing and server-side features offered by FrontPage. Note that I say this is a “free” software alternative, since it doesn’t cost you anything, but it isn’t open-source (sorry, Richard Stallman).

To install the FPSE-U, go to the Ready to Run Software site mentioned above, and download the extensions for your platform. These contain the FrontPage Server Extensions, the Apache module, and two installation scripts: fp_install.sh (to install the extensions on the server and for each required host/virtual host) and change_server.sh (to replace your Apache installation with a default one including the FP module). I strongly recommend that instead of running change_server.sh, you download the “Improved mod_frontpage” (home.edo.uni-dortmund.de/~chripo) and then compile your own Apache (following the instructions at www.rtr.com/fpsupport/serk4.0/inunix.htm#installingtheapachepatch), which will allow you to customize it and include the modules of your choice.

Once you’ve installed it, the FPSE-U generally work fine. You’ll need to learn to use the FPSE-U administration tool, fpsrvadm.exe, (check the FPSE-U SERK for documentation) usually installed into /usr/local/frontpage/currenversion/bin/, which generally isn’t much of a hassle. Note that I say “generally,” since (in my experience, at least) the FPSE-U have a habit of occasionally springing up weird errors which require a reinstall (especially as you try to support more FP users on Unix with more advanced requirements).

For troubleshooting, check out the RTR FrontPage 2000 documentation (www.rtr.com/fpsupport/faq2000.htm) and their message boards (www.rtr.com/fpsupport/discuss.htm) as well as the official FrontPage SERK (listed above). Note that the FPSE-U don’t provide any sort of compatibility for ASP or ODBC connectivity to Microsoft Access databases (which IIS/Win32 does, as do the ASP alternatives listed below).

Final Judgement:

This simply comes down to your choice of platform. If you’re running Windows NT/2000 Server, there’s no reason (you’ve already paid for it) not to use IIS for this. If you’re running any version of Unix, the free software alternatives are all you have. If you’re running Apache for Win32 … well, why are you running Apache for Win32?

Active Server Page (.asp) Support on Unix

Commercial Option: Chili!Soft ASP 3.x

Platforms: AIX, Solaris, OS/390, HP/UX, Linux

Publisher: Chili!Soft, www.chilisoft.com/ chiliasp/default.asp

ASP on Unix is already playing at a disadvantage, but for those wanting to support it on Unix hosting platforms (which have their own long list of benefits), it’s a necessary evil to deal with. Chili!Soft’s ASP product is relatively expensive ($795 for a single-CPU Linux or NT license; $1995 for Solaris, HP/UX or AIX). However, it seems to be reliable enough that IBM and Cobalt (recently purchased by Sun) have agreed to make Chili!Soft’s product an option on their server offerings; and, from most reports, it seems to work as advertised.

For all supported platforms, Chili!Soft offers a fully functional 30-day trial version at www.chilisoft.com/downloads/default.asp. Installation may require some pain-in-the-butt steps, but these are documented pretty well by going to www.chilisoft.com/caspdoc and following the directions for your platform of choice. After installation, things usually go pretty much as expected.

For support, you can check out their well-maintained forums (www.chilisoft.com/newforum), which offer frequent responses from Chili!Soft support employees, read their other documentation resources (www.chilisoft.com/support/default.asp), or call a phone number (for certain paid support packages).

ALTERNATIVES: These include HalcyonSoft’s Instant ASP [iASP] (www.halcyonsoft.com/products/iasp.asp), which is a Java-based solution that will run on any platform which supports a Java Virtual Machine (JVM). HalcyonSoft seems very committed to making their product available for as many platforms as possible, and offers a free “developer version” download. Note that for Win32 platforms, your commercial alternatives include Microsoft IIS, Chili!Soft ASP, iASP, and others.

Open-Source Option: Apache::ASP

Platforms: Any Unix platform supported by mod_perl (perl.apache.org)

Publisher: NodeWorks, www.nodeworks.com/asp

Apache::ASP is a Perl-based Apache module that works with mod_perl (with the common CGI.pm Perl module to handle uploads, and Perl’s DBI.pm/DBD.pm modules for connectivity to databases) to deliver ASP functionality. This module is heavily dependent on Perl, and the better you are with Perl, the better you will be with using and debugging Apache::ASP.

To install, find the latest version of Apache::ASP at http://www.perl.com/CPAN-local/modules/by-module/Apache/. Run Perl on the Makefile, then run a ‘make’ and ‘make install’ on the module (for more help, see www.nodeworks.com/asp/install.html). Additional resources for using Apache::ASP can be found at its homepage.

For support, try checking out the the Apache::ASP FAQ at www.nodeworks.com/asp/faq.html, or the mod_perl mailing list archives at www.egroups.com/group/modperl.

ALTERNATIVES: For more information on ODBC database connectivity through Perl’s DBI/DBD, see www.symbolstone.org/technology/perl/DBI (free) or www.openlinksw.com (commercial).

Final Judgement:

Since Apache::ASP works through Apache’s Perl module, be aware that its performance should be good, but, under heavy server loads, is unlikely to be as good as ASP support through a native binary (a similar concern exists for Halcyon’s Java-based iASP product). Apache::ASP is a good choice for basic installations or non-high-usage installations, but heavy usage, absolute mission-criticality or need for commercial-quality support will probably demand you use the Chili!Soft product (iASP provides commercial-quality support, as well).

Site Scripting Options

Commercial Option: Active Server Pages (ASP)

Platforms: Windows NT/2000, Windows 9x

Publisher: Microsoft, http://support.microsoft.com/support/default.asp?PR=asp&FR=0&SD=GN&LN=EN-US

Open Source Option: PHP Hypertext Preprocessor 4

Platforms: Most Unix platforms, Windows NT/2000, Windows 9x (See Apache platform information above for details)

Publisher: PHP Project, http://www.php.net

Overview:

As the Internet and the web have matured, CGI and dynamic content have rapidly replaced static HTML content for many sites. Unfortunately, as most developers will tell you, the increasingly complex requirements for dynamic sites have exposed many of the limitations of traditional CGI. However useful CGI was in extending the old infrastructure, it falls way short of current requirements for dynamic site design.

Quite a few software products are attempting to address the shortcomings of CGI and the desires of developers. They include products such as Microsoft ASP, Allaire ColdFusion, PHP, Apache ModPerl, and JSP, as well as others.

We’ll take a look at ASP and PHP, two popular scripting packages that, at first glance, appear to be very different beasts; but on closer examination, show a striking number of similarities.

ASP/PHP Common Features:

Script-Handling in the Web Server

One of the problems with CGI is that script execution usually requires the web server to spawn a new process in order to run the CGI script. The overhead of process creation is very high on some platforms. Even on platforms where it is efficient, CGI process creation can devastate performance on even moderately busy sites.

ASP and PHP have both sidestepped this problem by allowing the web server to handle the script itself. Neither one requires the web server to create an external process to handle a dynamic request. The script execution functionality resides in a module that is either built-in or dynamically loaded into the server on demand. The ASP functionality is part of Microsoft IIS, while PHP may be compiled directly into a variety of web servers including Apache and IIS. Although ASP and PHP handle this differently, the end result is more or less the same.

Session Handling

Due to the “stateless” nature of HTTP and CGI, it’s very difficult to “follow” a site visitor and keep track of his/her previous selections. As a kludge, developers have often posted data from page to page in hidden HTML form fields. This is neither convenient nor fail-safe, since it’s unwieldy and data can be lost between pages or compromised by malicious users.

ASP and PHP both have added support for “sessions.” A session allows the web server to simulate a continuous interaction between the user and the server. The server assigns the user a form of identification, and then remembers it and the data attached to it until the next request from that user arrives. Sessions minimize the amount of housekeeping work the developer must do to track the user across page requests. Again, ASP and PHP both provide good facilities for this.

Persistent Database Connections

Database access is a critical element for dynamic sites, providing a way to store and retrieve vast amounts of data that can be used to display tables of information, store user information/preferences, or control the operation of a site. A traditional CGI script has to connect to the database for each web request, query it, read the data returned, close the connection, and finally output the data to the user. If that sounds inefficient, it’s because it is.

To address this problem, ASP and PHP both support “persistent” database connections. In this environment, the web server creates a database connection the first time a user makes a request. The connection remains open, and the server reuses it for subsequent requests that arrive. The connection is closed only after a set period of inactivity (or when the web server is shut down). This is possible because the script execution environment remains intact inside the web server, as opposed to CGI where it is created and destroyed with the execution of each web request.

Again, this feature is a win for both ASP and PHP. Their syntaxes (though different) are simple and require no relearning of one’s database development knowledge.

Facilities for Object Design

A lot of useless information about Object-Oriented (“OO”) programming has been written, much of it presumably by people who either 1.) didn’t know what they were talking about, or 2.) were under the influence of Jack Daniel’s and horse tranquilizers. Here’s the real deal: OO design and programming allows you to encapsulate ideas in code through the combination of data and related data manipulation methods. Traditional programming has focused on the procedure that a program should follow, with data used as a “glue” to store information between and during procedures.

OO design focuses instead on the data and how it should act under different circumstances. This is done by housing data and function in distinct packages, or “objects,” that can’t interfere (except in predefined ways) with other objects. It is not a cure-all for poor programming, but it is a facility that allows for the creation of better software through good design and programming techniques. ASP and PHP both support the OO notion of classes (the definition of an object) in order to develop applications that are more modular, robust, and easier to maintain.

Access to External Logic Sources

Many designers would argue that for a truly complex installation, the site’s application or business logic should consist of objects completely external to the website and its presentation facilities. This is highly desirable since it allows you to build business objects that can be used for more than just your website and can be maintained apart from your site presentation code.

ASP and PHP allow you to access two types of external logic sources. The first, COM objects, are a Windows-based object framework. The second, CORBA, is a generalized network-transparent, architecture-neutral object framework. This support, especially of CORBA, is a huge win for both ASP and PHP.

ASP Features:

Ability to Utilize Different Programming Languages

When designing ASP, Microsoft made the decision not to tie the technology to a particular programming language. Although VBScript is what most people think of when they hear ASP, other popular language options include VBScript, Microsoft’s JScript, and PerlScript. This flexibility is especially useful if you or your development team has previous experience with one of these languages and your schedule precludes learning a new language for your current project.

Tight Integration with other Microsoft Software and Tools

For some, the ability to use Microsoft tools is a blessing; for others, a curse. For better or for worse, ASP is tightly integrated with other Microsoft tools. If you are comfortable with a Microsoft development environment, this could easily decide your scripting package for you. On the other hand, this tight integration can turn into a nightmare if you encounter insurmountable platform issues, as many Microsoft Windows tools do not have analogues on other platforms.

PHP Features:

Written Specifically As a Web Scripting Language

PHP benefits from being written from the ground up as a web-scripting language. It is tight and targeted, without the bloat that a general purpose programming language would have. This results in a package that is extremely stable, has a small memory footprint, and is lightning fast.

Integrates With Many Platforms and Web Servers

Version 4 of PHP was written using a new generalized API, which allows it to be ported more easily to other architectures and integrated into a variety of web servers, including Apache, Roxen, thttpd, and Microsoft’s IIS. It also is available for just about every type of UNIX as well as Windows 9x/NT/2000.

Final Judgement:

Both PHP and ASP have addressed many of today’s most critical web development needs. Both offer a ton of features, easy interfacing with all sorts of databases, and the facilities to create complex applications. For many, the decision will come down to a preference for one language or another, depending on what he/she has experience with already.

If the programming language is not a decision-making factor, the decision will probably come down to a choice of hosting platform. If you host on Windows, IIS/ASP is the obvious choice because of their tight integration. PHP operates quite nicely under Windows, but who in his/her right mind would turn down all of the resources Microsoft provides when working on its own platform? On the other hand, if you’re Unix-savvy, the decision to go with PHP is just as simple. PHP was born on Unix and built on top of Apache.

In the end, the question may not really be which solution is better, but which better fits the resources and needs of your project.

————————————————————————————————-

About the Authors:

Jeffrey Carl (Microsoft Web Compatiblity) is the Linux/BSD columnist for Boardwatch Magazine, and has written for the Boardwatch Directory of Internet Service Providers, CLEC Magazine and ISPworld. His writings have been featured on Daemon News, Linux Today, and Slashdot. He saw the plane dragging the “Web Hosting Magazine Kicks Ass” banner over the Spring 2000 ISPCON and agreed.

Matt Loschert (Site Scripting) is a web-based systems developer with ServInt Internet Services. Although he works with a variety of systems, programming languages, and database packages, he has a soft spot in his heart for FreeBSD, Perl, and MySQL. He begged Jeff to let him pretend to be a writer after hearing how cool Web Hosting Magazine was.

Eric Trager (Web Servers) currently serves as managing director for ServInt. With a background in personnel and operational management, Mr. Trager oversees ServInt’s daily business functions. His technical background includes four years in Unix server administration. He’s never read Web Hosting Magazine, but Matt and Jeff told him it was cool, so he foolishly believed them and wrote an article.

Linux 2.4: What’s in it For You?

By Jeffrey Carl

In a Nutshell: Linux’s day as a scalable server is here. The long wait for Linux kernel 2.4 made its release seem somewhat anti-climactic, so many of its new features have gone largely unnoticed. Although many of the changes were available as special add-ons to 2.2.x kernels before, the 2.4 kernel wraps them all together in a neat package – as well as integrating a number of great new features, notably in the networking, firewall and server areas.

Bringing Everybody Up to Speed

If you’re at all familiar with the Linux kernel and its upgrade cycle, you can skip the next several paragraphs and go on to make fun of the technical inaccuracies and indefensible opinions in the rest of the column. Everybody else should read these introductory paragraphs, and only then should they go on and make fun of the rest of the column.

The Linux kernel is the foundation of the Linux operating system, since it handles all of the low-level “dirty work” like handling processes and memory, I/O to drives and peripherals, networking protocols and other goodies. The capabilities and performance of the kernel in many ways circumscribe the capabilities and performance of all the programs that run on Linux, so its features and stability are critical.

In Linux, odd-numbered “minor” version numbers (like x.3) are unstable, developmental versions where adding cool new features is more important than whether or not they make the system crash. They are tested in the developmental kernels and once the bugs are worked out, they are “released” as the next highest even-numbered minor version (like x.4) kernel, which is considered stable enough for normal users to run. Linux kernel 2.2.x was the “stable” kernel (while kernel 2.3.x was the developmental version) from January 1999 to January 2001, when the new stable kernel became the very long-awaited 2.4.x (as of this writing, the most recent version was 2.4.2).

What’s New, Pussycat?

The changes since Linux kernel version 2.2 largely reflect the expansion of Linux as it comes to be used in an ever-wider variety of hardware and for different user needs. It wraps in features required to run on tiny embedded devices and multi-CPU servers as well as traditional workstations. Improving Linux’s multiprocessing capabilities also requires cleaning up a lot of other kernel parts so they can make use of (and also not to get in the way of) using multiple processors. To expand Linux’s acceptance in the consumer marketplace, it includes drivers for a large number of new devices. And to hasten Linux’s acceptance in the server market (especially on the high-end), it has enhanced its networking performance – notably in areas where earlier benchmarks had shown it losing to Microsoft’s Windows NT/2000.

With high-end server vendors (most notably IBM) embracing Linux, they have pushed for the kernel to include the features that would make Linux run reliably on high-end hardware. In addition to all the CPU types supported by Linux 2.2, the Intel ia64 (Itanium) architecture is now supported, as are the IBM S/390 and Hitachi SuperH (Windows CE hardware) architectures. There are optimizations for not only the latest Intel x86 processors, but also their AMD and Cyrix brethren as well, plus Memory Type Range Registers (MTRR/MCRs) for these processor types. Support for Transmeta Crusoe processors is built-in (as you would expect from Linus Torvalds being an employee of Transmeta). Whereas kernel 2.2 scaled well up to four processors, 2.4 supports up to 16 CPUs.

As part of the “scalability” push, a number of previous limitations have been removed in kernel 2.4. The former 2 GB size limit for individual files has been erased. Intel x86-based hardware can now support up to 4 GB of RAM. One system can now accept up to 16 Ethernet cards, as well as up to 10 IDE controllers. The previous system limit of 1024 threads has been removed, and the new thread limit is set at run time based on the system’s amount of RAM. The maximum number of users has been increased to 2^32 (about 4.2 billion). The scheduler has been improved to be more efficient on systems with many processes, and the kernel’s resource management code has been rewritten to make it more scalable as well.

Improved Networking

Kernel 2.4’s networking layer has been overhauled, with much of the effort going into making improvements necessary for dealing efficiently with multiprocessing. Improved routing capabilities have been added into Linux by splitting the network subsystem into improved packet filtering and Network Address Translation (NAT) layers; modules are included to make backward compatibility with kernel 2.0 ipfwadm and 2.2 ipchains-based applications available. Firewall and Internet protocol functions have also been added to the kernel.

Linux’s improved routing capabilities make use of a package called iproute2. They include the ability to throttle bandwidth for or from certain computers, to multiplex several servers as one for load-balancing purposes, or even to do routing based on user ID, MAC address, IP address, port, type of service or even time of day.

The new kernel’s firewall system (Netfilter) provides Linux’s first built-in “stateful” (remembering the state of previous packets received from a particular IP address) firewalling system. Stateful firewalls are also easier to administer with rules, since they automatically exclude many more “suspect” network transactions. Netfilter also provides improved logging via the kernel log system, automatically including things like SMB requests coming from outside your network, the ability to set different warning levels for different activities, and the ability to send certain warning-level items to a different source (like sending certain-level logging activities directly to a printer so the records are physically untouchable by a cracker that could erase the logfiles).

The system is largely backward-compatible, but it now allows Netfilter to detect many “stealth” scans (say goodbye to hacker tool nmap?) that Linux firewalls previously couldn’t detect, and blocks more DoS attacks (like SYN floods) by intelligently rate-limiting user-defined packet types.

Under kernel 2.2 (using a model that is standard across most Unix variants), all Unix network sockets waiting for an event were “awakened” when any activity was detected – even though the request was addressed to only one of those sockets. The new “wake one” architecture awakens only one socket, reducing processor overhead and improving Linux’s server performance.

A number of new protocols have been added as well, such as ATM and PPP-over-Ethernet support. DECnet support has been added for interfacing with high-end Digital (now Compaq) systems and ARCNet protocols. Support for the Server Message Block (SMB) protocol is now built-in rather than optional. SMB allows Linux clients to file share with Windows PCs, although the popular Samba package is still required for the Linux box to act as an SMB server.

Linux 2.4 has a web server called khttpd that integrates web serving directly into the kernel (like Microsoft’s IIS on WinNT or Solaris’s NCA). While not intended as a real replacement for Apache, khttpd’s ability to serve static-only content (it passes CGI or other dynamic content to another web server application) from within the kernel memory space provides very fast response times.

Get On the Bus

Linux’s existing bus drivers have been improved as part of the new resource subsystem, plus significant improvements and new drivers (including Ultra-160!) for the SCSI bus support. Logical Volume Manager (LVM), a standard in high-end systems like HP/UX and Digital/Tru64 UNIX that allows volumes to span multiple disks or be dynamically resized, is now part of the kernel. Support is also there for ISA Plug-and-Play, Intelligent Input/Output (I2O, a superset of PCI), and an increased number of IDE drivers.

The device filesystem has been changed significantly in kernel 2.4, and the naming convention for devices has been changed to add new “name space” for devices. These device names will now be added dynamically to /dev by the kernel, rather than all potential device names needing to be present beforehand in /dev whether used or not. While backward-compatibility is intended, this may interfere with some applications (most notably Zip drive drivers) that worked with previous kernel versions.

New filesystems have been added (including a functional OS/2 HPFS driver, IRIX XFS (EFS), NeXT UFS supporting CD-ROM and NFS version 3). Support for accessing shares via NFSv3 is a major step forward, although Linux volumes will still be exported using NFSv2. Linux’s method for accessing all filesystems has been optimized, with the cache layer using a single buffer for reading and writing operations; file operations should now be faster on transfers involving multiple disks.

For the Masses

There are, of course, a large number of updates to Linux that are primarily oriented towards the desktop (rather than server) user. A generic parallel port driver has been added which enables abstract communication with devices; this can be used for things like Plug-and-Play (PnP) polling or splitting the root console off to a parallel port device (like a printer). The new Direct Rendering Manager (DRM) provides a “cleaned-up” interface to the video hardware and removes the crash-inducing problem of multiple processes writing to a single video card at once.

There are a wide variety of new graphics and sound card drivers (including support for speech synthesizer cards for the visually impaired). The UDF filesystem used by DVDs has been added, and infrared (IrDA) port support is also included. Support for Universal Serial Bus (USB) has been added but isn’t yet perfect (although IMHO, whose USB implementation is?) PCMCIA card support for laptops is now part of the standard kernel rather than requiring a custom kernel version, but an external daemon will still be required for full support. FireWire/i.Link (IEEE 1394) support is there, as well as IEEE 802.11b wireless (Apple’s “AirPort,” Lucent’s “WaveLAN”).

Probably the most far-out “consumer”-level enhancement is that kernel 2.4 has added support for the rare infrared RS-219 standard, a management interface used by specialized remote controls for Mobil and Amoco station (and some others) car washes! With the optional xwash software package, this can actually be used (on a laptop) to send signals for a “free” carwash.

I’m kidding about that last one.

Is Anything Missing?

The 2.4 kernel itself does not have encryption technology built into it; that’s probably a wise decision, based on the various cryptography regulations of countries worldwide that might actually make it prohibitive to export or import the Linux kernel. Unlike the 2.2 kernel which included Java support automatically, you must specifically include it when building a 2.4 kernel.

Although Journaling File System (JFS) efforts have been underway for a while, their maturity was not sufficient to include in kernel 2.4. JFS systems – a major requirement for true mission-critical servers – record (“journal”) all of their operations (analogous to a transactional database), so advanced data recovery operations (such as after a crash or power loss during read/write operations) are possible. See IBM’s open-sourced JFS project (http://oss.software.ibm.com/developerworks/opensource/jfs/?dwzone=opensource) for more information and software availability.

For Mac Linux users, support for newer Mac Extended Format (HFS+) disks has not yet been added. As of this writing, the NTFS (Windows NT/2000 file system) driver can read but not write data from within Linux. Alas, support for Intel 8086 or 80286 chips is not present either.

Lastly, you should immediately assume that things that worked with kernel 2.2 will always work with 2.4. Changes in the device filesystem and the block device API (block devices are non-serial objects; i.e. devices like hard disks or CDs that can have any sector on them accessed randomly rather than receiving input in order) may break compatibility with some existing drivers.

Getting In-Depth with Linux 2.4

In this column, I’ve only been able to touch the surface of the new functionality available in Linux kernel 2.4. The “definitive” (most frequently quoted) analysis of 2.4 kernel changes is an ongoing set of posts to the Linux kernel-developers list by Joe Pranevich. The (currently) most recent version can be found at http://linuxtoday.com/stories/15936.html.

There’s also a good “kernel 2.2 vs. 2.4 shootout” with specific application test results and details at http://www.thedukeofurl.org/reviews/misc/kernel2224 and upgrade instructions for kernel 2.2.x systems at http://www.thedukeofurl.org/reviews/misc/kernel2224/5.shtml.

For an excellent overview of Linux 2.4’s new firewalling capabilities, see the article at SecurityFocus (http://securityportal.com/cover/coverstory20010122.html). For great information on the new network system’s routing capabilities, check the HOWTO (http://www.ds9a.nl/2.4Networking/HOWTO//cvs/2.4routing/output/2.4routing.html). A detailed article on the new security features in 2.4 can be found at http://www.linuxsecurity.com/feature_stories/kernel-24-security.html.

For a more in-depth overview of the general features, read the IBM DeveloperWorks kernel preview part 1 (http://www-106.ibm.com/developerworks/library/kernel1.html) and part 2 (http://www-106.ibm.com/developerworks/library/kernel2.html).

For an interesting comparison between Linux 2.4 and FreeBSD 4.1.1 (ignoring many of the advanced features of the new Linux kernel and concentrating on common tasks), see Byte Magazine’s article (http://www.byte.com/column/BYT20010130S0010).

For the kernels themselves and the most recent change logs, visit http://www.kernel.org/pub/linux/kernel/v2.4/. For ongoing news about Linux 2.4 and life in general, see http://slashdot.org.

Webhosting with Free Software Cheat Sheet

By Jeffrey Carl

So you want to run a webserver without paying a dime for software, eh? Or you want to make sure you have the source code to all your webserving applications in case you get bored and decide to try and port them to your networked office Xerox copier? Well, you’re in luck; webhosting with free (open-source, or free as in “free speech”) and free (no cost, or free as in “free beer”) software isn’t just possible, it also provides some of the best tools out there at any price.

In case you’re new to the game, or you’re looking for alternatives to packages you’re using now, the following is a brief guide to some of the more popular options that are out there. Trying to condense the features of any OS or application down to a couple sentences is inherently a dangerous thing; and I’m sure that many fans of the software listed below will disagree with elements of my “Reader’s Digest Condensed (Large Print Version)” summaries. Still, the following – based on my experiences and those of others – should provide at least a basic idea of what’s out there and why you might – or might not – want to choose it.

Operating Systems

Linux vs. BSD:

These OSes show the characteristics of their development styles: BSD was developed by small teams, largely focused on server hardware. Linux has been developed by many more people with wider uses, focusing more on desktop/workstation uses.

BSD has been around longer and is (in some ways) more optimized for server use. Due to its hype, Linux has many more developers, and almost all new third-party software is available for Linux first. Linux has the edge in user-friendliness, because distributions are targeting new users; BSD is, for the most part, more for the “old school.” Linux has also been adopted by a number of server hardware vendors producing “integrated” solutions.

Ultimately, it’s a matter of what you feel most comfortable with. Either way, with commodity x86 hardware, your server components (RAM, drives, etc.) and network connection will affect your performance much more than your choice of Linux vs. BSD will.

• FreeBSD (www.freebsd.org)

Best known among the BSDs. Concentrates on x86 architecture, server performance, integration of utilities. Standout features include ports collection, sysinstall admin utility, Linux binary compatibility, frequent releases.

• NetBSD (www.netbsd.org)

BSD with a focus on porting it to as many platforms as possible and keeping code portable. Great for using old/odd hardware as a server. Infrequent releases, not as popular as other BSDs.

• OpenBSD (www.openbsd.org)

BSD with a focus on security. Still in the process of line-by-line security audit of the whole OS. Infrequently released, utilities/packages lag behind other OSes because of security audits, but it’s the #1 choice if security is your primary concern.

• Red Hat Linux (www.redhat.com)

The number one U.S. distro, considered by many (rightly or wrongly) as “the standard.” As a result, it’s what many third-party/commercial Linux apps are tested against/designed for. Early adopter of new features in its releases; is on the cutting edge, but sometimes buggy until “release X.1.” Standout features: Red Hat Package Manager (RPM) installation, third-party support.

• SuSE Linux (www.suse.com)

The number one European distro. Favored by many because its six-CD set includes lots and lots of third-party software to install on CD. Less “cutting-edge” than Red Hat. Standout features include the YaST/YaST2 setup utility and the SaX X Windows setup tool.

• Slackware Linux (www.slackware.com)

Designed for experts: Slackware has no training wheels, and is probably the most “server-oriented” of Linux distros (maybe because of its close relationship to the BSDs). Not cutting-edge, few frills, but designed to be stable and familiar to BSD administrators.

• Linux Mandrake (www.linux-mandrake.com/en)

A solid, user-friendly distribution with good (but not great) documentation. Standout features include the DrakX system configuration utility and the DiskDrake disk partitioning utility.

• Debian GNU/Linux (www.debian.org)

The ideological Linux – totally supported by users rather than a corporation, and free (as is the GNU definition) software only is included. This is “ideologically pure” Linux – GNU-approved, but infrequent releases and not necessarily a good choice for beginners.

• Caldera OpenLinux (www.caldera.com/eserver)

Very user-friendly for new users. Standout features include LIZARD, its setup/configuration wizard.

• Corel LinuxOS (linux.corel.com)

By the time you read this, Corel will have sold its LinuxOS product to someone else, but the distro should remain the same. Ease of use for Windows converts is stressed, includes great SAMBA integration. Good for new users. Focus is mainly on desktop use.

Essentials

• Perl (www.perl.com)

From CGI to simple administration tasks, Perl scripts can cover a lot of territory. Perl is a must, and is practically part of Unix now. Check the Comprehensive Perl Archive Network (www.cpan.org) to find modules to extend Perl’s functionality.

• Perl’s cgi-lib.pl (cgi-lib.berkeley.edu) and/or CGI.pm (stein.cshl.org/WWW/software/CGI)

These are also “must-haves” for CGI scripts, whether you’re writing your own or using scripts found “out there” on the web.

• Sendmail (www.sendmail.org) or Qmail (www.qmail.org)

Free mailservers. Sendmail has the history and the documentation (a good thing, since its internals are famously complex), but Qmail has a less-complicated design, and a strong and growing band of followers.

• wu-ftpd (www.wu-ftpd.org)

A significant improvement in features over the classic BSD FTP daemon – for both BSD and Linux. Despite an older security flaw that was recently exploited by the “Ramen” Linux worm, it’s a very good program.

• OpenSSH (www.openssh.com/)

In this day and age, Telnet has become a liability for security reasons. There’s no reason not to migrate users who need a shell account to SSH. See www.freessh.org for a list of clients.

Web Servers

• Apache 1.3.x (www.apache.org/httpd.html)

The current king of web servers. Very good performance, stable enough to run on mission-critical systems. Very user-friendly to install and configure (due to comments in httpd.conf), but not always as easy as it should be to debug problems.

• Apache 2.x (www.apache.org/httpd.html)

Still in beta development, but may be final by the time you read this. It probably shouldn’t be used for mission-critical systems until it’s had a few months of time “out there” to find bugs after its final release. Version 2.0 will be much easier to add new protocols into (like FTP or WAP), and should have significantly better performance because of its multi-threaded nature.

• Roxen (www.roxen.com/products/webserver)

Roxen is much more than just a simple webserver – it includes its own web admin interface, secure server, and more. Used by real.com; shows promise but doesn’t have the acceptance level yet of Apache.

Secure Web Servers

Note: You may receive a “security warning” in most web browsers about your secure certificate if you generate your own secure certificate (free). For a non-free certificate created by an authority that most web browsers will accept without a warning, see VeriSign (www.verisign.com/products/site/ss/index.html), Thawte (www.thawte.com), Baltimore (www.baltimore.com/cybertrust), ValiCert (www.valicert.com/), Digital Signature Trust Co. (www.digsigtrust.com) or Equifax (www.equifaxsecure.com/ebusinessid) for more information.

• ApacheSSL (www.apache-ssl.org)

Focused on stability/reliability, and lacking in “bells and whistles” features. It’s simple and it works, but it lacks some features of mod_ssl and it isn’t updated very often.

• mod_ssl (www.modssl.org)

Originally based on ApacheSSL, mod_ssl is now largely rewritten and offers a number of extra features, plus better documentation.

Microsoft Web Compatibility

• FrontPage Extensions for Unix (www.rtr.com/fpsupport)

On one hand, it allows you to host sites built and published with FrontPage on a Unix server. On the other hand, it’s possibly the biggest piece of junk Unix software ever created. Use it if you have to; avoid it if you can.

• Improved mod_frontpage (home.edo.uni-dortmund.de/~chripo)

Addresses a number of problems with mod_frontpage (www.darkorb.net/pub/frontpage), with extra security and documentation, support for Dynamic Shared Objects (DSOs), better logging, as well as (unverified) claims of increased performance.

• Apache::ASP (www.nodeworks.com/asp)

An alternative to the very expensive ChiliSoft or Halcyon ASP Unix solutions, using Perl as the scripting language for ASPs. Requires the Apache mod_perl.

• asp2php (asp2php.naken.cc)

As its FAQ says, “ASP2PHP was written to help you correct the mistake of using ASP.” Converts ASP scripts to PHP scripts for use with Apache/PHP.

Application Building

• Zope (www.zope.org)

A complete tool for building dynamic websites; there’s a (somewhat) stiff learning curve that may be too much for basic needs. Zope offers incredible functionality, and is well-suited to large projects and web applications; it may be overkill for simple scripting that could be done with PHP or Perl CGIs.

• PHP (www.php.net)

The favorite open-source tool for building dynamic websites, and the open-source alternative to ASP. Reliable, uses syntax that seems like a cross between Perl and C, and features native integration with Apache. Version 4 is thread-safe, modular, and reads then compiles code rather than executing it as it reads (making it much faster with large, complex scripts).

Database Software

Note: for a more in-depth comparison, I highly recommend the O’Reilly book MySQL and mSQL, as well as the article “MySQL and PostgreSQL Compared” (www.phpbuilder.com/columns/tim20000705.php3).

• MySQL (www.mysql.com)

The “Red Hat” of free relational database software. Well-documented, and its performance for most users is excellent, designed around fast “read” rather than “write” operations. It doesn’t offer “subselect” functionality, and tends to buckle under very heavy loads (more than 15 concurrent users per second), but is very fast and reliable for most sites.

• PostgreSQL (www.postgresql.org)

Has an active developer community, especially popular among the “GPL-only” crowd. Offers advanced features that MySQL doesn’t (subselects, transactional features, etc.), but traditionally wasn’t as fast for common uses and sometimes suffered data corruption. New versions appear to have remedied most of these deficiencies.

• mSQL (www.hughes.com.au)

The first of the bunch, but appears to have fallen behind. More mature than MySQL or PostgreSQL, but may not have all of the features of its rapidly developing brethren.

Administration

• Webmin (www.webmin.com/webmin)

Fully featured web-based administration tool for web, mail, etc. Offers excellent functionality, but can present a potential security risk (I get really nervous about anything web-accessible which runs with root permissions).

Java Servlets

• Tomcat (jakarta.apache.org) and JServ/mod_jserv (java.apache.org)

Tomcat is an implementation of the Java Servlet 2.2 and JavaServer 1.1 specifications that works with other browsers as well as Apache. JServ is an Apache module for the execution of servlets. The two work together to serve JSPs independently of Apache.

Website Search

• ht://Dig (www.htdig.org)

ht://dig is relatively simple to set up, and (with a few quirks) offers excellent searching capabilities. Easily customizable, and has a good “ratings-based” results engine.

• MiniSearch (www.dansteinman.com/minisearch)

A simple Perl search engine, which can also be run from the command line. Not as fully featured as ht://dig, but good enough for basic needs.

Web Statistics

• analog (www.statslab.cam.ac.uk/~sret1/analog)

Analog is extremely fast, reliable and absolutely filled with features. Its documentation is a bit confusing for beginners, however, and it takes some configuration to make it pretty.

• wwwstat (www.ics.uci.edu/pub/websoft/wwwstat)

A no-frills, simple statistics analysis program that delivers the basics.

Other Goodies

Configuration Scripts:

• Install-Webserver (members.xoom.com/xeer)

• Install-Qmail (members.xoom.com/xeer)

• Install-Sendmail (members.xoom.com/xeer)

Shopping Carts:

• Aktivate (www.allen-keul.com/aktivate)

Aktivate is an “end-to-end e-commerce solution” for Linux and other Unixes. It is targeted at small-to-medium-sized businesses or charities that want to accept credit card payments over the Web and conduct e-commerce.

• OpenCart (www.opencart.com)

OpenCart is an open source Perl-based online shopping cart system. It was originally built to handle the consumer demands of Walnut Creek CDROM, was later expanded to also work with The FreeBSD Mall, and was finally developed to be used by the general public.

• Commerce.cgi (www.careyinternet.com)

Commerce.cgi is a free shopping cart program. Included is a Store Manager application to update program settings, and you can add/remove products from the inventory through a web interface.

Message Boards:

• WaddleSoft Message Board (www.ewaddle.com)

WaddleSoft is a message board system that includes polls, user registration, an extensive search engine, and sessions to track visitors.

• MyBoard (myboard.newmail.ru)

MyBoard is very easy and light-weight web messageboard system. It also has some extended features such as search and user registration.

• NeoBoard (www.neoboard.net)

NeoBoard is a Web-based threaded message board written in PHP. It includes a wide variety of advanced features for those comfortable with PHP.

• PerlBoard (caspian.twu.net/code/perlboard)

PerlBoard is a threaded messageboard system written in Perl. It is very easy to use and set up, and has been time-tested for the past several years on the site it was originally written for.

• RPGboard (www.resonatorsoft.com/software/rpgboard)

RPGBoard is a WWWBoard-style message board script. It includes a list of features as long as your arm, and is well worth checking out for those who need a rather advanced message board.

Notes from the Underground

If you see a favorite package here that I’ve overlooked, or would like to offer comments on any of the package descriptions, e-mail me at [email protected]. I’ll update this list with more information for a future column.

The Web Server First Aid Kit

By Jeffrey Carl

It’s a sad fact that most system administration learning is done in the minutes and hours after you say the words, “Wow. I’ve never seen something get broken that way before.” Learning to be a sysadmin means that you discover how to fix all the problems that pop up, until you find a problem you’ve never run into before. Then you scramble to learn how to fix that, and you’re fine until the next new Unidentified Weird Thing™ happens. And so on.

Fortunately, about 90 percent of Unix web/mail/etc. server problems can be discovered or fixed with just a few tools – much like 90 percent of all household repairs can be done with a screwdriver, a wrench or a baseball bat. Knowing just a few likely trouble spots and troubleshooting tools can help you resolve a lot of that unidentified weirdness without getting so frustrated that you want to rip the hard drives out of the computer and make refrigerator magnets out of them.

The key here is that unlike cars or girlfriends, everything that goes wrong on a Unix system happens for a clearly defined reason. While that reason may sometimes be freakish or undocumented, it’s almost always one of a few fairly common issues.

So, with that in mind, we’re going to take a look at the Handy Tools and the Usual Suspects – the top commands and tools to use, and common places to look that will at least shed a clue on most server problems. I’m going to use FreeBSD as the example system – but most other BSDs and Linux can use the same tools, even if they act slightly differently or are located in a different place in the filesystem.

The Handy Tools

• When a server is responding slowly, you need to figure out whether the problem is on the server or in the network. After you’ve logged in to the server and become root, your first stop should be uptime.

The important part of the information it provides is the server’s load averages – shown for the last one, five and 15 minutes. If the load average is high (above two or three), the most likely cause of the slowness is one or more “runaway” processes or some other processes extensively utilizing the system. If the load average isn’t high, then you’re probably looking at a networking issue that is slowing access to the server.

• If you’ve found that a high load average is the likely culprit, turn to top. The top command lists the server’s process in order of CPU and memory utilization.

By default, top shows the top 10 processes, or you can use it in the form top N, where N is the number of processes you wish it to show. If you have one or more “runaway” processes (like the tcsh process shown above – most likely from an improperly terminated login session), you can quickly identify it and issue a kill or kill –9 (which effectively means “I don’t care what you think you’re doing, just shut up and go away”) command to the process ID number (PID) of the runaway.

• For a more complete listing of the processes that are running on your computer, use ps. The ps –auxw command (on BSD-based systems; ps –ef on System V-based systems; the ps on most Linuxes will accept either) will show all system processes owned by all users, whether active or background.

You can use this to find any active process and get its PID if you need to “re-HUP” or kill it. You can find processes for a single server by using ps in combination with the venerable grep, such as finding all Apache processes by using ps –auxw | grep httpd | grep –v grep. Compare the number of web processes to the server’s “hard” and “soft” limits (the “hard” limit is set when Apache is compiled; the “soft” limit is set in the [apache_dir]/conf/httpd.conf file for recent versions) to the number of active processes. If those numbers are close to being equal, consider either upgrading your hardware or reconfiguring/recompiling Apache with higher limits.

• If you’re worried that a user on your system is running an unauthorized program, hacking the system or otherwise foobaring things, then w is a simple check.

The w command lists active users on the system and what they’re doing. If any of them are performing unauthorized activities, simply kill that user’s shell and use vipw to either give them a password (the second colon-separated field, immediately after the username) of “*” or assign them a shell (the last field of each user’s line) of /sbin/nologinuntil you have sorted out the what they were doing and whether it violated your policies. A kill –9 may be necessary for “phantom” or “zombie” processes that were left running after improper logouts.

• If your problem is a crashed or non-starting Apache webserver, use the built-in apachectl command to work out the issue. It’s generally installed in the bin subdirectory of the Apache installation; if this isn’t in your shell’s command path, you may need to specify the full path to this command. Aside from the basic apachectl start and apachectl stop commands, one of the more useful options is the apachectl configtest command, which performs a basic evaluation of Apache’s httpd.conf configuration file (where almost all options are specified for Apache 1.3.4 and later).

Unfortunately, apachectl is notorious for providing “okay” readings when some configuration problems are still present (most notably when a directory specified for a virtual host is not found or not readable, which causes Apache to fail). For these situations, you’ll need to consult your Apache error logs (see below). Also, apachectl consults the file /var/run/httpd.pid to find its originating process; if this PID is different, the apachectl stop command won’t work. In these cases, find the httpd process owned by root using ps (this will be the “parent” Apache process) and kill that process.

• Your first tool for diagnosing whether a problem may in the server’s network connection rather than on the server itself is ping. Using ping to test the connection to a server is a common test, but some problems (such as an error in duplex or settings between a server and its switch) may not show up using ping normally. If a ping to a server appears normal but you suspect a network error is involved, try using ping with larger-than-normal packet sizes. The default size of the data packet used by ping is only 56 bytes, but many errors will only show up when large ping packets (2048 bytes or greater) are used. Use the –s flag with ping to specify a larger packet size (use the –c option to specify the number or “count” of pings to send).

With large packet sizes, a longer-than-usual round-trip time is normal, but excessively long times or packet loss are good indicators that there is a network configuration problem present. Try sending large ping packets for at least a count of 50, and compare the results with a long-count ping with normal packet sizes.

• If a network misconfiguration between a server and its switch (or router) is possible, then you’ll want to To show the status of your server’s network connections, use netstat -finet. Netstat will show you which ports are open or your server or which services are active, as well as what foreign host is connecting to the port or service in question.

If you’re concerned that your server is being attacked across the network, this will generally show up in excessive usage of the memory that the kernel has allocated to networking. To find this out, use the –m (memory buffer, or “mbuf”) flag for netstat. If you find that normal services like httpd aren’t heavily burdened but the percentage of memory allocated to networking is still high (90 percent or more), consider shutting down network services or ports that are open and may be being attacked or misused.

• If a network issue is the likely cause of your problem, use ifconfig (the interface configuration command) to check how the NICs (Network Interface Cards) on the server are set up.

You can ignore the lo0 (loopback) interface; what really matters are the settings for your server’s NIC(s) as specified by their driver type. These will show its IP address(es), netmask, duplex and speed, as well as which driver is in use.

Very frequently, a server which otherwise boots up and appears fine but has a problematic or nonexistent network connection can be fixed with a check of its network interface configuration. Double-check the options set for your default ifconfig startup settings in the file /etc/rc.conf (at least in recent versions of FreeBSD). Frequently, a slow network connection is the result of a NIC configured for a different speed or duplex than its switch/router port, especially when “autosense” options are set but fail for whatever reason. This can frequently be remedied by resetting the connection with a simple ifconfig down [interface] [options] OPTIONS followed by an ifconfig up [interface] [options] command.

• Weird errors with files or services may sometimes be caused by a full hard drive (preventing the system from writing logfiles or other operations). Use the df command to show your server’s mounted partitions and their available capacity.

• A whole nasty horde of seemingly inexplicable problems are caused by simple issues with file permissions. In these cases, the humble ls command can be your best ally. Using ls –l will show you the permissions settings for files in any directory. Common issues include missing “x” (executable) permissions on CGI scripts or applications, or directory permissions which don’t allow “r” (reading) or “x” (entering).

The Usual Suspects

• When bizarre things are happening, the system logfiles are the first place to check. Under BSD, you’ll find these in /var/log; the first place to look is /var/log/messages., where syslog deposits all the messages that aren’t specified to go into another logfile. In fact, the entire /var/log directory is home to the messages for different services – from telnet/SSH or FTP logins to SMTP and POP connections to system errors and kernel messages.

Checking these files can often provide the answers to 90 percent of “I can’t do X” messages from desperate system users. Check /etc/syslog.conf to see where the syslog daemon is sending the errors it receives; check the config files for individual applications or services to see which logfiles they’re writing to.

• If the webserver won’t start, but there aren’t any clues elsewhere, immediately look at the webserver logfiles. Using Apache, these are generally located in the file [apache_dir]/logs/error_log or something similar. Even if apachectl runs and fails while printing a simple message like “httpd: could not be started” (this message is the winner of the “FrontPage Memorial ’Duh’ Award for Unhelpful Error Handling” three years in a row), the problem will almost certainly be logged to Apache’s errors file.

For problems with specific virtual hosts on a server, check wherever their logfiles are located. This is generally specified inside that domain’s <Virtual Host> … </Virtual Host> directive in the httpd.conf file. If no error logfile is specified for that virtual host, then errors will be logged to the main Apache error file.

Getting a Second Opinion on First Aid

Of course, all of the above are merely a few recommendations derived from my experience; if you have found other “First Aid Tools” or “Usual Suspects” that you rely on for server administration, please let me know at [email protected] and I’ll include them in an upcoming column.

The Next Apache

By Jeffrey Carl

The Apache webserver is (along with Linux and Perl) probably the more widely-used open-source software in the world. After beginning as a set of patches to the NCSA http server (“a patchy server” was how it got its name), Apache had moved by 1996 to become the most popular http server out there. According to Netcraft, Apache today powers 59 percent of all the webservers on the Internet, far more than the 20 percent share of the runner-up, Microsoft Internet Information Server.

Apache’s last “full-point” release (Apache 1.0) was released on December 1, 1995, and it has been five years since then. Naturally, there’s a lot of excitement about the long-awaited Apache 2.0, which should be in beta release by the time you read this. To find out what’s new with Apache 2, I asked the Apache Project’s Dirk-Willem van Gulik and Ryan Bloom. The following is selected portions of an e-mail interview with these Apache team members:

Boardwatch: Why was the Apache 2.0 project started? What shortcomings in Apache 1.x was it created to address, or what missing features was it designed to implement?

Bloom: There were a couple of reasons for Apache 2.0. The first was that a pre-forking server just doesn’t scale on some platforms. One of the biggest culprits was AIX, and since IBM had just made a commitment to using and delivering Apache, it was important to them that we get threads into Apache. Since forcing threads into 1.3 would be a very complex job, starting 2.0 made more sense. Another problem was getting Apache to run cleanly on non-Unix platforms. Apache has worked on OS/2 and Windows for a long time, but as we added more platforms (Netware, Mac OS X, BeOS) the code became harder and harder to maintain.

Apache 2.0 was written with an eye towards portability, using the new Apache Portable Run-time (APR), so adding new platforms is simple. Also, by using APR we are able to improve performance on non-Unix platforms, because we are using native function calls on all platforms. Finally, in Apache 2.0, we have implemented Filtered I/O. This is a feature that module writers have been requesting for years. It basically allows one module to modify the data from another module. This allows CGI responses to be parsed for PHP or SSI tags. It also allows the proxy module to filter data.

Boardwatch: What are the significant new features of Apache 2.0?

Bloom: Threading, APR, Filtered I/O. 🙂 And Multi-Processing modules and Protocol modules. These are two new module types that allow module writers more flexibility. A Multi-Processing module basically defines how the server is started, and how it maps requests onto threads and processes. This is an abstraction between Apache’s execution profile and the platform it is running on. Different platforms have different needs, and the MPM interface allows porters to define the best setup for their platform. For example, Windows uses two processes. The first monitors the second. The second serves pages. This is done by a Multi-Processing Module.

Protocol modules are modules that allow Apache to serve more than just HTTP requests. In this respect, Apache can act like inetd on steroids. Basically, each thread in each process can handle either HTTP or FTP or BXXP or WAP requests at any time, as long as those protocol modules are in the server. This means no forking a new process just to handle a new request type. If 90% of your site is served by HTTP, and 10% is served by WAP, then the server automatically adjusts to accommodate that. As the site migrates to WAP instead of HTTP, the server continues to serve whatever is requested. There is no extra setup involved. I should mention now that only an HTTP protocol module has been written, although there is talk of adding others.

Boardwatch: How much of a break with the past is Apache 2.0, in terms of 1.) the existing code base, 2.) the administration interface, and 3.) the API for modules?

Bloom: I’ll answer this in three parts.

1) The protocol handling itself is mostly the same. How the server starts and stops, generates data, sends data, and does anything else is completely different. By adding threads to the mix, we realized that we needed a new abstraction to allow different platforms to start-up differently and to map a request to a thread or process the best way for that platform.

2) The administrative interface is the same. We still use a text file, and the language hasn’t changed at all. We have added some directives to the config file, but if somebody wants to use the prefork MPM (it acts just like 1.3), then a 1.3 config file will migrate seamlessly into 2.0. If the MPM is changed, then the config file will need slight modifications to work. Also, some of the old directives no longer do what they used to. The definition of the directive is the same, but the way the code works is completely different, so they don’t always map. For example, SetHandler isn’t as important as it once was, but the Filter directives take its place.

3) The module API has changed a lot. Because we now rely on APR for most of the low-level routines, like file I/O and network I/O, the module doesn’t have as much flexibility when dealing with the OS. However, in exchange, the module has greater portability. Also the module structure that is at the bottom of every module has shrunk to about 5 functions. The others are registered with function calls. This allows the Apache group to add new hooks without breaking existing modules. Also, with the filter API modules can should take more care when generating data to generate it in size-able chunks.

Boardwatch: What are the advantages and disadvantages (if any) of Apache 2.0’s multithreaded style? What does it mean to have the option of being multi-process and multi-threaded?

Bloom: The multi-threading gives us greater scalability. I have actually seen an AIX box go from being saturated at 500 connections to being saturated at more than 1000. As for disadvantages, you loose some robustness with this model. If a module segfaults in 1.3, you lose one connection, the connection currently running on that process. If a module segfaults in 2.0, you lose N connections, depending on how many threads are running in that process, which MPM you have chosen, etc. However, we have different MPMs distributed with the server, so a site that only cares about robustness can still use the 1.3 pre-forking model. A site that doesn’t care about robustness or only has VERY trusted code can run a server that has more threads in it.

van Gulik: In other words; Apache 2.0 allows the webmaster to make his or her own tradeoffs; between scalability, stability and speed. This opens a whole new world of Quality of Service (QoS) management. Another advantage of these flexible process management models is that integration with languages like Perl, PHP, and in particular Java, can be made more cleanly and more robust without loosing much performance. Especially large e-commerce integration projects will be significantly easier.

Boardwatch: What is the Apache Portable Runtime? What effect does this have on the code, and on portability?

Bloom: The Apache Portable Runtime is exactly what it says. 🙂 It is a library of routines that Apache is using to make itself more portable. This makes the code much more portable and shrinks the code size, making the code easier to maintain. My favorite example is apachebench, a simple benchmarking tool distributed with the server. AB has never worked on anything other than Unix. We ported it to APR, and it works on Unix, Windows, BeOS, OS/2, etc without any work. As more platforms are ported to APR, AB will just work on them, as will Apache. This also improves our performance on non-POSIX platforms. Apache on Windows, the last I checked, is running as fast as Apache on Linux.

Boardwatch: Can Apache 1.3.x modules be used with Apache 2.0? How will this affect things like Apache-PHP or Apache-FrontPage?

Bloom: Unfortunately, no. However, they are very easy to port. I have personally ported many complex modules in under an hour. The PHP team is already working on a port of PHP to 2.0, as is mod_perl. Mod_perl has support for some of the more interesting features already, such as writing filters in Perl, and writing protocol modules in Perl. FrontPage will hopefully be subsumed by DAV, which is now distributed with Apache 2.0.

van Gulik: Plus it is likely that various Apache-focused companies; such as IBM, Covalent and C2/RedHat will assist customers with the transition with specific tools and migration products

Boardwatch: Do you predict that server administrators used to Apache 1.3.x will have a hard time adjusting to anything about Apache 2.0? If so, what?

Bloom: I think many admins will move slowly to Apache 2.0. Apache 2.0 has a lot of features that people have been asking for for a very long time. The threading issues will take some getting used to, however, and I suspect that will keep some people on 1.3 for a little while. Let’s be honest, there are still people running Apache 1.2 and even 0.8, so nobody things that every machine running Apache is suddenly going to migrate to 2.0. Apache tends to do what people need it to do. 2.0 just allows it to do more.

Boardwatch: Who should/shouldn’t use the alpha or beta releases?

Bloom: The alpha releases are developer releases, so if you aren’t comfortable patching code and fixing bugs, you should probably avoid the alphas. The betas should be stable enough to leave running as a production server, but there will still be issues so only people comfortable with debugging problems and helping to fix them should really be using the betas. (Napster is using alpha 6 to run their web site)

Boardwatch: For server administrators, what guidelines can you give about who should or shouldn’t upgrade to Apache 2.0 when it becomes a final release? For what reasons?

Bloom: Personally, I think EVERYBODY should upgrade to Apache 2.0. 🙂 Apache 2.0 has a lot of new features that are going to become very important once it is released. However, administrators need to take it slowly, and become comfortable with Apache 2.0. Anybody who is not on a Unix platform should definitely upgrade immediately. Apache has made huge strides in portability with 2.0, and this shows on non-Unix machines.

van Gulik: I’d concur with Ryan insofar as the non-Unix platforms are concerned; even an early beta might give your site a major boost; as for the established sites; running on well understood platforms such as Sun Solaris and the BSD’s – I am not so sure if they will upgrade quickly or see the need; The forking model has proven to be very robust and scalable. Those folks will need features; such as the filter chains to be tempted to migrate.

Boardwatch: Apache is the most popular web server out there, meeting the needs of many thousands of webmasters. What else is there to do? What is on the Apache web server team’s “wish list” for the future?

Bloom: Oh, what isn’t on it? I think for the most part, we are focusing on 2.0 right now, with the list of features that I have mentioned above. We are very interested in the effect of internationalization on the web and Apache in particular. There are people who want to see an async I/O implementation of Apache. I think that we will see some of the Apache group’s focus move from HTTP to other protocols that compliment it, such as WAP and FTP. And I think finally we want to continue to develop good stable software that does what is needed. I do think that we are close to hitting a point where there isn’t anything left to add to Apache that can’t be added with modules. When that happens, Apache will just become a framework to hang small modules off of and it will quiet down and not be released very often.

Boardwatch: Anything else you’d like to add? 😉

Bloom: Just that Apache 2.0 is coming very close to its first beta, and hopefully not long after that we will see an actual release. The Apache Group has worked long and hard on this project, and we all hope that people find our work useful and Apache continues to be successful.