The multiple versions solution

Benjamin Pierce had a problem that besets almost everyone who works with files on more than one computer. He couldn’t keep track of the various versions of files on his laptop and workstation.

Pierce, being an assistant professor of computer and information science, however, knew how to solve the problem. He could create new software.

“I was pounding my hands and saying, Why can’t I figure this out?” he said. “So I said, Why don’t I take a weekend and write a little tool?

“That was four years ago, and now I’m still writing my little tool.”

Pierce’s little tool is an application called Unison, which can compare files on two different networked computers and bring them up to date.

People who work in offices equipped with Lotus Notes software ought to grasp the basic concept, as that package lets multiple users work on a single file simultaneously. But it’s not quite the same thing as what Unison does. Pierce explains that Notes not only knows there is stuff in the file, it knows how the stuff is arranged.

Unison works on a simpler level. It only knows that a file has been changed from a previous version and replaces the old version with the new. But it also has cross-platform capability. Where Notes requires everyone using it to run Windows on their computers, Unison also works on Unix machines, the overwhelming choice of scientists and engineers.

Because Unison is open-source software, people may freely revise it and distribute it. So Pierce predicts that hackers should get it to work with Macintoshes soon.

But as Pierce readily admits, Unison is just a first step towards solving the problem of file synchronization across multiple computers and operating systems. The problem has attracted wide interest among computer scientists, in part because solving it would make their own lives easier: “Open-source projects usually begin with someone scratching their own itch,” he said.

He is scratching other people’s itches too. As of early September, 3,000 users have downloaded Unison since its public release. Part of the widespread interest in Unison is from the general need to synchronize files. And some of the interest is because file synchronization raises “a lot of interesting algorithmic problems” that Pierce still hasn’t solved, problems such as:

How does the software handle a case, for instance, where both copies of a file have been changed since the last synchronization?

... Or what if a file in a specific directory on one computer is changed, while on another, the whole directory containing the file is erased?

... And what about copies of files on more than two computers?

Each new advance in this area raises as many questions as it answers.

While the application has its uses, Pierce and the downloaders are also in it just for “the intellectual pleasure of understanding this tricky domain,” he said.