(The following post is cut-n-pasted from an email conversation I had recently about a Personal Search Engine with Jeff Barr. I'm posting it on my typepad so that I can use Google to find these notes again. I note the irony that if I had a Fisher, I wouldn't need to blog it publicly to be able to find it with certainty later when I need to find it.)
I've been thinking long and hard lately about finding things in my (20 Gigabytes representing 15 years of) email archives (or, as Rohit calls it, The "Search My Mail D*mnit" Problem) so I can properly deal with Ham. In the email must evolve context, I think ZOË-as-Personal-Server points in the right direction.
I find myself wondering if there is still an opportunity to launch a desktop search product that fits the classic definition of platform. The equivalent of a "Browser" for the next decade that brings together existing disparate tools by mixing SMTP and HTTP and throws in a healthy dose of instant messaging and RSS -- except that instead of browsing for information it lets you go (for lack of a better word) "Fish" for information. It's got a simple browser interface and query language (like Google), is lightning fast (due to regular re-indexing), and offers search results of your personal stuff in that simple UI.
Rohit is three steps ahead of me here -- that any good "Fisher" of all your emails, IM's, desktop files, web history, and RSS feeds needs a great algorithm to rank the results of your "Go Fish" queries. (Ranking is something ZOË doesn't do, and therefore it cannot handle the volumes of email I receive daily.)
A "Fisher" isn't a replacement for our existing PIM's and browsers and IM clients and RSS readers, in the same way that Google doesn't replace the Web. Actually, Google is a good analogy since it provides a set of ranked results for any given query of the Public Web. But Google is an anonymous search of the Public Web of information. Fisher, by contrast, provides a set of ranked results for any given query with a personal search of your Private Web of information. It's a customized search of your personal stuff.
I don't have a better sketch of the opportunity yet, but if it works for your personal stuff anything close to the way Google works for public stuff, I think it's a killerApp just waiting for an auteur to get around to writing it. The platform play of course is that it is scriptable -- not just the query engine but the "tap on your ethernet" which can watch HTTP and SMTP traffic crossing your machine and do things on your behalf based on inspecting all the data that crosses through.
I have to think about it. Like I said, I'm sure Rohit is three steps ahead of me on this one... as was jwz six years ago in his write-up of Intertwingle (and short discussion), "a potential project to make it easier to deal with a massive volume of personal messages: excavating, traversing, relating, reporting, annotating. Intertwingle can be seen as a unification of a search tool and an address book. It is not, however, a mail reader. The presentation of query results could be done through a mail reader, but the intention is that ones choice of mail reader should be orthogonal to the use of this tool. The two kinds of tools just happen to operate on the same data." Might this be what the Twingle effort of Kasei is all about?
Update, 3/4/2004 at 5am. Rohit emailed me a description of Fisher in his own words:
Personally, I'm getting very aggravated by the irony that it's easier to find stuff on the Internet than on my own PC. The vast majority of this problem is email, specifically. And while email is impoverished in hyperlinks, ruling out the sorts of PageRank algorithms web search engines use, it is very rich in social network information. The correspondence information can help us choose which bits of text are likely to be the most relevant hit for a query, because it does matter who said what.I believe Intertwingle is a good early manifesto about Fishers as a product category, and I think that ZOË and X1 are good first-generation instantiations of Fisher-as-a-product. Looking around SearchTools.Com I don't see any other Fishers... yet.Admittedly, this may have something to contribute about searching multiple-agent discourses in general -- anywhere you can clearly identify authorship of a snippet, and thence calibrate which authors a user reads most.
Of course, if we simply ranked people by frequency of interaction, it would be kind of boring. Frame it as a simple principal-eigenvector problem -- count who reads your readers, and so on -- and interesting patterns emerge. It could well be as useful as PageRank itself was, by comparison to text search alone.
Imagine the Google UI for your own PC. You aim your web browser at localhost and get back a results page that looks eerily familiar, but the hits are actually documents, mails, photos, and cached web pages from your own personal archive. This means nailing the challenges of grouping similar results, generating short excerpts, converting file formats for indexing, and so on. There's more magic to how it installs -- you don't change your email client at all! -- but that's just more technology.
See also: Fisher as a product category, Lookout, Software!, Fluffy Bunny (aka Google Desktop).
"Imagine the Google UI for your own PC."
You don't have to image, you just need to buy a Google Appliance. See http://www.google.com/appliance/ for details.
Posted by: Steve Wilhelm | July 19, 2004 at 11:20 PM
For the money it would cost me to buy a Google Appliance, I could buy a car. A really nice car. :)
Plus, I'm not convinced the Google Appliance is useful for spidering and indexing a single desktop PC. I'm willing to be convinced that I'm wrong, though. :) :)
Posted by: Adam | July 20, 2004 at 06:27 PM
I'm sure somewhere in Google Labs there is a desktop indexer being worked on that will eventually be integrated into Google Toolbar/Deskbar. It would be the perfect addition for those tools, and throw AdWords into the mix and they have a whole new source of revenue.
Posted by: Cameron Mirza | July 30, 2004 at 01:40 PM
Somewhere in Google Labs they're also curing world hunger, drafting plans for world peace, and prototyping a new delicious type of curry that is code-named blue curry (red, green, and yellow curries already exist, so once blue is perfected, Google's marketing team will be able to spell their entire logo in curry!).
I believe there's nothing the Google folks can't and won't do. It's all just a matter of time.
Not that that will stop Microsoft, who just demo'd hard-drive searching:
The race to build a great Fisher has started, but like the race to build a great Browser, this will be a 10-20 year war...
Posted by: Adam | August 02, 2004 at 01:22 PM
The most promising fisher I use:
http://www.lookoutsoft.com/
Was just bought by M$.
Posted by: Marcel Keller | September 26, 2004 at 01:42 PM
Marcel, thanks for your note.
I believe Microsoft's purchase of Lookout speaks to big investments in Fishers by not just Microsoft but also Google and Yahoo in coming years...
Also, TidBITS explains how Apple's Tiger OS in 2005 will have a Fisher called Spotlight:
This will be a long race, like the browser race, spanning a decade, and the entrants have only begun to think about the problems associated with making a fast, useful Fisher.
Posted by: Adam | September 28, 2004 at 06:33 PM
It's here...
http://desktop.google.com
Posted by: | October 15, 2004 at 11:01 AM
I know, and I'm thrilled... so far the Desktop Google has not disappointed me...
Posted by: Adam | October 17, 2004 at 01:26 PM