Indexing...why hasn't someone written a plugin

Hey folks,
It would be great if someone could write a plugin for opus to index the content of documents....kind of like google says they do. I set up google desktop to index only one folder on one of my hard drives. I know there are at least 4 documents there that contain "pvns" (pigmented villonodular synovitis) because I see them in a folder. A google desktop search can't find them. These aren't the only thing google desktop can't find. Lucene is a decent indexer if your a programmer and can write a front end for it. I'd just like to have a software that could index the content of *.doc, *.pdf, etc... as well as the headers of DICOM files.

Anyone know of anything?

There are a few different indexing tools out there (e.g. X10, Yahoo Desktop) in addition to Google Desktop. I think Google Desktop is the only one which can be integrated into Opus at the moment but if there are APIs for integrating the others then you could ask GPSoft to look into them, assuming they work better than Google Desktop. (Integrating an indexing program isn't something you can do with plugins so only GPSoft can help there really).

I've only spoken to a handful of people who work for Google but they seem interested in improving their software so it's probably worth mentioning that Google Desktop isn't finding things in your documents. Can't hurt!

Thanks for your response. I think the folks at Google are quite busy. I've sent them several e-mails. The first time I sent in suggestions I got a personal response, but said the things I suggested weren't a priority for them. Lately I've just gotten an auto-reply thanking me for sending feedback. There are a lot of things they could do to improve their software. Perhaps if I were a programmer, I'd understand why they haven't implemented a number of things. For example, It would be nice to have checkboxes to limit searchs to pictures, documents, movies, pdf, jpg, etc.. as well as date range modified, location, etc...

I've also tried Copernic, Yahoo and Lucene, none of which have any richer features. It seems like it would be very easy to write those queries into the GUI.

I've found Copernic to be far superior to GDS. Mostly thanks to it's GUI, but also because it has built in support for indexing network drives and it has exactly what you asked for, "checkboxes to limit searchs to pictures, documents, movies, pdf, jpg, etc.. as well as date range modified, location, etc..."

But I really don't feel a need to have it integrated into DO. I don't think it would offer any real convenience for myself.

I have tried both Google Desktop Search (GDS) and Copernic Desktop Search (CDS). I really wish GP Software could provide integration for both of them in Dopus.

The Dopus integration for Google Desktop works beautifully. The problem is that GDS is very intrusive, slowing down my work at times. Also, if I leave the computer sitting, while GDS is indexing it really thrashes the hard drive, making loud clicking noises no other program has done, and driving my wife nuts. So I have to turn it off.

Copernic is not intrusive at all, but the fact that Dopus does not integrate it is a major obstacle for me.

I suggest that GP Soft provide integration for Copernic (or perhaps X1, which I haven't tried) in addition to the current support for GDS, so that Dopus users have a choice of at least two indexing services.

I agree and thanks for reviving the issue. But to quote Leo's response to an earlier message:

As an X1 user, I'd like that one. But I manage to use it outside DOpus just fine. I will, though, ask the X1 folks if there is the sort of API that Leo mentioned.

Ideally, DOpus would be able to pick up on whatever the user has set as the search engine for their Windows installation.

I don't know about the alternatives, but X1 now offers to make itself the default search engine for Vista. I assume that the same will apply to Windows 7.

What about support in DOpus for "Everything" (no joke):

voidtools.com/

It has a nice API and it finds files on your WHOLE computer in a fraction of a second WITHOUT INDEXING!

The webpage for Everything mentions indexing as the third bullet point.

(Also the Everything FAQ.)

This is not indexing like with Copernic or GDS or similar programs, where the files are indexed by a relatively slow file search, but Everything rather accesses the MFT (Master File Table) to "index" file-names. From the FAQ:

""Everything" only uses file and folder names and generally takes a few seconds to build it's database. A fresh install of Windows XP SP2 (about 20,000 files) will take about 1 second to index. 1,000,000 files will take about 1 minute."

See here: voidtools.com/faq.php

That's still indexing. It might be faster than normal indexing, because it only indexes file names and not contents, but it's still indexing.

Not saying it's good or bad; just that it's misleading to say it doesn't use indexing.

It depends from what you mean with indexing: If you define indexing as searching for files in the traditional way then Everything does not index files in this manner.

Sigh.

There is no argument about what indexing means. There is no "depends [upon]" about it.

It refers to building, and usually retaining, an index. It does not refer to on-the-fly searching for files.

The extent, and size, of the index is often in the user's control.

For example, I use X1. This can index in many ways.

X1 can simply record file names and types.

It can record this and content.

It can record a mixture of these, for example, indexing the content of text files while indexing only the names of media files.

Anything that remembers stuff between operations and reboots has an index, even if it is one that is part of the operating system.

In other words, if software "remembers" the results of the search that you say takes next to no time at all, then it has to build an index. Unless, that is, the software writers have come up with some freaky version of telepathy.

An example of on-the-fly search is FindOnClick from 2brightsparks which "does not create index files that use up space on your drive".

I intervene only because someone has to provide a slightly more detailed account of the process that Leo has actually described clearly enough but that seems to pass you by.

Let's lose the pedantry and return to the original question. =)

That's the best comment in this thread! :wink:

[quote="soja"]Let's lose the pedantry and return to the original question. =)

What you call pedantry answers the original question.

Thread locked - it's become little more than a pointless argument. Indexing is indexing no matter what sort of index it creates. End of story really.

omg many years later and this tool is amazing thankyou