ITAZ doQuments

FAQ - Full Text Search (IFilters)

Frequently Asked Questions (F.A.Q.) about doQuments and related topics.


1 What are IFilters and why do I need them? top
A From version 4.4 onwards doQuments uses a new and improved mechanism to extract and index text from documents. Text extraction is now done using IFilters installed on the user's machine.
IFilters act as plug-ins and are a part of Microsoft Indexing Service (they are also used by Windows Desktop Search). Using the IFilter mechanism improves the accuracy and performance of text extraction in doQuments.

For doQuments to be able to extract text from a file of a particular format, an IFilter for that file format must be installed on the user's machine.

IFilters for the following file formats are installed by default on Windows 2000/XP/2003/Vista machines...

  • PPT (Microsoft PowerPoint presentation)
  • DOC (Microsoft Word document)
  • XLS (Microsoft Excel spreadsheet)
  • HTML documents
  • TXT documents

For PDF files, existing users will need to download the freely available PDF IFilter from...
http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

You can also install third party filters to enable doQuments to extract text from other file types, e.g.:

  • Microsoft XML IFilter
  • Microsoft RTF IFilter
  • Microsoft Visio IFilter

More information and downloads links for various IFilters (both free and commercial) are available at...

2 Why aren't all IFilters, automatically installed along with doQuments? top
A Although some IFilters are available for free, we cannot ship them with ITAZ doQuments as they are published by different companies. You will find download links for available IFilters (both free and commercial) at …
http://ifilter.org/Links.htm
3 I have enabled Full Text Search for my DB but when I search for some text, documents (which I am sure contain that text) are not listed in the search results. top
A The reason for this could be that the IFilter for that particular file format is not installed on the machine. For doQuments to be able to extract text from a file of a particular format, the IFilter for that file format must be installed on the machine.

Unless doQuments has extracted text from a document, that document cannot be found via full text search.

4 Will doQuments complain if it cannot extract text from a particular document? top
A No. doQuments attempts to find the IFilter for every document and proceeds without complaining (and without extracting text) if the IFilter for a particular file cannot be found on the machine. However for backward compatibility reasons doQuments does complain if it cannot find the IFilter for PDF files.