Automatically Extract Text from Documents

Print this Topic  Previous Topic Home Topic Next Topic
You are here: Search for a Document > Document Full Text Search >Automatically Extract Text from Documents

You can automatically extract text from documents on check-in by selecting the Automatically extract text from documents while adding option from the Options window.

 

To Automatically Extract text from the Document

1.In Globodox, click the Globodox button.
2.Click the Options button. The Options window will be launched.
3.Select the DB options node in the left pane. The DB options will now be displayed in  the right pane.
4.Check the Automatically extract text from documents on check-in option.
5.Click the OK button to apply the changes.
 

Note:
This option can only be selected if the Document Check-in/Checkout feature has been enabled. Please note that selecting this option can significantly slow down the process of adding/checking in of documents.

 
Notes:

You can also switch from Globodox OCR engine to Microsoft Office OCR engine to extract text from documents. For more info see Extract Text from Document
Sometimes for slower machines you may want to turn off the automatic extraction and indexing of documents.
Globodox uses it's built-in text extractor for MS Word (DOC, DOCX), MS Excel (XLS, XLSX) and PDF files (PDF files which contain text and not only scanned images). In case of any other file formats, for Globodox to be able to extract text from a file of that particular format, an IFilter for that file format must be installed on the user's machine.

 
     IFilters for the following file formats are installed by default on Windows 2000/XP/2003/2008//Vista/7 machines...

PPT (Microsoft PowerPoint presentation)
HTML documents
TXT documents

 
 


Related Topics
Search for text in a document
Automatically Index a document
Automatically Index a stack
Recognize barcodes on documents

 


Page URL: http://www.itaz.com/globodox/help/index.htm?automatically_extract_text_from_documents.htm