Sign in with
Sign up | Sign in
Your question

Omnipage Pro 14 - How do I save OCR data in TIFF so I can ..

Last response: in Computer Peripherals
Share
November 4, 2004 3:09:44 PM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

I want to scan documents, save the image for legal reasons but be able to
search the documents with the Start/Search/Documents utility.

I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
accurate than the MS Document Imaging so I'd like to use it.

However, when I scan a document, OCR it and save it to TIFF format the file
does not show up as being OCR'd by XP and documents scanned with Omnipage
don't show up in Start/Search/Documents search. Documents scanned and OCR'd
with Office Document Imaging are found correctly by the
Start/Search/Documents.

How do I save my OCR'd Omnipage documents in a TIFF so the data is available
to MS Document search?

I realize I could save as TIFF and searchable Word document but I'd rather
have a single source picture/OCR data file. I could save as PDF but I
believe PDF but XP cannot search a PDF file. I'd appreciate any advice.

thanks
rj
Anonymous
November 4, 2004 3:09:45 PM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

ScanSoft OmniPage Pro 14 is neither a part of MS Office, nor is it a
Microsoft product. If you bothered to read ScanSoft OmniPage Pro 14 's
excellent Help files, you would know that....

TIFF is a GRAPHICS file format -- when you save your OCR as a TIFF, you are
not saving it as editable / searchable text -- you are saving it as a
PICTURE of text. Since there is, by definition, no text in a graphic file,
neither MS Windows, nor any other program, can find text within a graphic
file.

--
steve

nhit_whit_thenut_@yahoo.com
remove _thenut_ to reach me


"RJ" <scabbycat1@ca.inter.net> wrote in message
news:2uv62pF2c3pauU1@uni-berlin.de...
> I want to scan documents, save the image for legal reasons but be able to
> search the documents with the Start/Search/Documents utility.
>
> I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
> accurate than the MS Document Imaging so I'd like to use it.
>
> However, when I scan a document, OCR it and save it to TIFF format the
file
> does not show up as being OCR'd by XP and documents scanned with Omnipage
> don't show up in Start/Search/Documents search. Documents scanned and
OCR'd
> with Office Document Imaging are found correctly by the
> Start/Search/Documents.
>
> How do I save my OCR'd Omnipage documents in a TIFF so the data is
available
> to MS Document search?
>
> I realize I could save as TIFF and searchable Word document but I'd rather
> have a single source picture/OCR data file. I could save as PDF but I
> believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
>
> thanks
> rj
>
>
Anonymous
November 4, 2004 9:04:07 PM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

"RJ" <scabbycat1@ca.inter.net> wrote in message
news:2uv62pF2c3pauU1@uni-berlin.de...
> I want to scan documents, save the image for legal reasons but be able to
> search the documents with the Start/Search/Documents utility.
>
> I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
> accurate than the MS Document Imaging so I'd like to use it.
>
> However, when I scan a document, OCR it and save it to TIFF format the
file
> does not show up as being OCR'd by XP and documents scanned with Omnipage
> don't show up in Start/Search/Documents search. Documents scanned and
OCR'd
> with Office Document Imaging are found correctly by the
> Start/Search/Documents.
>
> How do I save my OCR'd Omnipage documents in a TIFF so the data is
available
> to MS Document search?
>
> I realize I could save as TIFF and searchable Word document but I'd rather
> have a single source picture/OCR data file. I could save as PDF but I
> believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
>
> thanks
> rj
>
TIFF is an image(picture) format. It is not searchable, its just dots. I
think you mean .RTF this is a text format that allows different fonts and is
searchable. You could open them in Word and save them as a .DOC. It would
be easier to search for the text using *.rtf from the start button. To have
the original image(picture) and to be able to search it requires 2 files.
The tex/doc file and the image file.
HTH
Related resources
Can't find your answer ? Ask !
Anonymous
November 4, 2004 11:27:10 PM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

RJ wrote:

> How do I save my OCR'd Omnipage documents in a TIFF so the data is available
> to MS Document search?

You can't. OCR takes you from an image file (like TIFF) to a text file or
word processor file. That kind of file can be word-indexed for searches. Going back
to an image file loses that benefit: you convert words to pixels, and lose the words.
And you need to OCR it over again to get back at the words.

If you want to keep the original image, but with the text for searches,
PDF (as you note) or perhaps DjVu would be the right choices.

> I could save as PDF but I
> believe PDF but XP cannot search a PDF file.

Adobe Reader can -- and you're going to need for accessing the PDF file anyway.
If you have very many PDF files, talk to Adobe -- I'm almost sure they have
a solution. It may not fit your budget, though.

I believe DjVu now is capable of something similar, but I don't think
many OCR programs can save to that format directly.

--
Anders Thulin ath*algonet.se http://www.algonet.se/~ath
Anonymous
November 4, 2004 11:44:41 PM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

In article <ygwid.7948$d5.66434@newsb.telia.net>, Anders Thulin
<ath_no_spam_please@algonet.se> writes

>> I could save as PDF but I
>> believe PDF but XP cannot search a PDF file.
>
> Adobe Reader can -- and you're going to need for accessing the PDF file
>anyway.
>If you have very many PDF files, talk to Adobe -- I'm almost sure they have
>a solution. It may not fit your budget, though.

For a cheap way to search PDF files, you could try SearchWithin from
http://www.software995.com/

--
Graham Jones
http://www.visiv.co.uk
Emails to graham@visiv.co.uk may be deleted as spam
Please add a j just before the @ to ensure delivery
Anonymous
November 5, 2004 3:34:20 AM

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

"RJ" <scabbycat1@ca.inter.net> wrote in message
news:2uv62pF2c3pauU1@uni-berlin.de...
> I want to scan documents, save the image for legal reasons but be able to
> search the documents with the Start/Search/Documents utility.
>
> I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
> accurate than the MS Document Imaging so I'd like to use it.
>
> However, when I scan a document, OCR it and save it to TIFF format the
file
> does not show up as being OCR'd by XP and documents scanned with Omnipage
> don't show up in Start/Search/Documents search. Documents scanned and
OCR'd
> with Office Document Imaging are found correctly by the
> Start/Search/Documents.
>
> How do I save my OCR'd Omnipage documents in a TIFF so the data is
available
> to MS Document search?
>
> I realize I could save as TIFF and searchable Word document but I'd rather
> have a single source picture/OCR data file. I could save as PDF but I
> believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
>
> thanks
> rj
>
>
Well in the first place TIFF is a image only format. (A picture).

To have a searchable document, It must be OCR'ed to a Text format such as
Microsoft Word or a plain TXT file. You can also save a searchable PDF from
Omnipage Pro 14.

To search a PDF you need Adobe Acrobat Reader.
http://www.adobe.com/products/acrobat/readstep2.html

--
CSM1
http://www.carlmcmillan.com
--
January 31, 2005 4:23:07 AM

Archived from groups: alt.comp.periphs.scanner (More info?)

RJ Wrote:
>
> How do I save my OCR'd Omnipage documents in a TIFF so the data is
> available
> to MS Document search?
>
> I realize I could save as TIFF and searchable Word document but I'd
> rather
> have a single source picture/OCR data file. I could save as PDF but I
> believe PDF but XP cannot search a PDF file. I'd appreciate any
> advice.
>
> thanks
> rj

Hello rj
There is a product called DocSmart (free download from
www.versis.co.uk) which we use to achieve what it sounds like you are
trying to achieve. DocSmart can do instant full text search and
retrieval on the text content on TIFFs, DjVu, and PDFs (and all
electronic files as well). You can preview and/or open the files and/or
use the built in Windows Explorer type functions on the files (eg.
Print, Copy, Send, etc..).
Bonus - the workstation version is free.

NOTE - regarding DjVu file format: When you create a DjVu file it's OCR
is done by the IRIS engine and so it is really accurate. DocSmart
searches this OCR'd text content.
We used to use TIFF but had the same problem as you are having and then
we needed colour scanning for some documents and so TIFF was no good so
we used PDF for a while. However the files were still too large most of
the time and so we now use DjVu for everything and have never looked
back.

A 300dpi scanned A4 page in colour is about 50Kb and looks exactly the
same as the TIFF or PDF version (24Mb and 5Mb respectively). You view
DjVu doc's in Internet Explorer.

I can't understand why more people are not going crazy over the DjVu
format as it is a life saver for anyone who needs to scan paper or send
really small file sized, non-editable electronic files (eg. a Word,
Powerpoint, CAD document...).


I hope this helps.

Cheers

Barry


--
Barry
!