Omnipage Pro 14 - How do I save OCR data in TIFF so I can ..

Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

I want to scan documents, save the image for legal reasons but be able to
search the documents with the Start/Search/Documents utility.

I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
accurate than the MS Document Imaging so I'd like to use it.

However, when I scan a document, OCR it and save it to TIFF format the file
does not show up as being OCR'd by XP and documents scanned with Omnipage
don't show up in Start/Search/Documents search. Documents scanned and OCR'd
with Office Document Imaging are found correctly by the
Start/Search/Documents.

How do I save my OCR'd Omnipage documents in a TIFF so the data is available
to MS Document search?

I realize I could save as TIFF and searchable Word document but I'd rather
have a single source picture/OCR data file. I could save as PDF but I
believe PDF but XP cannot search a PDF file. I'd appreciate any advice.

thanks
rj
6 answers Last reply
More about omnipage save data tiff
  1. Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

    ScanSoft OmniPage Pro 14 is neither a part of MS Office, nor is it a
    Microsoft product. If you bothered to read ScanSoft OmniPage Pro 14 's
    excellent Help files, you would know that....

    TIFF is a GRAPHICS file format -- when you save your OCR as a TIFF, you are
    not saving it as editable / searchable text -- you are saving it as a
    PICTURE of text. Since there is, by definition, no text in a graphic file,
    neither MS Windows, nor any other program, can find text within a graphic
    file.

    --
    steve

    nhit_whit_thenut_@yahoo.com
    remove _thenut_ to reach me


    "RJ" <scabbycat1@ca.inter.net> wrote in message
    news:2uv62pF2c3pauU1@uni-berlin.de...
    > I want to scan documents, save the image for legal reasons but be able to
    > search the documents with the Start/Search/Documents utility.
    >
    > I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
    > accurate than the MS Document Imaging so I'd like to use it.
    >
    > However, when I scan a document, OCR it and save it to TIFF format the
    file
    > does not show up as being OCR'd by XP and documents scanned with Omnipage
    > don't show up in Start/Search/Documents search. Documents scanned and
    OCR'd
    > with Office Document Imaging are found correctly by the
    > Start/Search/Documents.
    >
    > How do I save my OCR'd Omnipage documents in a TIFF so the data is
    available
    > to MS Document search?
    >
    > I realize I could save as TIFF and searchable Word document but I'd rather
    > have a single source picture/OCR data file. I could save as PDF but I
    > believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
    >
    > thanks
    > rj
    >
    >
  2. Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

    "RJ" <scabbycat1@ca.inter.net> wrote in message
    news:2uv62pF2c3pauU1@uni-berlin.de...
    > I want to scan documents, save the image for legal reasons but be able to
    > search the documents with the Start/Search/Documents utility.
    >
    > I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
    > accurate than the MS Document Imaging so I'd like to use it.
    >
    > However, when I scan a document, OCR it and save it to TIFF format the
    file
    > does not show up as being OCR'd by XP and documents scanned with Omnipage
    > don't show up in Start/Search/Documents search. Documents scanned and
    OCR'd
    > with Office Document Imaging are found correctly by the
    > Start/Search/Documents.
    >
    > How do I save my OCR'd Omnipage documents in a TIFF so the data is
    available
    > to MS Document search?
    >
    > I realize I could save as TIFF and searchable Word document but I'd rather
    > have a single source picture/OCR data file. I could save as PDF but I
    > believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
    >
    > thanks
    > rj
    >
    TIFF is an image(picture) format. It is not searchable, its just dots. I
    think you mean .RTF this is a text format that allows different fonts and is
    searchable. You could open them in Word and save them as a .DOC. It would
    be easier to search for the text using *.rtf from the start button. To have
    the original image(picture) and to be able to search it requires 2 files.
    The tex/doc file and the image file.
    HTH
  3. Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

    RJ wrote:

    > How do I save my OCR'd Omnipage documents in a TIFF so the data is available
    > to MS Document search?

    You can't. OCR takes you from an image file (like TIFF) to a text file or
    word processor file. That kind of file can be word-indexed for searches. Going back
    to an image file loses that benefit: you convert words to pixels, and lose the words.
    And you need to OCR it over again to get back at the words.

    If you want to keep the original image, but with the text for searches,
    PDF (as you note) or perhaps DjVu would be the right choices.

    > I could save as PDF but I
    > believe PDF but XP cannot search a PDF file.

    Adobe Reader can -- and you're going to need for accessing the PDF file anyway.
    If you have very many PDF files, talk to Adobe -- I'm almost sure they have
    a solution. It may not fit your budget, though.

    I believe DjVu now is capable of something similar, but I don't think
    many OCR programs can save to that format directly.

    --
    Anders Thulin ath*algonet.se http://www.algonet.se/~ath
  4. Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

    In article <ygwid.7948$d5.66434@newsb.telia.net>, Anders Thulin
    <ath_no_spam_please@algonet.se> writes

    >> I could save as PDF but I
    >> believe PDF but XP cannot search a PDF file.
    >
    > Adobe Reader can -- and you're going to need for accessing the PDF file
    >anyway.
    >If you have very many PDF files, talk to Adobe -- I'm almost sure they have
    >a solution. It may not fit your budget, though.

    For a cheap way to search PDF files, you could try SearchWithin from
    http://www.software995.com/

    --
    Graham Jones
    http://www.visiv.co.uk
    Emails to graham@visiv.co.uk may be deleted as spam
    Please add a j just before the @ to ensure delivery
  5. Archived from groups: alt.comp.periphs.scanner,comp.ai.doc-analysis.ocr,microsoft.public.office.misc (More info?)

    "RJ" <scabbycat1@ca.inter.net> wrote in message
    news:2uv62pF2c3pauU1@uni-berlin.de...
    > I want to scan documents, save the image for legal reasons but be able to
    > search the documents with the Start/Search/Documents utility.
    >
    > I have Omnipage Pro 14 on a Home XP system. Omnipage OCR is much more
    > accurate than the MS Document Imaging so I'd like to use it.
    >
    > However, when I scan a document, OCR it and save it to TIFF format the
    file
    > does not show up as being OCR'd by XP and documents scanned with Omnipage
    > don't show up in Start/Search/Documents search. Documents scanned and
    OCR'd
    > with Office Document Imaging are found correctly by the
    > Start/Search/Documents.
    >
    > How do I save my OCR'd Omnipage documents in a TIFF so the data is
    available
    > to MS Document search?
    >
    > I realize I could save as TIFF and searchable Word document but I'd rather
    > have a single source picture/OCR data file. I could save as PDF but I
    > believe PDF but XP cannot search a PDF file. I'd appreciate any advice.
    >
    > thanks
    > rj
    >
    >
    Well in the first place TIFF is a image only format. (A picture).

    To have a searchable document, It must be OCR'ed to a Text format such as
    Microsoft Word or a plain TXT file. You can also save a searchable PDF from
    Omnipage Pro 14.

    To search a PDF you need Adobe Acrobat Reader.
    http://www.adobe.com/products/acrobat/readstep2.html

    --
    CSM1
    http://www.carlmcmillan.com
    --
  6. Archived from groups: alt.comp.periphs.scanner (More info?)

    RJ Wrote:
    >
    > How do I save my OCR'd Omnipage documents in a TIFF so the data is
    > available
    > to MS Document search?
    >
    > I realize I could save as TIFF and searchable Word document but I'd
    > rather
    > have a single source picture/OCR data file. I could save as PDF but I
    > believe PDF but XP cannot search a PDF file. I'd appreciate any
    > advice.
    >
    > thanks
    > rj

    Hello rj
    There is a product called DocSmart (free download from
    www.versis.co.uk) which we use to achieve what it sounds like you are
    trying to achieve. DocSmart can do instant full text search and
    retrieval on the text content on TIFFs, DjVu, and PDFs (and all
    electronic files as well). You can preview and/or open the files and/or
    use the built in Windows Explorer type functions on the files (eg.
    Print, Copy, Send, etc..).
    Bonus - the workstation version is free.

    NOTE - regarding DjVu file format: When you create a DjVu file it's OCR
    is done by the IRIS engine and so it is really accurate. DocSmart
    searches this OCR'd text content.
    We used to use TIFF but had the same problem as you are having and then
    we needed colour scanning for some documents and so TIFF was no good so
    we used PDF for a while. However the files were still too large most of
    the time and so we now use DjVu for everything and have never looked
    back.

    A 300dpi scanned A4 page in colour is about 50Kb and looks exactly the
    same as the TIFF or PDF version (24Mb and 5Mb respectively). You view
    DjVu doc's in Internet Explorer.

    I can't understand why more people are not going crazy over the DjVu
    format as it is a life saver for anyone who needs to scan paper or send
    really small file sized, non-editable electronic files (eg. a Word,
    Powerpoint, CAD document...).


    I hope this helps.

    Cheers

    Barry


    --
    Barry
Ask a new question

Read More

Scanners Document Office Windows XP Peripherals