Batch Conversion of image PDF's to PDF Searchable Image (Exact) format
Hi,
I have several hundred large PDF's that have been compiled from image files. I would like to set up a workflow to convert these PDF's into the Searchable Image (Exact) PDF format. I have been able to convert PDF files one by one using Adobe Acrobat 6.0. As each file may take an hour or more to convert I was hoping that it may be possible to set up a multiple file conversion process that I can leave to run overnight, possibly using Adobe Capture. I have Capture 3.0.
I have tried setting up workflows to do this in Capture, but have not been able to get Capture to accept a PDF as an acceptable input document (PDF selection check boxes are 'greyed out' when I try to select them).
I suppose one option is to convert my PDF's back to individual images and then set up workflow's to compile these image files into Searchable Image (Exact) PDF documents, but this would be a very time consuming exercise.
I am a very inexperienced user of Capture, so may be missing something simple.
Any assistance would be much appreciated.
Regards
David.
David_C_Rowland@adobeforums.com
Re: Batch Conversion of image PDF's to PDF Searchable Image (Exact) format
Re: Batch Conversion of image PDF's to PDF Searchable Image (Exact) format
Hi Sebastien,
Yes, I found I could do batch conversion using Acrobat 6.0 Professional (other versions may be able to do it too).
To do multiple conversions:
Click Advanced on the menu bar
Click Batch Processing
Click New Sequence
Give the new sequence a name
Click Select Commands
Click Paper Capture (under Document options)
Click Add to add this to the sequence
Click Edit and set your conversion options
Click OK
Select desired option from Run commands on
Select desired output location and output options (change file name, storage location etc.
Click OK.
Then run sequence.
I was very happy with the processing and output.
Hope this helps with your work.
Cheers
Dave.
David_C_Rowland@adobeforums.com
Re: Batch Conversion of image PDF's to PDF Searchable Image (Exact) format
Re: Batch Conversion of image PDF's to PDF Searchable Image (Exact) format
Capture processes OCR on image files, not PDF. The beginning of your workflow should include the steps of Split PDF, and Convert PDF to Image - this gives you single-page TIF files for the OCR process. The end of your workflow should include Bind PDF - which rolls it back up to the multi-page PDF equivalent to what you had to start with.
Note that if your PDF had any text added to the image (pagination, etc.) prior to your OCR processing, it will probably fail to process through Capture. It is a safety feature that precludes running recognition on text-based files that might already include text of a better quality than that which might be generated by OCR.