Re: Copying Hebrew text from a PDF into a Translation tool - OCR (Optical Character Recognition) Help Request #general


Joyaa Antares
 

Thank you all very much for your input and some really wonderful ideas!  I'll give a status report here for those interested in the topic now and for the record.
Unfortunately, the original PDFs - whilst legible and intelligible to someone fluent in Hebrew - are simply not readable by adobe acrobat or using any of the solutions provided to date.   However, I am reasonably sure that Dahn Cukier has given the correct reason for this - that the original document may have been created as images and then saved as a pdf.  (Certainly the suggestions from Gary Binetter and Meir Razy, whilst offering hope, didn't work in this instance.  Also, I have tried copying text from the document using a paid / full version of Adobe Acrobat without success [thank you Peter Straus]).
Therefore, I am running with Avraham Kahana's suggestion of trialling https://convertio.co/ on one of my five files.  The program has converted the pdf into a MS Word document (which was my choice of document type from the list offered by the program) that looks like utter garbage, containing Chinese characters, numbers, and all kinds of glyphs.  Still, this is much more promising than the blank content that resulted from other attempts at file conversion.  I plan to send this "garbage" file to Dahn Zukrowicz to see what can be made of it.  If this fails, I'll follow up a suggestion from David Lewin to approach the National Library of Israel to see if they have the documents and in a better format (I think it's unlikely so am trying Dahn's method first). 
I will report back here.
Joyaa ANTARES
Gold Coast, Qld, Australia

Join main@groups.jewishgen.org to automatically receive all group messages.