# Will OCR help with this?



## Slackjaw (Jan 14, 2011)

Seeking some technical guidance regarding a book my 94 year old Mum-in-law is writing.
 
The issues are numerous . I’ll do my best to sum up what I’m up against.
 
The book has been written entirely as a PDF**using MS “Wordpad”.
 
The process to spell check the PDF’s and make other corrections is remarkably painful to the point I offered some assistance only to discover what I thought may be a somewhat*easy fix is anything but…
 
I’ve used google docs to convert the pdf’s to Libreoffice ,odt which allows spell checking BUT all ‘images’ do not carry over. This means all images would have to be imported and inserted back in the appropriate spots.**This would be a more daunting process that I am willing to undertake. Why? You might ask….
 
Well the book is now up to approximate 500 pages with more to come and on those 500 pages there are more than SEVEN HUNDRED AND THIRTY NINE images!!! With more to come.
 
I’m aware of the limitations PDF’s present but Wordpad is the only app mum feels comfortable with and at this point 500 pages in… what do you do?
 
I’m wondering if an OCR program of some description can be employed to convert the complete document with all images/photo’s in tact in place.

Hopefully I’ve explained my dilemma. Thanks for what comes back.

CliveK
* *
macOS 10.15.4


----------



## wonderings (Jun 10, 2003)

Sounds like you need a page layout program. I would take a look at Affinity Publisher which is an Indesign like application for a fraction of the cost and no subscription. 

https://affinity.serif.com/en-gb/publisher/

You should be able to import a word file with images like you can do in Indesign. Think there is a 90 day full feature demo available so you could give it a shot. If it works the price is around $60 I believe and may be on sale for 50% off at the moment. So both ways are really cheap for software that can hopefully help get this setup easier.


----------



## eMacMan (Nov 27, 2006)

wonderings said:


> Sounds like you need a page layout program. I would take a look at Affinity Publisher which is an Indesign like application for a fraction of the cost and no subscription.
> 
> https://affinity.serif.com/en-gb/publisher/
> 
> You should be able to import a word file with images like you can do in Indesign. Think there is a 90 day full feature demo available so you could give it a shot. If it works the price is around $60 I believe and may be on sale for 50% off at the moment. So both ways are really cheap for software that can hopefully help get this setup easier.


Does it import correctly from the pdf format the OP referred to? I would think it should but nothing is ever certain in the digital world. 

I suspect if he was working with a Word file there would have been no problem in the first place.


----------



## wonderings (Jun 10, 2003)

eMacMan said:


> Does it import correctly from the pdf format the OP referred to? I would think it should but nothing is ever certain in the digital world.
> 
> I suspect if he was working with a Word file there would have been no problem in the first place.


oh Missed that bit. 

This might be the one place where Publishers PDF handling could be a good thing. Affinity Publisher likes to make a PDF editable. It is a nightmare as it cannot use embedded fonts so if you are working with client supplied PDF's and place them in Publisher they can get seriously messed up. But nn this case it might be a good thing as it "should" make your PDF editable in Publisher with all your images and formatting. Trial is free so no harm in trying, but if I am understanding correctly this might just work.


----------



## pm-r (May 17, 2009)

I think I would want to contact the potential publisher of the book and get their suggestions as to what and how to use their suggested applications.

No one will want to have to redo a 500+ page book, for spelling, formatting file type etc.

Would have thought that writing a lengthy book as a PDF would be the last choice, and I can't see any advantage for using any OCR. That's doubling up the amount of work involved at least I would think.



- Patrick
======


----------



## WCraig (Jul 28, 2004)

Slackjaw said:


> ...
> The book has been written entirely as a PDF**using MS “Wordpad”.
> ...


Wordpad's native file format is NOT pdf. Get access to the original file on Windows. MS Word can read native Wordpad files. Or from Windows, save the file in rtf format and then many word and document processors will be able to read it.

Craig
(Let me guess...the original file is corrupt and there are no backups.)


----------



## Slackjaw (Jan 14, 2011)

Thanks for the suggestions. I'll take a look at the programs suggested.

@WCraig. I'm aware that .pdf is not the native file format for Wordpad but mum saved everything as a .pdf. The are no rtf files to be found. 

CliveK


----------



## pm-r (May 17, 2009)

> Craig
> (_*Let me guess...the original file is corrupt and there are no backups*_.)



Isn't it amazing how often that seems to be so true!!!

Multiple years of work and typing so often will disappear into uselessness.




- Patrick
======


----------



## WCraig (Jul 28, 2004)

Slackjaw said:


> ... @WCraig. I'm aware that .pdf is not the native file format for Wordpad but mum saved everything as a .pdf. The are no rtf files to be found.
> 
> CliveK


This makes no sense. How can your mother continue to write and edit the book? Wordpad cannot read a pdf file; it can only output them. Thus there must be a native file somewhere. I'm not at a Windows machine but I don't think rtf is the native format for Wordpad files, either.

Craig


----------



## pm-r (May 17, 2009)

WCraig said:


> This makes no sense. How can your mother continue to write and edit the book? Wordpad cannot read a pdf file; it can only output them. Thus there must be a native file somewhere. I'm not at a Windows machine but I don't think rtf is the native format for Wordpad files, either.
> 
> Craig



I gather she must be using a Windows computer as well???



> RTF was created by the Microsoft Word team back in the 1980’s. It was intended as a universal format that could be used by most word processors, making it easier for people to share Word documents with people who don’t use Word. _*It was also incorporated as the default format used by Windows’ built-in WordPad app—a lightweight word processor.*_






- Patrick
======


----------



## WCraig (Jul 28, 2004)

My bad on the rtf file format.

The OP might try converting the pdf into something more useable. I've never tried it, but the following utility says it can translate a pdf to .doc or .docx format. 

https://pdfsam.org

Whether it will choke on the 700+ images is anybody's guess.

Craig


----------



## pm-r (May 17, 2009)

> My bad on the rtf file format.



I don't think you goofed up regarding the RTF file format at all.

I do think the OP's description and how their 94 year old Mum-in-law Is working with the large document is all rather confusing and a bit messed up.

I would suggest they make a Backup or two and then get it into a format with an application that is designed for writing large 500+ page books tThehat it can use.

As I posted earlier, I would also check with the potential publisher as to what they suggest one should use, and what the final format file type should be. 

They may even suggest what fonts to avoid using and a few other helpful hints.

I also wonder if the book's file Is just one large file, rather than being broken down into smaller chapter files. 

Some proper organization is going to be needed for proofreading and spellchecking and grammar etc. Maybe even chapters and page numbers and are graphics included and are any footnotes or subtitles needed.

Lots of stuff to be considered I would say. But I wouldn't think that using a PDF would be the ideal way of writing and editing a book. Certainly NOT a book of 500+ pages. Gads!!! 





- Patrick
======


----------



## unblocktheplanet (Feb 5, 2008)

eMacMan said:


> I suspect if he was working with a Word file there would have been no problem in the first place.


Unfortunately, Word does not, to my working knowledge, support inline images.

ODT is a good choice for text-only. But RTFD is what I use, TextEdit, to create most docs with text & images. PDFs are a big drag to format to beautiful pages.

Will Wordpad convert to plaintext?

This one of those many dratted problems of PC-to-Mac. (Remember, Bill Gates just wants to save the world!)

There are good Mac OCR appls for PDFs such as FineReader & Readiris. But I don't think simple OCR is going to solve your problem.

Good on your Mum for giving this a go! Way to go, girl!


----------



## unblocktheplanet (Feb 5, 2008)

One other idea. Can PDFs be exported to Google Docs? Those are editable.


----------



## WCraig (Jul 28, 2004)

unblocktheplanet said:


> Unfortunately, Word does not, to my working knowledge, support inline images.
> ...


Your knowledge is quite incomplete then. Of course MS Word allows images to be embedded in a document inline. Lots of people use Word as a page layout program. It is miles more capable than Wordpad. Reading between the lines, I imagine the OP's 94 y.o, mother is bewildered by Word's user interface. 

Craig


----------



## unblocktheplanet (Feb 5, 2008)

Me, too, Craig!


----------



## pm-r (May 17, 2009)

> I imagine the OP's 94 y.o, mother is bewildered by Word's user interface.



Actually the OP's _*94 year old Mum-in-law*_. And yes, even the interface of MS Office 2011 was bad enough and I'm sure it has got even more involved since, but I'll give it credit for its abilities, and it may be very worthwhile learning them. 

It's much easier working with an application that CAN do what you want, rather than trying to work with an application CAN'T, and trying to figure out a workaround, regardless of any seemingly complicated. Interface.

BTW: I'm still a bit confused as to what computer and application she is using, Apple or Windows.





- Patrick
======


----------

