Frustratingly, if you take plain text documents from the magnificent Gutenberg Project and use a word processor to convert them, they end up with ‘hard returns’ which break the lines and prevent them from wrapping properly. This makes for a very frustrating read.
The solution is to copy into a word processor file and remove all of the hard breaks while at the same time – important – making sure that the paragraph breaks remain.
- First step: in order to exempt all instances of two consecutive hard returns (those separate paragraphs and headings from the text) and replace them with a something unique – say FLESHISGRASS – which I will replace again with hard returns in step 3. This is to protect them from the second step.In MS Word, for example, I do this by Finding all instances of ^p^p (if you don’t know the code for a ‘Paragraph Mark’ as Word calls them, reveal all the Search Options and look for the menu of options for special – non alphanumeric – characters).
- The second step finds all hard returns. Deciding what to replace them with depends on what currently ends the lines of text in your book – whether it’s a space followed by the hard return, or just the hard return. Find this out by turning on the formatting, deleting a hard return and observing whether the words run together. It’s important to ascertain this, because if there is no space currently before the hard return, and I don’t insert one there during the Find/Replace, the result is that two words are run together.
- The third step finds all the unique codes of XXXXX we entered in the first pass and replace them with two Hard Returns, so restoring the line breaks between paragraphs and headings and text.
- Convert to a PDF file as normal