How to get rid of junk OCR character leftover in Word
Thread poster: Susan Welsh
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 22:08
Russian to English
+ ...
Apr 16, 2013

I have converted a PDF to Word using ABBYY Finereader, and wherever there was a hyphen at a line ending, the Word version has put it a junk character than I cannot search and replace to get rid of. It looks like a horizontal line with a short vertical line hanging down from the back of it -- like an L rotated 90 degrees clockwise. I have copied it into my Find field, but Word can't find it.

There are hundreds of these things in this rather long document, and I would really like to
... See more
I have converted a PDF to Word using ABBYY Finereader, and wherever there was a hyphen at a line ending, the Word version has put it a junk character than I cannot search and replace to get rid of. It looks like a horizontal line with a short vertical line hanging down from the back of it -- like an L rotated 90 degrees clockwise. I have copied it into my Find field, but Word can't find it.

There are hundreds of these things in this rather long document, and I would really like to get a clean text to make translating easier.

Any suggestions?

Thanks in advance!
Collapse


 
Kevin Fulton
Kevin Fulton  Identity Verified
United States
Local time: 22:08
German to English
Look under special characters Apr 16, 2013

If I recall correctly, this is for the optional hyphen ^-.

 
Sam Pinson
Sam Pinson  Identity Verified
United States
Local time: 20:08
Member (2011)
Russian to English
Optional hyphens can be replaced in Word Apr 16, 2013

Hi, Susan.

Please see my blog article on how to replace these "optional hyphens".
http://pinsonlingo.com/blog/2011/05/27/tag-char-namesoftbreakhyphen-removed/.


 
LEXpert
LEXpert  Identity Verified
United States
Local time: 21:08
Member (2008)
Croatian to English
+ ...
Easy! Apr 16, 2013

This is very common in multi-column articles.
Open Word's Find&Replace dialog.
Under Find, click the button "More >>"
Place the cursor in the Find box, and from the Special drop-down menu select "optional hyphen".
Leave the Replace box blank.
Replace All.


That's it.


 
esperantisto
esperantisto  Identity Verified
Local time: 05:08
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Better take care of it in FR Apr 16, 2013

In FineReader, go to Tools → Options → 4. Save → Format Settings → RTF/DOC/Word XML and tick Remove Optional Hyphens and re-export your document.

[Edited at 2013-04-16 07:57 GMT]


 
Susan Welsh
Susan Welsh  Identity Verified
United States
Local time: 22:08
Russian to English
+ ...
TOPIC STARTER
Thanks! Apr 16, 2013

I used Rudolf's solution, and it worked like a charm. (I didn't want to go back to FR, because I had already done some formatting work on the Word file, like moving footnotes around.)

Thanks to all.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to get rid of junk OCR character leftover in Word






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »