Problems with OCR and small text Thread poster: James Greenfield
| James Greenfield United Kingdom Local time: 01:45 Member (2013) French to English + ...
Hi, I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resoluti... See more Hi, I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice. ▲ Collapse | | |
send me the file? also, if it is raster, then all you have is all you have. | | | James Greenfield United Kingdom Local time: 01:45 Member (2013) French to English + ... TOPIC STARTER
Sergei Leshchinsky wrote: send me the file? also, if it is raster, then all you have is all you have. Thanks, I've just sent you an email. | | | James Greenfield United Kingdom Local time: 01:45 Member (2013) French to English + ... TOPIC STARTER could anyone help? | Nov 29, 2015 |
I don't suppose anyone has really powerful OCR software that would be prepared to do me a massive favour. I can't manage to OCR the bibliograohy which is in small text and to hand type the 64 entries it is going to take me a long time. Thanks very much. | |
|
|
Not sure if post-facto solutions will help | Nov 29, 2015 |
Hi James, I'm not an expert, but I think if the scan of the original document was not a high enough resolution, then attempts to increase the resolution of the scan won't help, because the "raw material" is inadequate. If I take a blurry photo of something, no amount of fiddling with the sharpness or resolution of the photo will give me a clear photo. I think the only alternative to typing out the text is to get a better scan. Good luck! Melissa | | | James Greenfield United Kingdom Local time: 01:45 Member (2013) French to English + ... TOPIC STARTER
Hi Melissa, Yes, I think that's right. This section is in English anyway so I have decided not to include it. I thought about including it as it is the bibliography and the French text refers to these English journals, but as you say there is no way of increasing the resolution and hand typing it out would take me an awful long time, James | | | Do you really need to type it? | Nov 30, 2015 |
If the list of references is already in the target language anyway, it makes sense to ask the client if they'd accept it as a pasted image instead of text. If so, you can just copy it using the Snapshot tool of Adobe Reader, then paste it into your target document. | | | esperantisto Local time: 03:45 Member (2006) English to Russian + ... SITE LOCALIZER Convert to black and white | Nov 30, 2015 |
In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on. Also, if the sections in question are French only, do select French onl... See more In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. However, there is one setting (off by default) that can be usable: Tools → Options → General → More options… → Convert color/gray-scale images to black and white (translating this menu items from Russian UI for FR 8.0, thus, they may be different in your case). Try it with on. Also, if the sections in question are French only, do select French only for the language and (re)recognize. ▲ Collapse | |
|
|
Tom in London United Kingdom Local time: 01:45 Member (2008) Italian to English
James Greenfield wrote: Hi, I am currently translating a dead PDF. I managed to OCR the document and the results were fine apart from the bibliography section at the end which is in very small print. The results for this section make no sense. Using abbyy finereader I tried to increase the resolution but the results were equally as bad. Does anyone have any advice? This is the first time I have had this problem. Perhaps someone with Abby finereader could guide me as to how to properly increase the resolution. When I try to do this the image size automatically becomes smaller and it still is unable to recognise the text. Many thanks for any advice. I don't know about you, James, but my Abbby Fine Reader for MacOS outputs to plain text. The resulting file can then be opened in Word and saved as a .doc file. Then you can alter the text any way you want to. I do this all the time.
[Edited at 2015-11-30 07:51 GMT] | | | Rolf Keller Germany Local time: 02:45 English to German Enlarge the picture externally | Dec 1, 2015 |
esperantisto wrote: In my experience, increasing the resolution above 300 dpi has no noticeable effect on recognition results even for small print. Ack. Convert color/gray-scale images to black and white Ack. Plus plan C: Enlarge the picture beforehand. If needs be, go to a copy shop, make an enlarged copy, try different contrast settings etc, then scan/export the result onto an USB stick. The shop staff will help you with this. Back in your office, OCR the file on the stick. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Problems with OCR and small text TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |