Calculating number of words from PDF files
Thread poster: Josephine Cassar
Josephine Cassar
Josephine Cassar  Identity Verified
Malta
Local time: 22:53
Member (2012)
English to Maltese
+ ...
Sep 10, 2015

Hello, I need to calculate the number of words from a PDF file I translated, as when I copied/pasted the file, I could not see how many words there were and I need the number for invoicing. Originally, I was told that there were some 4,100 words in 5 files, but I translated only 3 files and in the target language (English UK), I got 4785 words, but I cannot tell how many there were in the source text (French). Can anyone help please? I assume word count will be somewhat lower than the 4785 I got... See more
Hello, I need to calculate the number of words from a PDF file I translated, as when I copied/pasted the file, I could not see how many words there were and I need the number for invoicing. Originally, I was told that there were some 4,100 words in 5 files, but I translated only 3 files and in the target language (English UK), I got 4785 words, but I cannot tell how many there were in the source text (French). Can anyone help please? I assume word count will be somewhat lower than the 4785 I got in my translation. I do not like to leave it up to the agency as agencies may try and play about with word counts. Thank you for any help or ideas.Collapse


 
Serena Basili
Serena Basili  Identity Verified
Belgium
Local time: 22:53
English to Italian
+ ...
Scanned or original PDF? Sep 10, 2015

Is the PDF file like a scan or a normal document? Usually I select the paragraphs whose wordcount I want to know and simply copy and paste them in Word, when the file is an original PDF file.
Hope this helps!
Cheers

S.

Note: or you can put the PDF into a programme that converts it into a Word file (there's plenty of free tools in the internet) and then use the normal wordcount

[Edited a
... See more
Is the PDF file like a scan or a normal document? Usually I select the paragraphs whose wordcount I want to know and simply copy and paste them in Word, when the file is an original PDF file.
Hope this helps!
Cheers

S.

Note: or you can put the PDF into a programme that converts it into a Word file (there's plenty of free tools in the internet) and then use the normal wordcount

[Edited at 2015-09-10 08:52 GMT]
Collapse


 
Josephine Cassar
Josephine Cassar  Identity Verified
Malta
Local time: 22:53
Member (2012)
English to Maltese
+ ...
TOPIC STARTER
PDF or Scan Sep 10, 2015

A certificate is scanned and then there is PDF file to accompany it so when I copied/pasted, I did not get the word count. Thank you. Do you know of a free tool to convert please?

 
Josephine Cassar
Josephine Cassar  Identity Verified
Malta
Local time: 22:53
Member (2012)
English to Maltese
+ ...
TOPIC STARTER
Edit in word Sep 10, 2015

I tried to do it from the email files and could edit, but it says 'about' re. word count. I do not know if it is reliable therefore but I have an idea; in fact, there were more words than I got in the target language.

 
Andrzej Mierzejewski
Andrzej Mierzejewski  Identity Verified
Poland
Local time: 22:53
Polish to English
+ ...
OCR Sep 10, 2015

Use an OCR (Optical Character Recognition) software to extract the text from the pdf file and save it in a Word-supported format (doc, docx, rtf, txt, etc. whichever you need). Then, you will receive a document prepared for your translation work. At the same time, Word will tell you the actual character/word (whichever you need) number.

Such software is a "must have" for determination of text volume. When your source file is an image of a scanned print, you have no other solution.
... See more
Use an OCR (Optical Character Recognition) software to extract the text from the pdf file and save it in a Word-supported format (doc, docx, rtf, txt, etc. whichever you need). Then, you will receive a document prepared for your translation work. At the same time, Word will tell you the actual character/word (whichever you need) number.

Such software is a "must have" for determination of text volume. When your source file is an image of a scanned print, you have no other solution.

And there's one more important issue: do you calculate your payment based on the source, or target language word number? Whatever the answer, you should make it clear with the agency before you start working. Please note that the word count depends on the counting settings: make sure that text fields etc. are included.

HTH

[Edited at 2015-09-10 09:16 GMT]
Collapse


 
Maija Cirule
Maija Cirule  Identity Verified
Latvia
Local time: 23:53
German to English
+ ...
Yes, I always use Sep 10, 2015

Andrzej Mierzejewski wrote:

Use an OCR (Optical Character Recognition) software to extract the text from the pdf file and save it in a Word-supported format (doc, docx, rtf, txt, etc. whichever you need). Then, you will receive a document prepared for your translation work. At the same time, Word will tell you the actual character/word (whichever you need) number.

Such software is a "must have" for determination of text volume. When your source file is an image of a scanned print, you have no other solution.



HTH

[Edited at 2015-09-10 09:16 GMT]


the OCR program http://finereaderonline.com. It is not free but not expensive. You can convert almost any pdf file to Word.


 
Maria Teresa Borges de Almeida
Maria Teresa Borges de Almeida  Identity Verified
Portugal
Local time: 21:53
Member (2007)
English to Portuguese
+ ...
Try FineCount Sep 10, 2015

http://www.tilti.com/software-for-translators/finecount/

 
Serena Basili
Serena Basili  Identity Verified
Belgium
Local time: 22:53
English to Italian
+ ...
Free tools Sep 10, 2015

This is a free OCR for the scanned one:

http://www.onlineocr.net/

For the PDF file, just copying and pasting the text into Word should do, otherwise you can use this website that converts PDF to Word, I have used it sometimes (NB: use a copy of the PDF to do the conversion!):

... See more
This is a free OCR for the scanned one:

http://www.onlineocr.net/

For the PDF file, just copying and pasting the text into Word should do, otherwise you can use this website that converts PDF to Word, I have used it sometimes (NB: use a copy of the PDF to do the conversion!):

https://www.pdftoword.com/



[Edited at 2015-09-10 10:16 GMT]
Collapse


 
Josephine Cassar
Josephine Cassar  Identity Verified
Malta
Local time: 22:53
Member (2012)
English to Maltese
+ ...
TOPIC STARTER
@ Andrzej Sep 10, 2015

The agreement was as per source words but were quoted wrongly in the first email and was too busy to bother checking the number of words before I finished the job. Thank you for your suggestions, will try them out as editing in word gives an enormous number of words which just cannot be.

 
Peter Linton (X)
Peter Linton (X)  Identity Verified
Local time: 21:53
Swedish to English
+ ...
Scanned and normal in one PDF Sep 10, 2015

Unlikely to happen, but I once received a file that seemed straightfoward, but the customer complained that I had omitted a page.
It turned out that pages 1 and 3 were normal text, while page 2 was a graphics page, a snapshot that looked just like the original normal page, but didn't show up in a page count.
Worth a check when you receive a PDF.


 
Josephine Cassar
Josephine Cassar  Identity Verified
Malta
Local time: 22:53
Member (2012)
English to Maltese
+ ...
TOPIC STARTER
Managed to convert Sep 10, 2015

Just a short note to thank you all. One can use Google Drive and there is video that shows how it can be done on YouTube. I happened to come across NitroPro which I found easy to use so I converted them. Surprisingly, there were over 5000 words. Just shows you never must take an agency's words even with regards to word count, even if pressed. Thanks all.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Calculating number of words from PDF files






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »