Find the number of characters in a file using Python Find the number of characters in a file using Python python python

Find the number of characters in a file using Python


Sum up the length of all words in a line:

characters += sum(len(word) for word in wordslist)

The whole program:

with open('my_words.txt') as infile:    lines=0    words=0    characters=0    for line in infile:        wordslist=line.split()        lines=lines+1        words=words+len(wordslist)        characters += sum(len(word) for word in wordslist)print(lines)print(words)print(characters)

Output:

31335

This:

(len(word) for word in wordslist)

is a generator expression. It is essentially a loop in one line that produces the length of each word. We feed these lengths directly to sum:

sum(len(word) for word in wordslist)

Improved version

This version takes advantage of enumerate, so you save two lines of code, while keeping the readability:

with open('my_words.txt') as infile:    words = 0    characters = 0    for lineno, line in enumerate(infile, 1):        wordslist = line.split()        words += len(wordslist)        characters += sum(len(word) for word in wordslist)print(lineno)print(words)print(characters)

This line:

with open('my_words.txt') as infile:

opens the file with the promise to close it as soon as you leave indentation.It is always good practice to close file after your are done using it.


Remember that each line (except for the last) has a line separator.I.e. "\r\n" for Windows or "\n" for Linux and Mac.

Thus, exactly two characters are added in this case, as 47 and not 45.

A nice way to overcome this could be to use:

import osfname=input("enter the name of the file:")infile=open(fname, 'r')lines=0words=0characters=0for line in infile:    line = line.strip(os.linesep)    wordslist=line.split()    lines=lines+1    words=words+len(wordslist)    characters=characters+ len(line)print(lines)print(words)print(characters)


To count the characters, you should count each individual word. So you could have another loop that counts characters:

for word in wordslist:    characters += len(word)

That ought to do it. The wordslist should probably take away newline characters on the right, something like wordslist = line.rstrip().split() perhaps.