Python text files

One of the things I've always enjoyed about Python is how easy it is to write simple text transform programs that can get more sophisticated over time, and we're going to go down that path today.

Now that we can take some arguments in our Python scripts, let's get to work.

There are many ways to read a text file in Python, but I'm going to talk about just two: line-by-line, or all-in-one.

Line-by-line

My default choice is line-by-line if I don't need multiple lines. It simply does less work up-front and reduces memory usage.

import argparse

def main_with_args(args):
  line_count = 0
  with open(args.infile) as f:
    for l in f:
      if not l.startswith("#"):
        line_count = line_count + 1
  print("Found {} commented lines in {}".format(line_count, args.infile))

def main():
  parser = argparse.ArgumentParser(description="My program")
  parser.add_argument("infile")
  args = parser.parse_args()
  main_with_args(args)

if __name__ == '__main__':
  main()

Note the following in this snippet:

All at once

Now, if we needed to work with all the lines at once, perhaps because we want to look ahead or behind when processing a line, this is how main_with_args would change.

def main_with_args(args):
  line_count = 0
  with open(args.infile) as f:
    lines = f.readlines()
  print("Found {} total lines in {}".format(len(lines), args.infile))

I didn't show the filtering in this case, but note how I have all the lines in my lines list.

The second method is also very handy when you want to overwrite the original file, as you need to close it before you can open it again for writing out.

Happy text reading!

Tags:  codingpython

Home