Python Scripting for Computational Molecular Science: Glossary

Key Points

Introduction
  • You can assign the values of several variables at once.

  • Indentation is very important in for loops and if statements. Don’t forget the : and to indent the items in the loop.

File Parsing
  • You should use the os.path module to work with file paths.

  • One of the most flexible ways to read in the lines of a file is the readlines() function.

  • An if statement can be used to find a particular string within a file.

  • The split() function can be used to seperate the elements of a string.

  • You will often need to recast data into a different data type when it was read in as a string.

Processing Multiple Files and Writing Files
  • Use the glob function in the python library glob to find all the files you want to analyze.

  • You can have multiple for loops nested inside each other.

  • Python can only print strings to files.

  • Don’t forget to close files so python will actually write them.

Working with Tabular Data
  • If you are reading in a file that is mostly numerical data, there are better ways to read in the data than using readlines().

  • The notation to refer to a particular element of an array is array_name[row,column].

  • Typically, you should not write a function to perform a standard math operation. The function probably already exists in numpy.

Plotting and Data Visualization
  • The matplotlib library is the most commonly used plotting library.

  • You can import libraries with shorthand names.

  • You can save a figure with the savefig command.

  • Matplotlib is highly customizable. You can change the style of your plot, and there are many kinds of plots.

Writing Functions
  • Functions make your code easier to read, more reuseable, more portable, and easier to test.

  • If a function returns True or False, you can use it in an if statement as a condition.

Running code from the Linux Command Line
  • You must import argparse in your code to accept user arguments.

  • You add must first create an argument parser using parser = argparse.ArgumentParser

  • You add arguments using parser.add_argument

Running code from the Linux Command Line
  • You must import sys in your code to accept user arguments.

  • The name of the script itself is always sys.argv[0] so the first user input is normally sys.argv[1].

Testing Code with pytest
  • The python package pytest looks for functions that start with test to run.

  • It is particularly important to write tests for the edge and corner cases of your functions.

  • While writing tests can seem time consuming, they are essential to writing good code. Testing is particularly important when multiple people are collaborating on a complex code.

Version Control with git
  • Version control keeps a complete, organized history of all work on a project. It is extremely useful whether you are working individually or on a team.

  • Good commit messages are critical to maintaining an organized and useful repository.

Sharing Code
  • Putting your code on GitHub is the best way to easily share your code, collaborate, and track changes.

Glossary

FIXME