Mischiefblog
I make apps for other people

How I code Python with tests

Posted by Chris Jones
On November 19th, 2015 at 13:25

Permalink | Trackback | Links In |

Comments Off on How I code Python with tests
Posted in Python

Setting up the project and where to put the code

  1. Project creation

    I use virtualenv to create the project. If I’m using a different version of Python than the system default, I can specify the interpreter version when creating the environment.

    $ virtualenv -p /opt/local/bin/python2.7 my_project

  2. Environment activation

    Before I can do anything with the environment, I need to activate the environment.

    $ source myproject/bin/activate

  3. Installation of modules

    By default, virtualenv installs pip to allow me to download and install Python libraries in my environment. Depending on what I’m writing, I need to pick out libraries that are needed to complete the project.


    For a web application, for instance, I may choose to use Flask or Django. To access DB2, I’ll need the ibm_db package.

    (myproject)$ pip install flask
    (myproject)$ pip install ibm_db

    This installs the packages into my virtual environment’s python lib/python2.7/site-packages directory and makes them available to my Python scripts by using the import statement.

  4. Source directory

    It’s your choice as to where you write your Python files: they don’t even need to be inside your virtual environment, as long as your environment is active. To keep my project well organized and to separate shell scripts, configuration, and resources, I created a myproject/src directory to contain the Python source. This is not appropriate to all projects, so consider this a guideline and not a rule.

  5. Packages and Pythonic thinking

    If you’re used to writing Java or C# code, you’re used to creating many domain-oriented packages, like com.mycompany.myprogram. In Python, we typically create packages to handle naming conflicts, versioning (such as a two different implementations of an API, as is commonly used in OpenStack modules), or to try to encapsulate similar pieces of logic. A package is defined and visible to python my the presence of a special file, __init__.py, which can be empty.

    As a practical example, I created a myproject.models package. myproject is the project name and allows us to safely use common names like Server. myproject.models contains and provides imports to the SQLAlchemy entities to make them more convenient to use.

    myproject.models.Model defines the entity classes like Agent, Customer, Order, Shipment, etc. The package, myproject.models, exposes myproject.models.Model.Agent as myproject.models.Agent as a convenience and provides high-level query abstractions for working with the database. This is accomplished through the __init__.py file, which is executed when the package is imported.

    As a rule of thumb, create a new package when:

    • you have classes or implementations that are closely related (i.e., the models package which deals with entities, persistence, and querying), or
    • you have classes that have conflicting names, or implement different versions of a protocol, such as multiple Server or Client classes or concrete implementations of abstract, common types.

Documenting, testing, and implementation at once

  1. Documentation

    I’m creating an API document as I develop the RESTful interface to the service. By creating minimal documentation immediately prior to development, I keep it up to date and also have to think about and clearly communicate my intention prior to developing any code.

    In the case of a RESTful interface, it’s convenient to create an API description page on the root resource. I attached the HTML document to a Flask route. Later, this document can be copied to Word for the User’s Guide.

  2. Test cases

    It is vital in Python that you create test cases and attempt to get 100% code coverage. Python is a dynamic language and is compiled to bytecode only when the script is run (in most cases — you can precompile the bytecode to improve first execution speed). Variables, however, are duck typed and bound and checked at runtime. (Python 3.5 will have extended variable typing intended to improve static analysis.)

    I created a myproject.test package to contain my unittest.TestCase implementations. Test cases extend from unittest.TestCase and assertions are built into the parent class (hence the self.assert functions).

    When possible, I write my test cases before implementing the code under test: it clarifies how I’m going to call the function or method, what variables or structures need to be passed into it, and is a great opportunity to make sure I was clear in my documentation.

    For the service to date, since I still have to add other utilities, I have the following components or classes:

    1. Service, which binds action implementations to routes
    2. Queries, which provides high level methods around SQLAlchemy queries
    3. Models, which defines classes (entities) representative of the database tables

    When adding a new action to the server, I may touch all three layers and thus need to add unit tests to each. I don’t care that the unit tests fail or cause compilation errors at this point: the code isn’t done until the tests run.

    Test cases should be organized into suites.

  3. Implementation

    After I’ve got my basic classes created, such as Server or Queries, adding new behaviors is as easy as adding a new method or modifying existing methods. This implementation should cause the unit test code to complete successfully and no more.

  4. Coverage

    I’m using IntelliJ IDEA with the Python development plugin as my coverage tool, but that’s not the only option. A standard tool for Python coverage is coverage by Ned Batchelder, and PyDev for Eclipse includes a coverage tool.

    Our goal is 100% coverage. Until we determine otherwise, 95% is the minimum acceptable line coverage.

Surprising Pythonic Things

You may well be surprised by how Python works.

Python has a REPL

If you ever have a question about how syntax works in Python or if you can get a function or some code to work, remember than Python has a Read-Eval-Print-Loop, or REPL, built into the interpreter. Just type python at the command line.

Spaces matter!

Python indentation defines a block. Use tools to auto-format your code to make sure your spaces are consistent, and use four-character spaces instead of tab characters. See also PEP 8.

This was one of the largest complaints Google had when developing Python for their infrastructure and was one of the reasons cited for the development of Go.

Getting test cases to run

To get your test cases to run, you need to include a hook to __main__. This is boilerplate code:

if __name__ == '__main__':
    unittest.main()

You’ll need to do something similar to get any command-line application to run, except you’ll call a different function or method.

Mixing functions, classes, and methods

Python allows you to freely mix object oriented code with methods, normal functions, and functional code (lambdas). Keep in mind that Python has well defined scoping rules (LEGB).

Parents and superclasses

When defining a class, the child determines the order in which multiple parent classes are inherited. Inheritance and super-class method invocation follows linear order:


class Baz(Bar, Frobble):
    . . .
    
class Foo(Baz):
    . . .

If Bar and Frobble both define the method doIt(), the implementation on Bar will be called. This allows the child class (Baz) to specify which implementation to use and resolve the diamond problem, and children that extend from Baz (such as Foo) to either inherit the ordering or redefine it.

Truth

You may see lines like:

        customer_found = self.session.query(Customer).filter(Customer.number.in_(customer_numbers)).all()
        if customer_found:
            raise DuplicateEntityException(customer_found)

SQLAlchemy’s query().all() method will return a list but we’re treating it like a boolean. What gives?

Python is liberal in interpreting truth. Something is true when it’s:

  • not False
  • not None
  • not empty
  • not zero

This means we can quickly test to see if a list populated, for example, by simply passing the list into an if statement instead of checking for length or invoking a .isEmpty() method.

Ranges and iteration

Iteration over a list or keys in a map is very straightforward:

 
my_list = [1, 2, 3, 4, 5]
 
for elem in my_list:
    print(elem)
 
my_dict = {'apple': 'fruit', 'pear': 'fruit', 'lettuce': 'vegetable'}
 
for key in my_dict:
    print("%s is a %s" % (key, my_map[key])
 
for (key, value) in my_dict.items():
    print("%s is a %s" % (key, value)


Likewise, you can specify range by using a generator. Because we’re using Python 2, you should use xrange() instead of range(), which allocates a list.

# print 1-20 by threes:  1, 4, 7, 10, 13, 16, 19
for i in xrange(1, 20, 3):
    print(i)

Python objects are dynamic, not static

Even if you define a class and instantiate a new instance, your class isn’t locked into a specific structure like C++ or Java. You can add new methods and properties to a class at runtime. This can be a source of error in your programs.

class Foo(object):
 
    def __init__(self, message = 'Hello'):
        self.message = message
 
f = Foo("Welcome")
f.massage = "Wilkommen"


While attempting to overwrite message, I had a typo and entered massage instead. Now my Foo object, f, has two properties on it, message and massage. If my code depended on one of the incorrectly assigned value, it would fail.

Global and function scope

When you want to set a variable in a function, make sure you’re not writing to a local version of the variable.

name = 'Chris'

function update_name():
    name = 'Ivy'

In this case, the name Ivy is written to the local name variable. You need to tell Python that you want to use the global variable.

name = 'Chris'

function update_name():
    global name
    name = 'Ivy'

You won’t run into this problem as often when using classes and methods: most instance or class variables are scoped with the self or class name reference respectively, but you may still run into globals that need to be marked that way.

How to use coverage.py

mbp:~ chris$ cd python/my_project
mbp:my_project chris$ source bin/activate
(my_project)mbp:my_project chris$ pip install coverage
Collecting coverage
  Downloading coverage-3.7.1.tar.gz (284kB)
    100% |████████████████████████████████| 286kB 1.0MB/s
Installing collected packages: coverage
  Running setup.py install for coverage
Successfully installed coverage-3.7.1
(my_project)mbp:my_project chris$ echo $PYTHONPATH
(my_project)mbp:my_project chris$ export PYTHONPATH=`pwd`/src
(my_project)mbp:my_project chris$ coverage run --branch --source src -m myproject.test.TestSuite
. . . omitted . . .
(my_project)mbp:my_project chris$ coverage report

Name                                      Stmts   Miss Branch BrMiss  Cover
---------------------------------------------------------------------------
 . . . omitted . . .
---------------------------------------------------------------------------
TOTAL                                      3589    849    716    328    73%

You can also use coverage html to generate an HTML report with per-file line coverage details.

Comments are closed.