2014-02-08

Flask TodoMVC: SQLAlchemy

This is the sixth article in the Flask TodoMVC tutorial, a series that creates a Backbone.js backend with Flask for the TodoMVC app. In this article, we will replace the existing dataset persistence with the SQLAlchemy ORM using our existing tests to assist during refactoring.

Previous articles in this series:

  1. Getting Started
  2. Server Side Backbone Sync
  3. Dataset Persistence
  4. Custom Configuration
  5. Testing Todo API

We will begin with where we left off in the previous article. If you would like to follow along, but do not have the code, it is available on GitHub.

# Optional, use your own code if you followed the previous article
$ git clone https://github.com/kevinbeaty/flask-todomvc
$ cd flask-todomvc
$ git checkout -b sqlalchemy testing

Let's get started.

Installation and setup

In the third article, we added server side persistence using dataset. We found it a quick and easy way to add database persistence to our app. Dataset is a great project to use during prototyping and could be used beyond with the appropriate validation and access controls in place. In this article, we are going to replace it with a more common alternative.

SQLAlchemy is a SQL abstraction layer and an Object Relational Mapper (ORM). There are several ways to use SQLAlchemy within Flask with varying levels of abstraction. We've actually already used one: dataset is powered by SQLAlchemy. In this article, we are going to use the Flask SQLAlchemy extension.

Before we install SQLAlchemy, start the server using python server.py and make sure you have existing items in your list. Also run python tests.py and ensure your tests pass. We are going to replace the persistence layer, and we want to ensure we don't break anything during the transition.

Installation

OK. We're ready. Let's use pip to remove dataset and install Flask-SQLAlchemy.

$ pip uninstall dataset
$ pip install Flask-SQLAlchemy

Also remove the dataset import, connection and todo table initialization from server.py.

-import dataset
...
-db = dataset.connect(app.config['DATABASE'])
-todos = db['todos']

Replacing it with imports and initialization of Flask-SQLAlchemy.

# server.py

from flask_sqlalchemy import SQLAlchemy
...

db = SQLAlchemy(app)

def init_db():
    db.create_all()

...

if __name__ == '__main__':
    init_db()
    app.run(port=8000)

We initialized the SQLAlchemy extension and added an init_db function, which is called before running the app. This will create any tables that do not exist.

It does not handle database migrations to existing tables: if you add another column, for example, it will not be added to an existing database table. This is one reason that dataset is such a powerful tool during prototyping. Migrations are not difficult to add, and we will revisit this topic in a future article.

Tests and configuration

We need to change the setup and tear down methods of our tests to work with the Flask-SQLAlchemy API instead of dataset.

     def setUp(self):
         self.client = server.app.test_client()
         self.order = 1
+        server.init_db()

     def tearDown(self):
-        server.todos.drop()
-        server.todos = server.db['todos']
+        server.db.drop_all()

We reused the init_db function added above in setUp. Again, this creates all tables configured with Flask-SQLAlchemy. In tearDown, we do the reverse and drop all configured tables. This will allow us to start with a clean slate every time a test is run.

You may have also noticed that we removed the configuration of the database URI. Most Flask extensions use a naming convention to lookup configuration variables. We need to replace our DATABASE key with SQLALCHEMY_DATABASE_URI in our config files.

# config/default.py
SQLALCHEMY_DATABASE_URI = 'sqlite:///todos.db'
DEBUG = True

...

# config/testing.py
SQLALCHEMY_DATABASE_URI = 'sqlite:///test.db'
TESTING = True

Also, update test_config_settings to check the correct configuration key.

     def test_config_settings(self):
         config = server.app.config
-        assert config['DATABASE'] == 'sqlite:///test.db'
+        assert config['SQLALCHEMY_DATABASE_URI'] == \
+            'sqlite:///test.db'
         assert config['TESTING']
         assert config['DEBUG']

We do not have to update any other tests as each use the Flask test client setup in the previous article to test against the API. Those tests are not dependent on the database layer. This is a good thing: we can reuse our existing tests to ensure we have no regression when we are refactoring the routes to use SQLAlchemy.

Our installation and setup is complete. We are now ready to add our first model.

Todo model

When using dataset, we didn't have to define a model for our todo items. We simply stored the JSON object from the client directly into the database. Dataset made this incredibly easy, the table, database and columns were automatically added as we received them. It is great tool to use during prototyping. As we discussed, however, with great power comes great responsibility. Fully trusting the client without validation is not a good idea when programming for the web.

Please note that I am not suggesting that dataset should not be used in production. It is a powerful tool and combining it with a good JSON validation library would make an intriguing combination.

Object Relational Mapping

When using the SQLAlchemy ORM, we explicitly map the database table with a model class and column definitions. We are already familiar with the JSON model of the todo items.

{
    "id":1,
    "title":"Buy groceries",
    "order":1,
    "completed":false
}

When you define a SQLAlchemy model, you specify the column definitions as attributes of a Model class.

class Todo(db.Model):
    __tablename__ = 'todos'
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String)
    order = db.Column(db.Integer)
    completed = db.Column(db.Boolean)

We create a new model, called Todo subclassing the declarative model class created by Flask-SQLAlchemy. We map this model to the todos table by specifying the __tablename__. A primary key id column is added with type integer. All other attributes are added as columns with appropriate types.

Keep in mind while browsing the awesome SQLAlchemy documentation that Flask-SQLAlchemy includes all functions and classes from sqlalchemy and sqlalchemy.orm. The column types are the same types defined within SQLAlchemy and db.Model is a declarative_base. You can, for example, use custom PostgreSQL extensions, such as HSTORE and ARRAY within your model class.

JSON Object Mapping

You can define custom methods on your models (including static and class methods). This is very useful to create "fat models" with data specific business logic on the model class itself. Since we are communicating with Backbone.js using JSON, let's add methods to convert to and from JSON.

    def to_json(self):
        return {
            "id": self.id,
            "title": self.title,
            "order": self.order,
            "completed": self.completed}

    def from_json(self, source):
        if 'title' in source:
            self.title = source['title']
        if 'order' in source:
            self.order = source['order']
        if 'completed' in source:
            self.completed = source['completed']

We return a dict from to_json with properties available as attributes of the instance. In from_json we check for keys in a source dict and set attributes on the instance if necessary. We skip the id column because we do not want that to change.

This is all straightforward, but may quickly become tedious as you add more models transferred as JSON. We will revisit this in a future article, but it works just fine for our purposes today.

Notice that by simply defining the model and column types and explicitly mapping known columns from the JSON source, we removed some trust from our web client. The client can no longer add arbitrary columns or change the id attribute. SQLAlchemy supports further validation using the validates decorator and attribute events.

We have now defined our first model. If you run the tests now, they will fail. Let's fix them.

Fixing the routes

Now that we have a model in place, we are ready to fix our tests. We need to update the routes to use the Todo model we created instead of dataset.

Index, read and response

Start with the index method.

@app.route('/')
def index():
    todos = Todo.query.all()
    todo_list = map(Todo.to_json, todos)
    return render_template(
        'index.html', todos=todo_list)

We queried all Todo items from our model and converted the Todo objects into dicts using to_json defined above. All models that inherit db.Model include a query property that can be used to execute queries against the table mapped by the class. It is a subclass of SQLAlchemy Query with additional methods added by Flask-SQLAlchemy.

Let's also update _todo_response to accept a todo instance and return a JSON response.

def _todo_response(todo):
    return jsonify(**todo.to_json())

Nothing surprising here. We simply use our todo_json method to jsonify a response. Now let's use it to fix todo_read.

@app.route('/todos/<int:id>')
def todo_read(id):
    todo = Todo.query.get_or_404(id)
    return _todo_response(todo)

We use special query method defined Flask-SQLAlchemy to get an object by id or aborting with a 404 if not found. We no longer need our custom _todo_get_or_404, so feel free to delete that function.

If you are getting antsy to see some progress, you can run server.py now and see your old todo item list. But, alas, any modifications will fail.

Create

We'll turn our attention to todo_create.

@app.route('/todos/', methods=['POST'])
def todo_create():
    todo = Todo()
    todo.from_json(request.get_json())
    db.session.add(todo)
    db.session.commit()
    return _todo_response(todo)

We create a Todo object and map the properties from the JSON request using our from_json method defined on our model. We then add the model to the session and commit. And return using _todo_response defined above.

What's a session?

Hold on. What's a session? Glad you asked! From the documentation:

In the most general sense, the Session establishes all conversations with the database and represents a “holding zone” for all the objects which you’ve loaded or associated with it during its lifespan.

A session is the gateway to your database. Even the query property defined above interacts with the session. Essentially, Todo.query can be thought of as sugar for db.session.query(Todo) with the biggest difference being the property uses the custom Flask-SQLAlchemy query class with convenience methods.

All changes to objects maintained by a Session are tracked - before the database is queried again or before the current transaction is committed, it flushes all pending changes to the database. This is known as the Unit of Work pattern.

Flask-SQLAlchemy creates a session scoped to every request. Every request will create a new database transaction. Any queries during a request response cycle will be tracked by the session. Further queries will be flushed to the database within the transaction.

A session must be explicitly committed to make any updates permanent. You can commit the session multiple times if you would like to separate your updates into separate transactions during your request. If you do not commit the session, the transaction will be rolled back.

This pattern works really well for most cases in practice. If an exception occurs during request processing before you call commit, no changes will be made to the database. In general, you should make all updates by querying and adding new objects to the session and commit once right before returning the response.

Take a look at todo_create again. We need to add our new todo item to the session so that SQLAlchemy tracks it for the next commit. We immediately commit the session to make the change permanent and return our response.

Update and delete

We're almost done. Let's turn back to our app and fix our final two routes.

First, update:

@app.route('/todos/<int:id>', methods=['PUT', 'PATCH'])
def todo_update(id):
    todo = Todo.query.get_or_404(id)
    todo.from_json(request.get_json())
    db.session.commit()
    return _todo_response(todo)

Hopefully, most of this seems familiar. We find the todo object from the database, use from_json to update properties from the request, commit the session and return the response. We did not need to add the todo object to the session because it was already added when we queried it from the database. We did need to commit the session because modifications were made.

At last, delete:

@app.route('/todos/<int:id>', methods=['DELETE'])
def todo_delete(id):
    Todo.query.filter_by(id=id).delete()
    db.session.commit()
    return jsonify()

Here we query the todo table using filter_by, which queries the table for columns that match any keyword arguments. Normally, we would use get to retrieve on object by id, but in this case we want a query object which allows us to call delete on the query itself. This executes a delete statement without first executing an unnecessary select. Since we made modifications, we need to commit. Finally, we return an empty JSON response.

Conclusion

We covered a lot of ground. We uninstalled dataset in favor of Flask-SQLAlchemy. We updated the configuration of our app and tests and reused our existing API tests without change to ensure we had no regression. We discussed the power of rapid prototyping with dataset and compared it the SQLAlchemy ORM.

We created our first ORM Model and added convenience methods to assist in translation to and from JSON. We briefly discussed SQLAlchemy validation. We used the model and convenience methods to fix our failing tests by using our replacement ORM. We discussed the lifecycle of a SQLAlchemy session and when to add and commit.

In the next article we will add user authentication to our todo list, and, in the process, add user and role models and discuss SQLAlchemy relationships.

That completes our transition from dataset to SQLAlchemy. If you made it this far you might be interested in following us on twitter or GitHub.

The code is available on GitHub with tag sqlalchemy or compared to previous article.