Friday, March 11, 2011

Part 2: Continuous Deployment with Pinax and Jenkins

Part 1 is here.

So once Jenkins is running the code, we need a way to copy over that code to our staging server (we don't have a real production server yet). Our Jenkins user is the same as the staging server user, so it's simply a matter of copying things over in a script. If this were not the case, then we'd have to install the "Publish over SSH" Jenkins plugin and use that to copy things over and establish the symlinks.

Instead of having it as an additional step in the build process, our setup uses a separate task that is executed after the CI task is completed. However, we start off in the CI task's workspace so that we can clean up before moving things over.

Create a Jenkins Job:

Again, click "new job", select "build a free-style software project" and make sure the title has no spaces in it.

Job Settings:
  • Again, fill out a description. This is not going to do anything with Git or Github, so you don't need to fill out those sections.
  • In "Advanced Options", select "Use custom workspace" and put in the path to the CI task's workspace. The path is relative to the root Hudson folder (typically ".hudson"), so your path should be something like "jobs/[CI task name]/workspace".
  • Instead of polling, we will build after other projects are built. Check the box and type in the name of the CI task.
Build steps:

Step 1: Cleanup


Remember that this is starting from the CI task workspace, so the virtualenv should already be set up. We just use this to remove the pyc files before copying.

Step 2: Create new folder and copy


Simple script to create a new folder and copy. We use the environment variable "BUILD_TAG" to name our folders. It comes out as [task-name]-[build number].

Step 3: Start staging virtualenv and pull in external files


We use a separate virtualenv for the staging server, so we don't need .env anymore. We also copy a separate local_settings.py file and establish a symlink to the staging server's site_media folder.

Step 4: Update the staging environment's requirements and the database


We update the requirements using the same pip command from part 1. We then sync the database. We also use South for database migrations, so we also execute the migrations.

Step 5: Establish symlink and reload the code


We use mod_wsgi in Daemon mode, which means we don't need to restart the server once the code changes. mod_wsgi is using the "makahiki" symlink, so all we need to do to update the code is change the symlink. To be extra sure, we touch our wsgi script to make sure it reloads.

And that's it! We now have a project that polls Github, runs tests, and then deploys it. We can also rollback by changing the symlink to a previous build.

Part 1: Continuous Deployment With Pinax and Jenkins

I admire system admins. They do many things with scripts and commands that are a bit arcane to me. I had heard the term "Continuous Deployment" months ago back when Digg was going through their redesign. Continuous deployment, if you don't know, basically means a script updates code on a live server once a developer commits it. I thought it was an interesting idea, but I'd never be able to pull off such a thing.

Fast forward to today, where we have a Jenkins instance and now multiple developers. Being the lead developer/sys admin on this project by default (I was its sole developer for a while), it was up to me to set up continuous integration and then, if possible, pull off continuous deployment. This blog will describe setting up the CI task.

I had put our code into Hudson months ago, but I had forgotten about it and later found out it wasn't running. It was also having weird connectivity issues, so we figured this would be a great time to upgrade. During these past few months, other people have written much better blogs on how to get Jenkins running with Django/Pinax like Rob Baines. As it turns out, our Jenkins setup is not all that different from the setup described in Rob's blog. I'll lay out the steps and how we diverged from Rob's scripts. Rob's post is a great place to get a little more detail.

Prerequisites:
  • Jenkins (if you have a Mac, use homebrew and just 'brew install jenkins').
  • virtualenv ('pip install virtualenv')
  • Python 2.4 or higher
  • some kind of database (optional, by default we use SQLite3)
Assumptions:

This guide, like Rob's assumes the host system is UNIX (Linux or Mac). Sorry Windows users.

Jenkins plugins:
  • We use Git, so we need the git plugin. You can also install the github plugin if you'd like (provides links to github).
  • Cobertura
  • I don't use the setenv plugin. Rob uses it to set up a path to the virtualenv, but I don't think it's necessary.
Create a Jenkins Job:

Click new job and select "build a free-style software project". Type in a project name and make sure it has no spaces.

Job Settings:
  • Put in a description, link to Github project (if using the Github plugin).
  • In source code management, select Git. Get your project's read-only repo url (no need to do a commit) and specify a branch to build (I don't know what "default" is, so I explicitly put master).
  • If using the Github plugin, you can fill out the repository browser (githubweb) and URL as well.
  • We set Jenkins to poll the repo every 5 minutes, which in cron syntax comes out as "*/5 * * * *"
Build steps:

Step 1: Create virtualenv if it doesn't exist

It's not all that different from Rob's, but since Pinax as of this writing (version 0.7.3) is not available in PyPi, we download the tarball from pinaxproject.com and install it. This does mean we're somewhat stuck with a certain version of Pinax unless we update it by hand.

Step 2: Install and update dependencies


Similar to Rob's, though I had dumped everything into a single requirements file. In the future, we might want to split up requirements based on whether it's a developer, Jenkins, or the live code. Also, I passed the -q flag to silence pip, otherwise you'd see all these lines in the console that say the requirements are satisfied.

Step 3: Update local_settings.py

Identical to Rob's. We may move to MySQL, but those database settings cannot be in source control. Instead, I'll probably put a different local_settings.py script on the server.

Step 4: Execute Tests

The only thing I changed was that I added some extra parameters to manage.py test for nosetests. --with-xunit will create a nosetests.xml for use in reporting test results and --exe tells nosetests not to skip tests that have executable permissions. As for an explanation about the coverage commands, I'll defer to Rob.

Post-build Actions:
  • Check "Publish JUnit report" and give it a path to the nosetests.xml file ("**/nosetests.xml").
  • Check "Publish Cobertura coverage report" and give it a path to the coverage.xml file ("**/coverage.xml").
And that's it for the CI task. So what about continuous deployment?


Monday, May 3, 2010

Deployment Newbie

I've been trying to get our Pinax-based web application on our web server for the past few days. As a Master's student, I have had experience working with web frameworks but I've never had to deploy them. So dealing with the httpd.conf, .htaccess, and various permissions errors tripped me up big time. Thankfully, I've had the benefit of both Google and colleagues. Hopefully, other newbie deployers will find my foray into Apache and mod_wsgi useful.

Permission Denied: /Users/username/.htaccess pcfg_openfile?

So I got this error. A lot. And I couldn't quite figure out why. Everything I read from Google lead me to think that I had permissions errors somewhere in directory. But I changed everything to 755 and it still didn't seem to work. So I asked Robert, a colleague, and he asked me to check my home directory. The permissions on my home directory were 700. Robert mentioned that since Apache needs to go in, it needs access to the files. So once I 'chmodded' those, things started to work.

I had a few errors after that, but they were mostly related to permissions on the project file (they should all be 755).

Setting Up Virtual Hosts:

This was a real newbie mistake. Most of the tutorials I've seen online for Pinax deployment don't mention virtual hosts. And if they do, the code sample is basically about what to put in the VirtualHost definition. Of course, being a newbie, I thought "Hey, I can change this to be any port that I want and it'll just work!" After looking through Apache's examples, I noticed that I was missing Listen and NameVirtualHost statements. So, the actual configuration would look something like (assuming you're using mod_wsgi):



Just put this in a configuration file, edit the paths, and point to it from the httpd.conf with an Include statement and things should work! I've got to admit that it could've been a lot more painful if Pinax did not provide the WSGI script.

But after all this, I got it to work, at least on my Mac through Apache. Now to put it on our web server.

Monday, April 26, 2010

Javascript Hell

A Crash Course in jQuery

When I first got in to web development with Rails, Prototype and Scriptaculous were the most popular Javascript libraries. Since I was relatively new to Javascript at the time, they were a little tricky. After working with Prototype and Scriptaculous for some time, I think I have a pretty good handle on it. But, this new fancy Javascript library called jQuery came up and now everyone's using it. By default, version 1.3.2 comes with Pinax 0.7.1. So, in order to get with the times (and debug Pinax's Javascript), I went through a crash course in jQuery.

Following some tutorials, I implemented a toggle for my news articles. Even in this short script, I can see how jQuery differs from Prototype and Scriptaculous. I like how easy it is to implement the toggle without having to assign unique div ids for each article.


//News toggle
$(document).ready(function(){
//Switch the "Open" and "Close" state per click
$("h4.trigger").toggle(function(){
$(this).addClass("active");
}, function () {
$(this).removeClass("active");
});

//Hide and unhide on click.
$("h4.trigger").click(function(){
$(this).next(".article_container").toggle();
});
});
"Hacking" Google Gadgets

The WattDepot Google Gadgets group have done some great work in implementing visualizations for gadgets to be added in iGoogle. I wanted to add the Gauge visualization and the BioHeatMap visualization to my web application. However, I know very little about how visualizations work. Plus, the group created gadgets, so they are not embeddable on web pages right away. But they're only HTML and Javascript, and I have the source code. So why don't I just extract out the Javascript and add it to my web application?

For the most part, it worked out quite well. I added the gauge visualization connected to a WattDepot data source fairly easily. I hard coded some values since they are user preferences (eventually, I'd like to change that) and changed some variables around. Is it just me or are there automagically created Javascript variables based on the ids in the HTML section of the gadget?

However, the BioHeatMap visualization was a little tricky. When I added it, I got some weird errors. It was telling me things like "$(document).ready is not a function". It broke my previously working jQuery code somehow! After a little bit of debugging, it turns out that the BioHeatMap visualization actually requires my old friend Prototype. And Prototype also uses the "$" shortcut. Fortunately, someone at jQuery thought of this and created a function called "noConflict()" which relinquishes the usage of the $ symbol. The jQuery code above would then have to be rewritten (replace the "$" with "jQuery"), but it was a minor issue.

But after those things were resolved, everything ran smoothly! I been to a little Javascript hell and survived to blog about it.

Monday, April 19, 2010

Working Git Out

I have been using Subversion for over 3 years now. Ever since I learned it, I've been with organizations that use it as their primary version control system. When I went to RailsConf 2008 in Portland, Oregon, people were just starting to use this thing called Git. I didn't understand it at the time and the hype didn't really catch up to me until the beginning of the year. It was then I decided to put my project on GitHub instead of Google Code; my default choice.

For the most part, I used Git like I used Subversion. I checked out the master branch, made my changes on that branch, and committed them periodically. I thought Git was okay, except for that I needed that extra "git push" to commit to the remote repository.

Then, I started to branch out (excuse the pun). I created a branch to hold my stable releases and made that the default branch. It was easy for me to merge changes between the two branches (sometimes I changed things in release like documentation changes). I even did a cherry-pick to get something I fixed in master to release. And when other users came on, it seemed only natural that I have them work off of another branch that is separate from my changes. Merging between them has never been an issue.

Recently, I found an article by Andy Croll that outlined his typically daily workflow using Git. Since it is so easy to create local branches, why not create a branch for each individual feature? I had never thought of using Git that way and it made total sense, especially if you have a few features going on at the same time. When using Subversion, I found myself frequently picking files to not commit since I was still working on them. By using this Git workflow, I can easily merge completed features into the master branch and push that when its ready.

I have to say that this article opened my eyes to the possibilities of using a Distributed Version Control System like Mercurial or Git. While the workflow does not work as well for the small features I've been implementing, I can see it being great for when I'm working on one huge feature while submitting bug fixes.

Monday, April 12, 2010

It's the Small Things

I guess at this point, you'd assume that I'm an expert at hacking Django forms. After all, I already wrote two blog posts about my custom validation and saving. But this post is all about the small things I've tried to do. Some with success, and some without.

Fields that are Not Part of the ModelForm

One of the newer requirements in our use case was to create a place in the form for generating activity confirmation codes if the activity's confirmation type is "code". The field would take the number of codes the admin wants to generate. The easy way to do this is to simply have the number of codes be a property of the activity model. Then, the admin form can generate the field and have it on the Django admin form. However, there's no reason for the number of codes to be a part of the model. So there must be a way to present it as a non-model field.

Number of codes field in the Activity admin interface

The solution is to simply add the form as an additional IntegerField in the definition of the ModelForm. Then, when the form is being validated or saved, it is one of the parameters. Then I have a static method in my confirmation code model that generates a number of codes for a given activity. Great, so now I can generate a number of codes for an activity. Where should I put a link to view the codes?

I also started inserting help text into the model fields so that they're presented on the form. However, for an activity with confirmation codes, the text should change if the activity has already been created. So this presents an interesting opportunity. If I can change the help text so that it says "Number of additional codes to generate" instead of "Number of confirmation codes to generate", then I can insert a link to view codes.

Well, then we just override the init method (the method called to initialize an object) to edit the num_codes field. There, if the activity has been created (i.e. has a created_at field), then we can change the help text and insert a link to our view codes view.


class ActivityAdminForm(forms.ModelForm):
num_codes = forms.IntegerField(required=False,
label="Number of codes",
help_text="Number of confirmation codes to generate",
initial=0
)

def __init__(self, *args, **kwargs):
"""Override to change number of codes help text if we are editing an activity."""

super(ActivityAdminForm, self).__init__(*args, **kwargs)
# Instance points to an instance of the model.
if self.instance and self.instance.created_at and self.instance.confirm_type == "code":
self.fields["num_codes"].help_text = "Number of additional codes to generate <a href=\""
self.fields["num_codes"].help_text += reverse("activities.views.view_codes", args=(self.instance.pk,))
self.fields["num_codes"].help_text += "\" target=\"_blank\">View codes</a>"
Read-Only Fields

I like that the Django admin presents an interface for editing the different fields of a Django model. However, there are a few fields that need to be displayed but not edited. A good example is the ActivityMember model, which is a model that represents a user's participation in an activity. These are the models that are approved/rejected when a user requests points for their participation. There is no reason for admins to be able to change the activity or comments from the user. However, they need to be displayed so that admins have some context.

As it turns out, Django has a field for ModelForms that can specify fields that should be read-only. However, this feature is only enabled for the development version of Django (version 1.2). The version of Pinax I am using is 0.7.1 and it only has Django 1.0.4. A friend of mine is trying to convince me to update to the dev version of Pinax (0.9), which uses Django 1.2. I think I'll hold off updating until the summer rolls around. I'm hoping that both are stable enough by then so that I can use it in production in October. I also found this interesting post on StackOverflow that deals with this issue. Perhaps if I have time, I can add it in.

Check out the current implementation at GitHub.

Monday, April 5, 2010

More Django Form Hacking

Inline Formsets:

The activities in the Kukui Cup competition are fairly complex objects. They're complicated enough that the basic Django form validations will not work without a fair bit of tweaking. Some of the fields in an activity are either optional or required depending on the values of other fields. Here's a few things we need outside of the basic form processing:
  1. If an activity is an event (is_event = True), then it must have an event date.
  2. If the confirmation type is either "confirmation code" or "image upload", then a prompt is required. Examples would be "Enter the confirmation code you received at the event" or "Upload a photo of yourself holding a CFL and an incandescent light bulb".
  3. If the confirmation type is "question and answer", then at least one question and answer is required.
  4. Publication date must be before the expiration date.
1, 2, and 4 are pretty straightforward, especially since I had already taken care of 1. Here's the new activity admin form.


class ActivityAdminForm(ModelForm):
class Meta:
model = Activity

def clean(self):
# Data that has passed validation.
cleaned_data = self.cleaned_data

#1 Check that an event has an event date.
is_event = cleaned_data.get("is_event")
event_date = cleaned_data.get("event_date")
has_date = cleaned_data.has_key("event_date") #Check if this is in the data dict.

if is_event and has_date and not event_date:
self._errors["event_date"] = ErrorList([u"Events require an event date."])
del cleaned_data["is_event"]
del cleaned_data["event_date"]

#2 Check the verification type.
confirm_type = cleaned_data.get("confirm_type")
prompt = cleaned_data.get("confirm_prompt")
if confirm_type != "text" and len(prompt) == 0:
self._errors["confirm_prompt"] = ErrorList([u"This confirmation type requires a confirmation prompt."])
del cleaned_data["confirm_type"]
del cleaned_data["confirm_prompt"]

#4 Publication date must be before the expiration date.
if cleaned_data.has_key("pub_date") and cleaned_data.has_key("expire_date"):
pub_date = cleaned_data.get("pub_date")
expire_date = cleaned_data.get("expire_date")

if pub_date >= expire_date:
self._errors["expire_date"] = ErrorList([u"The expiration date must be after the pub date."])
del cleaned_data["expire_date"]

return cleaned_data
Number 3 is a little tricky, because there can be one or many question and answer pairs. I was already aware of inline forms from the Django tutorial. So the first step was to add in inline forms for the questions and answers.

Question and answer fields in the admin form.

Easy enough, but now I have to implement the validation. These questions and answers should only be provided if the confirm type is "text". I did some research and figured out that I needed to extend the BaseInlineFormSet class to provide my custom validation behavior.

class TextQuestionInlineFormSet(BaseInlineFormSet):
"""Custom formset model to override validation."""

def clean(self):
"""Validates the form data and checks if the activity confirmation type is text."""

# Form that represents the activity.
activity_form = self.instance

# Count the number of questions.
count = 0
for form in self.forms:
try:
if form.cleaned_data:
count += 1
except AttributeError:
pass

if activity_form.confirm_type == "text" and count == 0:
raise ValidationError("At least one question is required if the activity's confirmation type is text.")

elif activity_form.confirm_type != "text" and count > 0:
raise ValidationError("Questions are not required for this confirmation type.")

class TextQuestionInline(admin.StackedInline):
model = TextPromptQuestion
extra = 3
formset = TextQuestionInlineFormSet
I chose to raise the validation error if the confirm type is not text because I don't want any question and answers saved for activities that do not need it. This should take care of most of the activity admin interface, but the requirements can always change.

Themes:

There are other students working on my Kukui Cup implementation. Their goal is to redesign the interface through the use of HTML and CSS. We don't want just one redesign; we want them to attempt many redesigns so that we can evaluate each one. So, instead of having them hack the settings.py file to change themes, I added a drop down form at the top so that they can change the CSS files easily. This also required moving the original files into folders to create a "default" theme. I also created custom template tags so that any file that ends with ".css" is imported in the header.


def render_css_import():
"""Renders the CSS import header statements for a template."""

return_string = ""
css_dir = os.path.join(settings.PROJECT_ROOT, "media", settings.KUKUI_CSS_THEME, "css")
if os.path.isdir(css_dir):
items = (item for item in os.listdir(css_dir) if string.find(item, "css") >= 0)
for item in items:
return_string += "<link rel=\"stylesheet\" href=\"/site_media/static/" + settings.KUKUI_CSS_THEME
return_string += "/css/" + item + "\" />\n"

return return_string

register.simple_tag(render_css_import)
This should make it easy for the other students to see how the interface changes as they develop their own CSS.