Deploy¶

Introduction¶

This document explains how to install the project in a user-facing environment, usually for either acceptance testing or production.

Architecture¶

The minimal architecture for this project is a Postgresql 9.1+ database and a web server running either Apache2 or Nginx. At present we prefer Apache2 because we have greater familiarity with configuring and installing it. The project is deployed using wsgi.

A typical cloud deployment uses a load balancer in front of a web server farm (which may consist of a single server) to allow the environment to recover from a failure by launching new web server nodes as required.

Database Server¶

This section explains how to configure a server running Ubuntu 12.04 LTS as a postgresql database server. It is expected that this will only be done once, when the project is first deployed and that subsequent deployments will reuse the existing database.

Prerequisites¶

This document assumes that the server has already been installed with Ubuntu 12.04 LTS Server following the standard installation procedure and that the installation has been updated using apt-get to install current security fixes.

It is important that networking is configured correctly, and that the server has the hostname set correctly in /etc/hostname and /etc/hosts and that the correct IP addresses are used.

Installation¶

Install the basic packages:
sudo apt-get --no-install-recommends install postgresql-9.1 postgresql-plpython-9.1 postgresql-9.1-postgis postgresql-contrib-9.1 pgtap
Note that by default Postgresql only allows connections from the local machine. Normally we will need to update the configuration to allow connections from other machines by users authenticating with MD5 password hashes. Note that for security purposes the specific roles expected to connect remotely should be specified, rather than all. Also, if we should reject attempts to connect at the postgres user remotely. It is better to specific specific username and IP address combinations where possible, e.g. for application servers. You can prefix a role with + for the username to indicate that any user with the specified role can connect. If the server has access to a single sign-on system, e.g. LDAP, RADIUS, ActiveDirectory then passwords should be checked using PAM. Cloud servers should preferably be configured to use SSL connections and cert authentication. Finally, note that local socket connections use the local account name as the database user name. I.e. on the local machine you cannot log in to the database with a different username to the one used to log in to the operating system. If you need to do this then you can specify --host {ip address} --username {username}:
sudoedit /etc/postgresql/9.1/main/postgresql.conf

   listen_addresses = '*'

sudoedit /etc/postgresql/9.1/main/pg_hba.conf

   # Allow md5-authenticated connections from the same subnet (except postgres user)
   host    all             postgres        all                     reject
   host    all             dbas            samenet                 pam
   host    fdwdev          fdw_owner       172.17.0.55/32          md5
   host    fdwdev          +fdw_reporter   samenet                 pam

sudo service postgresql restart

Update the configuration to improve performance according to the available memory on the server, etc. Normally you should set shared_buffers to approximately 1/4 RAM and this will require increasing the value of shmmax on the server to that value (plus some overhead for connections, wal_buffers, etc.). For example, for a server with 1.5GB RAM we set shared_buffers to 256MB and shmmax to 288MB:

   cat << EOF | sudo tee -a /etc/sysctl.conf

   # Roger Hunwicks 2013-04-22
   # Increase maximum shared memory segment size to 256MB + 32MB = 288MB for Postgresql
   kernel.shmmax = 301989888
   EOF
   sudo sysctl -p

Once we have increased ``shmmax`` we can alter the Postgresql parameters::

   sudoedit /etc/postgresql/9.1/main/postgresql.conf

      shared_buffers = 256MB             # approximately 25% RAM, useful range for Windows is 64MB to 512MB
      effective_cache_size = 2GB         # usually 50-75% RAM depending on what else is running is on the server
      checkpoint_segments = 16           # default is 3, which causes checkpoints too frequently
      checkpoint_completion_target = 0.9 # given more segments we can spread the write load further
      wal_buffers = 16MB                 # increase to the size of a wal segment (16MB) from 64kb

Create a directory to hold database tablespaces::

   sudo mkdir /pgdata
   sudo chown postgres:postgres /pgdata

Install the AdminPack extension:

sudo -u postgres psql postgres --command "CREATE EXTENSION adminpack;"

Create a superuser account for the operating system user that will manage the database cluster:

sudo -u postgres createuser --superuser $USER

Prepare the PostGIS roles. Note that the postgis user must be a superuser in order to be able to create the necessary C functions:

createuser --no-login --no-createdb --no-createrole --superuser postgis
createuser --no-login --no-createdb --no-createrole --no-superuser postgis_user

Prepare the pgTAP roles (development servers only):

createuser --no-login --no-createdb --no-createrole --no-superuser pgtap
createuser --no-login --no-createdb --no-createrole --no-superuser pgtap_user

Install the debugger (development servers only, and if plugin_debugger.so is installed):

sudoedit /etc/postgresql/9.1/main/postgresql.conf

   # Roger Hunwicks 2012-06-13
   # Enable the interactive debugger
   shared_preload_libraries = '$libdir/plugins/plugin_debugger.so'

Create the Database(s)¶

Please note the following standards that apply to all Postgresql databases within the organization:

There must be no public schema, i.e. all database objects must be in a

designated schema with restricted permissions on which users can create objects in that schema. Note that GDAL <= 1.7 requires that PostGIS is in a schema called public, but we must still ensure that universal write privileges are removed * Each application database must be owned by a non-superuser schema owner account with privileges to create objects in that database. Ideally that user would not have login privileges and all connections would be either as an application user without schema change privileges or as a member of dbas. However, modern web frameworks usually run database migrations as the same user the application uses to connect to the database, and therefore login privileges for the schema owner are required. * Each application must have a 3-letter alias, e.g. fdw * Each application must have separate databases for development (dev), testing (tst) and production (prd) environments * Each database name must indicate the application and the environment it contains, e.g. fdwdev * Login accounts used by applications (i.e. system accounts, rather than accounts created for individual users) must not be shared across databases, i.e. if the application uses the schema owner account to connect to the database, then there must be a separate schema owner account for each environment (dev, tst, prd) * Each database must contain the PostGIS objects * Development and Testing databases must contain the pgTAP objects

Note that the following commands must be executed on the database server, e.g. via ssh, as a superuser.

The following commands create a new application owner account meeting the requirements:

APP=fdw
ENV=dev
PGDATABASE=${APP}${ENV}
SCHEMA=${APP}_owner
createuser --login --no-createdb --no-createrole --no-superuser --pwprompt $PGDATABASE

Create a new database meeting the requirements:

createdb --owner $PGDATABASE $PGDATABASE
POSTGIS_DIR=/opt/PostgreSQL/9.1/share/postgresql/contrib/postgis-1.5 # or /usr/share/postgresql/9.1/contrib/postgis-1.5
psql -d $PGDATABASE <<EOF
REVOKE ALL ON SCHEMA public FROM public;
GRANT USAGE ON SCHEMA public TO public;
SET search_path=public;
\i ${POSTGIS_DIR}/postgis.sql
\i ${POSTGIS_DIR}/spatial_ref_sys.sql
\i ${POSTGIS_DIR}/postgis_comments.sql
CREATE SCHEMA ${SCHEMA} AUTHORIZATION ${PGDATABASE};
ALTER DATABASE ${PGDATABASE} SET search_path=${SCHEMA}, public, pg_temp;
EOF

The PostGIS objects must also be installed. Note that once PostGIS 2.x is packaged for Ubuntu we will be able to use CREATE EXTENSION postgis; in place of the commands below. Also note, that while it is preferable to create the PostGIS objects in a separate schema, currently (GDAL 1.7) ogr2ogr does not work if PostGIS is in a schema other than public:

export PGDATABASE=${APP}${ENV}
psql <<"EOF"
REVOKE ALL ON SCHEMA public FROM public;
GRANT USAGE ON SCHEMA public TO public;
SET search_path=public;
\i ${POSTGIS_DIR}/postgis.sql
\i ${POSTGIS_DIR}/spatial_ref_sys.sql
\i ${POSTGIS_DIR}/../postgis_comments.sql
GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE public.geometry_columns TO postgis_users;
GRANT SELECT, UPDATE, INSERT, DELETE ON TABLE public.spatial_ref_sys TO postgis_users;
GRANT SELECT ON public.geography_columns TO postgis_users;
GRANT postgis_users TO ${PGDATABASE};
CREATE SCHEMA ${SCHEMA} AUTHORIZATION ${PGDATABASE};
ALTER DATABASE ${PGDATABASE} SET search_path=${SCHEMA}, ${POSTGIS_SCHEMA}, pg_temp;
EOF

For Development and Testing databases the pgTAP objects must be installed:

export PGDATABASE=${APP}${ENV}
psql <<EOF
CREATE SCHEMA pgtap AUTHORIZATION pgtap;
GRANT USAGE ON SCHEMA pgtap TO pgtap_user;
COMMENT ON SCHEMA pgtap IS 'pgTAP functions and type definitions';
SET ROLE pgtap;
SET search_path=pgtap;
CREATE EXTENSION pgtap;
EOF

Cloud Application¶

Introduction¶

Our current standard is to deploy applications using Amazon Elastic Beanstalk, which provides a self-healing web architecture.

Prerequisites¶

This document assumes that the Amazon Web Services command line tools are installed in the local environment such that the various tools, particularly eb are on the path and the various environment variables, e.g. AWS_CREDENTIAL_FILE and ELASTICBEANSTALK_URL exist and are correct. For example:

source ~/.aws/environment

Initialization¶

If the project has not been previously deployed to Elastic Beanstalk from this workstation then the Git repository must be configured for use by Elastic Beanstalk:

export APP=fdw
export ENV=dev
export REGION=us-east-1
eb init -f --solution-stack "64bit Amazon Linux running Python" -a $APP -e ${APP}${ENV} --region $REGION
   Create an RDS DB Instance?: N
   Attach an instance profile: # aws-elasticbeanstalk-ec2-role (or "Create a default instance profile")

This command may change .gitignore, which must then be committed:

git status .gitignore
git add .gitignore
git commit -m "Initial Elastic Beanstalk configuration"

We should also create a branch specification for the master branch (which is where all releases occur from):

git checkout master
eb branch
    Environment name: fdwdev
    Copy settings: Yes

Customization¶

Two files are used to configure the Elastic Beanstalk environment: .ebsettings/python.config and .elasticbeanstalk/optionsettings.${APP}${ENV}. Files in .ebsettings are included in the Git repository and should be used for settings that will be the same for every deployment. This includes the operating system packages required by the application and the commands that have to be run when the application is deployed, for example to process JavaScript and CSS assets or to run database migrations. Files in .elasticbeanstalk are not included in the Git repository and should be used for settings that will different for each deployment, including the number of servers to use and the values of security sensitive environment variables such as the database connection credentials.

The current version .ebsettings/python.config will be created when you pull the Git repository down to the workstation. For a new project the file will need to be created.

You will need to create the .elasticbeanstalk/optionsettings.${APP}${ENV} before you can deploy the application from a workstation for the first time:

cat << EOF | tee  .elasticbeanstalk/optionsettings.${APP}${ENV}
[aws:autoscaling:asg]
Custom Availability Zones=
MaxSize=4
MinSize=1

[aws:autoscaling:launchconfiguration]
EC2KeyName=$EC2_KEYPAIR
InstanceType=t1.micro

[aws:ec2:vpc]
ELBScheme=public
ELBSubnets=
Subnets=
VPCId=

[aws:elasticbeanstalk:application]
Application Healthcheck URL=

[aws:elasticbeanstalk:application:environment]
DJANGO_ADMIN_PASSWORD=adminpassword
DJANGO_SETTINGS_MODULE=$APP.settings.production
EMAIL_HOST=mail.tonic-solutions.com
EMAIL_HOST_PASSWORD=mailpassword
EMAIL_HOST_USER=djangomailuser
PGDATABASE=${APP}${ENV}
PGHOST=1.2.3.4
PGPASSWORD=database_password
PGPORT=5432
PGUSER=${APP}_owner
SECRET_KEY=abcdefghijklmnopqrstuvwyz
SENTRY_DSN=e976bb2fbc554039a4af01fb408c973d

[aws:elasticbeanstalk:container:python]
NumProcesses=1
NumThreads=15
StaticFiles=/static=assets/
WSGIPath=fdw/wsgi.py

[aws:elasticbeanstalk:container:python:staticfiles]
/static=assets/

[aws:elasticbeanstalk:hostmanager]
LogPublicationControl=false

[aws:elasticbeanstalk:monitoring]
Automatically Terminate Unhealthy Instances=true

[aws:elasticbeanstalk:sns:topics]
Notification Endpoint=django_admins@tonic-solutions.com
Notification Protocol=email
EOF

Start the environment for the first time, which will deploy the sample app:

eb start -e ${APP}${ENV}

Review the status with:

elastic-beanstalk-describe-configuration-settings -a ${APP} -e ${APP}${ENV} --show-json | python -mjson.tool
eb status

It is normal to get Red or Grey Health at this stage because we have not deployed the Django application yet and Elastic Beanstalk is attempting to run the Sample Application with our Django settings.

Releases¶

Once the setup is done and the app has been released for the first time, new releases can be done by:

APP=fdw
ENV=dev
source ~/.aws/environment
git checkout master # we always release from a tag on master
git aws.push --environment ${APP}${ENV}

Decommissioning¶

An environment can be terminated using eb stop.

If the application is no longer required at all, then it can be destroyed with eb delete.

Trouble-shooting¶

If you have set a key pair for the Elastic Beanstalk environment then you will be able to log into the running EC2 instances using SSH, e.g.:

ssh -oUserKnownHostsFile=/dev/null -oStrictHostKeyChecking=no ec2-user@ec1-2-3-4.compute-1.amazonaws.com

The project is deployed to /opt/python/bundle/x/app/ where x is the release number. Therefore, the first steps in trouble shooting will normally be to change to the application directory and activate the virtual environment:

cd /opt/python/current/app/
source /opt/python/run/venv/bin/activate
source /opt/python/current/env