Deploying the web service¶
To actually deploy the web service, it is necessary to package the Python classes that implement the backend and frontend, then use the build system to install these classes in the correct location, together with other resources such as images, style sheets or text files needed by the web interface.
Prerequisites¶
Every service needs some basic setup:
The service needs its own MySQL database, and two MySQL users set up, one for the backend and the other for the frontend. A sysadmin can set this up on the modbase machine.
The service needs its own user on the modbase machine; for example, there is a modloop user for the ModLoop service. It is this user that runs scons (below). All of the backend also runs as this user, and jobs on the SGE clusters also run under this user’s account. (It is not a good idea to use a regular user for this purpose, as it will use up the regular user’s disk and runtime quota on the cluster, and bugs in the service could lead to deletion of that user’s files or their exposure to outside attack.) A sysadmin can also set up this user account.
The web service user needs a directory on the
/wynton
disk in order to store running jobs, and at least one directory on a local modbase disk so the frontend can create incoming jobs.A sysadmin needs to configure the web server on modbase so that the web service files are visible to the outside world. They can also password protect the page if it is not yet ready for a full release.
It is usually a good idea to put the implementation files for a web service on GitHub, or in an SVN repository.
Quick start¶
The easiest way to set up a new web service is to have a sysadmin run the
make_web_service
script on the modbase machine. Given the name of the
web service it will set up all the necessary
files used for a basic web service. Run make_web_service
with no
arguments for further help.
Note
make_web_service
should be run on a local disk (not /wynton
).
Most users on modbase have their home directories on a local disk, so
this is generally OK by default. Note that the home directory should be
accessible by the backend user in order for the build system to work;
running chmod a+rx ~
should usually be sufficient.
Example usage¶
For example, the user ‘bob’ wants to set up a web service for peptide docking.
He first chooses a “human readable” name for his service, “Peptide Docking”. This name will appear on web pages and in emails, but can be changed later by editing the configuration file, if desired.
He also chooses a “short name” for his service, “pepdock”. The short name should be a single lowercase word; it is used to name system and MySQL users, the Perl and Python modules, etc. It is difficult to change later, but is never seen by end users so is essentially arbitrary.
He asks a sysadmin to set up the web service, giving him or her the “short name” and the human readable name. (The sysadmin will run the make_web_service script.)
Bob can then get the web service from git or Subversion by running:
$ git clone git@github.com:salilab/pepdock.git [git] $ svn co https://svn.salilab.org/pepdock/trunk pepdock [Subversion] $ cd pepdock/conf $ sudo -u pepdock cat ~pepdock/service/conf/backend.conf > backend.conf $ sudo -u pepdock cat ~pepdock/service/conf/frontend.conf > frontend.confBob edits the configuration file in
conf/live.conf
to adjust install locations, etc. if necessary, and fills in the template Python modules for the backend and frontend, inbackend/pepdock/__init__.py
andfrontend/pepdock/__init__.py
, respectively.He writes test cases for both the frontend and backend (see Testing) and runs them to make sure they work by typing scons test in the pepdock directory.
He deploys the web service by simply typing scons in the pepdock directory. This will give him further instructions to complete the setup (for example, providing a set of MySQL commands to give to a sysadmin to set up the database).
Once deployment is successful, he asks a sysadmin to set up the web server on modbase so that the URL given in urltop in
conf/live.conf
works.Whenever Bob makes changes to the service in his pepdock directory, he simply runs scons test to make sure the changes didn’t break anything, then scons to update the live copy of the service, then git commit and git push to publish the changes at GitHub. (The backend will also need to restarted when he does this, but scons will show a suitable command line to achieve this.)
If Bob wants to share development of the service with another user, Joe, they should ask a sysadmin to give Joe sudo access to the pepdock account. Joe can then set up his own pepdock directory by cloning the repository from GitHub and then developing in the same way as Bob, above.
Note
Development of the service should generally be done by the regular (‘bob’)
user; only the backend itself runs as the backend (‘pepdock’) user. Bob can
however run any command as the ‘pepdock’ user using ‘sudo’
(e.g. sudo -u pepdock scons
to run scons as the pepdock user). Note that
sudo will ask for the regular user’s (Bob’s) password, not the pepdock
account (which does not have a password anyway, and cannot be logged into).
For advanced access, a shell can be opened as the backend user by running
something like sudo -u pepdock bash
.
Design tips¶
When designing a web service, the following design tips may be useful:
The web service should implement little or none of the actual algorithm; instead, the algorithm should be implemented in another package that can be used independently. This allows others to use your algorithm on their own machines, rather than having to use Sali lab resources via the web service. The web service itself should only handle generating input files and nicely presenting any results (e.g. with interactive plots or protein structures). For example, ModLoop relies on MODELLER for the actual algorithm, while the algorithm used by the AllosMod web service is implemented in a separate AllosMod library, which allows the AllosMod protocol to be run from a command line.
A web service must be self contained. If you absolutely must use external scripts in your web service, don’t put them in your home directory or some other random place on the disk. Include and install them with the rest of the web service. See the MultiFoXS service for an example (in that case the external scripts are put in a scripts directory and installed in a cluster-accessible location).
Web service dependencies must be well defined. If you need to use external software, like IMP, scikit, or gnuplot, don’t compile your own version of that software and install it in a random place. Use “module load” to load the module for that software instead (if a module isn’t available, ask a sysadmin to build one for you).
The following sections describe the various components of a web service in more detail, for developers that wish to set things up themselves without using the convenience scripts.
Backend Python package¶
The backend for the service should be implemented as a Python package in the
backend
subdirectory. Its name should be the same as the service, except
that it should be all lowercase, and any spaces in the service name should be
replaced with underscores. For example, the ‘ModFoo’ web service should be
implemented by the file backend/modfoo/__init__.py
).
This package should implement a Job
subclass and may also
optionally implement Database
or Config
subclasses. It should
also provide a function get_web_service which, given the name of a
configuration file, will instantiate a WebService
object, using these
custom subclasses, and return it.
This function will be used by utility scripts set up by the build system to
run and maintain the web service. An example, building on previous ones,
is shown below.
import saliweb.backend
import glob
class Database(saliweb.backend.Database):
def __init__(self, jobcls):
saliweb.backend.Database.__init__(self, jobcls)
self.add_field(saliweb.backend.MySQLField('number_of_pdbs', 'INTEGER'))
class Job(saliweb.backend.Job):
runnercls = saliweb.backend.WyntonSGERunner
def preprocess(self):
pdbs = glob.glob("*.pdb")
self._metadata['number_of_pdbs'] = len(pdbs)
def run(self):
script = """
for f in *.pdb; do
grep '^HETATM' $f > $f.het
done
"""
r = self.runnercls(script)
r.set_options('-l diva1=1G')
return r
def get_web_service(config_file):
db = Database(Job)
config = saliweb.backend.Config(config_file)
return saliweb.backend.WebService(config, db)
Frontend Python package¶
The frontend for the service should be implemented as a Python package in the
frontend
subdirectory, named as for the backend
(e.g. the ‘ModFoo’ web service’s frontend should be implemented by the file
frontend/modfoo/__init__.py
).
An example is shown below. For clarity, only the methods are shown, not their
contents; for full implementations of the methods see the Frontend page.
from flask import render_template, request
import saliweb.frontend
app = saliweb.frontend.make_application(__name__)
@app.route('/')
def index():
return render_template('index.html')
@app.route('/job', methods=['GET', 'POST'])
def job():
# submit new job or show all jobs (queue)
@app.route('/job/<name>')
def results(name):
# show results page
@app.route('/job/<name>/<path:fp>')
def results_file(name, fp):
# download results file
Configuration file¶
The service’s configuration should be placed in a configuration file in the
conf
subdirectory. Multiple files can be created if desired, for example
to maintain both a testing and a live version of the service. Each
configuration file can specify a different install location, MySQL database,
etc. This directory will also contain the supplementary configuration files
that contain the usernames and passwords that the backend and frontend need
to access the MySQL database. Since these files contain sensitive information
(passwords), they should not be group- or world-readable
(chmod 0600 backend.conf), and if using SVN or git, do not put
these database configuration files into the repository.
Using the build system¶
The build system is a set of extensions to SCons that simplifies the setup and installation of a web service. To use, create a directory in which to develop the web service, and create a file SConstruct in that directory similar to the following:
import saliweb.build
v = Variables('config.py')
env = saliweb.build.Environment(v, ['conf/live.conf', 'conf/test.conf'])
Help(v.GenerateHelpText(env))
env.InstallAdminTools()
Export('env')
SConscript('backend/modfoo/SConscript')
SConscript('frontend/modfoo/SConscript')
This script creates an Environment
object which will set
up the web service using either the configuration file live.conf or the file
test.conf in the conf subdirectory.
The Environment
class derives from the standard SCons
Environment class, but adds additional methods which simplify the setup of
the web service. For example, the
InstallAdminTools()
method installs a set of
command-line admin tools in the web service’s directory (see below).
SConscript files in subdirectories can use similar methods (such as
InstallPython()
) to set up the rest of the
necessary files for the web service.
To test the web service, run scons test from the command line on the modbase machine (see Testing).
To actually install the web service, run scons build=live or scons build=test from the command line on the modbase machine, as the web service backend user, to install using either of the two configuration files listed in the example above. (If scons is run with no arguments, it will use the first one, live.conf.) Before actually installing any files, this will check to make sure things are set up for the web service to work properly - for example, that the necessary MySQL users and databases are present.
Command-line admin tools¶
The build system creates several command-line admin tools in the bin subdirectory under the web service’s install directory. These can be run by the web service user to control the service itself and manipulate jobs in the system.
service.py¶
This tool is used to start, stop or restart the backend itself for the web service. This daemon performs all functions of the web service, waiting for jobs submitted by the web frontend and submitting them to the cluster, harvesting completed cluster jobs, and expiring old job results. The tool also has a condstart option which will only start the service if it is not already running (the regular start option will complain if the service is running).
resubmit.py¶
This tool will move one or more jobs from the FAILED state back to the INCOMING state. It is designed to be used to resubmit failed jobs once whatever problem with the web service that caused these jobs to fail the first time around has been resolved.
deljob.py¶
This tool will delete one or more jobs in a given state. It can be used to remove failed jobs from the system, or to purge information from the database on expired jobs. Jobs in other states (such as RUNNING or COMPLETED) can also be deleted, but only if the backend service is stopped first, since that service actively manages jobs in these states.
failjob.py¶
This tool will force one or more jobs into the FAILED state. This is useful if, for example, due to a bug in the backend, a job didn’t work properly but went into the COMPLETED state. The backend service must first be stopped in order to use this tool.
delete_all_jobs.py¶
This tool will delete all of the jobs from the web service, so can be used to ‘restore to factory settings’. It deletes the database table, and all the files in all the job directories (even extraneous files that do not correspond to jobs in the database). It should be used with caution, as this cannot be undone.
list_jobs.py¶
This tool will show all the jobs in the given state(s). It is helpful for internal web services that don’t have an easily accessible queue web page.
Testing¶
Before the framework is put into production it should be tested to make sure it works correctly. There are two main types of tests that should be done:
Unit tests test individual parts of the service to make sure they work in isolation.
System tests test the service as a whole.
Unit tests¶
To test the frontend, make a test/frontend subdirectory and put one or more
Python scripts there. Each script can use the functions and classes in the
saliweb.test
module, together with test functionality provided
by the Flask framework,
to create simple instances of the web frontend and test various methods
given different inputs. For example,
a script to test the index page might look like:
import unittest
import saliweb.test
# Import the modfoo frontend with mocks
modfoo = saliweb.test.import_mocked_frontend("modfoo", __file__,
'../../frontend')
class Tests(saliweb.test.TestCase):
def test_index(self):
"""Test index page"""
c = modfoo.app.test_client()
rv = c.get('/')
self.assertIn(b'ModFoo: Modeling using Foo', rv.data)
if __name__ == '__main__':
unittest.main()
Then write an SConscript file in the same directory to actually run the
scripts, using the RunPythonFrontendTests()
method. This might look like:
Import('env')
env.RunPythonFrontendTests(Glob("*.py"))
To test the backend, make a test/backend subdirectory and put one or more
Python scripts there. Each script should define a subclass of
saliweb.test.TestCase
and define one or methods starting with test_
using standard Python unittest methods such as assertEquals. A number of
other utility classes are also provided in the saliweb.test
module.
For example, to test that the archive()
method of
the ModFoo service (Simple job example) really does gzip all of the PDB
files, a test case like that below could be used:
import unittest
import modfoo
import saliweb.test
import os
class JobTests(saliweb.test.TestCase):
"""Check custom ModFoo Job class"""
def test_archive(self):
"""Test the archive method"""
# Make a ModFoo Job test job in ARCHIVED state
j = self.make_test_job(modfoo.Job, 'ARCHIVED')
# Run the rest of this testcase in the job's directory
with saliweb.test.working_directory(j.directory):
# Make a test PDB file and another incidental file
with open('test.pdb', 'w') as f:
print("test pdb", file=f)
with open('test.txt', 'w') as f:
print("text file", file=f)
# Run the job's "archive" method
j.archive()
# Job's archive method should have gzipped every PDB file but not
# anything else
self.assertTrue(os.path.exists('test.pdb.gz'))
self.assertFalse(os.path.exists('test.pdb'))
self.assertTrue(os.path.exists('test.txt'))
if __name__ == '__main__':
unittest.main()
Then write an SConscript file in the same directory to actually run the
scripts, using the RunPythonTests()
method. This might look like:
Import('env')
env.RunPythonTests(Glob("*.py"))
Run scons test to actually run the tests.
System tests¶
There is currently no rigorous way to carry out system tests other than deploying the service, then using the web interface to submit a job.
Examples¶
A simple example of a complete web service is ModLoop. The source code for this service can be found at https://github.com/salilab/modloop/ and the service can be seen in action at https://salilab.org/modloop/