One of our more interesting GitHub projects is the Docking Validation project. We use this to establish and document best practices in virtual screening tools (such as docking) and approaches to semi-automating and scaling these procedures.
We just completed a new ‘experiment’ that is related to our work at the Diamond Light Source’s XChem project which has done some amazing work on fragment based screening using XRay crystallography.
With fragment screening you frequently get a number of crystal structures for your target, each with a different ligand. XChem has made this process relatively routine and the challenge is now to follow up the fragment structures and turn them into drug leads. A key part of this is to screen potential analogues of those fragments using docking algorithms.
The new experiment looks into one aspect of this - how to select the protein target for docking. You have a handful of crystal structures for your protein target, each with a different fragment ligand. Once you remove the ligand each structure is slightly different. You don’t want to use all of these for your docking as it is computationally expensive. So which one to use? Or do you need more than one.
So we took an approach suggested by Thomas Exner, author of the PLANTS docking program. This is to dock each ligand into each protein structure and compare the docked pose with the actual pose in the crystal structure. You may find that some protein structures are better at docking the range of ligands than others. Hopefully one can dock the whole lot successfully, but maybe you might be better with two or three structures to get a good spread across the range of ligands.
So that’s what we set off to achieve and to partly automate. The data for the NUDT7 bromo domain target studied at the SGC in Oxford and was prepared by Anthony Bradley while he was working at Diamond. There were 5 crystal structures of this target each with a different fragment ligand.
One key part of the process is to define the binding cavity of the protein. We are using rDock for the docking, partly because it has good tooling to support the docking process. Part of this is the ‘rbcavity’ program that can be used to define the binding site. One way this can be used is using an existing ligand. The coordinates of that ligand’s atoms are used to define the binding cavity. But we have multiple crystal structures for our NUTD7 protein, each with a different ligand. And those fragment ligands can occupy different parts of the binding cavity. So no one ligand really does the job for us. We want the space occupied by all those ligands. But the rbcavity program can only take a single molecule as input.
The solution seemed to be to create a single hybrid molecule that contained all the ligands and use that to map out the binding cavity. This seemed a bit crazy, but a quick email exchange with Peter Schmidtke and Xavier Barril from the rDock team indicated that this was the way ahead, and Xavier confirmed that he had used this approach before and provided a Perl script that performs this. Peter suggested the name ‘Frankenstein molecule’ which seems to have stuck. The script combines the atoms from the multiple ligands into one molecule, skipping atoms that don’t contribute to the ‘outer’ surface of the ligands. No bonds are included as rbcavity does not need them. We have a cut and paste molecule with all the useful atoms, but no bonds. A true monster, but loveable as it does exactly what we need. That monster molecule is used for the cavity definition and seems to work well.
For more information on the experiment look at the details on GitHub.
We welcome comments and contributions on the Docking Validation project. As you might expect from us it’s freely accessible to all.
You’ve probably created a machine image at some point. A base image for AWS that builds upon someone’s work by adding a particular version of Java or Python or a new utility. Did you create the image on AWS using an EC2 instance, login, run some
apt-get and then save it? Great, and if someone wants the
source code for that image or you want to build a similar image on a different region or provider? Well, Packer is an IaC tool for automating the construction of machine images.
The Python Jenkins module is a convenient wrapper for the Jenkins REST API that gives you control of a Jenkins server in a pythonic way. Here we’ll see how to grab all the jobs from a Jenkins server and also how these jobs can be re-created from the captured material.
In this post we look at using buildah to generate container images that only contain what we want, no extra fluff. We show how this can let us generate truly small images that will load faster and be more secure, and do this without the need for the Docker daemon to be running.
Here we’re going to be looking at the the idea of applying automation tools to the wider product development process. Tools that help you do this are part of a collection known as “Infrastructure as Code”, which refers to the the provisioning of compute instances (physical machines and their operating systems) and software applications using revision-controlled machine-readable text files.
This series of posts describes how we can generate smaller Docker images. In the first post we outlined a common problem with container images - that they frequently contain artefacts that were needed to build the software or to install it into the container. We’ll show one approach that can be used to avoid this extra bloat, and so generate smaller and more secure containers.
I’ve been in meetings, often driven by the root-cause-analysis of a software fault found in the field, where the topic of code coverage has cropped up. I’m sure many of us have been in similar meetings. On occasions I’ve also been asked to justify some of my apparently poor line coverage figures, where the percentage has fallen short of what was perceived by the inquisitor as acceptable.
This is the first in a series of blog posts about building better Docker images.
Docker Inc is widely acknowledged for transitioning containers from geekdom to the real world inhabited by us developers, and did this by providing easy to use tools for building, sharing and running containers. Key to this is
docker build command and the Dockerfile.
Welcome to the Informatics Matters blog.
This is the first post of what will become a regular stream of information about our activities at Informatics Matters in providing solutions for scientific computing, including bioinformatics, genomics, cheminformatics and computational chemistry.