If you’ve read our earlier post you’ll understand how to create an AWS ParallelCluster and adapt the head and compute nodes with custom scripts that you place in an AWS S3 bucket.
Here we discus using the AWS ParallelCluster ImageBuilder utility to create a custom AMI image that you can use to create your cluster, useful if you find you’re installing a lot of custom packages that slow down the formation of new compute nodes.
Rather than installing packages and tools every time a cluster node is created (costing valuable minutes) the ImageBuilder lets you construct a customised AMI.
The essence of the custom image is to put your package configuration into a shell-script and store this in an Amazon S3 bucket, which you refer to in an ImageBuilder YAML-based configuration file.
Your IAM user will need the image build pcluster user policy described in the iam-roles part of the AWS ParallelCluster documentation.
Find a suitable Amazon Linux image to base your custom image on. You can use
pcluster list-official-images to find some.
If you don’t have a default VPC you will need to provide a
With this information you can create a simple YAML file. Here we’ve chosen to use a
c6a.4xlarge instance for the image build and our custom script (which you can also use) has been placed on S3 at
s3://im-aws-parallel-cluster/imagebuilder-amazon.sh. Use whatever instance and script is suitable for your cluster: -
Now we just have to compile the custom image using the
pcluster build-image command. We need to provide an identity for the image and a
In the following example our YAML file is called
Building an Image can take a substantial length of time (an hour or so) but you can track image build status using the following command: -
imageBuildStatus from the above command is
BUILD_COMPLETE you should also find the image AMI under
ec2AmiInfo -> amiId and can use this in your cluster configuration, removing the corresponding
CustomActions, which are no longer required, by placing the AMI in the
Image block of your cluster configuration: -
You can see a fuller discussion of ParallelCluster and the ImageBuilder in our nextflow-pcluster repository.