Requirements

  • An Amazon AWS account.
  • Paired-end short read data in FASTQ format.
  • Some time and patience. High quality data can take time.

  • That's it! The STORMSeq software itself is free and open-source, but you will be charged by Amazon for compute time for processing and storage of your data.

    How to use STORMSeq

    0. Set up an Amazon AWS account.

  • Your telephone number and billing information will be verified with Amazon, which may take some time (typically a few minutes, but could take longer).
  • 1. Create an S3 bucket to upload reads and for our results

  • First, we'll create a S3 persistent storage bucket for our reads (fq/fastq, gzipped fastq, or bam) and results (bam and vcf files).
  • You can do this by going to the AWS console.
  • Click the S3 link.


  • Click "Create Bucket" and name the bucket (it must not share a name with any other person's bucket, so you may need to get creative with naming).
  • IMPORTANT: Do not use any capital letters in your bucket name!!!
  • Note this name, as you will copy it back on the STORMSeq page.

  • You may now upload your reads to this bucket.


    • For multiple individuals, you must create a folder (with a unique name, which will be the sample name later) for each individual, and upload all the reads for that individual into that folder.
    • For a single individual, it can be in a named folder or in the main directory.
    • Reads can be *.fq, *.fastq, *.fq.gz (gzipped), or *.bam.
  • This may take some time depending on how much data you have (may take overnight for high coverage genomes).

  • 2. Go to the EC2 Console and start a new EC2 instance to start your webserver.

  • When the upload is done or close to done, we can set up the rest of the server.
  • In the AWS console, click on the EC2 link.



  • You are now on the EC2 console, where you can see all your running resources in the left panel.
  • IMPORTANT: Note your region in the top right. Please choose US East (Virginia) for now.
  • Click "Launch Instance" and click Continue through the Classic Wizard.


  • Click "Community AMIs."


  • Search for the STORMSeq AMI (choose the latest version) and select it.


  • Select your instance type. Since this is just the webserver, the instance can be of any type, including Micro.
  • We can now skip ahead to "Configure Security Group."


  • Create a new security group named "stormseq" as shown below.
  • Select HTTPS from the "Create a new rule" dropdown box and click "Add rule" and click Continue.
  • If this is not your first time using STORMSeq, you can skip this step if the "stormseq" security group still exists.
  • Click "Review and Launch."


  • Your instance is ready! Click Launch.
  • In the key pair screen, you may proceed without a key pair. Check the box and click "Launch Instances."


  • After a moment (usually about a minute or so), we should notice our server up and running. Click the instance itself and copy the address (ec2-XX-XX-XX-XX.compute-1.amazonaws.com).
  • Point your web browser to a secure version of this address (https://ec2-XX-XX-XX-XX.compute-1.amazonaws.com).
  • Your browser will likely throw a warning that the certificate cannot be trusted. This is because the server signed it itself. It is safe to proceed.
  • You will now see the main STORMSeq page. Leave this open for the next steps

  • 3. Get security credentials

  • While we're waiting, we can get the last of the security credentials needed. Click your name in the top right corner, and then Security Credentials
  • Open the "Access Keys" dropdown and click "Create New Access Key" to generate a new access key id and secret key


  • Click "Download Key File" and save the .csv file to your computer.


  • Back on the security credentials page, copy your account number.
  • You can either paste this number directly into the STORMSeq instance we started earlier, or edit the .csv file we just downloaded with a text editor and (replacing the #'s with that number), add a new line with: AccountNumber=####-####-####
  • This is handy for future runs of STORMSeq.
  • 4. Set your parameters and begin processing!

  • Return to your STORMSeq start page (ec2-XX-XX-XX-XX.compute-1.amazonaws.com) and poke around!
  • Paste your account number into the AWS Account Number field, and copy and paste the Access Key ID and Secret Access Key from the csv file into the relevant fields.
  • Alternatively, you can drag and drop the csv file anywhere onto this page and the fields will automatically fill in.


  • In particular, note the Request Type button, where you can set the prices for bidding on compute time to save money (more information can be found here). Note that there are currently no failure modes for spot requests. If your jobs are cancelled due to a low bid, they cannot be resumed, and will need to be set up from scratch).
  • Clicking Demo Data will override any samples you have and run the exome demo provided.
  • If you have one sample and it is in the root directory of the bucket, you may name it whatever you'd like. If it is in a subfolder, that name should be used.
  • If you have multiple samples, they will be guessed based on the folder names you uploaded previously.
  • Wait until your reads are finished uploading, then click "GO!" to start the pipeline. This may take a few minutes depending on a number of factors, including the number of samples and the size of your upload.
  • Note that clicking GO begins additional EC2 instances, which will be billed to your Amazon account. Importantly, one STORMSeq run may start many machines, so your hourly rate may be higher than the prices listed, or the spot bid you set. For information about your usage and billing, click here.
  • That's it! Your progress will be updated every few minutes and results will be available on the front page (as shown below) and in the S3 bucket.
  • If STORMSeq successfully finishes, it should automatically shut down the instances it started. However, you will need to terminate the webserver page by right clicking on the instance, or selecting the instance and clicking "Actions" as shown here.
  • In some cases, the process may fail to shut down the additional machines, in which case you should select all the instances and terminate them.




  • Results will be available in the S3 bucket and visualizations (such as quality metrics, SNP density, and indel length distributions) will be shown below, as in these examples: