Scientific Applications on ACCESS Anvil

Anvil System

This is the list of Applications, Compilers, MPIs, NVIDIA NGC containers, and biocontainers deployed on ACCESS Anvil that is managed by Rosen Center for Advanced Computing (RCAC) clusters at Purdue University.

Compilers

aocc

Description

The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux® platforms.

Versions

  • 3.1.0

Module

You can load the modules by:

module load aocc

gcc

Description

The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.

Versions

  • 8.4.1

  • 10.2.0

  • 11.2.0

  • 11.2.0-openacc

Module

You can load the modules by:

module load gcc

intel

Description

Intel Parallel Studio.

Versions

  • 19.0.5.281

Module

You can load the modules by:

module load intel

MPIs

impi

Description

Intel MPI

Versions

  • 2019.5.281

Module

You can load the modules by:

module load impi

mvapich2

Description

Mvapich2 is a High-Performance MPI Library for clusters with diverse networks InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE and computing platforms x86 Intel and AMD, ARM and OpenPOWER

Versions

  • 2.3.6

Module

You can load the modules by:

module load mvapich2

openmpi

Description

An open source Message Passing Interface implementation.

Versions

  • 3.1.6

  • 4.0.6

  • 4.0.6-cu11.0.3

Module

You can load the modules by:

module load openmpi

Applications

AMD

amdblis

Description

AMD Optimized BLIS. BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries.

Versions
  • 3.0

Module

You can load the modules by:

module load amdblis

amdfftw

Description

FFTW AMD Optimized version is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform DFT and various special cases thereof.

Versions
  • 3.0

Module

You can load the modules by:

module load amdfftw

amdlibflame

Description

libFLAME AMD Optimized version is a portable library for dense matrix computations, providing much of the functionality present in Linear Algebra Package LAPACK. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation.

Versions
  • 3.0

Module

You can load the modules by:

module load amdlibflame

amdlibm

Description

AMD LibM is a software library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions. Applications can link into AMD LibM library and invoke math functions instead of compilers math functions for better accuracy and performance.

Versions
  • 3.0

Module

You can load the modules by:

module load amdlibm

amdscalapack

Description

ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It depends on external libraries including BLAS and LAPACK for Linear Algebra computations.

Versions
  • 3.0

Module

You can load the modules by:

module load amdscalapack

Audio/Visualization

ffmpeg

Description

FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.

Versions
  • 4.2.2

Module

You can load the modules by:

module load ffmpeg

gmt

Description

GMT Generic Mapping Tools is an open source collection of about 80 command-line tools for manipulating geographic and Cartesian data sets including filtering, trend fitting, gridding, projecting, etc. and producing PostScript illustrations ranging from simple x-y plots via contour maps to artificially illuminated surfaces and 3D perspective views.

Versions
  • 6.1.0

Module

You can load the modules by:

module load gmt

gnuplot

Description

Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed i.e., you dont have to pay for it. It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986

Versions
  • 5.4.2

Module

You can load the modules by:

module load gnuplot

paraview

Description

ParaView is an open-source, multi-platform data analysis and visualization application.

Versions
  • 5.9.1

  • 5.10.1

Module

You can load the modules by:

module load paraview

visit

Description

VisIt is an Open Source, interactive, scalable, visualization, animation and analysis tool. Description

Versions
  • 3.1.4

Module

You can load the modules by:

module load visit

vlc

Description

VLC is a free and open source multimedia player for most multimedia formats.

Versions
  • 3.0.9.2

Module

You can load the modules by:

module load vlc

vtk

Description

The Visualization Toolkit VTK is an open-source, freely available software system for 3D computer graphics, image processing and visualization.

Versions
  • 9.0.0

Module

You can load the modules by:

module load vtk

Bioinformatics

bamtools

Description

C++ API & command-line toolkit for working with BAM data.

Versions
  • 2.5.2

Module

You can load the modules by:

module load bamtools
Example job

To run bamtools our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=bamtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module load bamtools

bamtools convert -format fastq -in in.bam -out out.fastq

beagle

Description

Beagle is a software package for phasing genotypes and for imputing ungenotyped markers.

Versions
  • 5.1

Module

You can load the modules by:

module load beagle

beast2

Description

BEAST is a cross-platform program for Bayesian inference using MCMC of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology.

Versions
  • 2.6.4

Module

You can load the modules by:

module load beast2

bismark

Description

A tool to map bisulfite converted sequence reads and determine cytosine methylation states

Versions
  • 0.23.0

Module

You can load the modules by:

module load bismark

blast-plus

Description

Basic Local Alignment Search Tool. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.

Versions
  • 2.12.0

Module

You can load the modules by:

module load blast-plus
BLAST Databases

Local copies of the blast dabase can be found in the directory /anvil/datasets/ncbi/blast/latest. The environment varialbe BLASTDB was also set as /anvil/datasets/ncbi/blast/latest. If users want to use cdd_delta, env_nr, env_nt, nr, nt, pataa, patnt, pdbnt, refseq_protein, refseq_rna, swissprot, or tsa_nt databases, do not need to provide the database path. Instead, just use the format like this -db nr.

Example job

To run bamtools our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=blast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module load blast-plus

blastp -query protein.fasta -db nr -out test_out -num_threads 4

bowtie2

Description

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences

Versions
  • 2.4.2

Module

You can load the modules by:

module load bowtie2

bwa

Description

Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.

Versions
  • 0.7.17

Module

You can load the modules by:

module load bwa

cutadapt

Description

Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.

Versions
  • 2.10

Module

You can load the modules by:

module load cutadapt

fastqc

Description

A quality control tool for high throughput sequence data.

Versions
  • 0.11.9

Module

You can load the modules by:

module load fastqc

fasttree

Description

FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.

Versions
  • 2.1.10

Module

You can load the modules by:

module load fasttree

fastx-toolkit

Description

The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.

Versions
  • 0.0.14

Module

You can load the modules by:

module load fastx-toolkit

gatk

Description

Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data

Versions
  • 4.1.8.1

Module

You can load the modules by:

module load gatk
Example job

To run gatk our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=gatk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module load gatk

gatk  --java-options "-Xmx12G -XX:ParallelGCThreads=24" HaplotypeCaller -R hg38.fa -I 19P0126636WES.sorted.bam  -O 19P0126636WES.HC.vcf --sample-name 19P0126636

htseq

Description

HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.

Versions
  • 0.11.2

Module

You can load the modules by:

module load htseq

mrbayes

Description

MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo MCMC methods to estimate the posterior distribution of model parameters.

Versions
  • 3.2.7a

Module

You can load the modules by:

module load mrbayes

nf-core

Description

A community effort to collect a curated set of analysis pipelines built using Nextflow and tools to run the pipelines.

Versions
  • 2.7.2

  • 2.8

Module

You can load the modules by:

module load nf-core

perl-bioperl

Description

BioPerl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects. These objects not only do what they are advertised to do in the documentation, but they also interact - Alignment objects are made from the Sequence objects, Sequence objects have access to Annotation and SeqFeature objects and databases, Blast objects can be converted to Alignment objects, and so on. This means that the objects provide a coordinated and extensible framework to do computational biology.

Versions
  • 1.7.6

Module

You can load the modules by:

module load perl-bioperl

picard

Description

Picard is a set of command line tools for manipulating high-throughput sequencing HTS data and formats such as SAM/BAM/CRAM and VCF.

Versions
  • 2.25.7

Module

You can load the modules by:

module load picard

samtools

Description

SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format

Versions
  • 1.12

Module

You can load the modules by:

module load samtools

sratoolkit

Description

The NCBI SRA Toolkit enables reading dumping of sequencing files from the SRA database and writing loading files into the .sra format.

Versions
  • 2.10.9

Module

You can load the modules by:

module load sratoolkit

tophat

Description

Spliced read mapper for RNA-Seq.

Versions
  • 2.1.2

Module

You can load the modules by:

module load tophat

trimmomatic

Description

A flexible read trimming tool for Illumina NGS data.

Versions
  • 0.39

Module

You can load the modules by:

module load trimmomatic

vcftools

Description

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

Versions
  • 0.1.14

Module

You can load the modules by:

module load vcftools

Climate

cdo

Description

CDO is a collection of command line Operators to manipulate and analyse Climate and NWP model Data.

Versions
  • 1.9.9

Module

You can load the modules by:

module load cdo

ncl

Description

NCL is an interpreted language designed specifically for scientific data analysis and visualization. Supports NetCDF 3/4, GRIB 1/2, HDF 4/5, HDF-EOD 2/5, shapefile, ASCII, binary. Numerous analysis functions are built-in.

Versions
  • 6.4.0

Module

You can load the modules by:

module load ncl

Computational chemistry

amber

Description

AMBER (Assisted Model Building with Energy Refinement) is a package of molecular simulation programs.

Versions
  • 20

Module

You can load the modules by:

module load amber

cp2k

Description

CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems

Versions
  • 8.2

Module

You can load the modules by:

module load cp2k

gromacs

Description

GROMACS GROningen MAchine for Chemical Simulations is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.

Versions
  • 2021.2

Module

You can check available gromacs version by:

module spider gromacs

You can check how to load the gromacs module by the module’s full name:

module spider gromacs/XXXX
Note: RCAC also installed some containerized gromacs modules.

To use these containerized modules, please following the instructions in the output of “module spider gromacs/XXXX”

You can load the modules by:

module load gromacs # for default version
module load gromacs/XXXX # for specific version
Usage

The GROMACS executable is gmx_mpi and you can use gmx help commands for help on a command.

For more details about how to run GROMACS, please check GROMACS.

Example job
#!/bin/bash
# FILENAME:  myjobsubmissionfile

#SBATCH --nodes=2       # Total # of nodes
#SBATCH --ntasks=256    # Total # of MPI tasks
#SBATCH --time=1:30:00  # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname    # Job name
#SBATCH -o myjob.o%j    # Name of stdout output file
#SBATCH -e myjob.e%j    # Name of stderr error file

# Manage processing environment, load compilers and applications.
module purge
module load gcc/XXXX openmpi/XXXX # or module load intel/XXXX impi/XXXX | depends on the output of "module spider gromacs/XXXX"
module load gromacs/XXXX
module list

# Launch MPI code
gmx_mpi pdb2gmx -f my.pdb -o my_processed.gro -water spce
gmx_mpi grompp -f my.mdp -c my_processed.gro -p topol.top -o topol.tpr
srun -n $SLURM_NTASKS gmx_mpi mdrun -s topol.tpr
Note

Using mpirun -np $SLURM_NTASKS gmx_mpi or mpiexex -np $SLURM_NTASKS gmx_mpi may not work for non-exclusive jobs on some clusters. Use srun -n $SLURM_NTASKS gmx_mpi or mpirun gmx_mpi instead. mpirun gmx_mpi without specifying the number of ranks will automatically pick up the number of SLURM_NTASKS and works fine.

lammps

Description

LAMMPS is a classical molecular dynamics code with a focus on materials modelling. It’s an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.

LAMMPS has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

Versions
  • 20210310

  • 20210310-kokkos

Module

You can check available lammps version by:

module spider lammps

You can check how to load the lammps module by the module’s full name:

module spider lammps/XXXX

You can load the modules by:

module load lammps # for default version
module load lammps/XXXX # for specific version
Usage

LAMMPS reads command lines from an input file like “in.file”. The LAMMPS executable is lmp, to run the lammps input file, use the -in command:

lmp -in in.file

For more details about how to run LAMMPS, please check LAMMPS.

Example job
#!/bin/bash
# FILENAME:  myjobsubmissionfile

#SBATCH --nodes=2       # Total # of nodes
#SBATCH --ntasks=256    # Total # of MPI tasks
#SBATCH --time=1:30:00  # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname    # Job name
#SBATCH -o myjob.o%j    # Name of stdout output file
#SBATCH -e myjob.e%j    # Name of stderr error file

# Manage processing environment, load compilers and applications.
module purge
module load gcc/XXXX openmpi/XXXX # or module load intel/XXXX impi/XXXX | depends on the output of "module spider lammps/XXXX"
module load lammps/XXXX
module list

# Launch MPI code
srun -n $SLURM_NTASKS lmp
Note

Using mpirun -np $SLURM_NTASKS lmp or mpiexex -np $SLURM_NTASKS lmp may not work for non-exclusive jobs on some clusters. Use srun -n $SLURM_NTASKS lmp or mpirun lmp instead. mpirun lmp without specifying the number of ranks will automatically pick up the number of SLURM_NTASKS and works fine.

namd

Description

NAMDis a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.

Versions
  • 2.14

Module

You can load the modules by:

module load namd

nwchem

Description

High-performance computational chemistry software

Versions
  • 7.0.2

Module

You can load the modules by:

module load nwchem

quantum-espresso

Description

Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.

Versions
  • 6.7

Module

You can load the modules by:

module load quantum-espresso

vasp

Description

The Vienna Ab initio Simulation Package VASP is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.

Versions
  • 5.4.4.pl2

  • 6.3.0

Module

You can load the modules by:

module load vasp

vmd

Description

VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.

Versions
  • 1.9.3

Module

You can load the modules by:

module load vmd

wannier90

Description

Wannier90 is an open-source code released under GPLv2 for generating maximally-localized Wannier functions and using them to compute advanced electronic properties of materials with high efficiency and accuracy.

Versions
  • 3.1.0

Module

You can load the modules by:

module load wannier90

Fluid dynamics

openfoam

Description

OpenFOAM is leading software for computational fluid dynamics (CFD).

Versions
  • 8-20210316

Module

You can load the modules by:

module load openfoam

Geospatial tools

gdal

Description

GDAL Geospatial Data Abstraction Library is a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful command line utilities for data translation and processing.

Versions
  • 2.4.4

  • 3.2.0

Module

You can load the modules by:

module load gdal

geos

Description

GEOS Geometry Engine - Open Source is a C++ port of the Java Topology Suite JTS. As such, it aims to contain the complete functionality of JTS in C++. This includes all the OpenGIS Simple Features for SQL spatial predicate functions and spatial operators, as well as specific JTS enhanced topology functions.

Versions
  • 3.8.1

  • 3.9.1

Module

You can load the modules by:

module load geos

grads

Description

The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).

Versions
  • 2.2.1

Module

You can load the modules by:

module load grads

proj

Description

PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system CRS to another. This includes cartographic projections as well as geodetic transformations.

Versions
  • 5.2.0

  • 6.2.0

Module

You can load the modules by:

module load proj

Libraries

arpack-ng

Description

ARPACK-NG is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.

Versions
  • 3.8.0

Module

You can load the modules by:

module load arpack-ng

blis

Description

BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries.

Versions
  • 0.8.1

Module

You can load the modules by:

module load blis

boost

Description

Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.

Versions
  • 1.74.0

Module

You can load the modules by:

module load boost

eigen

Description

Eigen is a C++ template library for linear algebra matrices, vectors, numerical solvers, and related algorithms.

Versions
  • 3.3.9

Module

You can load the modules by:

module load eigen

fftw

Description

FFTW is a C subroutine library for computing the discrete Fourier transform DFT in one or more dimensions, of arbitrary input size, and of both real and complex data as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST. We believe that FFTW, which is free software, should become the FFT library of choice for most applications.

Versions
  • 2.1.5

  • 3.3.8

Module

You can load the modules by:

module load fftw

gmp

Description

GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.

Versions
  • 6.2.1

Module

You can load the modules by:

module load gmp

gsl

Description

The GNU Scientific Library GSL is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.

Versions
  • 2.4

Module

You can load the modules by:

module load gsl

hdf5

Description

HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.

Versions
  • 1.10.7

Module

You can load the modules by:

module load hdf5

hdf

Description

HDF4 also known as HDF is a library and multi-object file format for storing and managing data between machines.

Versions
  • 4.2.15

Module

You can load the modules by:

module load hdf

intel-mkl

Description

Intel’s Math Kernel Library (MKL) provides highly optimized, threaded and vectorized functions to maximize performance on each processor family. It Utilises de-facto standard C and Fortran APIs for compatibility with BLAS, LAPACK and FFTW functions from other math libraries.

Versions
  • 2019.5.281

Module

You can load the modules by:

module load intel-mkl

libfabric

Description

The Open Fabrics Interfaces OFI is a framework focused on exporting fabric communication services to applications.

Versions
  • 1.12.0

Module

You can load the modules by:

module load libfabric

libflame

Description

libflame is a portable library for dense matrix computations, providing much of the functionality present in LAPACK, developed by current and former members of the Science of High-Performance Computing SHPC group in the Institute for Computational Engineering and Sciences at The University of Texas at Austin. libflame includes a compatibility layer, lapack2flame, which includes a complete LAPACK implementation.

Versions
  • 5.2.0

Module

You can load the modules by:

module load libflame

libiconv

Description

GNU libiconv provides an implementation of the iconv function and the iconv program for character set conversion.

Versions
  • 1.16

Module

You can load the modules by:

module load libiconv

libmesh

Description

The libMesh library provides a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms.

Versions
  • 1.6.2

Module

You can load the modules by:

module load libmesh

libszip

Description

Szip is an implementation of the extended-Rice lossless compression algorithm.

Versions
  • 2.1.1

Module

You can load the modules by:

module load libszip

libtiff

Description

LibTIFF - Tag Image File Format TIFF Library and Utilities.

Versions
  • 4.1.0

Module

You can load the modules by:

module load libtiff

libv8

Description

Distributes the V8 JavaScript engine in binary and source forms in order to support fast builds of The Ruby Racer

Versions
  • 6.7.17

Module

You can load the modules by:

module load libv8

libx11

Description

Xlib − C Language X Interface is a reference guide to the low-level C language interface to the X Window System protocol. It is neither a tutorial nor a user’s guide to programming the X Window System. Rather, it provides a detailed description of each function in the library as well as a discussion of the related background information.

Versions
  • 1.7.0

Module

You can load the modules by:

module load libx11

libxml2

Description

Libxml2 is the XML C parser and toolkit developed for the Gnome project but usable outside of the Gnome platform, it is free software available under the MIT License.

Versions
  • 2.9.10

Module

You can load the modules by:

module load libxml2

mpfr

Description

The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.

Versions
  • 4.0.2

Module

You can load the modules by:

module load mpfr

netcdf-c

Description

NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.

Versions
  • 4.7.4

Module

You can load the modules by:

module load netcdf-c

netcdf-cxx4

Description

NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C++ distribution.

Versions
  • 4.3.1

Module

You can load the modules by:

module load netcdf-cxx4

netcdf-fortran

Description

NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.

Versions
  • 4.5.3

Module

You can load the modules by:

module load netcdf-fortran

netlib-lapack

Description

LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.

Versions
  • 3.8.0

Module

You can load the modules by:

module load netlib-lapack

openblas

Description

OpenBLAS is an open source implementation of the BLAS API with many hand-crafted optimizations for specific processor types

Versions
  • 0.3.17

Module

You can load the modules by:

module load openblas

parallel-netcdf

Description

PnetCDF Parallel netCDF is a high-performance parallel I/O library for accessing files in format compatibility with Unidatas NetCDF, specifically the formats of CDF-1, 2, and 5.

Versions
  • 1.11.2

Module

You can load the modules by:

module load parallel-netcdf

petsc

Description

PETSc is a suite of data structures and routines for the scalable parallel solution of scientific applications modeled by partial differential equations.

Versions
  • 3.15.3

Module

You can load the modules by:

module load petsc

swig

Description

SWIG is an interface compiler that connects programs written in C and C++ with scripting languages such as Perl, Python, Ruby, and Tcl. It works by taking the declarations found in C/C++ header files and using them to generate the wrapper code that scripting languages need to access the underlying C/C++ code. In addition, SWIG provides a variety of customization features that let you tailor the wrapping process to suit your application.

Versions
  • 4.0.2

Module

You can load the modules by:

module load swig

ucx

Description

a communication library implementing high-performance messaging for MPI/PGAS frameworks

Versions
  • 1.11.2

Module

You can load the modules by:

module load ucx

zlib

Description

A free, general-purpose, legally unencumbered lossless data-compression library.

Versions
  • 1.2.11

Module

You can load the modules by:

module load zlib

Mathematical/Statistics

gurobi

Description

The Gurobi Optimizer was designed from the ground up to be the fastest, most powerful solver available for your LP, QP, QCP, and MIP MILP, MIQP, and MIQCP problems.

Versions
  • 9.5.1

Module

You can load the modules by:

module load gurobi

jupyter

Description

Complete Jupyter Hub/Lab/Notebook environment.

Versions
  • 2.0.0

Module

You can load the modules by:

module load jupyter

matlab

Description

MATLAB MATrix LABoratory is a multi-paradigm numerical computing environment and fourth-generation programming language. A proprietary programming language developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and Python.

Versions
  • R2020b

  • R2021b

  • R2022a

  • R2023a

Module

You can load the modules by:

module load matlab

meep

Description

Meep or MEEP is a free finite-difference time-domain FDTD simulation software package developed at MIT to model electromagnetic systems.

Versions
  • 1.20.0

Module

You can load the modules by:

module load meep

octave

Description

GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language.

Versions
  • 6.3.0

Module

You can load the modules by:

module load octave

r

Description

linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.

Versions
  • 4.0.5

  • 4.1.0

Module

You can load the modules by:

module load r

rstudio

Description

This package installs Rstudio desktop from pre-compiled binaries available in the Rstudio website. The installer assumes that you are running on CentOS7/Redhat7/Fedora19. Please fix the download URL for other systems.

Versions
  • 2021.09.0

Module

You can load the modules by:

module load rstudio

ML toolkit

learning

Description

The learning module loads the prerequisites (such as anaconda and cudnn ) and makes ML applications visible to the user

Versions
  • conda-2021.05-py38-gpu

Module

You can load the modules by:

module load learning
Example job

Below is an example job script:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --gpus-per-node=1
#SBATCH -p PartitionName
#SBATCH --job-name=learning
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
module load learning/conda-2020.11-py38-gpu
module load ml-toolkit-gpu/pytorch/1.7.1

python torch.py

nco

Description

The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats

Versions
  • 4.9.3

Module

You can load the modules by:

module load nco

py-mpi4py

Description

mpi4py provides a Python interface to MPI or the Message-Passing Interface. It is useful for parallelizing Python scripts

Versions
  • 3.0.3

Module

You can load the modules by:

module load py-mpi4py

python

Description

Native Python 3.9.5 including optimized libraries.

Versions
  • 3.9.5

Module

You can load the modules by:

module load python

spark

Description

Apache Spark is a fast and general engine for large-scale data processing.

Versions
  • 3.1.1

Module

You can load the modules by:

module load spark

NVIDIA

cuda

Description

CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

Versions
  • 11.0.3

  • 11.2.2

  • 11.4.2

  • 12.0.1

Module

You can load the modules by:

module load cuda

cudnn

Description

cuDNN is a deep neural network library from Nvidia that provides a highly tuned implementation of many functions commonly used in deep machine learning applications.

Versions
  • cuda-11.0_8.0

  • cuda-11.2_8.1

  • cuda-11.4_8.2

  • cuda-12.0_8.8

Module

You can load the modules by:

module load cudnn

nccl

Description

Optimized primitives for collective multi-GPU communication.

Versions
  • cuda-11.0_2.11.4

  • cuda-11.2_2.8.4

  • cuda-11.4_2.11.4

Module

You can load the modules by:

module load modtree/gpu
module load nccl

nvhpc

Description

The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC® directives, and CUDA®. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming.

Versions
  • 21.7

Module

You can load the modules by:

module load nvhpc

Programming languages

julia

Description

Julia is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. One can write code in Julia that is nearly as fast as C. Julia features optional typing, multiple dispatch, and good performance, achieved using type inference and just-in-time (JIT) compilation, implemented using LLVM. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming.

Versions
  • 1.6.2

Module

You can load the modules by:

module load julia

tcl

Description

Tcl Tool Command Language is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.

Versions
  • 8.6.11

Module

You can load the modules by:

module load tcl

System

cue-login-env

Description

XSEDE Common User Environment Variables for Anvil. Load this module to have XSEDE Common User Environment variables defined for your shell session or job on Anvil. See detailed description at https://www.ideals.illinois.edu/bitstream/handle/2142/75910/XSEDE-CUE-Variable-Definitions-v1.1.pdf

Versions
  • 1.1

Module

You can load the modules by:

module load cue-login-env

modtree

Description

ModuleTree or modtree helps users naviagate between different application stacks and sets up a default compiler and mpi environment.

Versions
  • cpu

  • gpu

Module

You can load the modules by:

module load modtree

xalt

Versions
  • 2.10.45

Module

You can load the modules by:

module load xalt

Text Editors

vscode

Description

Visual Studio Code

Versions
  • 1.61.2

Module

You can load the modules by:

module load vscode

Tools/Utilities

anaconda

Description

Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment.

Versions
  • 2021.05-py38

Module

You can load the modules by:

module load anaconda

aws-cli

Description

The AWS Command Line Interface CLI is a unified tool to manage your AWS services from command line.

Versions
  • 2.4.15

Module

You can load the modules by:

module load aws-cli

cmake

Description

A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.

Versions
  • 3.20.0

Module

You can load the modules by:

module load cmake

curl

Description

cURL is an open source command line tool and library for transferring data with URL syntax

Versions
  • 7.76.1

Module

You can load the modules by:

module load curl

emacs

Description

The Emacs programmable text editor.

Versions
  • 27.2

Module

You can load the modules by:

module load emacs

gdb

Description

GDB, the GNU Project debugger, allows you to see what is going on inside another program while it executes – or what another program was doing at the moment it crashed.

Versions
  • 11.1

Module

You can load the modules by:

module load gdb

gpaw

Description

GPAW is a density-functional theory DFT Python code based on the projector-augmented wave PAW method and the atomic simulation environment ASE.

Versions
  • 21.1.0

Module

You can load the modules by:

module load gpaw

hadoop

Description

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

Versions
  • 3.3.0

Module

You can load the modules by:

module load hadoop

hpctoolkit

Description

HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nations largest supercomputers. By using statistical sampling of timers and hardware performance counters, HPCToolkit collects accurate measurements of a programs work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur.

Versions
  • 2021.03.01

Module

You can load the modules by:

module load hpctoolkit

hwloc

Description

The Hardware Locality hwloc software project.

Versions
  • 1.11.13

Module

You can load the modules by:

module load hwloc

launcher

Description

Framework for running large collections of serial or multi-threaded applications

Versions
  • 3.9

Module

You can load the modules by:

module load launcher

monitor

Description

System resource monitoring tool.

Versions
  • 2.3.1

Module

You can load the modules by:

module load monitor

mpc

Description

Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result.

Versions
  • 1.1.0

Module

You can load the modules by:

module load mpc

ncview

Description

Simple viewer for NetCDF files.

Versions
  • 2.1.8

Module

You can load the modules by:

module load ncview

numactl

Description

Simple NUMA policy support. It consists of a numactl program to run other programs with a specific NUMA policy and a libnuma shared library (“NUMA API”) to set NUMA policy in applications.

Versions
  • 2.0.14

Module

You can load the modules by:

module load numactl

openjdk

Description

The free and opensource java implementation

Versions
  • 11.0.8_10

Module

You can load the modules by:

module load openjdk

papi

Description

PAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events. In addition Component PAPI provides access to a collection of components that expose performance measurement opportunities across the hardware and software stack.

Versions
  • 6.0.0.1

Module

You can load the modules by:

module load papi

parafly

Description

Run UNIX commands in parallel

Versions
  • r2013

Module

You can load the modules by:

module load parafly

protobuf

Description

Protocol Buffers (a.k.a., protobuf) are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data.

Versions
  • 3.11.4

Module

You can load the modules by:

module load protobuf

qemu

Description

QEMU is a generic and open source machine emulator and virtualizer.

Versions
  • 4.1.1

Module

You can load the modules by:

module load qemu

qt

Description

Qt is a comprehensive cross-platform C++ application framework.

Versions
  • 5.15.2

Module

You can load the modules by:

module load qt

texlive

Description

TeX Live is a free software distribution for the TeX typesetting system. Heads up, its is not a reproducible installation. At any point only the most recent version can be installed. Older versions are included for backward compatibility, i.e., if you have that version already installed.

Versions
  • 20200406

Module

You can load the modules by:

module load texlive

tk

Description

Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.

Versions
  • 8.6.11

Module

You can load the modules by:

module load tk

totalview

Description

TotalView is a GUI-based source code defect analysis tool that gives you unprecedented control over processes and thread execution and visibility into program state and variables.

Versions
  • 2020.2.6

Module

You can load the modules by:

module load totalview

valgrind

Description

An instrumentation framework for building dynamic analysis.

Versions
  • 3.15.0

Module

You can load the modules by:

module load valgrind

Workflow automation

hyper-shell

Description

Process shell commands over a distributed, asynchronous queue.

Versions
  • 2.0.2

Module

You can load the modules by:

module load hyper-shell

nextflow

Description

Data-driven computational pipelines.

Versions
  • 22.10.1

Module

You can load the modules by:

module load nextflow

parallel

Description

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.

Versions
  • 20200822

Module

You can load the modules by:

module load parallel

Biocontainers

Abacas

Introduction

Abacas is a tool for algorithm based automatic contiguation of assembled sequences.

For more information, please check its website: https://biocontainers.pro/tools/abacas and its home page: http://abacas.sourceforge.net.

Versions

  • 1.3.1

Commands

  • abacas.pl

  • abacas.1.3.1.pl

Module

You can load the modules by:

module load biocontainers
module load abacas

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Abacas on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abacas
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers abacas

abacas.pl -r cmm.fasta -q Cm.contigs.fasta -p nucmer -o out_prefix

Abismal

Introduction

Another Bisulfite Mapping Algorithm (abismal) is a read mapping program for bisulfite sequencing in DNA methylation studies.

For more information, please check:

Versions

  • 3.0.0

Commands

  • abismal

  • abismalidx

  • simreads

Module

You can load the modules by:

module load biocontainers
module load abismal

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run abismal on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abismal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers abismal

abismalidx  ~/.local/share/genomes/hg38/hg38.fa hg38

Abpoa

Introduction

abPOA: adaptive banded Partial Order Alignment

For more information, please check:

Versions

  • 1.4.1

Commands

  • abpoa

Module

You can load the modules by:

module load biocontainers
module load abpoa

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run abpoa on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abpoa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers abpoa

abpoa seq.fa > cons.fa

Abricate

Introduction

Abricate is a tool for mass screening of contigs for antimicrobial resistance or virulence genes.

For more information, please check its website: https://biocontainers.pro/tools/abricate and its home page on Github.

Versions

  • 1.0.1

Commands

  • abricate

Module

You can load the modules by:

module load biocontainers
module load abricate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Abricate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=abricate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers abricate

abricate --threads 8 *.fasta

Abyss

Introduction

ABySS is a de novo sequence assembler intended for short paired-end reads and genomes of all sizes.

For more information, please check its website: https://biocontainers.pro/tools/abyss and its home page on Github.

Versions

  • 2.3.2

  • 2.3.4

Commands

  • ABYSS

  • ABYSS-P

  • AdjList

  • Consensus

  • DAssembler

  • DistanceEst

  • DistanceEst-ssq

  • KAligner

  • MergeContigs

  • MergePaths

  • Overlap

  • ParseAligns

  • PathConsensus

  • PathOverlap

  • PopBubbles

  • SimpleGraph

  • abyss-align

  • abyss-bloom

  • abyss-bloom-dbg

  • abyss-bowtie

  • abyss-bowtie2

  • abyss-bwa

  • abyss-bwamem

  • abyss-bwasw

  • abyss-db-txt

  • abyss-dida

  • abyss-fac

  • abyss-fatoagp

  • abyss-filtergraph

  • abyss-fixmate

  • abyss-fixmate-ssq

  • abyss-gapfill

  • abyss-gc

  • abyss-index

  • abyss-junction

  • abyss-kaligner

  • abyss-layout

  • abyss-longseqdist

  • abyss-map

  • abyss-map-ssq

  • abyss-mergepairs

  • abyss-overlap

  • abyss-paired-dbg

  • abyss-paired-dbg-mpi

  • abyss-pe

  • abyss-rresolver-short

  • abyss-samtoafg

  • abyss-scaffold

  • abyss-sealer

  • abyss-stack-size

  • abyss-tabtomd

  • abyss-todot

  • abyss-tofastq

  • konnector

  • logcounter

Module

You can load the modules by:

module load biocontainers
module load abyss

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run abyss on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=abyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers abyss

abyss-pe np=4 k=25 name=test B=1G \
    in='test-data/reads1.fastq test-data/reads2.fastq'

Actc

Introduction

Actc is used to align subreads to ccs reads.

For more information, please check:

Versions

  • 0.2.0

Commands

  • actc

Module

You can load the modules by:

module load biocontainers
module load actc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run actc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=actc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers actc

actc subreads.bam ccs.bam subreads_to_ccs.bam

Adapterremoval

Introduction

AdapterRemoval searches for and removes adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3’ end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, AdapterRemoval can construct a consensus adapter sequence for paired-ended reads, if which this information is not available.

For more information, please check:

Versions

  • 2.3.3

Commands

  • AdapterRemoval

Module

You can load the modules by:

module load biocontainers
module load adapterremoval

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run adapterremoval on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=adapterremoval
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers adapterremoval

AdapterRemoval --file1 input_1.fastq --file2 input_2.fastq

Advntr

Introduction

Advntr is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data.

For more information, please check its website: https://biocontainers.pro/tools/advntr and its home page on Github.

Versions

  • 1.4.0

  • 1.5.0

Commands

  • advntr

Module

You can load the modules by:

module load biocontainers
module load advntr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Advntr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=advntr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers advntr

advntr addmodel -r chr21.fa -p CGCGGGGCGGGG -s 45196324 -e 45196360 -c chr21
advntr genotype --vntr_id 1 --alignment_file CSTB_2_5_testdata.bam --working_directory working_dir

Afplot

Introduction

Afplot is a tool to plot allele frequencies in VCF files.

For more information, please check its website: https://biocontainers.pro/tools/afplot and its home page on Github.

Versions

  • 0.2.1

Commands

  • afplot

Module

You can load the modules by:

module load biocontainers
module load afplot

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run afplot on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers afplot

afplot whole-genome histogram -v my_vcf.gz -l my_label -s my_sample -o mysample.histogram.png

Afterqc

Introduction

Afterqc is a tool for quality control of FASTQ data produced by HiSeq 2000/2500/3000/4000, Nextseq 500/550, MiniSeq, and Illumina 1.8 or newer.

For more information, please check its website: https://biocontainers.pro/tools/afterqc and its home page on Github.

Versions

  • 0.9.7

Commands

  • after.py

Module

You can load the modules by:

module load biocontainers
module load afterqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run blobtools on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afterqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers afterqc

after.py -1 SRR11941281_1.fastq.paired.fq  -2 SRR11941281_2.fastq.paired.fq

Agat

Introduction

Agat is a suite of tools to handle gene annotations in any GTF/GFF format.

For more information, please check its website: https://biocontainers.pro/tools/agat and its home page on Github.

Versions

  • 0.8.1

Commands

  • agat_convert_bed2gff.pl

  • agat_convert_embl2gff.pl

  • agat_convert_genscan2gff.pl

  • agat_convert_mfannot2gff.pl

  • agat_convert_minimap2_bam2gff.pl

  • agat_convert_sp_gff2bed.pl

  • agat_convert_sp_gff2gtf.pl

  • agat_convert_sp_gff2tsv.pl

  • agat_convert_sp_gff2zff.pl

  • agat_convert_sp_gxf2gxf.pl

  • agat_sp_Prokka_inferNameFromAttributes.pl

  • agat_sp_add_introns.pl

  • agat_sp_add_start_and_stop.pl

  • agat_sp_alignment_output_style.pl

  • agat_sp_clipN_seqExtremities_and_fixCoordinates.pl

  • agat_sp_compare_two_BUSCOs.pl

  • agat_sp_compare_two_annotations.pl

  • agat_sp_complement_annotations.pl

  • agat_sp_ensembl_output_style.pl

  • agat_sp_extract_attributes.pl

  • agat_sp_extract_sequences.pl

  • agat_sp_filter_by_ORF_size.pl

  • agat_sp_filter_by_locus_distance.pl

  • agat_sp_filter_by_mrnaBlastValue.pl

  • agat_sp_filter_feature_by_attribute_presence.pl

  • agat_sp_filter_feature_by_attribute_value.pl

  • agat_sp_filter_feature_from_keep_list.pl

  • agat_sp_filter_feature_from_kill_list.pl

  • agat_sp_filter_gene_by_intron_numbers.pl

  • agat_sp_filter_gene_by_length.pl

  • agat_sp_filter_incomplete_gene_coding_models.pl

  • agat_sp_filter_record_by_coordinates.pl

  • agat_sp_fix_cds_phases.pl

  • agat_sp_fix_features_locations_duplicated.pl

  • agat_sp_fix_fusion.pl

  • agat_sp_fix_longest_ORF.pl

  • agat_sp_fix_overlaping_genes.pl

  • agat_sp_fix_small_exon_from_extremities.pl

  • agat_sp_flag_premature_stop_codons.pl

  • agat_sp_flag_short_introns.pl

  • agat_sp_functional_statistics.pl

  • agat_sp_keep_longest_isoform.pl

  • agat_sp_kraken_assess_liftover.pl

  • agat_sp_list_short_introns.pl

  • agat_sp_load_function_from_protein_align.pl

  • agat_sp_manage_IDs.pl

  • agat_sp_manage_UTRs.pl

  • agat_sp_manage_attributes.pl

  • agat_sp_manage_functional_annotation.pl

  • agat_sp_manage_introns.pl

  • agat_sp_merge_annotations.pl

  • agat_sp_prokka_fix_fragmented_gene_annotations.pl

  • agat_sp_sensitivity_specificity.pl

  • agat_sp_separate_by_record_type.pl

  • agat_sp_statistics.pl

  • agat_sp_webApollo_compliant.pl

  • agat_sq_add_attributes_from_tsv.pl

  • agat_sq_add_hash_tag.pl

  • agat_sq_add_locus_tag.pl

  • agat_sq_count_attributes.pl

  • agat_sq_filter_feature_from_fasta.pl

  • agat_sq_list_attributes.pl

  • agat_sq_manage_IDs.pl

  • agat_sq_manage_attributes.pl

  • agat_sq_mask.pl

  • agat_sq_remove_redundant_entries.pl

  • agat_sq_repeats_analyzer.pl

  • agat_sq_rfam_analyzer.pl

  • agat_sq_split.pl

  • agat_sq_stat_basic.pl

Module

You can load the modules by:

module load biocontainers
module load agat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Agat on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers agat

agat_convert_sp_gff2bed.pl  --gff genes.gff -o genes.bed

Agfusion

Introduction

AGFusion (pronounced ‘A G Fusion’) is a python package for annotating gene fusions from the human or mouse genomes.

For more information, please check:

Versions

  • 1.3.11

Commands

  • agfusion

Module

You can load the modules by:

module load biocontainers
module load agfusion

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run agfusion on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers agfusion

Alfred

Introduction

Alfred is an efficient and versatile command-line application that computes multi-sample quality control metrics in a read-group aware manner.

For more information, please check its website: https://biocontainers.pro/tools/alfred and its home page on Github.

Versions

  • 0.2.5

  • 0.2.6

Commands

  • alfred

Module

You can load the modules by:

module load biocontainers
module load alfred

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Alfred on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alfred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers alfred

alfred qc -r genome.fasta -o qc.tsv.gz sorted.bam

Alien-hunter

Introduction

Alien-hunter is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs).

For more information, please check its website: https://biocontainers.pro/tools/alien-hunter and its home page: https://www.sanger.ac.uk/tool/alien-hunter/.

Versions

  • 1.7.7

Commands

  • alien_hunter

Module

You can load the modules by:

module load biocontainers
module load alien_hunter

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Alien_hunter on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alien_hunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers alien_hunter

alien_hunter genome.fasta output

Alignstats

Introduction

AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format.

For more information, please check:

Versions

  • 0.9.1

Commands

  • alignstats

Module

You can load the modules by:

module load biocontainers
module load alignstats

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run alignstats on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alignstats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers alignstats

alignstats -C -i input.bam -o report.txt

Allpathslg

Introduction

Allpathslg is a whole-genome shotgun assembler that can generate high-quality genome assemblies using short reads.

For more information, please check its website: https://biocontainers.pro/tools/allpathslg and its home page: https://software.broadinstitute.org/allpaths-lg/blog/.

Versions

  • 52488

Commands

  • PrepareAllPathsInputs.pl

  • RunAllPathsLG

  • CacheLibs.pl

  • Fasta2Fastb

Module

You can load the modules by:

module load biocontainers
module load allpathslg

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Allpathslg on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=allpathslg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers allpathslg

PrepareAllPathsInputs.pl \
                       DATA_DIR=data \
                       PLOIDY=1 \
                       IN_GROUPS_CSV=in_groups.csv\
                       IN_LIBS_CSV=in_libs.csv\
                       OVERWRITE=True\

RunAllPathsLG PRE=allpathlg REFERENCE_NAME=test.genome \
              DATA_SUBDIR=data  RUN=myrun TARGETS=standard \
              SUBDIR=test OVERWRITE=True

~

Alphafold

Introduction

Alphafold is a protein structure prediction tool developed by DeepMind (Google). It uses a novel machine learning approach to predict 3D protein structures from primary sequences alone. The source code is available on Github. It has been deployed in all RCAC clusters, supporting both CPU and GPU.

It also relies on a huge database. The full database (~2.2TB) has been downloaded and setup for users.

Protein struction prediction by alphafold is performed in the following steps:

  • Search the amino acid sequence in uniref90 database by jackhmmer (using CPU)

  • Search the amino acid sequence in mgnify database by jackhmmer (using CPU)

  • Search the amino acid sequence in pdb70 database (for monomers) or pdb_seqres database (for multimers) by hhsearch (using CPU)

  • Search the amino acid sequence in bfd database and uniclust30 (updated to uniref30 since v2.3.0) database by hhblits (using CPU)

  • Search structure templates in pdb_mmcif database (using CPU)

  • Search the amino acid sequence in uniprot database (for multimers) by jackhmmer (using CPU)

  • Predict 3D structure by machine learning (using CPU or GPU)

  • Structure optimisation with OpenMM (using CPU or GPU)

Versions

  • 2.1.1

  • 2.2.0

  • 2.2.3

  • 2.3.0

  • 2.3.1

Commands

run_alphafold.sh

Module

You can load the modules by:

module load biocontainers
module load alphafold

Usage

The usage of Alphafold on our cluster is very straightford, users can create a flagfile containing the database path information:

run_alphafold.sh --flagfile=full_db.ff --fasta_paths=XX --output_dir=XX ...

Users can check its detaied user guide in its Github.

full_db.ff

Example contents of full_db.ff:

--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db/
--uniref90_database_path=/depot/itap/datasets/alphafold/db/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb70_database_path=/depot/itap/datasets/alphafold/db/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db/pdb_mmcif/mmcif_files
--max_template_date=2022-01-29
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign

Note

Since Version v2.2.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20221014. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20221014. Users need to update the flagfile using the updated database:

run_alphafold.sh --flagfile=full_db_20221014.ff --fasta_paths=XX --output_dir=XX ...

full_db_20221014.ff (for alphafold v2)

Example contents of full_db_20221014.ff (For ACCESS Anvil, please change depot/itap to anvil):

--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20221014/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20221014/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20221014/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20221014/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db_20221014/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20221014/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20221014/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign

Note

Since Version v2.3.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20230311. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20230311. Users need to update the flagfile using the updated database:

run_alphafold.sh --flagfile=full_db_20230311.ff --fasta_paths=XX --output_dir=XX ...

Note

Since Version v2.3.0, uniclust30_database_path has been changed to uniref30_database_path.

full_db_20230311.ff (for alphafold v3)

Example contents of full_db_20230311.ff for monomer (For ACCESS Anvil, please change depot/itap to anvil):

--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb70_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign

Example contents of full_db_20230311.ff for multimer (For ACCESS Anvil, please change depot/itap to anvil):

--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20230311/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign

Example job using CPU

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Note

Notice that since version 2.2.0, the parameter --use_gpu_relax=False is required.

To run alphafold using CPU:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers alphafold/2.3.1

run_alphafold.sh --flagfile=full_db_20230311.ff  \
    --fasta_paths=sample.fasta --max_template_date=2022-02-01 \
    --output_dir=af2_full_out --model_preset=monomer \
    --use_gpu_relax=False

Example job using GPU

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Note

Notice that since version 2.2.0, the parameter --use_gpu_relax=True is required.

To run alphafold using GPU:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 11
#SBATCH --gres=gpu:1
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers alphafold/2.3.1

run_alphafold.sh --flagfile=full_db_20230311.ff \
    --fasta_paths=sample.fasta --max_template_date=2022-02-01 \
    --output_dir=af2_full_out --model_preset=monomer \
    --use_gpu_relax=True

Amptk

Introduction

Amptk is a series of scripts to process NGS amplicon data using USEARCH and VSEARCH, it can also be used to process any NGS amplicon data and includes databases setup for analysis of fungal ITS, fungal LSU, bacterial 16S, and insect COI amplicons.

For more information, please check its website: https://biocontainers.pro/tools/amptk and its home page on Github.

Versions

  • 1.5.4

Commands

  • amptk

Module

You can load the modules by:

module load biocontainers
module load amptk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Amptk on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=amptk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers amptk

amptk illumina -i test_data/illumina_test_data -o miseq -f fITS7 -r ITS4  --cpus 4

Ananse

Introduction

ANANSE is a computational approach to infer enhancer-based gene regulatory networks (GRNs) and to identify key transcription factors between two GRNs.

For more information, please check:

Versions

  • 0.4.0

Commands

  • ananse

Module

You can load the modules by:

module load biocontainers
module load ananse

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ananse on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ananse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ananse

mkdir -p ANANSE.REMAP.model.v1.0
wget https://zenodo.org/record/4768075/files/ANANSE.REMAP.model.v1.0.tgz
tar xvzf ANANSE.REMAP.model.v1.0.tgz -C ANANSE.REMAP.model.v1.0
rm ANANSE.REMAP.model.v1.0.tgz

wget https://zenodo.org/record/4769814/files/ANANSE_example_data.tgz
tar xvzf ANANSE_example_data.tgz
rm ANANSE_example_data.tgz

ananse binding -H ANANSE_example_data/H3K27ac/fibroblast*bam -A ANANSE_example_data/ATAC/fibroblast*bam -R ANANSE.REMAP.model.v1.0/ -o fibroblast.binding
ananse binding -H ANANSE_example_data/H3K27ac/heart*bam -A ANANSE_example_data/ATAC/heart*bam -R ANANSE.REMAP.model.v1.0/ -o heart.binding

ananse network -b  fibroblast.binding/binding.h5 -e ANANSE_example_data/RNAseq/fibroblast*TPM.txt -n 4 -o fibroblast.network.txt
ananse network -b  heart.binding/binding.h5 -e ANANSE_example_data/RNAseq/heart*TPM.txt -n 4 -o heart.network.txt

ananse influence -s fibroblast.network.txt -t heart.network.txt -d ANANSE_example_data/RNAseq/fibroblast2heart_degenes.csv -p -o fibroblast2heart.influence.txt

Anchorwave

Introduction

Anchorwave is used for sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation.

For more information, please check its website: https://biocontainers.pro/tools/anchorwave and its home page on Github.

Versions

  • 1.0.1

Commands

  • anchorwave

  • gmap_build

  • gmap

  • minimap2

Module

You can load the modules by:

module load biocontainers
module load anchorwave

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Anchorwave on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=anchorwave
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers anchorwave

anchorwave gff2seq -i Zea_mays.AGPv4.34.gff3 -r Zea_mays.AGPv4.dna.toplevel.fa -o cds.fa

ANGSD

Introduction

ANGSD is a software for analyzing next generation sequencing data. Detailed usage can be found here: http://www.popgen.dk/angsd/index.php/ANGSD.

Versions

  • 0.935

  • 0.937

  • 0.939

  • 0.940

Commands

  • angsd

  • realSFS

  • msToGlf

  • thetaStat

  • supersim

Module

You can load the modules by:

module load biocontainers
module load angsd/0.937

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run angsd on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=angsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers angsd/0.937

angsd -b bam.filelist -GL 1 -doMajorMinor 1 -doMaf 2 -P 5 -minMapQ 30 -minQ 20 -minMaf 0.05

Annogesic

Introduction

ANNOgesic is the swiss army knife for RNA-Seq based annotation of bacterial/archaeal genomes.

For more information, please check:

Versions

  • 1.1.0

Commands

  • annogesic

Module

You can load the modules by:

module load biocontainers
module load annogesic

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run annogesic on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=annogesic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers annogesic

ANNOGESIC_FOLDER=ANNOgesic
annogesic \
    update_genome_fasta \
    -c $ANNOGESIC_FOLDER/input/references/fasta_files/NC_009839.1.fa \
    -m $ANNOGESIC_FOLDER/input/mutation_tables/mutation.csv \
    -u NC_test.1 \
    -pj $ANNOGESIC_FOLDER

ANNOVAR

Introduction

ANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).

For more information, please check its website: https://annovar.openbioinformatics.org/en/latest/.

Versions

  • 2022-01-13

Commands

  • annotate_variation.pl

  • coding_change.pl

  • convert2annovar.pl

  • retrieve_seq_from_fasta.pl

  • table_annovar.pl

  • variants_reduction.pl

Module

You can load the modules by:

module load biocontainers
module load annovar

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ANNOVAR on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=annovar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers annovar

annotate_variation.pl --buildver hg19 --downdb seq humandb/hg19_seq
convert2annovar.pl -format region -seqdir humandb/hg19_seq/ chr1:2000001-2000003

Antismash

Introduction

Antismash Antismash allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes.

For more information, please check its website: https://biocontainers.pro/tools/antismash and its home page: https://docs.antismash.secondarymetabolites.org.

Versions

  • 5.1.2

  • 6.0.1

Commands

  • antismash

Module

You can load the modules by:

module load biocontainers
module load antismash

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Antismash on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=antismash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers antismash

antismash --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees seq.gbk

Anvio

Introduction

Anvio is an analysis and visualization platform for ‘omics data.

For more information, please check its website: https://biocontainers.pro/tools/anvio and its home page on Github.

Versions

  • 7.0

Commands

  • anvi-analyze-synteny

  • anvi-cluster-contigs

  • anvi-compute-ani

  • anvi-compute-completeness

  • anvi-compute-functional-enrichment

  • anvi-compute-gene-cluster-homogeneity

  • anvi-compute-genome-similarity

  • anvi-convert-trnaseq-database

  • anvi-db-info

  • anvi-delete-collection

  • anvi-delete-hmms

  • anvi-delete-misc-data

  • anvi-delete-state

  • anvi-dereplicate-genomes

  • anvi-display-contigs-stats

  • anvi-display-metabolism

  • anvi-display-pan

  • anvi-display-structure

  • anvi-estimate-genome-completeness

  • anvi-estimate-genome-taxonomy

  • anvi-estimate-metabolism

  • anvi-estimate-scg-taxonomy

  • anvi-estimate-trna-taxonomy

  • anvi-experimental-organization

  • anvi-export-collection

  • anvi-export-contigs

  • anvi-export-functions

  • anvi-export-gene-calls

  • anvi-export-gene-coverage-and-detection

  • anvi-export-items-order

  • anvi-export-locus

  • anvi-export-misc-data

  • anvi-export-splits-and-coverages

  • anvi-export-splits-taxonomy

  • anvi-export-state

  • anvi-export-structures

  • anvi-export-table

  • anvi-gen-contigs-database

  • anvi-gen-fixation-index-matrix

  • anvi-gen-gene-consensus-sequences

  • anvi-gen-gene-level-stats-databases

  • anvi-gen-genomes-storage

  • anvi-gen-network

  • anvi-gen-phylogenomic-tree

  • anvi-gen-structure-database

  • anvi-gen-variability-matrix

  • anvi-gen-variability-network

  • anvi-gen-variability-profile

  • anvi-get-aa-counts

  • anvi-get-codon-frequencies

  • anvi-get-enriched-functions-per-pan-group

  • anvi-get-sequences-for-gene-calls

  • anvi-get-sequences-for-gene-clusters

  • anvi-get-sequences-for-hmm-hits

  • anvi-get-short-reads-from-bam

  • anvi-get-short-reads-mapping-to-a-gene

  • anvi-get-split-coverages

  • anvi-help

  • anvi-import-collection

  • anvi-import-functions

  • anvi-import-items-order

  • anvi-import-misc-data

  • anvi-import-state

  • anvi-import-taxonomy-for-genes

  • anvi-import-taxonomy-for-layers

  • anvi-init-bam

  • anvi-inspect

  • anvi-interactive

  • anvi-matrix-to-newick

  • anvi-mcg-classifier

  • anvi-merge

  • anvi-merge-bins

  • anvi-meta-pan-genome

  • anvi-migrate

  • anvi-oligotype-linkmers

  • anvi-pan-genome

  • anvi-profile

  • anvi-push

  • anvi-refine

  • anvi-rename-bins

  • anvi-report-linkmers

  • anvi-run-hmms

  • anvi-run-interacdome

  • anvi-run-kegg-kofams

  • anvi-run-ncbi-cogs

  • anvi-run-pfams

  • anvi-run-scg-taxonomy

  • anvi-run-trna-taxonomy

  • anvi-run-workflow

  • anvi-scan-trnas

  • anvi-script-add-default-collection

  • anvi-script-augustus-output-to-external-gene-calls

  • anvi-script-calculate-pn-ps-ratio

  • anvi-script-checkm-tree-to-interactive

  • anvi-script-compute-ani-for-fasta

  • anvi-script-enrichment-stats

  • anvi-script-estimate-genome-size

  • anvi-script-filter-fasta-by-blast

  • anvi-script-fix-homopolymer-indels

  • anvi-script-gen-CPR-classifier

  • anvi-script-gen-distribution-of-genes-in-a-bin

  • anvi-script-gen-help-pages

  • anvi-script-gen-hmm-hits-matrix-across-genomes

  • anvi-script-gen-programs-network

  • anvi-script-gen-programs-vignette

  • anvi-script-gen-pseudo-paired-reads-from-fastq

  • anvi-script-gen-scg-domain-classifier

  • anvi-script-gen-short-reads

  • anvi-script-gen_stats_for_single_copy_genes.R

  • anvi-script-gen_stats_for_single_copy_genes.py

  • anvi-script-gen_stats_for_single_copy_genes.sh

  • anvi-script-get-collection-info

  • anvi-script-get-coverage-from-bam

  • anvi-script-get-hmm-hits-per-gene-call

  • anvi-script-get-primer-matches

  • anvi-script-merge-collections

  • anvi-script-pfam-accessions-to-hmms-directory

  • anvi-script-predict-CPR-genomes

  • anvi-script-process-genbank

  • anvi-script-process-genbank-metadata

  • anvi-script-reformat-fasta

  • anvi-script-run-eggnog-mapper

  • anvi-script-snvs-to-interactive

  • anvi-script-tabulate

  • anvi-script-transpose-matrix

  • anvi-script-variability-to-vcf

  • anvi-script-visualize-split-coverages

  • anvi-search-functions

  • anvi-self-test

  • anvi-setup-interacdome

  • anvi-setup-kegg-kofams

  • anvi-setup-ncbi-cogs

  • anvi-setup-pdb-database

  • anvi-setup-pfams

  • anvi-setup-scg-taxonomy

  • anvi-setup-trna-taxonomy

  • anvi-show-collections-and-bins

  • anvi-show-misc-data

  • anvi-split

  • anvi-summarize

  • anvi-trnaseq

  • anvi-update-db-description

  • anvi-update-structure-database

  • anvi-upgrade

Module

You can load the modules by:

module load biocontainers
module load anvio

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Anvio on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=anvio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers anvio

anvi-script-reformat-fasta assembly.fa -o contigs.fa -l 1000 --simplify-names  --seq-type NT
anvi-gen-contigs-database -f contigs.fa -o contigs.db -n 'An example contigs database' --num-threads 8
anvi-display-contigs-stats contigs.db
anvi-setup-ncbi-cogs --cog-data-dir $PWD --num-threads 8 --just-do-it --reset
anvi-run-ncbi-cogs -c contigs.db --cog-data-dir COG20 --num-threads 8

Any2fasta

Introduction

Any2fasta can convert various sequence formats to FASTA.

For more information, please check:

Versions

  • 0.4.2

Commands

  • any2fasta

Module

You can load the modules by:

module load biocontainers
module load any2fasta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run any2fasta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=any2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers any2fasta

any2fasta input.gff > out.fasta

Arcs

Introduction

ARCS is a tool for scaffolding genome sequence assemblies using linked or long read sequencing data.

For more information, please check:

Versions

  • 1.2.4

Commands

  • arcs

  • arcs-make

Module

You can load the modules by:

module load biocontainers
module load arcs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run arcs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=arcs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers arcs

Ascatngs

Introduction

AscatNGS contains the Cancer Genome Projects workflow implementation of the ASCAT copy number algorithm for paired end sequencing.

For more information, please check:

Versions

  • 4.5.0

Commands

  • alleleCounter.pl

  • ascatCnToVCF.pl

  • ascatCounts.pl

  • ascatFaiChunk.pl

  • ascatFailedCnCsv.pl

  • ascat.pl

  • ascatSnpPanelFromVcfs.pl

  • ascatSnpPanelGcCorrections.pl

  • ascatSnpPanelGenerator.pl

  • ascatSnpPanelMerge.pl

  • ascatToBigWig.pl

  • bamToBw.pl

  • blast2sam.pl

  • bowtie2sam.pl

  • bwa_aln.pl

  • bwa_mem.pl

  • cgpAppendIdsToVcf.pl

  • cgpVCFSplit.pl

  • export2sam.pl

  • interpolate_sam.pl

  • merge_or_mark.pl

  • novo2sam.pl

  • pkg-config.pl

  • psl2sam.pl

  • sam2vcf.pl

  • samtools.pl

  • seq_cache_populate.pl

  • soap2sam.pl

  • stag-autoschema.pl

  • stag-db.pl

  • stag-diff.pl

  • stag-drawtree.pl

  • stag-filter.pl

  • stag-findsubtree.pl

  • stag-flatten.pl

  • stag-grep.pl

  • stag-handle.pl

  • stag-itext2simple.pl

  • stag-itext2sxpr.pl

  • stag-itext2xml.pl

  • stag-join.pl

  • stag-merge.pl

  • stag-mogrify.pl

  • stag-parse.pl

  • stag-query.pl

  • stag-splitter.pl

  • stag-view.pl

  • stag-xml2itext.pl

  • wgsim_eval.pl

  • xam_coverage_bins.pl

  • zoom2sam.pl

Module

You can load the modules by:

module load biocontainers
module load ascatngs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ascatngs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ascatngs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ascatngs

ASGAL

Introduction

ASGAL (Alternative Splicing Graph ALigner) is a tool for detecting the alternative splicing events expressed in a RNA-Seq sample with respect to a gene annotation.

For more information, please check its | Docker hub: https://hub.docker.com/r/algolab/asgal and its home page on Github.

Versions

  • 1.1.7

Commands

  • asgal

Module

You can load the modules by:

module load biocontainers
module load asgal

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ASGAL on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=asgal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers asgal

asgal -g input/genome.fa \
    -a input/annotation.gtf \
    -s input/sample_1.fa -o outputFolder

Aspera-connect

Introduction

Aspera Connect is software that allows download and upload data. The software includes a command line tool (ascp) that allows scripted data transfer.

For more information, please check:

Versions

  • 4.2.6

Commands

  • ascp

  • ascp4

  • asperaconnect

  • asperaconnect.bin

  • asperaconnect-nmh

  • asperacrypt

  • asunprotect

Module

You can load the modules by:

module load biocontainers
module load aspera-connect

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run aspera-connect on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=aspera-connect
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers aspera-connect

Assembly-stats

Introduction

Assembly-stats is a tool to get assembly statistics from FASTA and FASTQ files.

For more information, please check its website: https://biocontainers.pro/tools/assembly-stats and its home page on Github.

Versions

  • 1.0.1

Commands

  • assembly-stats

Module

You can load the modules by:

module load biocontainers
module load assembly-stats

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Assembly-stats on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=assembly-stats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers assembly-stats

assembly-stats seq.fasta

Atac-seq-pipeline

Introduction

The ENCODE ATAC-seq pipeline is used for quality control and statistical signal processing of short-read sequencing data, producing alignments and measures of enrichment. It was developed by Anshul Kundaje’s lab at Stanford University.

For more information, please check:

Versions

  • 2.1.3

Commands

  • 10x_bam2fastq

  • SAMstats

  • SAMstatsParallel

  • ace2sam

  • aggregate_scores_in_intervals.py

  • align_print_template.py

  • alignmentSieve

  • annotate.py

  • annotateBed

  • axt_extract_ranges.py

  • axt_to_fasta.py

  • axt_to_lav.py

  • axt_to_maf.py

  • bamCompare

  • bamCoverage

  • bamPEFragmentSize

  • bamToBed

  • bamToFastq

  • bed12ToBed6

  • bedToBam

  • bedToIgv

  • bed_bigwig_profile.py

  • bed_build_windows.py

  • bed_complement.py

  • bed_count_by_interval.py

  • bed_count_overlapping.py

  • bed_coverage.py

  • bed_coverage_by_interval.py

  • bed_diff_basewise_summary.py

  • bed_extend_to.py

  • bed_intersect.py

  • bed_intersect_basewise.py

  • bed_merge_overlapping.py

  • bed_rand_intersect.py

  • bed_subtract_basewise.py

  • bedpeToBam

  • bedtools

  • bigwigCompare

  • blast2sam.pl

  • bnMapper.py

  • bowtie2sam.pl

  • bwa

  • chardetect

  • closestBed

  • clusterBed

  • complementBed

  • compress

  • computeGCBias

  • computeMatrix

  • computeMatrixOperations

  • correctGCBias

  • coverageBed

  • createDiff

  • cutadapt

  • cygdb

  • cython

  • cythonize

  • deeptools

  • div_snp_table_chr.py

  • download_metaseq_example_data.py

  • estimateReadFiltering

  • estimateScaleFactor

  • expandCols

  • export2sam.pl

  • faidx

  • fastaFromBed

  • find_in_sorted_file.py

  • flankBed

  • gene_fourfold_sites.py

  • genomeCoverageBed

  • getOverlap

  • getSeq_genome_wN

  • getSeq_genome_woN

  • get_objgraph

  • get_scores_in_intervals.py

  • gffutils-cli

  • groupBy

  • gsl-config

  • gsl-histogram

  • gsl-randist

  • idr

  • int_seqs_to_char_strings.py

  • interpolate_sam.pl

  • intersectBed

  • intersection_matrix.py

  • interval_count_intersections.py

  • interval_join.py

  • intron_exon_reads.py

  • jsondiff

  • lav_to_axt.py

  • lav_to_maf.py

  • line_select.py

  • linksBed

  • lzop_build_offset_table.py

  • mMK_bitset.py

  • macs2

  • maf_build_index.py

  • maf_chop.py

  • maf_chunk.py

  • maf_col_counts.py

  • maf_col_counts_all.py

  • maf_count.py

  • maf_covered_ranges.py

  • maf_covered_regions.py

  • maf_div_sites.py

  • maf_drop_overlapping.py

  • maf_extract_chrom_ranges.py

  • maf_extract_ranges.py

  • maf_extract_ranges_indexed.py

  • maf_filter.py

  • maf_filter_max_wc.py

  • maf_gap_frequency.py

  • maf_gc_content.py

  • maf_interval_alignibility.py

  • maf_limit_to_species.py

  • maf_mapping_word_frequency.py

  • maf_mask_cpg.py

  • maf_mean_length_ungapped_piece.py

  • maf_percent_columns_matching.py

  • maf_percent_identity.py

  • maf_print_chroms.py

  • maf_print_scores.py

  • maf_randomize.py

  • maf_region_coverage_by_src.py

  • maf_select.py

  • maf_shuffle_columns.py

  • maf_species_in_all_files.py

  • maf_split_by_src.py

  • maf_thread_for_species.py

  • maf_tile.py

  • maf_tile_2.py

  • maf_tile_2bit.py

  • maf_to_axt.py

  • maf_to_concat_fasta.py

  • maf_to_fasta.py

  • maf_to_int_seqs.py

  • maf_translate_chars.py

  • maf_truncate.py

  • maf_word_frequency.py

  • makeBAM.sh

  • makeDiff.sh

  • makeFastq.sh

  • make_unique

  • makepBAM_genome.sh

  • makepBAM_transcriptome.sh

  • mapBed

  • maq2sam-long

  • maq2sam-short

  • maskFastaFromBed

  • mask_quality.py

  • mergeBed

  • metaseq-cli

  • multiBamCov

  • multiBamSummary

  • multiBigwigSummary

  • multiIntersectBed

  • nib_chrom_intervals_to_fasta.py

  • nib_intervals_to_fasta.py

  • nib_length.py

  • novo2sam.pl

  • nucBed

  • one_field_per_line.py

  • out_to_chain.py

  • pairToBed

  • pairToPair

  • pbam2bam

  • pbam_mapped_transcriptome

  • pbt_plotting_example.py

  • peak_pie.py

  • plot-bamstats

  • plotCorrelation

  • plotCoverage

  • plotEnrichment

  • plotFingerprint

  • plotHeatmap

  • plotPCA

  • plotProfile

  • prefix_lines.py

  • pretty_table.py

  • print_unique

  • psl2sam.pl

  • py.test

  • pybabel

  • pybedtools

  • pygmentize

  • pytest

  • python-argcomplete-check-easy-install-script

  • python-argcomplete-tcsh

  • qv_to_bqv.py

  • randomBed

  • random_lines.py

  • register-python-argcomplete

  • sam2vcf.pl

  • samtools

  • samtools.pl

  • seq_cache_populate.pl

  • shiftBed

  • shuffleBed

  • slopBed

  • soap2sam.pl

  • sortBed

  • speedtest.py

  • subtractBed

  • table_add_column.py

  • table_filter.py

  • tagBam

  • tfloc_summary.py

  • ucsc_gene_table_to_intervals.py

  • undill

  • unionBedGraphs

  • varfilter.py

  • venn_gchart.py

  • venn_mpl.py

  • wgsim

  • wgsim_eval.pl

  • wiggle_to_array_tree.py

  • wiggle_to_binned_array.py

  • wiggle_to_chr_binned_array.py

  • wiggle_to_simple.py

  • windowBed

  • windowMaker

  • zoom2sam.pl

Module

You can load the modules by:

module load biocontainers
module load atac-seq-pipeline

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run atac-seq-pipeline on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=atac-seq-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers atac-seq-pipeline

Ataqv

Introduction

Ataqv is a toolkit for measuring and comparing ATAC-seq results, made in the Parker lab at the University of Michigan.

For more information, please check its website: https://biocontainers.pro/tools/ataqv and its home page on Github.

Versions

  • 1.3.0

Commands

  • ataqv

Module

You can load the modules by:

module load biocontainers
module load ataqv

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ataqv on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ataqv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ataqv

ataqv --peak-file sample_1_peaks.broadPeak \
    --name sample_1 --metrics-file sample_1.ataqv.json.gz \
    --excluded-region-file hg19.blacklist.bed.gz \
    --tss-file hg19.tss.refseq.bed.gz \
    --ignore-read-groups human sample_1.md.bam \
     > sample_1.ataqv.out

ataqv --peak-file sample_2_peaks.broadPeak \
    --name sample_2 --metrics-file sample_2.ataqv.json.gz \
    --excluded-region-file hg19.blacklist.bed.gz \
    --tss-file hg19.tss.refseq.bed.gz \
    --ignore-read-groups human sample_2.md.bam \
    > sample_2.ataqv.out

ataqv --peak-file sample_3_peaks.broadPeak \
    --name sample_3 --metrics-file sample_3.ataqv.json.gz \
    --excluded-region-file hg19.blacklist.bed.gz \
    --tss-file hg19.tss.refseq.bed.gz \
    --ignore-read-groups human sample_3.md.bam \
     > sample_3.ataqv.out

mkarv my_fantastic_experiment sample_1.ataqv.json.gz sample_2.ataqv.json.gz sample_3.ataqv.json.gz

aTRAM

Introduction

aTRAM (automated target restricted assembly method) is an iterative assembler that performs reference-guided local de novo assemblies using a variety of available methods.

Detailed usage can be found here: https://bioinformaticshome.com/tools/wga/descriptions/aTRAM.html

Versions

  • 2.4.3

Commands

  • atram.py

  • atram_preprocessor.py

  • atram_stitcher.py

Module

You can load the modules by:

module load biocontainers
module load atram/2.4.3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run aTRAM on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=atram
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers atram/2.4.3a

atram_preprocessor.py --blast-db=atram_db  \
                      --end-1=data/tutorial_end_1.fasta.gz \
                      --end-2=data/tutorial_end_2.fasta.gz \
                      --gzip
atram.py --query=tutorial-query.pep.fasta  \
         --blast-db=atram_db \
         --output=output \
         --assembler=velvet

Atropos

Introduction

Atropos is a tool for specific, sensitive, and speedy trimming of NGS reads.

For more information, please check its website: https://biocontainers.pro/tools/atropos and its home page on Github.

Versions

  • 1.1.17

  • 1.1.31

Commands

  • atropos

Module

You can load the modules by:

module load biocontainers
module load atropos

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Atropos on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=atropos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers atropos

atropos --threads 4  \
    -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA \
    -o trimmed1.fq.gz -p trimmed2.fq.gz \
    -pe1 SRR13176582_1.fastq -pe2 SRR13176582_2.fastq

Augur

Introduction

Augur is the bioinformatics toolkit we use to track evolution from sequence and serological data.

For more information, please check its website: https://biocontainers.pro/tools/augur and its home page on Github.

Versions

  • 14.0.0

  • 15.0.0

Commands

  • augur

Module

You can load the modules by:

module load biocontainers
module load augur

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Augur on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=augur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers augur

mkdir -p results
augur index --sequences zika-tutorial/data/sequences.fasta \
            --output results/sequence_index.tsv

augur filter --sequences zika-tutorial/data/sequences.fasta \
             --sequence-index results/sequence_index.tsv \
             --metadata  zika-tutorial/data/metadata.tsv \
             --exclude zika-tutorial/config/dropped_strains.txt \
             --output results/filtered.fasta \
             --group-by country year month \
             --sequences-per-group 20 \
             --min-date 2012

augur align --sequences results/filtered.fasta \
            --reference-sequence zika-tutorial/config/zika_outgroup.gb \
            --output results/aligned.fasta \
            --fill-gaps

augur tree --alignment results/aligned.fasta \
           --output results/tree_raw.nwk

augur refine --tree results/tree_raw.nwk \
             --alignment results/aligned.fasta \
             --metadata  zika-tutorial/data/metadata.tsv \
             --output-tree results/tree.nwk \
             --output-node-data results/branch_lengths.json \
             --timetree \
             --coalescent opt \
             --date-confidence \
             --date-inference marginal \
             --clock-filter-iqd 4

AUGUSTUS

Introduction

AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences.

For more information, please check its website: https://bioinf.uni-greifswald.de/augustus/.

Versions

  • 3.4.0

  • 3.5.0

Commands

  • aln2wig

  • augustus

  • bam2wig

  • bam2wig-dist

  • consensusFinder

  • curve2hints

  • etraining

  • fastBlockSearch

  • filterBam

  • getSeq

  • getSeq-dist

  • homGeneMapping

  • joingenes

  • prepareAlign

Module

You can load the modules by:

module load biocontainers
module load augustus/3.4.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run AUGUSTUS on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=AUGUSTUS
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers augustus/3.4.0

augustus --species=botrytis_cinerea genome.fasta > annotation.gff

Bactopia

Introduction

Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is to process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!

For more information, please check:

Versions

  • 2.0.3

  • 2.1.1

  • 2.2.0

  • 3.0.0

Commands

  • bactopia

Module

You can load the modules by:

module load biocontainers
module load bactopia

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bactopia on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bactopia
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bactopia

bactopia datasets \
--ariba "vfdb_core,card" \
--species "Staphylococcus aureus" \
--include_genus \
--limit 100 \
--cpus 12

bactopia --accession SRX4563634 \
--datasets datasets/ \
--species "Staphylococcus aureus" \
--coverage 100 \
--genome_size median \
--outdir ena-single-sample \
--max_cpus 12

Bali-phy

Introduction

Bali-phy is a tool for bayesian co-estimation of phylogenies and multiple alignments via MCMC.

For more information, please check:

Versions

  • 3.6.0

Commands

  • bali-phy

Module

You can load the modules by:

module load biocontainers
module load bali-phy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bali-phy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bali-phy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bali-phy

bali-phy examples/sequences/ITS/ITS1.fasta 5.8S.fasta ITS2.fasta --test
bali-phy examples/sequences/5S-rRNA/5d-clustalw.fasta -S gtr+Rates.gamma[4]+inv -n 5d-free

Bamgineer

Introduction

Bamgineer is a tool that can be used to introduce user-defined haplotype-phased allele-specific copy number variations (CNV) into an existing Binary Alignment Mapping (BAM) file with demonstrated applicability to simulate somatic cancer CNVs in phased whole-genome sequencing datsets.

For more information, please check its | Docker hub: https://hub.docker.com/r/suluxan/bamgineer-v2 and its home page on Github.

Versions

  • 1.1

Commands

  • simulate.py

Module

You can load the modules by:

module load biocontainers
module load bamgineer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bamgineer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamgineer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bamgineer

simulate.py -config inputs/config.cfg \
            -splitbamdir splitbams \
            -cnv_bed inputs/cnv.bed \
            -vcf inputs/normal_het.vcf \
            -exons inputs/exons.bed \
            -outbam tumour.bam \
            -results outputs \
            -cancertype LUAC1

Bamliquidator

Introduction

Bamliquidator is a set of tools for analyzing the density of short DNA sequence read alignments in the BAM file format.

For more information, please check its | Docker hub: https://hub.docker.com/r/bioliquidator/bamliquidator/ and its home page on Github.

Versions

  • 1.5.2

Commands

  • bamliquidator

  • bamliquidator_bins

  • bamliquidator_regions

  • bamliquidatorbatch

Module

You can load the modules by:

module load biocontainers
module load bamliquidator

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bamliquidator on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamliquidator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bamliquidator

Bam-readcount

Introduction

Bam-readcount is a utility that runs on a BAM or CRAM file and generates low-level information about sequencing data at specific nucleotide positions.

For more information, please check its | Docker hub: https://hub.docker.com/r/mgibio/bam-readcount and its home page on Github.

Versions

  • 1.0.0

Commands

  • bam-readcount

Module

You can load the modules by:

module load biocontainers
module load bam-readcount

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bam-readcount on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bam-readcount
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bam-readcount

bam-readcount -f Homo_sapiens.GRCh38.dna.primary_assembly.fa Aligned.sortedByCoord.out.bam

Bamsurgeon

Introduction

Bamsurgeon are tools for adding mutations to .bam files, used for testing mutation callers.

For more information, please check its | Docker hub: https://hub.docker.com/r/lethalfang/bamsurgeon and its home page on Github.

Versions

  • 1.2

Commands

  • addindel.py

  • addsnv.py

  • addsv.py

Module

You can load the modules by:

module load biocontainers
module load bamsurgeon

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bamsurgeon on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamsurgeon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bamsurgeon

addsv.py -p 1 -v test_sv.txt -f testregion_realign.bam \
    -r reference.fasta -o testregion_sv_mut.bam \
    --aligner mem --keepsecondary --seed 1234 \
    --inslib test_inslib.fa

BamTools

Introduction

BamTools is a programmer API and an end-user toolkit for handling BAM files. This container provides a toolkit-only version (no API to build against).

For more information, please check its website: https://biocontainers.pro/tools/bamtools and its home page on Github.

Versions

  • 2.5.1

Commands

  • bamtools

Module

You can load the modules by:

module load biocontainers
module load bamtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BamTools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH -ddd-error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bamtools

bamtools convert -format fastq -in in.bam -out out.fastq

Bamutil

Introduction

Bamutil is a collection of programs for working on SAM/BAM files.

For more information, please check its website: https://biocontainers.pro/tools/bamutil and its home page on Github.

Versions

  • 1.0.15

Commands

  • bam

Module

You can load the modules by:

module load biocontainers
module load bamutil

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bamutil on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamutil
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bamutil

bam validate --params --in test/testFiles/testInvalid.sam --refFile test/testFilesLibBam/chr1_partial.fa --v --noph 2> results/validateInvalid.txt

bam convert --params --in test/testFiles/testFilter.bam --out results/convertBam.sam --noph 2> results/convertBam.log

bam  splitChromosome --in test/testFile/sortedBam1.bam --out results/splitSortedBam --noph 2> results/splitChromosome.txt

bam stats --basic --in test/testFiles/testFilter.sam --noph 2> results/basicStats.txt

bam gapInfo --in test/testFiles/testGapInfo.sam --out results/gapInfo.txt --noph 2> results/gapInfo.log

bam findCigars --in test/testFiles/testRevert.sam --out results/cigarNonM.sam --nonM --noph 2> results/cigarNonM.log

Barrnap

Introduction

Barrnap: BAsic Rapid Ribosomal RNA Predictor.

For more information, please check its website: https://biocontainers.pro/tools/barrnap and its home page on Github.

Versions

  • 0.9.4

Commands

  • barrnap

Module

You can load the modules by:

module load biocontainers
module load barrnap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Barrnap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=barrnap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers barrnap

barrnap --kingdom bac -o bac_16s.fasta < bac_genome.fasta > bac_16s.gff3
barrnap --kingdom euk -o euk_16s.fasta < euk_genome.fasta  > euk_16s.gff3

Basenji

Introduction

Basenji is a tool for sequential regulatory activity predictions with deep convolutional neural networks.

For more information, please check its website: https://biocontainers.pro/tools/basenji and its home page on Github.

Versions

  • 0.5.1

Commands

  • akita_data.py

  • akita_data_read.py

  • akita_data_write.py

  • akita_predict.py

  • akita_sat_plot.py

  • akita_sat_vcf.py

  • akita_scd.py

  • akita_scd_multi.py

  • akita_test.py

  • akita_train.py

  • bam_cov.py

  • basenji_annot_chr.py

  • basenji_bench_classify.py

  • basenji_bench_gtex.py

  • basenji_bench_gtex_cmp.py

  • basenji_bench_phylop.py

  • basenji_bench_phylop_folds.py

  • basenji_cmp.py

  • basenji_data.py

  • basenji_data2.py

  • basenji_data_align.py

  • basenji_data_gene.py

  • basenji_data_hic_read.py

  • basenji_data_hic_write.py

  • basenji_data_read.py

  • basenji_data_write.py

  • basenji_fetch_app.py

  • basenji_fetch_app1.py

  • basenji_fetch_app2.py

  • basenji_fetch_norm.py

  • basenji_fetch_vcf.py

  • basenji_gtex_folds.py

  • basenji_hdf5_genes.py

  • basenji_hidden.py

  • basenji_map.py

  • basenji_map_genes.py

  • basenji_map_seqs.py

  • basenji_motifs.py

  • basenji_motifs_denovo.py

  • basenji_norm_h5.py

  • basenji_predict.py

  • basenji_predict_bed.py

  • basenji_predict_bed_multi.py

  • basenji_sad.py

  • basenji_sad_multi.py

  • basenji_sad_norm.py

  • basenji_sad_ref.py

  • basenji_sad_ref_multi.py

  • basenji_sad_table.py

  • basenji_sat_bed.py

  • basenji_sat_bed_multi.py

  • basenji_sat_folds.py

  • basenji_sat_plot.py

  • basenji_sat_plot2.py

  • basenji_sat_vcf.py

  • basenji_sed.py

  • basenji_sed_multi.py

  • basenji_sedg.py

  • basenji_test.py

  • basenji_test_folds.py

  • basenji_test_genes.py

  • basenji_test_reps.py

  • basenji_test_specificity.py

  • basenji_train.py

  • basenji_train1.py

  • basenji_train2.py

  • basenji_train_folds.py

  • basenji_train_hic.py

  • basenji_train_reps.py

  • save_model.py

  • sonnet_predict_bed.py

  • sonnet_sad.py

  • sonnet_sad_multi.py

  • sonnet_sat_bed.py

  • sonnet_sat_vcf.py

  • tfr_bw.py

  • tfr_hdf5.py

  • tfr_qc.py

  • upgrade_tf1.py

Module

You can load the modules by:

module load biocontainers
module load basenji

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Basenji on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=basenji
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers basenji

Bayescan

Introduction

BayeScan aims at identifying candidate loci under natural selection from genetic data, using differences in allele frequencies between populations.

For more information, please check:

Versions

  • 2.1

Commands

  • bayescan

Module

You can load the modules by:

module load biocontainers
module load bayescan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bayescan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bayescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bayescan

Bazam

Introduction

Bazam is a tool to extract paired reads in FASTQ format from coordinate sorted BAM files. For more information, please check: Docker hub: https://hub.docker.com/r/dockanomics/bazam Home page: https://github.com/ssadedin/bazam

Versions

  • 1.0.1

Commands

  • bazam

Module

You can load the modules by:

module load biocontainers
module load bazam

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bazam on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bazam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bazam

Bbmap

Introduction

Bbmap is a short read aligner, as well as various other bioinformatic tools.

For more information, please check its website: https://biocontainers.pro/tools/bbmap and its home page on Sourceforge.

Versions

  • 38.93

  • 38.96

Commands

  • addadapters.sh

  • a_sample_mt.sh

  • bbcountunique.sh

  • bbduk.sh

  • bbest.sh

  • bbfakereads.sh

  • bbmap.sh

  • bbmapskimmer.sh

  • bbmask.sh

  • bbmerge-auto.sh

  • bbmergegapped.sh

  • bbmerge.sh

  • bbnorm.sh

  • bbqc.sh

  • bbrealign.sh

  • bbrename.sh

  • bbsketch.sh

  • bbsplitpairs.sh

  • bbsplit.sh

  • bbstats.sh

  • bbversion.sh

  • bbwrap.sh

  • calcmem.sh

  • calctruequality.sh

  • callpeaks.sh

  • callvariants2.sh

  • callvariants.sh

  • clumpify.sh

  • commonkmers.sh

  • comparesketch.sh

  • comparevcf.sh

  • consect.sh

  • countbarcodes.sh

  • countgc.sh

  • countsharedlines.sh

  • crossblock.sh

  • crosscontaminate.sh

  • cutprimers.sh

  • decontaminate.sh

  • dedupe2.sh

  • dedupebymapping.sh

  • dedupe.sh

  • demuxbyname.sh

  • diskbench.sh

  • estherfilter.sh

  • explodetree.sh

  • filterassemblysummary.sh

  • filterbarcodes.sh

  • filterbycoverage.sh

  • filterbyname.sh

  • filterbysequence.sh

  • filterbytaxa.sh

  • filterbytile.sh

  • filterlines.sh

  • filtersam.sh

  • filtersubs.sh

  • filtervcf.sh

  • fungalrelease.sh

  • fuse.sh

  • getreads.sh

  • gi2ancestors.sh

  • gi2taxid.sh

  • gitable.sh

  • grademerge.sh

  • gradesam.sh

  • idmatrix.sh

  • idtree.sh

  • invertkey.sh

  • kcompress.sh

  • khist.sh

  • kmercountexact.sh

  • kmercountmulti.sh

  • kmercoverage.sh

  • loadreads.sh

  • loglog.sh

  • makechimeras.sh

  • makecontaminatedgenomes.sh

  • makepolymers.sh

  • mapPacBio.sh

  • matrixtocolumns.sh

  • mergebarcodes.sh

  • mergeOTUs.sh

  • mergesam.sh

  • msa.sh

  • mutate.sh

  • muxbyname.sh

  • normandcorrectwrapper.sh

  • partition.sh

  • phylip2fasta.sh

  • pileup.sh

  • plotgc.sh

  • postfilter.sh

  • printtime.sh

  • processfrag.sh

  • processspeed.sh

  • randomreads.sh

  • readlength.sh

  • reducesilva.sh

  • reformat.sh

  • removebadbarcodes.sh

  • removecatdogmousehuman.sh

  • removehuman2.sh

  • removehuman.sh

  • removemicrobes.sh

  • removesmartbell.sh

  • renameimg.sh

  • rename.sh

  • repair.sh

  • replaceheaders.sh

  • representative.sh

  • rqcfilter.sh

  • samtoroc.sh

  • seal.sh

  • sendsketch.sh

  • shred.sh

  • shrinkaccession.sh

  • shuffle.sh

  • sketchblacklist.sh

  • sketch.sh

  • sortbyname.sh

  • splitbytaxa.sh

  • splitnextera.sh

  • splitsam4way.sh

  • splitsam6way.sh

  • splitsam.sh

  • stats.sh

  • statswrapper.sh

  • streamsam.sh

  • summarizecrossblock.sh

  • summarizemerge.sh

  • summarizequast.sh

  • summarizescafstats.sh

  • summarizeseal.sh

  • summarizesketch.sh

  • synthmda.sh

  • tadpipe.sh

  • tadpole.sh

  • tadwrapper.sh

  • taxonomy.sh

  • taxserver.sh

  • taxsize.sh

  • taxtree.sh

  • testfilesystem.sh

  • testformat2.sh

  • testformat.sh

  • tetramerfreq.sh

  • textfile.sh

  • translate6frames.sh

  • unicode2ascii.sh

  • webcheck.sh

Module

You can load the modules by:

module load biocontainers
module load bbmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bbmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bbmap

stats.sh in=SRR11234553_1.fastq > stats_out.txt
statswrapper.sh *.fastq > statswrapper_out.txt
pileup.sh in=map1.sam out=pileup_out.txt
readlength.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq > readlength_out.txt
kmercountexact.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq out=kmer_test.out khist=kmer.khist peaks=kmer.peak
bbmask.sh in=SRR11234553_1.fastq out=test.mark sam=map1.sam

Bbtools

Introduction

BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data.

Versions

  • 39.00

Commands

  • Xcalcmem.sh

  • a_sample_mt.sh

  • addadapters.sh

  • addssu.sh

  • adjusthomopolymers.sh

  • alltoall.sh

  • analyzeaccession.sh

  • analyzegenes.sh

  • analyzesketchresults.sh

  • applyvariants.sh

  • bbcms.sh

  • bbcountunique.sh

  • bbduk.sh

  • bbest.sh

  • bbfakereads.sh

  • bbmap.sh

  • bbmapskimmer.sh

  • bbmask.sh

  • bbmerge-auto.sh

  • bbmerge.sh

  • bbnorm.sh

  • bbrealign.sh

  • bbrename.sh

  • bbsketch.sh

  • bbsplit.sh

  • bbsplitpairs.sh

  • bbstats.sh

  • bbversion.sh

  • bbwrap.sh

  • bloomfilter.sh

  • calcmem.sh

  • calctruequality.sh

  • callgenes.sh

  • callpeaks.sh

  • callvariants.sh

  • callvariants2.sh

  • clumpify.sh

  • commonkmers.sh

  • comparegff.sh

  • comparesketch.sh

  • comparessu.sh

  • comparevcf.sh

  • consect.sh

  • consensus.sh

  • countbarcodes.sh

  • countgc.sh

  • countsharedlines.sh

  • crossblock.sh

  • crosscontaminate.sh

  • cutgff.sh

  • cutprimers.sh

  • decontaminate.sh

  • dedupe.sh

  • dedupe2.sh

  • dedupebymapping.sh

  • demuxbyname.sh

  • diskbench.sh

  • estherfilter.sh

  • explodetree.sh

  • fetchproks.sh

  • filterassemblysummary.sh

  • filterbarcodes.sh

  • filterbycoverage.sh

  • filterbyname.sh

  • filterbysequence.sh

  • filterbytaxa.sh

  • filterbytile.sh

  • filterlines.sh

  • filterqc.sh

  • filtersam.sh

  • filtersilva.sh

  • filtersubs.sh

  • filtervcf.sh

  • fixgaps.sh

  • fungalrelease.sh

  • fuse.sh

  • gbff2gff.sh

  • getreads.sh

  • gi2ancestors.sh

  • gi2taxid.sh

  • gitable.sh

  • grademerge.sh

  • gradesam.sh

  • icecreamfinder.sh

  • icecreamgrader.sh

  • icecreammaker.sh

  • idmatrix.sh

  • idtree.sh

  • invertkey.sh

  • kapastats.sh

  • kcompress.sh

  • keepbestcopy.sh

  • khist.sh

  • kmercountexact.sh

  • kmercountmulti.sh

  • kmercoverage.sh

  • kmerfilterset.sh

  • kmerlimit.sh

  • kmerlimit2.sh

  • kmerposition.sh

  • kmutate.sh

  • lilypad.sh

  • loadreads.sh

  • loglog.sh

  • makechimeras.sh

  • makecontaminatedgenomes.sh

  • makepolymers.sh

  • mapPacBio.sh

  • matrixtocolumns.sh

  • mergeOTUs.sh

  • mergebarcodes.sh

  • mergepgm.sh

  • mergeribo.sh

  • mergesam.sh

  • mergesketch.sh

  • mergesorted.sh

  • msa.sh

  • mutate.sh

  • muxbyname.sh

  • partition.sh

  • phylip2fasta.sh

  • pileup.sh

  • plotflowcell.sh

  • plotgc.sh

  • postfilter.sh

  • printtime.sh

  • processfrag.sh

  • processhi-c.sh

  • processspeed.sh

  • randomgenome.sh

  • randomreads.sh

  • readlength.sh

  • readqc.sh

  • reducesilva.sh

  • reformat.sh

  • reformatpb.sh

  • removebadbarcodes.sh

  • removecatdogmousehuman.sh

  • removehuman.sh

  • removehuman2.sh

  • removemicrobes.sh

  • removesmartbell.sh

  • rename.sh

  • renameimg.sh

  • repair.sh

  • replaceheaders.sh

  • representative.sh

  • rqcfilter.sh

  • rqcfilter2.sh

  • runhmm.sh

  • samtoroc.sh

  • seal.sh

  • sendsketch.sh

  • shred.sh

  • shrinkaccession.sh

  • shuffle.sh

  • shuffle2.sh

  • sketch.sh

  • sketchblacklist.sh

  • sketchblacklist2.sh

  • sortbyname.sh

  • splitbytaxa.sh

  • splitnextera.sh

  • splitribo.sh

  • splitsam.sh

  • splitsam4way.sh

  • splitsam6way.sh

  • stats.sh

  • statswrapper.sh

  • streamsam.sh

  • subsketch.sh

  • summarizecontam.sh

  • summarizecoverage.sh

  • summarizecrossblock.sh

  • summarizemerge.sh

  • summarizequast.sh

  • summarizescafstats.sh

  • summarizeseal.sh

  • summarizesketch.sh

  • synthmda.sh

  • tadpipe.sh

  • tadpole.sh

  • tadwrapper.sh

  • taxonomy.sh

  • taxserver.sh

  • taxsize.sh

  • taxtree.sh

  • testfilesystem.sh

  • testformat.sh

  • testformat2.sh

  • tetramerfreq.sh

  • textfile.sh

  • translate6frames.sh

  • unicode2ascii.sh

  • unzip.sh

  • vcf2gff.sh

  • webcheck.sh

Module

You can load the modules by:

module load biocontainers
module load bbtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bbtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bbtools

Bcftools

Introduction

Bcftools is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.

For more information, please check its website: https://biocontainers.pro/tools/bcftools and its home page on Github.

Versions

  • 1.13

  • 1.14

  • 1.17

Commands

  • bcftools

  • color-chrs.pl

  • guess-ploidy.py

  • plot-roh.py

  • plot-vcfstats

  • run-roh.pl

  • vcfutils.pl

Module

You can load the modules by:

module load biocontainers
module load bcftools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bcftools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bcftools

bcftools query -f '%CHROM %POS %REF %ALT\n' file.bcf
bcftools polysomy -v -o outdir/ file.vcf

# Variant calling
bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ob -o calls.bcf

Bcl2fastq

Introduction

bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis.

Versions

  • 2.20.0

Commands

  • bcl2fastq

Module

You can load the modules by:

module load biocontainers
module load bcl2fastq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bcl2fastq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcl2fastq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bcl2fastq

Beagle

Introduction

Beagle is a software package for phasing genotypes and for imputing ungenotyped markers. Start it with: beagle [java options] [arguments] Note: Bref is not installed in this container.

For more information, please check its website: https://biocontainers.pro/tools/beagle and its home page: https://faculty.washington.edu/browning/beagle/beagle.html.

Versions

  • 5.1_24Aug19.3e8

Commands

  • beagle

Module

You can load the modules by:

module load biocontainers
module load beagle

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Beagle on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=beagle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers beagle

beagle gt=test.vcf.gz out=test.out

BEAST 2

Introduction

BEAST 2 is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences.

For more information, please check its website: https://biocontainers.pro/tools/beast2 and its home page: https://www.beast2.org.

Versions

  • 2.6.3

  • 2.6.4

  • 2.6.6

Commands

  • applauncher

  • beast

  • beauti

  • densitree

  • loganalyser

  • logcombiner

  • packagemanager

  • treeannotator

Module

You can load the modules by:

module load biocontainers
module load beast2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BEAST 2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=beast2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers beast2

beast -threads 4 -prefix input input.xml

Bedops

Introduction

Bedops is a software package for manipulating and analyzing genomic interval data.

For more information, please check its website: https://biocontainers.pro/tools/bedops and its home page: https://bedops.readthedocs.io/en/latest/.

Versions

  • 2.4.39

Commands

  • bam2bed

  • bam2bed-float128

  • bam2bed_gnuParallel

  • bam2bed_gnuParallel-float128

  • bam2bed_gnuParallel-megarow

  • bam2bed_gnuParallel-typical

  • bam2bed-megarow

  • bam2bed_sge

  • bam2bed_sge-float128

  • bam2bed_sge-megarow

  • bam2bed_sge-typical

  • bam2bed_slurm

  • bam2bed_slurm-float128

  • bam2bed_slurm-megarow

  • bam2bed_slurm-typical

  • bam2bed-typical

  • bam2starch

  • bam2starch-float128

  • bam2starch_gnuParallel

  • bam2starch_gnuParallel-float128

  • bam2starch_gnuParallel-megarow

  • bam2starch_gnuParallel-typical

  • bam2starch-megarow

  • bam2starch_sge

  • bam2starch_sge-float128

  • bam2starch_sge-megarow

  • bam2starch_sge-typical

  • bam2starch_slurm

  • bam2starch_slurm-float128

  • bam2starch_slurm-megarow

  • bam2starch_slurm-typical

  • bam2starch-typical

  • bedextract

  • bedextract-float128

  • bedextract-megarow

  • bedextract-typical

  • bedmap

  • bedmap-float128

  • bedmap-megarow

  • bedmap-typical

  • bedops

  • bedops-float128

  • bedops-megarow

  • bedops-typical

  • closest-features

  • closest-features-float128

  • closest-features-megarow

  • closest-features-typical

  • convert2bed

  • convert2bed-float128

  • convert2bed-megarow

  • convert2bed-typical

  • gff2bed

  • gff2bed-float128

  • gff2bed-megarow

  • gff2bed-typical

  • gff2starch

  • gff2starch-float128

  • gff2starch-megarow

  • gff2starch-typical

  • gtf2bed

  • gtf2bed-float128

  • gtf2bed-megarow

  • gtf2bed-typical

  • gtf2starch

  • gtf2starch-float128

  • gtf2starch-megarow

  • gtf2starch-typical

  • gvf2bed

  • gvf2bed-float128

  • gvf2bed-megarow

  • gvf2bed-typical

  • gvf2starch

  • gvf2starch-float128

  • gvf2starch-megarow

  • gvf2starch-typical

  • psl2bed

  • psl2bed-float128

  • psl2bed-megarow

  • psl2bed-typical

  • psl2starch

  • psl2starch-float128

  • psl2starch-megarow

  • psl2starch-typical

  • rmsk2bed

  • rmsk2bed-float128

  • rmsk2bed-megarow

  • rmsk2bed-typical

  • rmsk2starch

  • rmsk2starch-float128

  • rmsk2starch-megarow

  • rmsk2starch-typical

  • sam2bed

  • sam2bed-float128

  • sam2bed-megarow

  • sam2bed-typical

  • sam2starch

  • sam2starch-float128

  • sam2starch-megarow

  • sam2starch-typical

  • sort-bed

  • sort-bed-float128

  • sort-bed-megarow

  • sort-bed-typical

  • starch

  • starchcat

  • starchcat-float128

  • starchcat-megarow

  • starchcat-typical

  • starchcluster_gnuParallel

  • starchcluster_gnuParallel-float128

  • starchcluster_gnuParallel-megarow

  • starchcluster_gnuParallel-typical

  • starchcluster_sge

  • starchcluster_sge-float128

  • starchcluster_sge-megarow

  • starchcluster_sge-typical

  • starchcluster_slurm

  • starchcluster_slurm-float128

  • starchcluster_slurm-megarow

  • starchcluster_slurm-typical

  • starch-diff

  • starch-diff-float128

  • starch-diff-megarow

  • starch-diff-typical

  • starch-float128

  • starch-megarow

  • starchstrip

  • starchstrip-float128

  • starchstrip-megarow

  • starchstrip-typical

  • starch-typical

  • switch-BEDOPS-binary-type

  • unstarch

  • unstarch-float128

  • unstarch-megarow

  • unstarch-typical

  • update-sort-bed-migrate-candidates

  • update-sort-bed-migrate-candidates-float128

  • update-sort-bed-migrate-candidates-megarow

  • update-sort-bed-migrate-candidates-typical

  • update-sort-bed-slurm

  • update-sort-bed-slurm-float128

  • update-sort-bed-slurm-megarow

  • update-sort-bed-slurm-typical

  • update-sort-bed-starch-slurm

  • update-sort-bed-starch-slurm-float128

  • update-sort-bed-starch-slurm-megarow

  • update-sort-bed-starch-slurm-typical

  • vcf2bed

  • vcf2bed-float128

  • vcf2bed-megarow

  • vcf2bed-typical

  • vcf2starch

  • vcf2starch-float128

  • vcf2starch-megarow

  • vcf2starch-typical

  • wig2bed

  • wig2bed-float128

  • wig2bed-megarow

  • wig2bed-typical

  • wig2starch

  • wig2starch-float128

  • wig2starch-megarow

  • wig2starch-typical

Module

You can load the modules by:

module load biocontainers
module load bedops

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bedops on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedops
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bedops

bedops -m 001.merge.001.test > 001.merge.001.observed
bedops -c 001.merge.001.test > 001.complement.001.observed
bedops -i 001.intersection.001a.test 001.intersection.001b.test > 001.intersection.001.observed

Bedtools

Introduction

Bedtools is an extensive suite of utilities for genome arithmetic and comparing genomic features in BED format.

For more information, please check its website: https://biocontainers.pro/tools/bedtools and its home page on Github.

Versions

  • 2.30.0

  • 2.31.0

Commands

  • annotateBed

  • bamToBed

  • bamToFastq

  • bed12ToBed6

  • bedpeToBam

  • bedToBam

  • bedToIgv

  • bedtools

  • closestBed

  • clusterBed

  • complementBed

  • coverageBed

  • expandCols

  • fastaFromBed

  • flankBed

  • genomeCoverageBed

  • getOverlap

  • groupBy

  • intersectBed

  • linksBed

  • mapBed

  • maskFastaFromBed

  • mergeBed

  • multiBamCov

  • multiIntersectBed

  • nucBed

  • pairToBed

  • pairToPair

  • randomBed

  • shiftBed

  • shuffleBed

  • slopBed

  • sortBed

  • subtractBed

  • tagBam

  • unionBedGraphs

  • windowBed

  • windowMaker

Module

You can load the modules by:

module load biocontainers
module load bedtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bedtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bedtools

bedtools intersect -a a.bed -b b.bed
bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed

Bioawk

Introduction

Bioawk is an extension to Brian Kernighan’s awk, adding the support of several common biological data formats, including optionally gzip’ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.

For more information, please check its website: https://biocontainers.pro/tools/bioawk and its home page on Github.

Versions

  • 1.0

Commands

  • bioawk

Module

You can load the modules by:

module load biocontainers
module load bioawk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bioawk on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioawk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bioawk

bioawk -c fastx '{print ">"$name;print revcomp($seq)}' seq.fa.gz

Biobambam

Introduction

Biobambam is a collection of tools for early stage alignment file processing.

For more information, please check its website: https://biocontainers.pro/tools/biobambam and its home page on Gitlab.

Versions

  • 2.0.183

Commands

  • bam12auxmerge

  • bam12split

  • bam12strip

  • bamadapterclip

  • bamadapterfind

  • bamalignfrac

  • bamauxmerge

  • bamauxmerge2

  • bamauxsort

  • bamcat

  • bamchecksort

  • bamclipXT

  • bamclipreinsert

  • bamcollate2

  • bamdepth

  • bamdepthintersect

  • bamdifference

  • bamdownsamplerandom

  • bamexplode

  • bamexploderef

  • bamfastcat

  • bamfastexploderef

  • bamfastnumextract

  • bamfastsplit

  • bamfeaturecount

  • bamfillquery

  • bamfilteraux

  • bamfiltereofblocks

  • bamfilterflags

  • bamfilterheader

  • bamfilterheader2

  • bamfilterk

  • bamfilterlength

  • bamfiltermc

  • bamfilternames

  • bamfilterrefid

  • bamfilterrg

  • bamfixmateinformation

  • bamfixpairinfo

  • bamflagsplit

  • bamindex

  • bamintervalcomment

  • bamintervalcommenthist

  • bammapdist

  • bammarkduplicates

  • bammarkduplicates2

  • bammarkduplicatesopt

  • bammaskflags

  • bammdnm

  • bammerge

  • bamnumericalindex

  • bamnumericalindexstats

  • bamrank

  • bamranksort

  • bamrecalculatecigar

  • bamrecompress

  • bamrefextract

  • bamrefinterval

  • bamreheader

  • bamreplacechecksums

  • bamreset

  • bamscrapcount

  • bamseqchksum

  • bamsormadup

  • bamsort

  • bamsplit

  • bamsplitdiv

  • bamstreamingmarkduplicates

  • bamtofastq

  • bamvalidate

  • bamzztoname

Module

You can load the modules by:

module load biocontainers
module load biobambam

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Biobambam on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=biobambam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers biobambam

bammarkduplicates I=Aligned.sortedByCoord.out.bam O=out.bam D=duplcate_out

bamsort I=Aligned.sortedByCoord.out.bam O=sorted.bam sortthreads=8

bamtofastq filename=Aligned.sortedByCoord.out.bam outputdir=fastq_out

Bioconvert

Introduction

Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.

For more information, please check its website: https://biocontainers.pro/tools/bioconvert and its home page: https://bioconvert.readthedocs.io/en/master/.

Versions

  • 0.4.3

  • 0.5.2

  • 0.6.1

  • 0.6.2

Commands

  • bioconvert

Module

You can load the modules by:

module load biocontainers
module load bioconvert

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bioconvert on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioconvert
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bioconvert

bioconvert fastq2fasta input.fastq output.fa

Biopython

Introduction

Biopython is a set of freely available tools for biological computation written in Python.

For more information, please check its website: https://biocontainers.pro/tools/biopython and its home page: https://biopython.org.

Versions

  • 1.70-np112py27

  • 1.70-np112py36

  • 1.78

Commands

  • easy_install

  • f2py

  • f2py3

  • idle3

  • pip

  • pip3

  • pydoc

  • pydoc3

  • python

  • python3

  • python3-config

  • python3.9

  • python3.9-config

  • wheel

Module

You can load the modules by:

module load biocontainers
module load biopython

Interactive job

To run biopython interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers biopython
(base) UserID@bell-a008:~ $ python
Python 3.9.1 |  packaged by conda-forge |  (default, Jan 26 2021, 01:34:10)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> with open("input.gb") as input_handle:
    for record in SeqIO.parse(input_handle, "genbank"):
          print(record)

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Biopython on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=biopython
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers biopython

python script.py

Bismark

Introduction

Bismark is a tool to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step.

For more information, please check its website: https://biocontainers.pro/tools/bismark and its home page on Github.

Versions

  • 0.23.0

  • 0.24.0

Commands

  • bismark

  • bam2nuc

  • bismark2bedGraph

  • bismark2report

  • bismark2summary

  • bismark_genome_preparation

  • bismark_methylation_extractor

  • copy_bismark_files_for_release.pl

  • coverage2cytosine

  • deduplicate_bismark

  • filter_non_conversion

  • methylation_consistency

Dependencies

Bowtie v2.4.2, Samtools v1.12, HISAT2 v2.2.1 were included in the container image. So users do not need to provide the dependency path in the bismark parameter.

Module

You can load the modules by:

module load biocontainers
module load bismark

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bismark on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bismark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bismark

bismark_genome_preparation --bowtie2 data/ref_genome

bismark --multicore 12 --genome data/ref_genome seq.fastq

Blasr

Introduction

Blasr Blasr is a read mapping program that maps reads to positions in a genome by clustering short exact matches between the read and the genome, and scoring clusters using alignment.

For more information, please check its website: https://biocontainers.pro/tools/blasr and its home page on Github.

Versions

  • 5.3.5

Commands

  • blasr

Module

You can load the modules by:

module load biocontainers
module load blasr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Blasr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blasr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers blasr

blasr reads.bas.h5  ecoli_K12.fasta -sam

BLAST

Introduction

BLAST (Basic Local Alignment Search Tool) finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.

Versions

  • 2.11.0

  • 2.13.0

Commands

  • blastn

  • blastp

  • blastx

  • blast_formatter

  • amino-acid-composition

  • between-two-genes

  • blastdbcheck

  • blastdbcmd

  • blastdb_aliastool

  • cleanup-blastdb-volumes.py

  • deltablast

  • dustmasker

  • eaddress

  • eblast

  • get_species_taxids.sh

  • legacy_blast.pl

  • makeblastdb

  • makembindex

  • makeprofiledb

  • psiblast

  • rpsblast

  • rpstblastn

  • run-ncbi-converter

  • segmasker

  • tblastn

  • tblastx

  • update_blastdb.pl

  • windowmasker

Module

You can load the modules by:

module load biocontainers
module load blast

BLAST Databases

Local copies of the blast dabase can be found in the directory /depot/itap/datasets/blast/latest/. The environment varialbe BLASTDB was also set as /depot/itap/datasets/blast/latest/. If users want to use cdd_delta, env_nr, env_nt, nr, nt, pataa, patnt, pdbnt, refseq_protein, refseq_rna, swissprot, or tsa_nt databases, do not need to provide the database path. Instead, just use the format like this -db nr.

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BLAST on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers blast

blastp -query protein.fasta -db nr -out test_out -num_threads 4

BlobTools

Introduction

BlobTools is a modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.

Detailed usage can be found here: https://github.com/DRL/blobtools

Versions

  • 1.1.1

Commands

  • blobtools

Module

You can load the modules by:

module load biocontainers
module load blobtools/1.1.1

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run blobtools on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=blobtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers blobtools/1.1.1

blobtools create -i example/assembly.fna -b example/mapping_1.sorted.bam -t example/blast.out -o test && \
blobtools view -i test.blobDB.json && \
blobtools plot -i test.blobDB.json

Bmge

Introduction

Bmge is a program that selects regions in a multiple sequence alignment that are suited for phylogenetic inference.

For more information, please check its website: https://biocontainers.pro/tools/bmge and its home page: https://bioweb.pasteur.fr/packages/pack@BMGE@1.12.

Versions

  • 1.12

Commands

  • bmge

Module

You can load the modules by:

module load biocontainers
module load bmge

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bmge on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bmge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bmge

bmge -i seq.fa -t AA -o out.phy

Bowtie

Introduction

Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).

For more information, please check its website: https://biocontainers.pro/tools/bowtie and its home page: http://bowtie-bio.sourceforge.net/.

Versions

  • 1.3.1

Commands

  • bowtie

  • bowtie-build

  • bowtie-inspect

Module

You can load the modules by:

module load biocontainers
module load bowtie

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bowtie on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bowtie

bowtie-build ref.fasta ref
bowtie -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam

Bowtie 2

Introduction

``Bowtie 2``is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

For more information, please check its website: https://biocontainers.pro/tools/bowtie2 and its home page on Github.

Versions

  • 2.4.2

  • 2.5.1

Commands

  • bowtie2

  • bowtie2-build

  • bowtie2-inspect

Module

You can load the modules by:

module load biocontainers
module load bowtie2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bowtie 2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bowtie2

bowtie2-build ref.fasta ref
bowtie2 -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam

Bracken

Introduction

Bracken (Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.

Detailed usage can be found here: https://github.com/jenniferlu717/Bracken

Note

Inside the bracken container image, kraken2 was also installed. As a result, when you load bracken/2.6.1-py37, kraken version 2.1.1 will be automatically loaded. Please do not load kraken2 module together with bracken module to avaoid conflict.

Versions

  • 2.6.1

  • 2.7

Commands

  • bracken

  • bracken-build

  • combine_bracken_outputs.py

  • kraken2

  • kraken2-build

  • kraken2-inspect

  • combine_bracken_outputs.py

  • est_abundance.py

  • generate_kmer_distribution.py

Module

You can load the modules by:

module load biocontainers
module load bracken/2.6.1-py37

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bracken on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=bracken
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bracken/2.6.1-py37

DATABASE=minikraken2_v2_8GB_201904_UPDATE
kraken2 --threads 24  --report kranken2.report --db $DATABASE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq
bracken -d  $DATABASE -i kranken2.report -o bracken_output -w bracken.report

BRAKER

Introduction

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET and AUGUSTUS in novel eukaryotic genomes.

For more information. please check its github repository https://github.com/Gaius-Augustus/BRAKER.

Versions

  • 2.1.6

Commands

braker.pl

Helper command

Note

Since BRAKER is a pipeline that trains AUGUSTUS, i.e. writes species specific parameter files, BRAKER needs writing access to the configuration directory of AUGUSTUS that contains such files. This installation comes with a stub of AUGUSTUS coniguration files, but you must copy them out from the container into a location where you have write permissions.

A helper command copy_augustus_config is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:

$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config

Module

You can load the modules by:

module load biocontainers
module load braker2/2.1.6

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BRAKER on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=BRAKER2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers braker2/2.1.6

# The augustus config step is only required for the first time to use BRAKER2
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config

braker.pl --genome genome.fa --bam RNAseq.bam --softmasking --cores 24

Brass

Introduction

Brass is used to analyze one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.

For more information, please check its website: https://quay.io/repository/wtsicgp/brass and its home page on Github.

Versions

  • 6.3.4

Commands

  • brass-assemble

  • brass_bedpe2vcf.pl

  • brass_foldback_reads.pl

  • brass-group

  • brassI_filter.pl

  • brassI_np_in.pl

  • brassI_pre_filter.pl

  • brassI_prep_bam.pl

  • brass.pl

Module

You can load the modules by:

module load biocontainers
module load brass

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Brass on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=brass
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers brass

brass.pl -c 4 -o myout -t tumour.bam -n normal.bam

Breseq

Introduction

Breseq is a computational pipeline for the analysis of short-read re-sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/breseq and its home page on Github.

Versions

  • 0.36.1

Commands

  • breseq

Module

You can load the modules by:

module load biocontainers
module load breseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Breseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=breseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers breseq

BUSCO

Introduction

BUSCO (Benchmarking sets of Universal Single-Copy Orthologs) provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs.

Detailed information can be found here: https://gitlab.com/ezlab/busco/

Versions

  • 5.2.2

  • 5.3.0

  • 5.4.1

  • 5.4.3

  • 5.4.4

  • 5.4.5

Commands

  • busco

  • generate_plot.py

Helper command

Note

Augustus is a gene prediction program for eukaryotes which is required by BUSCO. Augustus requires a writable configuration directory. This installation comes with a stub of AUGUSTUS coniguration files, but you must copy them out from the container into a location where you have write permissions.

A helper command copy_augustus_config is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:

$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config

Module

You can load the modules by:

module load biocontainers
module load busco

Example job for prokaryotic genomes

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BUSCO on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers busco

## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets

## run the evaluation
busco -f -c 12 -l actinobacteria_class_odb10  -i bacteria_genome.fasta -o busco_out -m genome

## generate a simple summary plot
generate_plot.py -wd busco_out

Example job for eukaryotic genomes

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BUSCO on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers busco

## The augustus config step is only required for the first time to use BUSCO
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus

## This is required for eukaryotic genomes
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config

## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets

## run the evaluation
busco -f -c 12 -l fungi_odb10 -i fungi_protein.fasta -o busco_out_protein  -m protein
busco -f -c 12 --augustus -l fungi_odb10 -i fungi_genome.fasta -o busco_out_genome  -m genome

## generate a simple summary plot
generate_plot.py -wd busco_out_protein
generate_plot.py -wd busco_out_genome

Bustools

Introduction

Bustools is a program for manipulating BUS files for single cell RNA-Seq datasets.

For more information, please check its website: https://biocontainers.pro/tools/bustools and its home page on Github.

Versions

  • 0.41.0

Commands

  • bustools

Module

You can load the modules by:

module load biocontainers
module load bustools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Bustools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bustools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bustools

bustools capture -s -o cDNA_capture.bus -c cDNA_transcripts.to_capture.txt -e matrix.ec -t transcripts.txt output.correct.sort.bus
bustools count -o u -g cDNA_introns_t2g.txt -e matrix.ec -t transcripts.txt --genecounts cDNA_capture.bus

BWA

Introduction

BWA (Burrows-Wheeler Aligner) is a fast, accurate, memory-efficient aligner for short and long sequencing reads.

For more information, please check its website: https://biocontainers.pro/tools/bwa and its home page: http://bio-bwa.sourceforge.net.

Versions

  • 0.7.17

Commands

  • bwa

  • qualfa2fq.pl

  • xa2multi.pl

Module

You can load the modules by:

module load biocontainers
module load bwa

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BWA on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bwa

bwa index ref.fasta
bwa mem ref.fasta input.fq > test.sam

Bwameth

Introduction

Bwameth is a tool for fast and accurante alignment of BS-Seq reads.

For more information, please check:

Versions

  • 0.2.5

Commands

  • bwameth.py

Module

You can load the modules by:

module load biocontainers
module load bwameth

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run bwameth on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwameth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers bwameth

Cactus

Introduction

Cactus is a reference-free whole-genome multiple alignment program.

For more information, please check its website: https://biocontainers.pro/tools/cactus and its home page on Github.

Versions

  • 2.0.5

  • 2.2.1

  • 2.2.3-gpu

  • 2.2.3

  • 2.4.0-gpu

  • 2.4.0

Commands

  • cactus

  • cactus-align

  • cactus-align-batch

  • cactus-blast

  • cactus-graphmap

  • cactus-graphmap-join

  • cactus-graphmap-split

  • cactus-minigraph

  • cactus-prepare

  • cactus-prepare-toil

  • cactus-preprocess

  • cactus-refmap

  • cactus2hal-stitch.sh

  • cactus2hal.py

  • cactusAPITests

  • cactus_analyseAssembly

  • cactus_barTests

  • cactus_batch_mergeChunks

  • cactus_chain

  • cactus_consolidated

  • cactus_covered_intervals

  • cactus_fasta_fragments.py

  • cactus_fasta_softmask_intervals.py

  • cactus_filterSmallFastaSequences.py

  • cactus_halGeneratorTests

  • cactus_local_alignment.py

  • cactus_makeAlphaNumericHeaders.py

  • cactus_softmask2hardmask

Module

You can load the modules by:

module load biocontainers
module load cactus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cactus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cactus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cactus

wget https://raw.githubusercontent.com/ComparativeGenomicsToolkit/cactus/master/examples/evolverMammals.txt
cactus jobStore evolverMammals.txt evolverMammals.hal

Cafe

Introduction

Cafe is a computational tool for the study of gene family evolution.

For more information, please check its website: https://biocontainers.pro/tools/cafe and its home page on Github.

Versions

  • 4.2.1

  • 5.0.0

Commands

  • cafe

Module

You can load the modules by:

module load biocontainers
module load cafe

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cafe on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cafe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cafe

#To get a list of commands just call CAFE with the -h or --help arguments
cafe5 -h

#To estimate lambda with no among family rate variation issue the command
cafe5 -i mammal_gene_families.txt -t mammal_tree.txt

Canu

Introduction

Canu is a single molecule sequence assembler for genomes large and small.

Detailed usage can be found here: https://github.com/marbl/canu

Versions

  • 2.1.1

  • 2.2

Commands

  • canu

Module

You can load the modules by:

module load biocontainers
module load canu/2.2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run canu on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=canu
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers canu/2.2

canu -p Cm -d clavibacter_pacbio genomeSize=3.4m  -pacbio *.fastq

Ccs

Introduction

Pbccs is a tool to generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads).

For more information, please check:

Versions

  • 6.4.0

Commands

  • ccs

Module

You can load the modules by:

module load biocontainers
module load ccs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ccs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ccs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ccs

ccs --all subreads.bam ccs.bam

Cdbtools

Introduction

Cdbtools is a collection of tools used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files.

For more information, please check its website: https://biocontainers.pro/tools/cdbtools and its home page: http://compbio.dfci.harvard.edu/tgi.

Versions

  • 0.99

Commands

  • cdbfasta

  • cdbyank

Module

You can load the modules by:

module load biocontainers
module load cdbtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cdbtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cdbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cdbtools

cdbfasta genome.fa
cdbyank -a 'seq_1' genome.fa.cidx

Cd-hit

Introduction

Cd-hit is a very widely used program for clustering and comparing protein or nucleotide sequences.

For more information, please check its website: https://biocontainers.pro/tools/cd-hit and its home page on Github.

Versions

  • 4.8.1

Commands

  • FET.pl

  • cd-hit

  • cd-hit-2d

  • cd-hit-2d-para.pl

  • cd-hit-454

  • cd-hit-clstr_2_blm8.pl

  • cd-hit-div

  • cd-hit-div.pl

  • cd-hit-est

  • cd-hit-est-2d

  • cd-hit-para.pl

  • clstr2tree.pl

  • clstr2txt.pl

  • clstr2xml.pl

  • clstr_cut.pl

  • clstr_list.pl

  • clstr_list_sort.pl

  • clstr_merge.pl

  • clstr_merge_noorder.pl

  • clstr_quality_eval.pl

  • clstr_quality_eval_by_link.pl

  • clstr_reduce.pl

  • clstr_renumber.pl

  • clstr_rep.pl

  • clstr_reps_faa_rev.pl

  • clstr_rev.pl

  • clstr_select.pl

  • clstr_select_rep.pl

  • clstr_size_histogram.pl

  • clstr_size_stat.pl

  • clstr_sort_by.pl

  • clstr_sort_prot_by.pl

  • clstr_sql_tbl.pl

  • clstr_sql_tbl_sort.pl

  • make_multi_seq.pl

  • plot_2d.pl

  • plot_len1.pl

Module

You can load the modules by:

module load biocontainers
module load cd-hit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cd-hit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cd-hit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cd-hit

cd-hit -i Cm_pep.fasta  -o Cmdb90 -c 0.9 -n 5 -M 16000 -T 8

cd-hit-est -i Cm_dna.fasta  -o Cmdb90_nt -c 0.9 -n 5 -M 16000 -T 8

Cegma

Introduction

CEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.

For more information, please check:

Versions

  • 2.5

Commands

  • cegma

Module

You can load the modules by:

module load biocontainers
module load cegma

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cegma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cegma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cegma

cegma --genome genome.fasta -o output

Cellbender

Introduction

Cellbender is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.

For more information, please check its website: https://biocontainers.pro/tools/cellbender and its home page on Github.

Versions

  • 0.2.0

  • 0.2.2

Commands

  • cellbender

Module

You can load the modules by:

module load biocontainers
module load cellbender

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cellbender on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellbender
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellbender

cellbender remove-background \
             --input cellranger/test_count/run_count_1kpbmcs/outs/raw_feature_bc_matrix.h5 \
             --output output_cpu.h5 \
             --expected-cells 1000 \
             --total-droplets-included 20000 \
             --fpr 0.01 \
             --epochs 150

Cellphonedb

Introduction

CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions.

For more information, please check:

Versions

  • 2.1.7

Commands

  • cellphonedb

Module

You can load the modules by:

module load biocontainers
module load cellphonedb

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cellphonedb on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellphonedb
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellphonedb

Cellranger

Introduction

Cellranger is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more. Detailed usage can be found here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger.

Versions

  • 6.0.1

  • 6.1.1

  • 6.1.2

  • 7.0.0

  • 7.0.1

  • 7.1.0

Commands

  • cellranger mkfastq

  • cellranger count

  • cellranger aggr

  • cellranger reanalyze

  • cellranger multi

Module

You can load the modules by:

module load biocontainers
module load cellranger

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cellranger our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 48
#SBATCH --job-name=cellranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellranger

cellranger count --id=run_count_1kpbmcs --fastqs=pbmc_1k_v3_fastqs --sample=pbmc_1k_v3 --transcriptome=refdata-gex-GRCh38-2020-A

Cellranger-arc

Introduction

Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell Multiome ATAC + Gene Expression sequencing data to generate a variety of analyses pertaining to gene expression (GEX), chromatin accessibility, and their linkage. Furthermore, since the ATAC and GEX measurements are on the very same cell, we are able to perform analyses that link chromatin accessibility and GEX.

Versions

  • 2.0.2

Commands

  • cellranger-arc

Module

You can load the modules by:

module load biocontainers
module load cellranger-arc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cellranger-arc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-arc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellranger-arc

Cellranger-atac

Introduction

Cellranger-atac is a set of analysis pipelines that process Chromium Single Cell ATAC data.

Versions

  • 2.0.0

  • 2.1.0

Commands

  • cellranger-atac

Module

You can load the modules by:

module load biocontainers
module load cellranger-atac

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cellranger-atac on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --mem=64G
#SBATCH --job-name=cellranger-atac
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellranger-atac

cellranger-atac count --id=sample345 \
                    --reference=refdata-cellranger-arc-GRCh38-2020-A-2.0.0 \
                    --fastqs=runs/HAWT7ADXX/outs/fastq_path \
                    --sample=mysample \
                    --localcores=8 \
                    --localmem=64

Cellranger-dna

Introduction

Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.

Versions

  • 1.1.0

Commands

  • cellranger-dna

Module

You can load the modules by:

module load biocontainers
module load cellranger-dna

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cellranger-dna on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-dna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellranger-dna

CellRank

images/cellrank.png

Introduction

CellRank a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data. Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.

Versions

  • 1.5.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load cellrank/1.5.1

Note

The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.

Interactive job

To run CellRank interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 |  packaged by conda-forge |  (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellrank/1.5.1

python script.py

CellRank-krylov

images/cellrank.png

Introduction

CellRank a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data. CellRank-krylov is CellRank installed with extra libraries, enabling it to have better performance for large datasets (>15k cells). Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.

Versions

  • 1.5.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load cellrank-krylov/1.5.1

Note

The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.

Interactive job

To run CellRank-krylov interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank-krylov/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 |  packaged by conda-forge |  (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank-krylov
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellrank-krylov/1.5.1

python script.py

cellSNP

Introduction

cellSNP aims to pileup the expressed alleles in single-cell or bulk RNA-seq data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, particularly with vireo, which assigns cells to donors and detects doublets, even without genotyping reference.

For more information, please check its website: https://biocontainers.pro/tools/cellsnp-lite and its home page on Github.

Versions

  • 1.2.2

Commands

  • cellsnp-lite

Module

You can load the modules by:

module load biocontainers
module load cellsnp-lite

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cellSNP on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=cellsnp-lite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cellsnp-lite

cellsnp-lite -s sample.bam -b barcode.tsv -O cellsnp_out -p 8 --minMAF 0.1 --minCOUNT 100

Celltypist

Introduction

Celltypist is a tool for semi-automatic cell type annotation.

For more information, please check its website: https://biocontainers.pro/tools/celltypist and its home page on Github.

Versions

  • 0.2.0

  • 1.1.0

Commands

  • celltypist

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load celltypist

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Celltypist on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=celltypist
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers celltypist

celltypist --indata demo_2000_cells.h5ad --model Immune_All_Low.pkl --outdir output

Centrifuge

Introduction

Centrifuge is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers.

For more information, please check its website: https://biocontainers.pro/tools/centrifuge and its home page: http://www.ccb.jhu.edu/software/centrifuge/.

Versions

  • 1.0.4_beta

Commands

  • centrifuge

  • centrifuge-BuildSharedSequence.pl

  • centrifuge-RemoveEmptySequence.pl

  • centrifuge-RemoveN.pl

  • centrifuge-build

  • centrifuge-build-bin

  • centrifuge-class

  • centrifuge-compress.pl

  • centrifuge-download

  • centrifuge-inspect

  • centrifuge-inspect-bin

  • centrifuge-kreport

  • centrifuge-sort-nt.pl

  • centrifuge_evaluate.py

  • centrifuge_simulate_reads.py

Module

You can load the modules by:

module load biocontainers
module load centrifuge

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Centrifuge on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=centrifuge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers centrifuge

centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 8 --conversion-table seqid2taxid.map \
             --taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \
             input-sequences.fna abv

Cfsan-snp-pipeline

Introduction

The CFSAN SNP Pipeline is a Python-based system for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced from samples of interest to food safety.

Versions

  • 2.2.1

Commands

  • cfsan_snp_pipeline

Module

You can load the modules by:

module load biocontainers
module load cfsan-snp-pipeline

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cfsan-snp-pipeline on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cfsan-snp-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cfsan-snp-pipeline

Checkm-genome

Introduction

CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.

For more information, please check:

Versions

  • 1.2.0

  • 1.2.2

Commands

  • checkm-genome

Module

You can load the modules by:

module load biocontainers
module load checkm-genome

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run checkm-genome on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=checkm-genome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers checkm-genome

checkm lineage_wf -t 8 -x fa bins checkm

Chewbbaca

Introduction

chewBBACA is a comprehensive pipeline including a set of functions for the creation and validation of whole genome and core genome MultiLocus Sequence Typing (wg/cgMLST) schemas, providing an allele calling algorithm based on Blast Score Ratio that can be run in multiprocessor settings and a set of functions to visualize and validate allele variation in the loci. chewBBACA performs the schema creation and allele calls on complete or draft genomes resulting from de novo assemblers.

For more information, please check:

Versions

  • 2.8.5

Commands

  • chewBBACA.py

Module

You can load the modules by:

module load biocontainers
module load chewbbaca

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run chewbbaca on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=chewbbaca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers chewbbaca

chewBBACA.py CreateSchema -i complete_genomes/ -o tutorial_schema --ptf Streptococcus_agalactiae.trn --cpu 4
chewBBACA.py AlleleCall -i complete_genomes/ -g tutorial_schema/schema_seed -o results32_wgMLST --cpu 4

Chopper

Introduction

Chopper is Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. Filtering is done on average read quality and minimal or maximal read length, and applying a headcrop (start of read) and tailcrop (end of read) while printing the reads passing the filter.

For more information, please check:

Versions

  • 0.2.0

Commands

  • chopper

Module

You can load the modules by:

module load biocontainers
module load chopper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run chopper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers chopper

Chromap

Introduction

Chromap is an ultrafast method for aligning and preprocessing high throughput chromatin profiles.

For more information, please check:

Versions

  • 0.2.2

Commands

  • chromap

Module

You can load the modules by:

module load biocontainers
module load chromap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run chromap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chromap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers chromap

CICERO

Introduction

CICERO (Clipped-reads Extended for RNA Optimization) is an assembly-based algorithm to detect diverse classes of driver gene fusions from RNA-seq.

For more information, please check its home page on Github.

Versions

  • 1.8.1

Commands

  • Cicero.sh

Module

You can load the modules by:

module load biocontainers
module load cicero

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run CICERO on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cicero
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cicero

Circexplorer2

Introduction

CIRCexplorer2 is a comprehensive and integrative circular RNA analysis toolset. It is the successor of CIRCexplorer with plenty of new features to facilitate circular RNA identification and characterization.

For more information, please check:

Versions

  • 2.3.8

Commands

  • CIRCexplorer2

  • fast_circ.py

  • fetch_ucsc.py

Module

You can load the modules by:

module load biocontainers
module load circexplorer2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run circexplorer2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circexplorer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers circexplorer2

Circlator

Introduction

Circlator is a tool to circularize genome assemblies.

For more information, please check its | Docker hub: https://hub.docker.com/r/sangerpathogens/circlator and its home page on Github.

Versions

  • 1.5.5

Commands

  • circlator

  • python3

Module

You can load the modules by:

module load biocontainers
module load circlator

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Circlator on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circlator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers circlator

circlator minimus2  minimus2_test_run_minimus2.in.fa  minimus2_test

Circompara2

Introduction

CirComPara2 is a computational pipeline to detect, quantify, and correlate expression of linear and circular RNAs from RNA-seq data that combines multiple circRNA-detection methods.

For more information, please check:

Versions

  • 0.1.2.1

Commands

  • python

  • Rscript

  • circompara2

  • CIRCexplorer2

  • CIRCexplorer_compare.R

  • CIRI.pl

  • DCC

  • DCC_patch_CombineCounts.py

  • QRE_finder.py

  • STAR

  • bedtools

  • bowtie

  • bowtie-build

  • bowtie-inspect

  • bowtie2

  • bowtie2-build

  • bowtie2-inspect

  • bwa

  • ccp_circrna_expression.R

  • cfinder_compare.R

  • chimoutjunc_to_bed.py

  • ciri_compare.R

  • collect_read_stats.R

  • convert_circrna_collect_tables.py

  • cuffcompare

  • cuffdiff

  • cufflinks

  • cuffmerge

  • cuffnorm

  • cuffquant

  • dcc_compare.R

  • dcc_fix_strand.R

  • fasta_len.py

  • fastq_rev_comp.py

  • fastqc

  • filterCirc.awk

  • filterSpliceSiteCircles.pl

  • filter_and_cast_circexp.R

  • filter_fastq_reads.py

  • filter_findcirc_res.R

  • filter_segemehl.R

  • find_circ.py

  • findcirc_compare.R

  • gene_annotation.R

  • get_ce2_bwa_bks_reads.R

  • get_ce2_bwa_circ_reads.py

  • get_ce2_segemehl_bks_reads.R

  • get_ce2_star_bks_reads.R

  • get_ce2_th_bks_reads.R

  • get_circompara_counts.R

  • get_circrnaFinder_bks_reads.R

  • get_ciri_bks_reads.R

  • get_dcc_bks_reads.R

  • get_findcirc_bks_reads.R

  • get_gene_expression_files.R

  • get_stringtie_rawcounts.R

  • gffread

  • gtfToGenePred

  • gtf_collapse_features.py

  • gtf_to_sam

  • haarz.x

  • hisat2

  • hisat2-build

  • htseq-count

  • install_R_libs.R

  • nrForwardSplicedReads.pl

  • parallel

  • pip

  • postProcessStarAlignment.pl

  • samtools

  • samtools_v0

  • scons

  • segemehl.x

  • split_start_end_gtf.py

  • starCirclesToBed.pl

  • stringtie

  • testrealign_compare.R

  • tophat2

  • trim_read_header.py

  • trimmomatic-0.39.jar

  • unmapped2anchors.py

  • cf_filterChimout.awk

  • circompara

  • get_unmapped_reads_from_bam.sh

  • install_circompara

  • make_circrna_html

  • make_indexes

Module

You can load the modules by:

module load biocontainers
module load circompara2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run circompara2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circompara2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers circompara2

Circos

Introduction

Circos is a software package for visualizing data and information.

For more information, please check its website: https://biocontainers.pro/tools/circos and its home page: http://circos.ca.

Versions

  • 0.69.8

Commands

  • circos

Module

You can load the modules by:

module load biocontainers
module load circos

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Circos on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers circos

circos -conf circos.conf

Ciri2

Introduction

CIRI2: Circular RNA identification based on multiple seed matching

For more information, please check:

Versions

  • 2.0.6

Commands

  • CIRI2.pl

Module

You can load the modules by:

module load biocontainers
module load ciri2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ciri2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ciri2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ciri2

CIRIquant

Introduction

CIRIquant is a comprehensive analysis pipeline for circRNA detection and quantification in RNA-Seq data.

For more information, please check its | Docker hub: https://hub.docker.com/r/mortreux/ciriquant and its home page on Github.

Versions

  • 1.1.2

Commands

  • CIRIquant

Module

You can load the modules by:

module load biocontainers
module load ciriquant

config.yml

All required dependencies have been installed within the CIRIquant container image. But users still need toprovide the PATH of these exectuables in config.yml. Please use the below config.yml as example:

name: hg38
tools:
   bwa: /bin/bwa
   hisat2: /bin/hisat2
   stringtie: /bin/stringtie
   samtools: /usr/local/bin/samtools
reference:
   fasta: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
   gtf:  reference/Homo_sapiens.GRCh38.105.gtf
   bwa_index: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
   hisat_index: reference/hg38_hisat2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run CIRIquant on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 64
#SBATCH --job-name=ciriquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ciriquant

CIRIquant -t 64 -1 SRR12095148_1.fastq -2 SRR12095148_2.fastq --config config.yml -o Output -p test

Clair3

Introduction

Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.

For more information, please check:

Versions

  • 0.1-r11

  • 0.1-r12

Commands

  • run_clair3.sh

Module

You can load the modules by:

module load biocontainers
module load clair3

Model_path

Note

model_path is in /opt/models/. The parameter will be like this --model_path="/opt/models/MODEL_NAME"

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clair3 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clair3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clair3

run_clair3.sh \
      --bam_fn=input.bam \
      --ref_fn=ref.fasta \
      --threads=12 \
      --platform=ont \
      --model_path="/opt/models/ont" \
      --output=output

Clairvoyante

Introduction

Clairvoyante is a deep neural network based variant caller.

For more information, please check:

Versions

  • 1.02

Commands

  • clairvoyante.py

Module

You can load the modules by:

module load biocontainers
module load clairvoyante

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clairvoyante on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clairvoyante
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clairvoyante

cd training
clairvoyante.py callVarBam \
   --chkpnt_fn ../trainedModels/fullv3-illumina-novoalign-hg001+hg002-hg38/learningRate1e-3.epoch500 \
   --bam_fn ../testingData/chr21/chr21.bam \
   --ref_fn ../testingData/chr21/chr21.fa \
   --bed_fn ../testingData/chr21/chr21.bed \
   --call_fn chr21_calls.vcf \
   --ctgName chr21

Clearcnv

Introduction

ClearCNV: CNV calling from NGS panel data in the presence of ambiguity and noise.

For more information, please check:

Versions

  • 0.306

Commands

  • clearCNV

Module

You can load the modules by:

module load biocontainers
module load clearcnv

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clearcnv on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clearcnv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clearcnv

Clever-toolkit

Introduction

Clever-toolkit is a collection of tools to discover and genotype structural variations in genomes from paired-end sequencing reads. The main software is written in C++ with some auxiliary scripts in Python.

Versions

  • 2.4

Commands

  • clever

  • laser

  • bam-to-alignment-priors

  • split-priors-by-chromosome

  • clever-core

  • postprocess-predictions

  • evaluate-sv-predictions

  • split-reads

  • laser-core

  • laser-recalibrate

  • genotyper

  • insert-length-histogram

  • add-score-tags-to-bam

  • bam2fastq

  • remove-redundant-variations

  • precompute-distributions

  • extract-bad-reads

  • filter-variations

  • merge-to-vcf

  • multiline-to-xa

  • filter-bam

  • read-group-stats

Module

You can load the modules by:

module load biocontainers
module load clever-toolkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clever-toolkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clever-toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clever-toolkit

cat mapped.bam |  bam2fastq output_1.fq output_2.fq

Clonalframeml

Introduction

ClonalFrameML is a software package that performs efficient inference of recombination in bacterial genomes.

For more information, please check:

Versions

  • 1.11

Commands

  • ClonalFrameML

Module

You can load the modules by:

module load biocontainers
module load clonalframeml

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clonalframeml on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clonalframeml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clonalframeml

Clust

Introduction

Clust is a fully automated method for identification of clusters (groups) of genes that are consistently co-expressed (well-correlated) in one or more heterogeneous datasets from one or multiple species.

For more information, please check:

Versions

  • 1.17.0

Commands

  • clust

Module

You can load the modules by:

module load biocontainers
module load clust

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run clust on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clust
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clust

Clustalw

Introduction

Clustalw is a general purpose multiple alignment program for DNA or proteins.

For more information, please check its website: https://biocontainers.pro/tools/clustalw and its home page: http://www.clustal.org/clustal2/.

Versions

  • 2.1

Commands

  • clustalw

Module

You can load the modules by:

module load biocontainers
module load clustalw

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Clustalw on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clustalw
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers clustalw

clustalw -tree -align -infile=seq.faa

CNVkit

Introduction

CNVkit is a command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing.

For more information, please check its website: https://biocontainers.pro/tools/cnvkit and its home page on Github.

Versions

  • 0.9.9-py

Commands

  • cnvkit.py

  • cnv_annotate.py

  • cnv_expression_correlate.py

  • cnv_updater.py

Module

You can load the modules by:

module load biocontainers
module load cnvkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run CNVkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cnvkit

cnvkit.py batch *Tumor.bam --normal *Normal.bam \
                --targets my_baits.bed --fasta hg19.fasta \
                --access data/access-5kb-mappable.hg19.bed \
                --output-reference my_reference.cnn
                --output-dir example/

Cnvnator

Introduction

Cnvnator is a tool for discovery and characterization of copy number variation (CNV) in population genome sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/cnvnator and its home page on Github.

Versions

  • 0.4.1

Commands

  • cnvnator

  • cnvnator2VCF.pl

  • plotbaf.py

  • plotcircular.py

  • plotrdbaf.py

  • pytools.py

Module

You can load the modules by:

module load biocontainers
module load cnvnator

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cnvnator on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvnator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cnvnator

cnvnator -root file.root -tree file.bam -chrom $(seq 1 22) X Y

plotcircular.py file.root

Coinfinder

Introduction

Coinfinder is an algorithm and software tool that detects genes which associate and dissociate with other genes more often than expected by chance in pangenomes.

For more information, please check:

Versions

  • 1.2.0

Commands

  • coinfinder

Module

You can load the modules by:

module load biocontainers
module load coinfinder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run coinfinder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coinfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers coinfinder

coinfinder -i coinfinder-manuscript/gene_presence_absence.csv \
    -I -p coinfinder-manuscript/core-gps_fasttree.newick \
    -o output

CONCOCT

Introduction

CONCOCT: Clustering cONtigs with COverage and ComposiTion.

Detailed usage can be found here: https://github.com/BinPro/CONCOCT

Versions

  • 1.1.0

Commands

  • concoct

  • concoct_refine

  • concoct_coverage_table.py

  • cut_up_fasta.py

  • extract_fasta_bins.py

  • merge_cutup_clustering.py

Module

You can load the modules by:

module load biocontainers
module load concoct/1.1.0-py38

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run concoct on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=concoct
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers concoct/1.1.0-py38

cut_up_fasta.py final.contigs.fa -c 10000 -o 0 --merge_last -b contigs_10K.bed > contigs_10K.fa
concoct_coverage_table.py contigs_10K.bed SRR1976948_sorted.bam > coverage_table.tsv
concoct --composition_file contigs_10K.fa --coverage_file coverage_table.tsv -b concoct_output/

Control-freec

Introduction

Control-freec is a tool for detection of copy-number changes and allelic imbalances (including LOH) using deep-sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/control-freec and its home page on Github.

Versions

  • 11.6

Commands

  • freec

Module

You can load the modules by:

module load biocontainers
module load control-freec

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Control-freec on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=control-freec
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers control-freec

freec -conf config_chr19.txt
Alternative text

Cooler

Introduction

Cooler is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.

For more information, please check its website: https://biocontainers.pro/tools/cooler and its home page on Github.

Versions

  • 0.8.11

Commands

  • cooler

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load Cooler

Interactive job

To run Cooler interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cooler
(base) UserID@bell-a008:~ $ python
Python 3.9.7 |  packaged by conda-forge |  (default, Sep 29 2021, 19:20:46)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cooler

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cooler batch jobs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cooler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cooler

cooler info data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -f bin-size data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -m data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler tree data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler attrs data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool

Coverm

Introduction

Coverm is a configurable, easy to use and fast DNA read coverage and relative abundance calculator focused on metagenomics applications.

For more information, please check its website: https://biocontainers.pro/tools/coverm and its home page on Github.

Versions

  • 0.6.1

Commands

  • coverm

Module

You can load the modules by:

module load biocontainers
module load coverm

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Coverm on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coverm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers coverm

coverm  genome  --genome-fasta-files xcc.fasta  --coupled SRR11234553_1.fastq SRR11234553_2.fastq

Covgen

Introduction

Covgen creates a target specific exome_full192.coverage.txt file required by MutSig.

For more information, please check:

Versions

  • 1.0.2

Commands

  • CovGen

Module

You can load the modules by:

module load biocontainers
module load covgen

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run covgen on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=covgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers covgen

Cramino

Introduction

Cramino is a tool for quick quality assessment of cram and bam files, intended for long read sequencing.

For more information, please check:

Versions

  • 0.9.6

Commands

  • cramino

Module

You can load the modules by:

module load biocontainers
module load cramino

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cramino on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cramino
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cramino

CRISPRCasFinder

Introduction

CRISPRCasFinder enables the easy detection of CRISPRs and cas genes in user-submitted sequence data. It is an updated, improved, and integrated version of CRISPRFinder and CasFinder.

Detailed usage can be found here: https://github.com/dcouvin/CRISPRCasFinder

Versions

  • 4.2.20

Commands

  • CRISPRCasFinder.pl

Module

You can load the modules by:

module load biocontainers
module load crisprcasfinder/4.2.20

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run CRISPRCasFinder on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=CRISPRCasFinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers crisprcasfinder/4.2.20

CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -cf CasFinder-2.0.3 -def G -keep

Crispresso2

Introduction

CRISPResso2 is a software pipeline designed to enable rapid and intuitive interpretation of genome editing experiments.

For more information, please check:

Versions

  • 2.2.10

  • 2.2.11a

  • 2.2.8

  • 2.2.9

Commands

  • CRISPResso

  • CRISPRessoAggregate

  • CRISPRessoBatch

  • CRISPRessoCompare

  • CRISPRessoPooled

  • CRISPRessoPooledWGSCompare

  • CRISPRessoWGS

Module

You can load the modules by:

module load biocontainers
module load crispresso2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run crispresso2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispresso2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers crispresso2

CRISPResso --fastq_r1 nhej.r1.fastq.gz --fastq_r2 nhej.r2.fastq.gz -n nhej --amplicon_seq \
    AATGTCCCCCAATGGGAAGTTCATCTGGCACTGCCCACAGGTGAGGAGGTCATGATCCCCTTCTGGAGCTCCCAACGGGCCGTGGTCTGGTTCATCATCTGTAAGAATGGCTTCAAGAGGCTCGGCTGTGGTT

Crispritz

Introduction

Crispritz is a software package containing 5 different tools dedicated to perform predictive analysis and result assessement on CRISPR/Cas experiments.

For more information, please check its website: https://biocontainers.pro/tools/crispritz and its home page on Github.

Versions

  • 2.6.5

Commands

  • crispritz.py

Module

You can load the modules by:

module load biocontainers
module load crispritz

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Crispritz on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispritz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers crispritz

crispritz.py add-variants hg38_1000genomeproject_vcf/ hg38_ref/ &> output.redirect.out

crispritz.py index-genome hg38_ref hg38_ref/ 20bp-NGG-SpCas9.txt -bMax 2 &> output.redirect.out

crispritz.py search hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38 -mm 4 -t -scores hg38_ref/ &> output.redirect.out

crispritz.py search genome_library/NGG_2_hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38.bulges -index -mm 4 -bDNA 1 -bRNA 1 -t &> output.redirect.out

crispritz.py annotate-results emx1.hg38.targets.txt hg38Annotation.bed emx1.hg38 &> output.redirect.out

Crossmap

Introduction

Crossmap is a program for genome coordinates conversion between different assemblies.

Versions

  • 0.6.3

Commands

  • CrossMap.py

Module

You can load the modules by:

module load biocontainers
module load crossmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Crossmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crossmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers crossmap

CrossMap.py bed GRCh37_to_GRCh38.chain.gz test.bed

cross_match

Introduction

cross_match is a general purpose utility for comparing any two DNA sequence sets using a ‘banded’ version of swat.

For more information, please check its home page: http://www.phrap.org/phredphrapconsed.html#block_phrap.

Versions

  • 1.090518

Commands

  • cross_match

Module

You can load the modules by:

module load biocontainers
module load cross_match

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cross_match on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cross_match
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cross_match

Csvkit

Introduction

csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.

For more information, please check:

Versions

  • 1.1.1

Commands

  • csvclean

  • csvcut

  • csvformat

  • csvgrep

  • csvjoin

  • csvjson

  • csvlook

  • csvpy

  • csvsort

  • csvsql

  • csvstack

  • csvstat

  • in2csv

  • sql2csv

Module

You can load the modules by:

module load biocontainers
module load csvkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run csvkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers csvkit

Csvtk

Introduction

Csvtk is a cross-platform, efficient and practical CSV/TSV toolkit.

For more information, please check its website: https://biocontainers.pro/tools/csvtk and its home page on Github.

Versions

  • 0.23.0

  • 0.25.0

Commands

  • csvtk

Module

You can load the modules by:

module load biocontainers
module load csvtk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Csvtk on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers csvtk

cat data.csv \
 |  csvtk summary --ignore-non-digits --fields f4:sum,f5:sum --groups f1,f2 \
 |  csvtk pretty

Cutadapt

Introduction

Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.

For more information, please check its website: https://biocontainers.pro/tools/cutadapt and its home page: https://cutadapt.readthedocs.io/en/stable/.

Versions

  • 3.4

  • 3.7

Commands

  • cutadapt

Module

You can load the modules by:

module load biocontainers
module load cutadapt

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cutadapt on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cutadapt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cutadapt


cutadapt -a AACCGGTT -o output.fastq input.fastq

Cuttlefish

Introduction

Cuttlefish is a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from sequencing reads or reference sequences. It is highly scalable in terms of the size of the input data.

For more information, please check:

Versions

  • 2.1.1

Commands

  • cuttlefish

Module

You can load the modules by:

module load biocontainers
module load cuttlefish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run cuttlefish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cuttlefish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cuttlefish

Cyvcf2

Introduction

Cyvcf2 is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.

For more information, please check its website: https://biocontainers.pro/tools/cyvcf2 and its home page on Github.

Versions

  • 0.30.14

Commands

  • cyvcf2

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load cyvcf2

Interactive job

To run Cyvcf2 interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n1 -t1:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.7.12 |  packaged by conda-forge |  (default, Oct 26 2021, 06:08:53)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cyvcf2 import VCF

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Cyvcf2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cyvcf2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers cyvcf2

cyvcf2 --help
cyvcf2 [OPTIONS] <vcf_file>

Das_tool

Introduction

DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.

For more information, please check:

Versions

  • 1.1.6

Commands

  • DAS_Tool

  • Contigs2Bin_to_Fasta.sh

  • Fasta_to_Contig2Bin.sh

  • get_species_taxids.sh

Module

You can load the modules by:

module load biocontainers
module load das_tool

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run das_tool on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=das_tool
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers das_tool

DAS_Tool -i sample.human.gut_concoct_contigs2bin.tsv,\
    sample.human.gut_maxbin2_contigs2bin.tsv,\
    sample.human.gut_metabat_contigs2bin.tsv,\
    sample.human.gut_tetraESOM_contigs2bin.tsv \
    -l concoct,maxbin,metabat,tetraESOM \
    -c sample.human.gut_contigs.fa \
    -o DASToolRun2 \
    --proteins DASToolRun1_proteins.faa \
    --write_bin_evals \
    --threads 4 \
    --score_threshold 0.6

Dbg2olc

Introduction

Dbg2olc is used for efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies.

For more information, please check its website: https://biocontainers.pro/tools/dbg2olc and its home page on Github.

Versions

  • 20180222

  • 20200723

Commands

  • AssemblyStatistics

  • DBG2OLC

  • RunSparcConsensus.txt

  • SelectLongestReads

  • SeqIO.py

  • Sparc

  • SparseAssembler

  • split_and_run_sparc.sh

  • split_and_run_sparc.sh.bak

  • split_reads_by_backbone.py

Module

You can load the modules by:

module load biocontainers
module load dbg2olc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Dbg2olc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dbg2olc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dbg2olc

SelectLongestReads sum 600000000 longest 0 o TEST.fq f SRR1976948.abundtrim.subset.pe.fq

Deconseq

Introduction

DeconSeq: DECONtamination of SEQuence data using a modified version of BWA-SW. The DeconSeq tool can be used to automatically detect and efficiently remove sequence contamination from genomic and metagenomic datasets. It is easily configurable and provides a user-friendly interface.

For more information, please check:

Versions

  • 0.4.3

Commands

  • bwa64

  • deconseq.pl

  • splitFasta.pl

Module

You can load the modules by:

module load biocontainers
module load deconseq

Helper command

Note

Users need to use DeconSeqConfig.pm to specify the database information. Besides, for the current deconseq module in biocontainers, users need to copy the executables to your current directory, including bwa64, deconseq.pl, and splitFasta.pl. This step is only needed to run once.

A helper command copy_DeconSeqConfig is provided to copy the configuration file DeconSeqConfig.pm and executables to your current directory. You just need to run the command copy_DeconSeqConfig and modify DeconSeqConfig.pm as needed:

copy_DeconSeqConfig
nano DeconSeqConfig.pm # modify database information as needed

For detailed information about how to config DeconSeqConfig.pm, please check its online manual (https://sourceforge.net/projects/deconseq/files/).

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run deconseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deconseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deconseq

bwa64 index -p hg38_db -a bwtsw Homo_sapiens.GRCh38.dna.fa
bwa64 index -p m39_db -a bwtsw GRCm38.p4.genome.fa
deconseq.pl -f input.fastq -dbs hg38_db -dbs_retain m39_db

Deepbgc

Introduction

Deepbgc is a tool for BGC detection and classification using deep learning.

For more information, please check its website: https://biocontainers.pro/tools/deepbgc and its home page on Github.

Versions

  • 0.1.26

  • 0.1.30

Commands

  • deepbgc

Module

You can load the modules by:

module load biocontainers
module load deepbgc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Deepbgc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepbgc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deepbgc

export DEEPBGC_DOWNLOADS_DIR=$PWD
deepbgc download
deepbgc pipeline genome.fa  -o output

Deepconsensus

Introduction

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

For more information, please check:

Versions

  • 0.2.0

Commands

  • deepconsensus

  • ccs

  • actc

Module

You can load the modules by:

module load biocontainers
module load deepconsensus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run deepconsensus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deepconsensus

deepconsensus run \
    --subreads_to_ccs=subreads_to_ccs.bam  \
    --ccs_fasta=ccs.fasta \
    --checkpoint=checkpoint-50 \
    --output=output.fastq \
    --batch_zmws=100

Deepsignal2

Introduction

Deepsignal2 is a deep-learning method for detecting DNA methylation state from Oxford Nanopore sequencing reads.

For more information, please check its home page on Github.

Versions

  • 0.1.2

Commands

  • deepsignal2

  • call_modification_frequency.py

  • combine_call_mods_freq_files.py

  • combine_two_strands_frequency.py

  • concat_two_files.py

  • evaluate_mods_call.py

  • filter_samples_by_label.py

  • filter_samples_by_positions.py

  • gff_reader.py

  • randsel_file_rows.py

  • shuffle_a_big_file.py

  • split_freq_file_by_5mC_motif.py

  • txt_formater.py

Module

You can load the modules by:

module load biocontainers
module load deepsignal2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Deepsignal2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepsignal2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deepsignal2

DeepTools

Introduction

DeepTools is a collection of user-friendly tools for normalization and visualization of deep-sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/deeptools and its home page on Github.

Versions

  • 3.5.1-py

Commands

  • alignmentSieve

  • bamCompare

  • bamCoverage

  • bamPEFragmentSize

  • bigwigCompare

  • computeGCBias

  • computeMatrix

  • computeMatrixOperations

  • correctGCBias

  • deeptools

  • estimateReadFiltering

  • estimateScaleFactor

  • multiBamSummary

  • multiBigwigSummary

  • plotCorrelation

  • plotCoverage

  • plotEnrichment

  • plotFingerprint

  • plotHeatmap

  • plotPCA

  • plotProfile

Module

You can load the modules by:

module load biocontainers
module load deeptools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run DeepTools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deeptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deeptools

bamCoverage  --normalizeUsing CPM -p 32  \
     --effectiveGenomeSize  11000000  \
     -b WT_coord_sorted.bam  \
     -o WT_coord_sorted.bw

Deepvariant

Introduction

DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.

For more information, please check:

Versions

  • 1.0.0

  • 1.1.0

Commands

  • call_variants

  • get-pip.py

  • make_examples

  • model_eval

  • model_train

  • postprocess_variants

  • run-prereq.sh

  • run_deepvariant

  • run_deepvariant.py

  • settings.sh

  • show_examples

  • vcf_stats_report

Module

You can load the modules by:

module load biocontainers
module load deepvariant

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run deepvariant on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers deepvariant

INPUT_DIR="${PWD}/quickstart-testdata"
DATA_HTTP_DIR="https://storage.googleapis.com/deepvariant/quickstart-testdata"
mkdir -p ${INPUT_DIR}
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam.bai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.bed
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz.tbi
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.gzi

run_deepvariant --model_type=WGS --ref="${INPUT_DIR}"/ucsc.hg19.chr20.unittest.fasta  --reads="${INPUT_DIR}"/NA12878_S1.chr20.10_10p1mb.bam  --regions "chr20:10,000,000-10,010,000"  --output_vcf="output/output.vcf.gz"  --output_gvcf="output/output.g.vcf.gz" --intermediate_results_dir "output/intermediate_results_dir"  --num_shards=4

Delly

Introduction

Delly is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/delly and its home page on Github.

Versions

  • 0.9.1

  • 1.0.3

  • 1.1.3

  • 1.1.5

  • 1.1.6

Commands

  • delly

Module

You can load the modules by:

module load biocontainers
module load delly

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Delly on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=delly
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers delly

delly call -x hg19.excl -o delly.bcf -g hg19.fa input.bam
delly filter -f somatic -o t1.pre.bcf -s samples.tsv t1.bcf

Dendropy

Introduction

DendroPy is a Python library for phylogenetic computing. It provides classes and functions for the simulation, processing, and manipulation of phylogenetic trees and character matrices, and supports the reading and writing of phylogenetic data in a range of formats, such as NEXUS, NEWICK, NeXML, Phylip, FASTA, etc. Application scripts for performing some useful phylogenetic operations, such as data conversion and tree posterior distribution summarization, are also distributed and installed as part of the libary. DendroPy can thus function as a stand-alone library for phylogenetics, a component of more complex multi-library phyloinformatic pipelines, or as a scripting “glue” that assembles and drives such pipelines.

For more information, please check:

Versions

  • 4.5.2

Commands

  • python

  • python3

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load dendropy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run dendropy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dendropy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dendropy

Diamond

Introduction

Diamond is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data. The key features are:

  • Pairwise alignment of proteins and translated DNA at 100x-10,000x speed of BLAST.

  • Frameshift alignments for long read analysis.

  • Low resource requirements and suitable for running on standard desktops or laptops.

  • Various output formats, including BLAST pairwise, tabular and XML, as well as taxonomic classification.

Detailed about its usage can be found here: https://github.com/bbuchfink/diamond

Versions

  • 2.0.13

  • 2.0.14

  • 2.0.15

  • 2.1.6

Commands

  • diamond makedb

  • diamond prepdb

  • diamond blastp

  • diamond blastx

  • diamond view

  • diamond version

  • diamond dbinfo

  • diamond help

  • diamond test

Module

You can load the modules by:

module load biocontainers
module load diamond/2.0.14

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run diamond on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=diamond
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers diamond/2.0.14

diamond makedb  --in uniprot_sprot.fasta -d uniprot_sprot
diamond blastp -p 24 -q test.faa -d uniprot_sprot  --very-sensitive -o blastp_output.txt

Dnaapler

Introduction

dnaapler is a simple python program that takes a single nucleotide input sequence (in FASTA format), finds the desired start gene using blastx against an amino acid sequence database, checks that the start codon of this gene is found, and if so, then reorients the chromosome to begin with this gene on the forward strand.

For more information, please check:

Versions

  • 0.1.0

Commands

  • dnaapler

Module

You can load the modules by:

module load biocontainers
module load dnaapler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run dnaapler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaapler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dnaapler

Dnaio

Introduction

Dnaio is a Python 3.7+ library for very efficient parsing and writing of FASTQ and also FASTA files.

For more information, please check its website: https://biocontainers.pro/tools/dnaio and its home page on Github.

Versions

  • 0.8.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load dnaio

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Dnaio on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dnaio

python dnaio_test.py

Dragonflye

Introduction

Dragonflye is a pipeline that aims to make assembling Oxford Nanopore reads quick and easy.

For more information, please check:

Versions

  • 1.0.13

  • 1.0.14

Commands

  • dragonflye

Module

You can load the modules by:

module load biocontainers
module load dragonflye

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run dragonflye on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=dragonflye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dragonflye

dragonflye --cpus 8 \
     --outdir output \
     --reads SRR18498195.fastq

Drep

Introduction

Drep is a python program for rapidly comparing large numbers of genomes.

For more information, please check its website: https://biocontainers.pro/tools/drep and its home page on Github.

Versions

  • 3.2.2

Commands

  • dRep

Module

You can load the modules by:

module load biocontainers
module load drep

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Drep on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers drep

dRep compare compare_out -g tests/genomes/*
dRep dereplicate dereplicate_out -g tests/genomes/*

Dropest

Introduction

Dropest is a pipeline for initial analysis of droplet-based single-cell RNA-seq data.

For more information, please check its website: https://biocontainers.pro/tools/dropest and its home page on Github.

Versions

  • 0.8.6

Commands

  • dropest

  • droptag

  • dropReport.Rsc

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load dropest

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Dropest on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dropest
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dropest

dropest -f -c 10x.xml  -C 1200 neurons_900_possorted_genome_bam.bam

Drop-seq

Introduction

Drop-seq are java tools for analyzing Drop-seq data.

For more information, please check:

Versions

  • 2.5.2

Commands

  • AssignCellsToSamples

  • BamTagHistogram

  • BamTagOfTagCounts

  • BaseDistributionAtReadPosition

  • BipartiteRabiesVirusCollapse

  • CensusSeq

  • CollapseBarcodesInPlace

  • CollapseTagWithContext

  • CompareDropSeqAlignments

  • ComputeUMISharing

  • ConvertTagToReadGroup

  • ConvertToRefFlat

  • CountUnmatchedSampleIndices

  • CreateIntervalsFiles

  • CreateMetaCells

  • CreateSnpIntervalFromVcf

  • CsiAnalysis

  • DetectBeadSubstitutionErrors

  • DetectBeadSynthesisErrors

  • DetectDoublets

  • DigitalExpression

  • DownsampleBamByTag

  • DownsampleTranscriptsAndQuantiles

  • Drop-seq_Alignment_Cookbook.pdf

  • Drop-seq_alignment.sh

  • FilterBam

  • FilterBamByGeneFunction

  • FilterBamByTag

  • FilterDge

  • FilterGtf

  • FilterValidRabiesBarcodes

  • GatherGeneGCLength

  • GatherMolecularBarcodeDistributionByGene

  • GatherReadQualityMetrics

  • GenotypeSperm

  • MaskReferenceSequence

  • MergeDgeSparse

  • PolyATrimmer

  • ReduceGtf

  • RollCall

  • SelectCellsByNumTranscripts

  • SignTest

  • SingleCellRnaSeqMetricsCollector

  • SpermSeqMarkDuplicates

  • SplitBamByCell

  • TagBam

  • TagBamWithReadSequenceExtended

  • TagReadWithGeneExonFunction

  • TagReadWithGeneFunction

  • TagReadWithInterval

  • TagReadWithRabiesBarcodes

  • TrimStartingSequence

  • ValidateAlignedSam

  • ValidateReference

  • create_Drop-seq_reference_metadata.sh

Module

You can load the modules by:

module load biocontainers
module load drop-seq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run drop-seq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drop-seq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers drop-seq

Dsuite

Introduction

Dsuite is a fast C++ implementation, allowing genome scale calculations of the D and f4-ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file.

For more information, please check its home page on Github.

Versions

  • 0.4.r43

  • 0.5.r44

Commands

  • Dsuite

  • dtools.py

  • DtriosParallel

Module

You can load the modules by:

module load biocontainers
module load dsuite

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Dsuite on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=dsuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers dsuite

Dsuite Dtrios -c -n no_geneflow -t simulated_tree_no_geneflow.nwk chr1_no_geneflow.vcf.gz species_sets.txt

easySFS

Introduction

easySFS is a tool for the effective selection of population size projection for construction of the site frequency spectrum.

For more information, please check its home page on Github.

Versions

  • 1.0

Commands

  • easySFS.py

Module

You can load the modules by:

module load biocontainers
module load easysfs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run easySFS on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=easysfs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers easysfs

easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt --preview -a
easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt -a --proj=7,7

Edta

Introduction

Edta is is developed for automated whole-genome de-novo TE annotation and benchmarking the annotation performance of TE libraries.

For more information, please check its website: https://biocontainers.pro/tools/edta and its home page on Github.
Note: Running EDTA, please use the command like this:

EDTA.pl [OPTIONS]

DO NOT call it ‘perl EDTA.pl’

Versions

  • 1.9.6

  • 2.0.0

Commands

  • EDTA.pl

  • EDTA_processI.pl

  • EDTA_raw.pl

  • FET.pl

  • bdf2gdfont.pl

  • buildRMLibFromEMBL.pl

  • buildSummary.pl

  • calcDivergenceFromAlign.pl

  • cd-hit-2d-para.pl

  • cd-hit-clstr_2_blm8.pl

  • cd-hit-div.pl

  • cd-hit-para.pl

  • check_result.pl

  • clstr2tree.pl

  • clstr2txt.pl

  • clstr2xml.pl

  • clstr_cut.pl

  • clstr_list.pl

  • clstr_list_sort.pl

  • clstr_merge.pl

  • clstr_merge_noorder.pl

  • clstr_quality_eval.pl

  • clstr_quality_eval_by_link.pl

  • clstr_reduce.pl

  • clstr_renumber.pl

  • clstr_rep.pl

  • clstr_reps_faa_rev.pl

  • clstr_rev.pl

  • clstr_select.pl

  • clstr_select_rep.pl

  • clstr_size_histogram.pl

  • clstr_size_stat.pl

  • clstr_sort_by.pl

  • clstr_sort_prot_by.pl

  • clstr_sql_tbl.pl

  • clstr_sql_tbl_sort.pl

  • convert_MGEScan3.0.pl

  • convert_ltr_struc.pl

  • convert_ltrdetector.pl

  • createRepeatLandscape.pl

  • down_tRNA.pl

  • dupliconToSVG.pl

  • filter_rt.pl

  • genome_plot.pl

  • genome_plot2.pl

  • genome_plot_svg.pl

  • getRepeatMaskerBatch.pl

  • legacy_blast.pl

  • lib-test.pl

  • make_multi_seq.pl

  • maskFile.pl

  • plot_2d.pl

  • plot_len1.pl

  • rmOut2Fasta.pl

  • rmOutToGFF3.pl

  • rmToUCSCTables.pl

  • update_blastdb.pl

  • viewMSA.pl

  • wublastToCrossmatch.pl

Module

You can load the modules by:

module load biocontainers
module load edta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Edta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=edta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers edta

EDTA.pl --genome genome.fa --cds genome.cds.fa --curatedlib EDTA/database/rice6.9.5.liban --exclude genome.exclude.bed --overwrite 1 --sensitive 1 --anno 1 --evaluate 1 --threads 10

Eggnog-mapper

Introduction

Eggnog-mapper is a tool for fast functional annotation of novel sequences.

For more information, please check its website: https://biocontainers.pro/tools/eggnog-mapper and its home page on Github.

Versions

  • 2.1.7

Commands

  • create_dbs.py

  • download_eggnog_data.py

  • emapper.py

  • hmm_mapper.py

  • hmm_server.py

  • hmm_worker.py

  • vba_extract.py

Module

You can load the modules by:

module load biocontainers
module load eggnog-mapper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Eggnog-mapper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=eggnog-mapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers eggnog-mapper

emapper.py -i proteins.faa --cpu 24 -o protein.out
emapper.py -m diamond --itype CDS -i cDNA.fasta -o cdna.out --cpu 24

Emboss

Introduction

Emboss is “The European Molecular Biology Open Software Suite”.

For more information, please check its website: https://biocontainers.pro/tools/emboss and its home page: http://emboss.open-bio.org.

Versions

  • 6.6.0

Commands

  • aaindexextract

  • abiview

  • acdc

  • acdgalaxy

  • acdlog

  • acdpretty

  • acdtable

  • acdtrace

  • acdvalid

  • aligncopy

  • aligncopypair

  • antigenic

  • assemblyget

  • backtranambig

  • backtranseq

  • banana

  • biosed

  • btwisted

  • cachedas

  • cachedbfetch

  • cacheebeyesearch

  • cacheensembl

  • cai

  • chaos

  • charge

  • checktrans

  • chips

  • cirdna

  • codcmp

  • codcopy

  • coderet

  • compseq

  • cons

  • consambig

  • cpgplot

  • cpgreport

  • cusp

  • cutgextract

  • cutseq

  • dan

  • dbiblast

  • dbifasta

  • dbiflat

  • dbigcg

  • dbtell

  • dbxcompress

  • dbxedam

  • dbxfasta

  • dbxflat

  • dbxgcg

  • dbxobo

  • dbxreport

  • dbxresource

  • dbxstat

  • dbxtax

  • dbxuncompress

  • degapseq

  • density

  • descseq

  • diffseq

  • distmat

  • dotmatcher

  • dotpath

  • dottup

  • dreg

  • drfinddata

  • drfindformat

  • drfindid

  • drfindresource

  • drget

  • drtext

  • edamdef

  • edamhasinput

  • edamhasoutput

  • edamisformat

  • edamisid

  • edamname

  • edialign

  • einverted

  • embossdata

  • embossupdate

  • embossversion

  • emma

  • emowse

  • entret

  • epestfind

  • eprimer3

  • eprimer32

  • equicktandem

  • est2genome

  • etandem

  • extractalign

  • extractfeat

  • extractseq

  • featcopy

  • featmerge

  • featreport

  • feattext

  • findkm

  • freak

  • fuzznuc

  • fuzzpro

  • fuzztran

  • garnier

  • geecee

  • getorf

  • godef

  • goname

  • helixturnhelix

  • hmoment

  • iep

  • infoalign

  • infoassembly

  • infobase

  • inforesidue

  • infoseq

  • isochore

  • jaspextract

  • jaspscan

  • jembossctl

  • lindna

  • listor

  • makenucseq

  • makeprotseq

  • marscan

  • maskambignuc

  • maskambigprot

  • maskfeat

  • maskseq

  • matcher

  • megamerger

  • merger

  • msbar

  • mwcontam

  • mwfilter

  • needle

  • needleall

  • newcpgreport

  • newcpgseek

  • newseq

  • nohtml

  • noreturn

  • nospace

  • notab

  • notseq

  • nthseq

  • nthseqset

  • octanol

  • oddcomp

  • ontocount

  • ontoget

  • ontogetcommon

  • ontogetdown

  • ontogetobsolete

  • ontogetroot

  • ontogetsibs

  • ontogetup

  • ontoisobsolete

  • ontotext

  • palindrome

  • pasteseq

  • patmatdb

  • patmatmotifs

  • pepcoil

  • pepdigest

  • pepinfo

  • pepnet

  • pepstats

  • pepwheel

  • pepwindow

  • pepwindowall

  • plotcon

  • plotorf

  • polydot

  • preg

  • prettyplot

  • prettyseq

  • primersearch

  • printsextract

  • profit

  • prophecy

  • prophet

  • prosextract

  • pscan

  • psiphi

  • rebaseextract

  • recoder

  • redata

  • refseqget

  • remap

  • restover

  • restrict

  • revseq

  • runJemboss.sh

  • seealso

  • seqcount

  • seqmatchall

  • seqret

  • seqretsetall

  • seqretsplit

  • seqxref

  • seqxrefget

  • servertell

  • showalign

  • showdb

  • showfeat

  • showorf

  • showpep

  • showseq

  • showserver

  • shuffleseq

  • sigcleave

  • silent

  • sirna

  • sixpack

  • sizeseq

  • skipredundant

  • skipseq

  • splitsource

  • splitter

  • stretcher

  • stssearch

  • supermatcher

  • syco

  • taxget

  • taxgetdown

  • taxgetrank

  • taxgetspecies

  • taxgetup

  • tcode

  • textget

  • textsearch

  • tfextract

  • tfm

  • tfscan

  • tmap

  • tranalign

  • transeq

  • trimest

  • trimseq

  • trimspace

  • twofeat

  • union

  • urlget

  • variationget

  • vectorstrip

  • water

  • whichdb

  • wobble

  • wordcount

  • wordfinder

  • wordmatch

  • wossdata

  • wossinput

  • wossname

  • wossoperation

  • wossoutput

  • wossparam

  • wosstopic

  • xmlget

  • xmltext

  • yank

Module

You can load the modules by:

module load biocontainers
module load emboss

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Emboss on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=emboss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers emboss

Ensembl-vep

Introduction

Ensembl-vep(Ensembl Variant Effect Predictor) predicts the functional effects of genomic variants.

For more information, please check:

Versions

  • 106.1

  • 107.0

  • 108.2

Commands

  • vep

  • haplo

  • variant_recoder

Module

You can load the modules by:

module load biocontainers
module load ensembl-vep

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ensembl-vep on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ensembl-vep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ensembl-vep

haplo -i bos_taurus_UMD3.1.vcf -o out.txt

Epic2

Introduction

Epic2 is an ultraperformant Chip-Seq broad domain finder based on SICER.

For more information, please check its website: https://biocontainers.pro/tools/epic2 and its home page on Github.

Versions

  • 0.0.51

  • 0.0.52

Commands

  • epic2

  • epic2-bw

  • epic2-df

Module

You can load the modules by:

module load biocontainers
module load epic2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Epic2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=epic2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers epic2

epic2 -t /examples/test.bed.gz \
  -c /examples/control.bed.gz \
  > deleteme.txt

Evidencemodeler

Introduction

Evidencemodeler is a software combines ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures.

For more information, please check its website: https://biocontainers.pro/tools/evidencemodeler and its home page on Github.

Versions

  • 1.1.1

Commands

  • evidence_modeler.pl

  • BPbtab.pl

  • EVMLite.pl

  • EVM_to_GFF3.pl

  • convert_EVM_outputs_to_GFF3.pl

  • create_weights_file.pl

  • execute_EVM_commands.pl

  • extract_complete_proteins.pl

  • gff3_file_to_proteins.pl

  • gff3_gene_prediction_file_validator.pl

  • gff_range_retriever.pl

  • partition_EVM_inputs.pl

  • recombine_EVM_partial_outputs.pl

  • summarize_btab_tophits.pl

  • write_EVM_commands.pl

Module

You can load the modules by:

module load biocontainers
module load evidencemodeler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Evidencemodeler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=evidencemodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers evidencemodeler


evidence_modeler.pl --genome genome.fasta \
                   --weights weights.txt \
                   --gene_predictions gene_predictions.gff3 \
                   --protein_alignments protein_alignments.gff3 \
                   --transcript_alignments transcript_alignments.gff3 \
                 > evm.out

Exonerate

Introduction

Exonerate is a generic tool for pairwise sequence comparison/alignment.

For more information, please check its home page: https://www.ebi.ac.uk/about/vertebrate-genomics/software/exonerate.

Versions

  • 2.4.0

Commands

  • exonerate

Module

You can load the modules by:

module load biocontainers
module load exonerate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Exonerate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=exonerate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers exonerate

exonerate  -m genome2genome  cms.fasta cmm.fasta > cm_vs_cs.out

Expansionhunter

Introduction

Expansion Hunter: a tool for estimating repeat sizes.

For more information, please check:

Versions

  • 4.0.2

Commands

  • ExpansionHunter

Module

You can load the modules by:

module load biocontainers
module load expansionhunter

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run expansionhunter on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=expansionhunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers expansionhunter

Fasta3

Introduction

Fasta3 is a suite of programs for searching nucleotide or protein databases with a query sequence.

For more information, please check its website: https://biocontainers.pro/tools/fasta3 and its home page on Github.

Versions

  • 36.3.8

Commands

  • fasta36

  • fastf36

  • fastm36

  • fasts36

  • fastx36

  • fasty36

  • ggsearch36

  • glsearch36

  • lalign36

  • ssearch36

  • tfastf36

  • tfastm36

  • tfasts36

  • tfastx36

  • tfasty36

Module

You can load the modules by:

module load biocontainers
module load fasta3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fasta3 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasta3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fasta3

fasta36 input.fasta genome.fasta

FastANI

Introduction

FastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI).

For more information, please check its website: https://biocontainers.pro/tools/fastani and its home page on Github.

Versions

  • 1.32

  • 1.33

Commands

  • fastANI

Module

You can load the modules by:

module load biocontainers
module load fastani

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run FastANI on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastani

fastANI -q cmm.fasta -r cms.fasta -o cm_cs_out

fastANI -q cmm.fasta -r cms.fasta  --visualize -o cm_cs_visualize_out

Fastp

Introduction

Fastp is an ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging, etc).

For more information, please check its website: https://biocontainers.pro/tools/fastp and its home page on Github.

Versions

  • 0.20.1

  • 0.23.2

Commands

  • fastp

Module

You can load the modules by:

module load biocontainers
module load fastp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fastp on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastp

fastp -i input_1.fastq  -I input_2.fastq -o out.R1.fq.gz -O out.R2.fq.gz

FastQC

Introduction

FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.

For more information, please check its website: https://biocontainers.pro/tools/fastqc and its home page: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.

Versions

  • 0.11.9

Commands

  • fastqc

Module

You can load the modules by:

module load biocontainers
module load fastqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fastqc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=fastqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastqc

fastqc -o fastqc_out -t 4 FASTQ1 FASTQ2

Fastq_pair

Introduction

Fastq_pair is used to match up paired end fastq files quickly and efficiently.

For more information, please check its website: https://biocontainers.pro/tools/fastq_pair and its home page on Github.

Versions

  • 1.0

Commands

  • fastq_pair

Module

You can load the modules by:

module load biocontainers
module load fastq_pair

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fastq_pair on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq_pair
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastq_pair

fastq_pair seq_1.fastq  seq_2.fastq

Fastq-scan

Introduction

Fastq-scan reads a FASTQ from STDIN and outputs summary statistics (read lengths, per-read qualities, per-base qualities) in JSON format.

For more information, please check:

Versions

  • 1.0.0

Commands

  • fastq-scan

Module

You can load the modules by:

module load biocontainers
module load fastq-scan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run fastq-scan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq-scan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastq-scan

cat example-q33.fq | fastq-scan -g 150000

Fastspar

Introduction

Fastspar is a tool for rapid and scalable correlation estimation for compositional data.

For more information, please check its website: https://biocontainers.pro/tools/fastspar and its home page on Github.

Versions

  • 1.0.0

Commands

  • fastspar

  • fastspar_bootstrap

  • fastspar_pvalues

  • fastspar_reduce

Module

You can load the modules by:

module load biocontainers
module load fastspar

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fastspar on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastspar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastspar

fastStructure

Introduction

fastStructure is an algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.

Note: programs “structure.py”, “chooseK.py” and “distruct.py” are standalone executable and should be called by name directly (“structure.py”, etc). DO NOT invoke them as “python structure.py”, or as “python /usr/local/bin/structure.py”, this will not work!

Note: This containers lacks X11 libraries, so GUI plots with ‘distruct.py’ do not work. Instead, we need to tell the underlying Matplotlib to use a non-interactive plotting backend (to file). The easiest and most flexible way is to use the MPLBACKEND environment variable: env MPLBACKEND=”svg” distruct.py –output myplot.svg …….

Available backends in this container:

Backend Filetypes Description agg png raster graphics – high quality PNG output ps ps eps vector graphics – Postscript output pdf pdf vector graphics – Portable Document Format svg svg vector graphics – Scalable Vector Graphics

Default MPLBACKEND=”agg” (for PNG format output).

For more information, please check its website: https://biocontainers.pro/tools/faststructure and its home page on Github.

Versions

  • 1.0-py27

Commands

  • structure.py

  • chooseK.py

  • distruct.py

Module

You can load the modules by:

module load biocontainers
module load faststructure

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run fastStructure on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=faststructure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers faststructure

FastTree

Introduction

FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.

Detailed usage can be found here: http://www.microbesonline.org/fasttree/

Versions

  • 2.1.10

  • 2.1.11

Commands

  • fasttree

  • FastTree

  • FastTreeMP

Note

fasttree and FastTree are the same program, and they only support one CPU. If you want to use multiple CPUs, please use FastTreeMP and also set the OMP_NUM_THREADS to the number of cores you requested.

Module

You can load the modules by:

module load biocontainers
module load fasttree

Example job using single CPU

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run FastTree on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fasttree

FastTree alignmentfile > treefile

Example job using multiple CPUs

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run FastTree on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=FastTreeMP
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fasttree

export OMP_NUM_THREADS=24

FastTreeMP alignmentfile > treefile

FASTX-Toolkit

Introduction

FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.

For more information, please check its website: https://biocontainers.pro/tools/fastx_toolkit and its home page on Github.

Versions

  • 0.0.14

Commands

  • fasta_clipping_histogram.pl

  • fasta_formatter

  • fasta_nucleotide_changer

  • fastq_masker

  • fastq_quality_boxplot_graph.sh

  • fastq_quality_converter

  • fastq_quality_filter

  • fastq_quality_trimmer

  • fastq_to_fasta

  • fastx_artifacts_filter

  • fastx_barcode_splitter.pl

  • fastx_clipper

  • fastx_collapser

  • fastx_nucleotide_distribution_graph.sh

  • fastx_nucleotide_distribution_line_graph.sh

  • fastx_quality_stats

  • fastx_renamer

  • fastx_reverse_complement

  • fastx_trimmer

  • fastx_uncollapser

Module

You can load the modules by:

module load biocontainers
module load fastx_toolkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run FASTX-Toolkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastx_toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fastx_toolkit

Filtlong

Introduction

Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.

For more information, please check its website: https://biocontainers.pro/tools/filtlong and its home page on Github.

Versions

  • 0.2.1

Commands

  • filtlong

Module

You can load the modules by:

module load biocontainers
module load filtlong

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Filtlong on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=filtlong
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers filtlong

Flye

Introduction

Flye: Fast and accurate de novo assembler for single molecule sequencing reads.

For more information, please check its website: https://biocontainers.pro/tools/flye and its home page on Github.

Versions

  • 2.9.1

  • 2.9.2

  • 2.9

Commands

  • flye

Module

You can load the modules by:

module load biocontainers
module load flye

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Flye on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=flye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers flye

flye --pacbio-raw E.coli_PacBio_40x.fasta --out-dir out_pacbio --threads 12
flye --nano-raw Loman_E.coli_MAP006-1_2D_50x.fasta --out-dir out_nano --threads 12

Fq

Introduction

Fq is a command line utility for manipulating Illumina-generated FastQ files.

For more information, please check:

Versions

  • 0.10.0

Commands

  • fq

Module

You can load the modules by:

module load biocontainers
module load fq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run fq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fq

Fraggenescan

Introduction

Fraggenescan is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.

For more information, please check its website: https://biocontainers.pro/tools/fraggenescan and its home page on Github.

Versions

  • 1.31

Commands

  • FragGeneScan

  • run_FragGeneScan.pl

Module

You can load the modules by:

module load biocontainers
module load fraggenescan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fraggenescan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fraggenescan

FragGeneScanRs -t 454_10 < example/NC_000913-454.fna > example/NC_000913-454.faa

Fraggenescanrs

Introduction

FragGeneScanRs is a better and faster Rust implementation of the FragGeneScan gene prediction model for short and error-prone reads. Its command line interface is backward compatible and adds extra features for more flexible usage. Compared to the original C implementation, shotgun metagenomic reads are processed up to 22 times faster using a single thread, with better scaling for multithreaded execution.

For more information, please check:

Versions

  • 1.1.0

Commands

  • FragGeneScanRs

Module

You can load the modules by:

module load biocontainers
module load fraggenescanrs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run fraggenescanrs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescanrs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fraggenescanrs

Freebayes

Introduction

Freebayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

For more information, please check its website: https://biocontainers.pro/tools/freebayes and its home page on Github.

Versions

  • 1.3.5

  • 1.3.6

Commands

  • freebayes

  • freebayes-parallel

Module

You can load the modules by:

module load biocontainers
module load freebayes

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Freebayes on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freebayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers freebayes

freebayes -f ref.fa aln.cram >var.vcf

Freyja

Introduction

Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational “barcodes” derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem.

For more information, please check:

Versions

  • 1.3.11

  • 1.4.2

Commands

  • freyja

Module

You can load the modules by:

module load biocontainers
module load freyja

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run freyja on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freyja
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers freyja

Fseq

Introduction

Fseq is a feature density estimator for high-throughput sequence tags.

For more information, please check its home page: https://fureylab.web.unc.edu/software/fseq/.

Versions

  • 2.0.3

Commands

  • fseq2

Module

You can load the modules by:

module load biocontainers
module load fseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Fseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fseq

Ftp

Introduction

A File Transfer Protocol client (FTP client) is a software utility that establishes a connection between a host computer and a remote server, typically an FTP server.

For more information, please check:

Versions

  • 0.17

Commands

  • ftp

Module

You can load the modules by:

module load biocontainers
module load ftp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ftp on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ftp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ftp

Funannotate

Introduction

Funannotate is a genome prediction, annotation, and comparison software package.

For more information, please check its | Docker hub: https://hub.docker.com/r/nextgenusfs/funannotate and its home page on Github.

Versions

  • 1.8.10

  • 1.8.13

Commands

  • funannotate

Module

You can load the modules by:

module load biocontainers
module load funannotate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Funannotate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=funannotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers funannotate

funannotate clean -i genome.fa -o genome_cleaned.fa
funannotate sort -i genome_cleaned.fa -o genome_cleaned_sorted.fa
funannotate predict -i genome_cleaned_sorted.fa -o predict_out --species "arabidopsis" --rna_bam  RNAseq.bam --cpus 12

Fwdpy11

Introduction

Fwdpy11 is a Python package for forward-time population genetic simulation.

For more information, please check:

Versions

  • 0.18.1

Commands

  • python3

  • python

Module

You can load the modules by:

module load biocontainers
module load fwdpy11

Interactive job

To run fwdpy11 interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers fwdpy11
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fwdpy11
>>> pop = fwdpy11.DiploidPopulation(100, 1000.0)
>>> print(f"N = {pop.N}, L = {pop.tables.genome_length}")

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run fwdpy11 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fwdpy11
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers fwdpy11

python script.py

Gadma

Introduction

GADMA is a command-line tool. Basic pipeline presents a series of launches of the genetic algorithm folowed by local search optimization and infers demographic history from the Allele Frequency Spectrum of multiple populations (up to three).

For more information, please check:

Versions

  • 2.0.0rc21

Commands

  • gadma

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load gadma

Interactive job

To run GADMA interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers gadma
(base) UserID@bell-a008:~ $ python
Python 3.8.13 |  packaged by conda-forge |  (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gadma import *

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gadma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gadma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gadma

gadma -p params_file

Gambit

Introduction

GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking) is a tool for rapid taxonomic identification of microbial pathogens.

For more information, please check:

Versions

  • 0.5.0

Commands

  • gambit

Module

You can load the modules by:

module load biocontainers
module load gambit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gambit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gambit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gambit

gambit -d database query -o results.csv *.fasta

Gamma

Introduction

GAMMA (Gene Allele Mutation Microbial Assessment) is a command line tool that finds gene matches in microbial genomic data using protein coding (rather than nucleotide) identity, and then translates and annotates the match by providing the type (i.e., mutant, truncation, etc.) and a translated description (i.e., Y190S mutant, truncation at residue 110, etc.). Because microbial gene families often have multiple alleles and existing databases are rarely exhaustive, GAMMA is helpful in both identifying and explaining how unique alleles differ from their closest known matches.

For more information, please check:

Versions

  • 1.4

  • 2.2

Commands

  • GAMMA-S.py

  • GAMMA.py

Module

You can load the modules by:

module load biocontainers
module load gamma

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gamma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gamma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gamma

GAMMA.py DHQP1701672_complete_genome.fasta ResFinderDB_Combined_05-06-20.fsa GAMMA_Test

Gangstr

Introduction

GangSTR is a tool for genome-wide profiling tandem repeats from short reads. A key advantage of GangSTR over existing genome-wide TR tools (e.g. lobSTR or hipSTR) is that it can handle repeats that are longer than the read length. GangSTR takes aligned reads (BAM) and a set of repeats in the reference genome as input and outputs a VCF file containing genotypes for each locus.

For more information, please check:

Versions

  • 2.5.0

Commands

  • GangSTR

Module

You can load the modules by:

module load biocontainers
module load gangstr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gangstr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gangstr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gangstr

Gapfiller

Introduction

GapFiller is a seed-and-extend local assembler to fill the gap within paired reads. It can be used for both DNA and RNA and it has been tested on Illumina data. GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length.

For more information, please check:

Versions

  • 2.1.2

Commands

  • GapFiller

Module

You can load the modules by:

module load biocontainers
module load gapfiller

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gapfiller on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapfiller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gapfiller

Gapit

Introduction

GAPIT is a Genome Association and Prediction Integrated Tool.

For more information, please check:

Versions

  • 3.3

Commands

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load gapit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gapit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gapit

GATK

Introduction

GATK (Genome Analysis Toolkit) is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery.

For more information, please check its website: https://biocontainers.pro/tools/gatk and its home page: https://www.broadinstitute.org/gatk/.

Versions

  • 3.8

Commands

  • gatk3

Module

You can load the modules by:

module load biocontainers
module load gatk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run GATK on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gatk

gatk3 -T HaplotypeCaller  \
    -nct 24  -R hg38.fa \
    -I 19P0126636WES.sorted.bam \
     -o 19P0126636WES.HC.vcf

GATK4

Introduction

GATK (Genome Analysis Toolkit) is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. Detailed usage can be found here: https://www.broadinstitute.org/gatk/.

Versions

  • 4.2.0

  • 4.2.5.0

  • 4.2.6.1

  • 4.3.0.0

Commands

gatk

Module

You can load the modules by:

module load biocontainers
module load gatk4/4.2.5.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gatk4 our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gatk4/4.2.5.0

gatk  --java-options "-Xmx12G -XX:ParallelGCThreads=24" HaplotypeCaller -R hg38.fa -I 19P0126636WES.sorted.bam  -O 19P0126636WES.HC.vcf --sample-name 19P0126636

Gemma

Introduction

Gemma is a software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets.

For more information, please check its website: https://biocontainers.pro/tools/gemma and its home page on Github.

Versions

  • 0.98.3

Commands

  • gemma

Module

You can load the modules by:

module load biocontainers
module load gemma

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gemma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gemma

gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt \
    -gk -o mouse_hs1940

gemma -g ./example/mouse_hs1940.geno.txt.gz \
    -p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt \
    -k ./output/mouse_hs1940.cXX.txt -lmm -o mouse_hs1940_CD8_lmm

Gemoma

Introduction

Gene Model Mapper (GeMoMa) is a homology-based gene prediction program. GeMoMa uses the annotation of protein-coding genes in a reference genome to infer the annotation of protein-coding genes in a target genome. Thereby, GeMoMa utilizes amino acid sequence and intron position conservation. In addition, GeMoMa allows to incorporate RNA-seq evidence for splice site prediction.

For more information, please check:

Versions

  • 1.7.1

Commands

  • GeMoMa

Module

You can load the modules by:

module load biocontainers
module load gemoma

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gemoma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemoma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gemoma

GeneMark-ES/ET/EP

Introduction

GeneMark-ES/ET/EP contains GeneMark-ES, GeneMark-ET and GeneMark-EP+ algorithms.

Versions

  • 4.68

  • 4.69

Commands

  • bed_to_gff.pl

  • bp_seq_select.pl

  • build_mod.pl

  • calc_introns_from_gtf.pl

  • change_path_in_perl_scripts.pl

  • compare_intervals_exact.pl

  • gc_distr.pl

  • get_below_gc.pl

  • get_sequence_from_GTF.pl

  • gmes_petap.pl

  • hc_exons2hints.pl

  • histogram.pl

  • make_nt_freq_mat.pl

  • parse_ET.pl

  • parse_by_introns.pl

  • parse_gibbs.pl

  • parse_set.pl

  • predict_genes.pl

  • reformat_gff.pl

  • rescale_gff.pl

  • rnaseq_introns_to_gff.pl

  • run_es.pl

  • run_hmm_pbs.pl

  • scan_for_bp.pl

  • star_to_gff.pl

  • verify_evidence_gmhmm.pl

Academic license

To use GeneMark, users need to download license files by yourself.

Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic and LINUX 64 next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit licenss. Paste the link addresses in the commands below:

cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key

Module

You can load the modules by:

module load biocontainers
module load genemark/4.68

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run GeneMark on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=genemark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genemark/4.68

gmes_petap.pl --ES  --cores 24 --sequence scaffolds.fasta

Genemarks-2

Introduction

GeneMarkS-2 combines GeneMark.hmm (prokaryotic) and GeneMark (prokaryotic) with a self-training procedure that determines parameters of the models of both GeneMark.hmm and GeneMark.

For more information, please check:

The users need to download your own licence key from GeneMark website and copy key “gm_key” into users’ home directory as: cp gm_key ~/.gm_key | Home page: http://opal.biology.gatech.edu/GeneMark/

Versions

  • 1.14_1.25

Commands

  • gms2.pl

  • biogem

  • compp

  • gmhmmp2

Module

You can load the modules by:

module load biocontainers
module load genemarks-2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run genemarks-2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genemarks-2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genemarks-2

Genmap

Introduction

GenMap: Ultra-fast Computation of Genome Mappability.

For more information, please check:

Versions

  • 1.3.0

Commands

  • genmap

Module

You can load the modules by:

module load biocontainers
module load genmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run genmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genmap

export TMPDIR=$PWD/tmp
genmap index -F ~/.local/share/genomes/hg38/hg38.fa  -I hg38_index
genmap map -K 64 -E 2 -I hg38_index -O map_output_hg38 -t -w -bg

Genomedata

Introduction

Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint.

For more information, please check:

Versions

  • 1.5.0

Commands

  • python

  • python3

  • genomeCoverageBed

  • genomedata-close-data

  • genomedata-erase-data

  • genomedata-hardmask

  • genomedata-histogram

  • genomedata-info

  • genomedata-load

  • genomedata-load-assembly

  • genomedata-load-data

  • genomedata-load-seq

  • genomedata-open-data

  • genomedata-query

  • genomedata-report

Module

You can load the modules by:

module load biocontainers
module load genomedata

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run genomedata on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomedata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genomedata

Genomepy

Introduction

Genomepy is designed to provide a simple and straightforward way to download and use genomic data.

For more information, please check its website: https://biocontainers.pro/tools/genomepy and its home page on Github.

Versions

  • 0.12.0

  • 0.14.0

Commands

  • genomepy

Module

You can load the modules by:

module load biocontainers
module load genomepy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Genomepy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomepy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genomepy

Genomescope2

Introduction

Genomescope2: Reference-free profiling of polyploid genomes.

For more information, please check its website: https://biocontainers.pro/tools/genomescope2 and its home page on Github.

Versions

  • 2.0

Commands

  • genomescope2

Module

You can load the modules by:

module load biocontainers
module load genomescope2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Genomescope2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomescope2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genomescope2

wget https://raw.githubusercontent.com/schatzlab/genomescope/master/analysis/real_data/ara_F1_21.hist

genomescope2 -i ara_F1_21.hist -o output -k 21

Genomicconsensus

Introduction

Genomicconsensus is the current PacBio consensus and variant calling suite.

For more information, please check its website: https://biocontainers.pro/tools/genomicconsensus and its home page on Github.

Versions

  • 2.3.3

Commands

  • quiver

  • arrow

  • variantCaller

Module

You can load the modules by:

module load biocontainers
module load genomicconsensus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Genomicconsensus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomicconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genomicconsensus

quiver -j12 out.aligned_subreads.bam \
    -r All4mer.V2.01_Insert-changed.fa  \
    -o consensus.fasta -o consensus.fastq

Genrich

Introduction

Genrich is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.

For more information, please check its website: https://biocontainers.pro/tools/genrich and its home page on Github.

Versions

  • 0.6.1

Commands

  • Genrich

Module

You can load the modules by:

module load biocontainers
module load genrich

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Genrich on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genrich
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers genrich

Genrich  -t sample.bam  -o sample.narrowPeak  -v

Getorganelle

Introduction

GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes.

For more information, please check:

Versions

  • 1.7.7.0

Commands

  • get_organelle_config.py

  • get_organelle_from_assembly.py

  • get_organelle_from_reads.py

  • slim_graph.py

  • summary_get_organelle_output.py

Module

You can load the modules by:

module load biocontainers
module load getorganelle

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run getorganelle on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=getorganelle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers getorganelle

Gfaffix

Introduction

GFAffix identifies walk-preserving shared affixes in variation graphs and collapses them into a non-redundant graph structure.

For more information, please check:

Versions

  • 0.1.4

Commands

  • gfaffix

Module

You can load the modules by:

module load biocontainers
module load gfaffix

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gfaffix on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfaffix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gfaffix

Gfastats

Introduction

gfastats is a single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa [.gz]) genome assembly file manipulation. gfastats also allows seamless fasta<>fastq<>gfa[.gz] conversion. It has been tested in genomes even >100Gbp.

For more information, please check:

Versions

  • 1.2.3

  • 1.3.6

Commands

  • gfastats

Module

You can load the modules by:

module load biocontainers
module load gfastats

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gfastats on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfastats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gfastats

gfastats input.fasta -o gfa

Gfatools

Introduction

gfatools is a set of tools for manipulating sequence graphs in the GFA or the rGFA format. It has implemented parsing, subgraph and conversion to FASTA/BED.

For more information, please check:

Versions

  • 0.5

Commands

  • gfatools

Module

You can load the modules by:

module load biocontainers
module load gfatools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gfatools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gfatools

# Extract a subgraph
gfatools view -l MTh4502 -r 1 test/MT.gfa > sub.gfa

# Convert GFA to segment FASTA
gfatools gfa2fa test/MT.gfa > MT-seg.fa

# Convert rGFA to stable FASTA or BED
gfatools gfa2fa -s test/MT.gfa > MT.fa
gfatools gfa2bed -m test/MT.gfa > MT.bed

Gffcompare

Introduction

Gffcompare is used to compare, merge, annotate and estimate accuracy of one or more GFF files.

For more information, please check its website: https://biocontainers.pro/tools/gffcompare and its home page: https://ccb.jhu.edu/software/stringtie/gffcompare.shtml.

Versions

  • 0.11.2

Commands

  • gffcompare

Module

You can load the modules by:

module load biocontainers
module load gffcompare

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gffcompare on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffcompare
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gffcompare

gffcompare -r annotation.gff transcripts.gtf

Gffread

Introduction

Gffread is used to validate, filter, convert and perform various other operations on GFF files.

For more information, please check its website: https://biocontainers.pro/tools/gffread and its home page: http://ccb.jhu.edu/software/stringtie/gff.shtml.

Versions

  • 0.12.7

Commands

  • gffread

Module

You can load the modules by:

module load biocontainers
module load gffread

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gffread on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gffread

gffread -E annotation.gff -o ann_simple.gff

gffread annotation.gff -T -o annotation.gtf

gffread -w transcripts.fa -g genome.fa annotation.gff

Gffutils

Introduction

gffutils is a Python package for working with and manipulating the GFF and GTF format files typically used for genomic annotations.

For more information, please check:

Versions

  • 0.11.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load gffutils

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gffutils on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gffutils

Gimmemotifs

Introduction

GimmeMotifs is a suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments.

For more information, please check:

Versions

  • 0.17.1

Commands

  • gimme

Module

You can load the modules by:

module load biocontainers
module load gimmemotifs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gimmemotifs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gimmemotifs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gimmemotifs

gimme motifs ENCFF407IVS.bed ENCFF407IVS_motifs \
    -g ~/.local/share/genomes/hg38/hg38.fa --denovo

Glimmer

Introduction

Glimmer is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses.

For more information, please check its website: https://biocontainers.pro/tools/glimmer and its home page: http://ccb.jhu.edu/software/glimmer/index.shtml.

Versions

  • 3.02

Commands

  • anomaly

  • build-fixed

  • build-icm

  • entropy-profile

  • entropy-score

  • extract

  • g3-from-scratch.csh

  • g3-from-training.csh

  • g3-iterated.csh

  • get-motif-counts.awk

  • glim-diff.awk

  • glimmer3

  • long-orfs

  • match-list-col.awk

  • multi-extract

  • not-acgt.awk

  • score-fixed

  • start-codon-distrib

  • test

  • uncovered

  • upstream-coords.awk

  • window-acgt

Module

You can load the modules by:

module load biocontainers
module load glimmer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Glimmer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers glimmer

long-orfs -n -t 1.15 scaffolds.fasta run1.longorfs
extract -t scaffolds.fasta run1.longorfs > run1.train
build-icm -r run1.icm < run1.train
glimmer3 scaffolds.fasta run1.icm cm

Glimmerhmm

Introduction

Glimmerhmm is a new gene finder based on a Generalized Hidden Markov Model (GHMM).

For more information, please check its website: https://biocontainers.pro/tools/glimmerhmm and its home page: https://ccb.jhu.edu/software/glimmerhmm/.

Versions

  • 3.0.4

Commands

  • glimmerhmm

  • glimmhmm.pl

  • trainGlimmerHMM

Module

You can load the modules by:

module load biocontainers
module load glimmerhmm

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Glimmerhmm on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmerhmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers glimmerhmm

trainGlimmerHMM Asperg.fasta Asperg.cds -d Asperg
glimmerhmm Asperg.fasta -d Asperg -o Asperg_glimmerhmm_out

Glnexus

Introduction

Glnexus: Scalable gVCF merging and joint variant calling for population sequencing projects.

For more information, please check:

Versions

  • 1.4.1

Commands

  • glnexus_cli

Module

You can load the modules by:

module load biocontainers
module load glnexus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run glnexus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glnexus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers glnexus

glnexus_cli --config DeepVariant \
    --bed ALDH2.bed \
    dv_1000G_ALDH2_gvcf/*.g.vcf.gz \
    > dv_1000G_ALDH2.bcf

Gmap

Introduction

Gmap is a genomic mapping and alignment program for mRNA and EST sequences.

For more information, please check its website: https://biocontainers.pro/tools/gmap and its home page: http://research-pub.gene.com/gmap/.

Versions

  • 2021.05.27

  • 2021.08.25

Commands

  • atoiindex

  • cmetindex

  • cpuid

  • dbsnp_iit

  • ensembl_genes

  • fa_coords

  • get-genome

  • gff3_genes

  • gff3_introns

  • gff3_splicesites

  • gmap

  • gmap.avx2

  • gmap_build

  • gmap_cat

  • gmapindex

  • gmapl

  • gmapl.avx2

  • gmapl.nosimd

  • gmap.nosimd

  • gmap_process

  • gsnap

  • gsnap.avx2

  • gsnapl

  • gsnapl.avx2

  • gsnapl.nosimd

  • gsnap.nosimd

  • gtf_genes

  • gtf_introns

  • gtf_splicesites

  • gtf_transcript_splicesites

  • gvf_iit

  • iit_dump

  • iit_get

  • iit_store

  • indexdb_cat

  • md_coords

  • psl_genes

  • psl_introns

  • psl_splicesites

  • sam_sort

  • snpindex

  • trindex

  • vcf_iit

Module

You can load the modules by:

module load biocontainers
module load gmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=gmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gmap

gmap_build -d Cmm -D Cmm genome.fasta
gmap -d Cmm -t 4 -D ./Cmm  cdna.fasta > gmap_out.txt

gmap_build -d GRCh38 -D GRCh38 Homo_sapiens.GRCh38.dna.primary_assembly.fa
gsnap -d GRCh38 -D ./GRCh38 --nthreads=4  SRR16956239_1.fastq SRR16956239_2.fastq > gsnap_out.txt

goatools

Introduction

Goatools is a python library for gene ontology analyses. Detailed information about its usage can be found here: https://github.com/tanghaibao/goatools

Versions

  • 1.1.12

  • 1.2.3

Commands

  • python

  • python3

  • compare_gos.py

  • fetch_associations.py

  • find_enrichment.py

  • go_plot.py

  • map_to_slim.py

  • ncbi_gene_results_to_python.py

  • plot_go_term.py

  • prt_terms.py

  • runxlrd.py

  • vba_extract.py

  • wr_hier.py

  • wr_sections.py

Module

You can load the modules by:

module load biocontainers
module load goatools/1.1.12

Interactive job

To run goatools interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers goatools/1.1.12
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from goatools.base import download_go_basic_obo
>>> obo_fname = download_go_basic_obo()

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=goatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers goatools/1.1.12

python script.py

find_enrichment.py --pval=0.05 --indent data/study data/population data/association

go_plot.py --go_file=tests/data/go_plot/go_heartjogging6.txt -r -o heartjogging6_r1.png

Graphlan

Introduction

Graphlan is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees.

For more information, please check its website: https://biocontainers.pro/tools/graphlan and its home page: https://huttenhower.sph.harvard.edu/graphlan/.

Versions

  • 1.1.3

Commands

  • graphlan.py

  • graphlan_annotate.py

Module

You can load the modules by:

module load biocontainers
module load graphlan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Graphlan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers graphlan

graphlan_annotate.py hmptree.xml hmptree.annot.xml --annot annot.txt

graphlan.py hmptree.annot.xml hmptree.png --dpi 150 --size 14

Graphmap

Introduction

Graphmap is a novel mapper targeted at aligning long, error-prone third-generation sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/graphmap and its home page on Github.

Versions

  • 0.6.3

Commands

  • graphmap2

Module

You can load the modules by:

module load biocontainers
module load graphmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Graphmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers graphmap

Gridss

Introduction

Gridss is a module software suite containing tools useful for the detection of genomic rearrangements.

For more information, please check its | Docker hub: https://hub.docker.com/r/gridss/gridss and its home page on Github.

Versions

  • 2.13.2

Commands

  • R

  • Rscript

  • gridss

  • gridss_annotate_vcf_kraken2

  • gridss_annotate_vcf_repeatmasker

  • gridss_extract_overlapping_fragments

  • gridss_somatic_filter

  • gridsstools

  • virusbreakend

  • virusbreakend-build

Module

You can load the modules by:

module load biocontainers
module load gridss

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gridss on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gridss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gridss

Gseapy

Introduction

Gseapy is a python wrapper for GESA and Enrichr.

For more information, please check its website: https://biocontainers.pro/tools/gseapy and its home page: https://gseapy.readthedocs.io/en/latest/introduction.html.

Versions

  • 0.10.8

Commands

  • gseapy

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load gseapy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gseapy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gseapy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gseapy

gseapy ssgsea -d ./data/testSet_rand1200.gct \
            -g data/temp.gmt \
            -o test/ssgsea_report2  \
            -p 4 --no-plot --no-scale
gseapy replot -i data -o test/replot_test

GTDB-Tk

Introduction

GTDB-Tk is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB. It is designed to work with recent advances that allow hundreds or thousands of metagenome-assembled genomes (MAGs) to be obtained directly from environmental samples. It can also be applied to isolate and single-cell genomes.

GTDB-Tk reference data (R202) has been downloaded for users.

Versions

  • 1.7.0

  • 2.1.0

Commands

  • gtdbtk

Module

module load biocontainers module load gtdbtk/1.7.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run GTDB-Tk our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gtdbtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gtdbtk/1.7.0

gtdbtk identify --genome_dir genomes --out_dir identify --extension gz --cpus 8
gtdbtk align --identify_dir identify --out_dir align --cpus 8
gtdbtk classify --genome_dir genomes --align_dir align --out_dir classify --extension gz --cpus 8

Gubbins

Introduction

Gubbins is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.

For more information, please check its website: https://biocontainers.pro/tools/gubbins and its home page on Github.

Versions

  • 3.2.0

  • 3.3

Commands

  • extract_gubbins_clade.py

  • generate_ska_alignment.py

  • gubbins_alignment_checker.py

  • mask_gubbins_aln.py

  • run_gubbins.py

  • sumlabels.py

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load gubbins

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Gubbins on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gubbins
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers gubbins

run_gubbins.py --prefix ST239 ST239.aln

Guppy

Introduction

Guppy is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features.

For more information, please check its | Docker hub: https://hub.docker.com/r/genomicpariscentre/guppy and its home page: https://community.nanoporetech.com.

Versions

  • 6.0.1

  • 6.5.7

Commands

  • guppy_aligner

  • guppy_barcoder

  • guppy_basecall_server

  • guppy_basecaller

  • guppy_basecaller_duplex

  • guppy_basecaller_supervisor

  • guppy_basecall_client

Module

You can load the modules by:

module load biocontainers
module load guppy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Guppy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=guppy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers guppy

guppy_basecaller --compress_fastq -i data/fast5_tiny/ \
    -s basecall_tiny/ --cpu_threads_per_caller 12 \
    --num_callers 1 -c dna_r9.4.1_450bps_hac.cfg

Hail

Introduction

Hail is an open-source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data.

For more information, please check:

Versions

  • 0.2.94

  • 0.2.98

Commands

  • python3

Module

You can load the modules by:

module load biocontainers
module load hail

Interactive job

To run Hail interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers hail
(base) UserID@bell-a008:~ $ python3
Python 3.7.13 (default, Apr 24 2022, 01:05:22)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail as hl
>>>  print(hl.citation())
Hail Team. Hail 0.2.94-f0b38d6c436f. https://github.com/hail-is/hail/commit/f0b38d6c436f.

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run hail on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hail
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hail
python3 script.py

Hap.py

Introduction

Hap.py is a tool to compare diploid genotypes at haplotype level.

For more information, please check:

Versions

  • 0.3.9

Commands

  • bamstats.py

  • cnx.py

  • ftx.py

  • guess-ploidy.py

  • hap.py

  • ovc.py

  • plot-roh.py

  • pre.py

  • qfy.py

  • som.py

  • varfilter.py

Module

You can load the modules by:

module load biocontainers
module load hap.py

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run hap.py on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hap.py
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hap.py

hap.py  \
  example/happy/PG_NA12878_chr21.vcf.gz \
  example/happy/NA12878_chr21.vcf.gz \
  -f example/happy/PG_Conf_chr21.bed.gz \
  -r example/chr21.fa \
  -o test

Helen

Introduction

HELEN is a multi-task RNN polisher which operates on images produced by MarginPolish.

For more information, please check:

Versions

  • 1.0

Commands

  • helen

Module

You can load the modules by:

module load biocontainers
module load helen

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run helen on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=helen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers helen

helen polish \
    --image_dir mp_output \
    --model_path "helen_modles/HELEN_r941_guppy344_microbial.pkl" \
    --threads 32 \
    --output_dir "helen_output/" \
    --output_prefix Staph_Aur_draft_helen

Hicexplorer

Introduction

Hicexplorer is a set of tools to process, normalize and visualize Hi-C data.

For more information, please check its website: https://biocontainers.pro/tools/hicexplorer and its home page: https://hicexplorer.readthedocs.io/en/latest/#.

Versions

  • 3.7.2

Commands

  • chicAggregateStatistic

  • chicDifferentialTest

  • chicExportData

  • chicPlotViewpoint

  • chicQualityControl

  • chicSignificantInteractions

  • chicViewpoint

  • chicViewpointBackgroundModel

  • hicAdjustMatrix

  • hicAggregateContacts

  • hicAverageRegions

  • hicBuildMatrix

  • hicCompareMatrices

  • hicCompartmentalization

  • hicConvertFormat

  • hicCorrectMatrix

  • hicCorrelate

  • hicCreateThresholdFile

  • hicDetectLoops

  • hicDifferentialTAD

  • hicexplorer

  • hicFindEnrichedContacts

  • hicFindRestSite

  • hicFindTADs

  • hicHyperoptDetectLoops

  • hicHyperoptDetectLoopsHiCCUPS

  • hicInfo

  • hicInterIntraTAD

  • hicMergeDomains

  • hicMergeLoops

  • hicMergeMatrixBins

  • hicMergeTADbins

  • hicNormalize

  • hicPCA

  • hicPlotAverageRegions

  • hicPlotDistVsCounts

  • hicPlotMatrix

  • hicPlotSVL

  • hicPlotTADs

  • hicPlotViewpoint

  • hicQC

  • hicQuickQC

  • hicSumMatrices

  • hicTADClassifier

  • hicTrainTADClassifier

  • hicTransform

  • hicValidateLocations

Module

You can load the modules by:

module load biocontainers
module load hicexplorer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Hicexplorer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hicexplorer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hicexplorer

Hic-pro

Introduction

Hicpro is an optimized and flexible pipeline for Hi-C data processing.

For more information, please check:

Versions

  • 3.0.0

  • 3.1.0

Commands

  • HiC-Pro

  • digest_genome.py

  • extract_snps.py

  • hicpro2fithic.py

  • hicpro2higlass.sh

  • hicpro2juicebox.sh

  • make_viewpoints.py

  • sparseToDense.py

  • split_reads.py

  • split_sparse.py

Module

You can load the modules by:

module load biocontainers
module load hic-pro

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run hic-pro on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hic-pro
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hic-pro

Hifiasm

Introduction

Hifiasm is a fast haplotype-resolved de novo assembler for PacBio HiFi reads.

For more information, please check its website: https://biocontainers.pro/tools/hifiasm and its home page on Github.

Versions

  • 0.16.0

  • 0.18.5

Commands

  • hifiasm

Module

You can load the modules by:

module load biocontainers
module load hifiasm

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Hifiasm on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hifiasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hifiasm

HISAT2

Introduction

HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.

For more information, please check its website: https://biocontainers.pro/tools/hisat2 and its home page on Github.

Versions

  • 2.2.1

Commands

  • extract_exons.py

  • extract_splice_sites.py

  • hisat2

  • hisat2-align-l

  • hisat2-align-s

  • hisat2-build

  • hisat2-build-l

  • hisat2-build-s

  • hisat2-inspect

  • hisat2-inspect-l

  • hisat2-inspect-s

  • hisat2_extract_exons.py

  • hisat2_extract_snps_haplotypes_UCSC.py

  • hisat2_extract_snps_haplotypes_VCF.py

  • hisat2_extract_splice_sites.py

  • hisat2_read_statistics.py

  • hisat2_simulate_reads.py

Module

You can load the modules by:

module load biocontainers
module load hisat2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run HISAT2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hisat2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hisat2

hisat2-build genome.fa genome

# for single-end FASTA reads DNA alignment
hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignment

# for paired-end FASTQ reads alignment
hisat2 -x genome -1 reads_1.fq -2 read2_2.fq -S output.sam

Hmmer

Introduction

Hmmer is used for searching sequence databases for sequence homologs, and for making sequence alignments.

For more information, please check its website: https://biocontainers.pro/tools/hmmer and its home page: http://hmmer.org.

Versions

  • 3.3.2

Commands

  • alimask

  • easel

  • esl-afetch

  • esl-alimanip

  • esl-alimap

  • esl-alimask

  • esl-alimerge

  • esl-alipid

  • esl-alirev

  • esl-alistat

  • esl-compalign

  • esl-compstruct

  • esl-construct

  • esl-histplot

  • esl-mask

  • esl-mixdchlet

  • esl-reformat

  • esl-selectn

  • esl-seqrange

  • esl-seqstat

  • esl-sfetch

  • esl-shuffle

  • esl-ssdraw

  • esl-translate

  • esl-weight

  • hmmalign

  • hmmbuild

  • hmmconvert

  • hmmemit

  • hmmfetch

  • hmmlogo

  • hmmpgmd

  • hmmpgmd_shard

  • hmmpress

  • hmmscan

  • hmmsearch

  • hmmsim

  • hmmstat

  • jackhmmer

  • makehmmerdb

  • nhmmer

  • nhmmscan

  • phmmer

Module

You can load the modules by:

module load biocontainers
module load hmmer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Hmmer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hmmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hmmer

hmmsearch Nramp.hmm protein.fa > out

HOMMER

Introduction

HOMMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. Details about its usage can be found in HOMMER website.

Versions

  • 4.11

Commands

  • addDataHeader.pl

  • addData.pl

  • addGeneAnnotation.pl

  • addInternalData.pl

  • addOligos.pl

  • adjustPeakFile.pl

  • adjustRedunGroupFile.pl

  • analyzeChIP-Seq.pl

  • analyzeRepeats.pl

  • analyzeRNA.pl

  • annotateInteractions.pl

  • annotatePeaks.pl

  • annotateRelativePosition.pl

  • annotateTranscripts.pl

  • assignGeneWeights.pl

  • assignTSStoGene.pl

  • batchAnnotatePeaksHistogram.pl

  • batchFindMotifsGenome.pl

  • batchFindMotifs.pl

  • batchMakeHiCMatrix.pl

  • batchMakeMultiWigHub.pl

  • batchMakeTagDirectory.pl

  • batchParallel.pl

  • bed2DtoUCSCbed.pl

  • bed2pos.pl

  • bed2tag.pl

  • blat2gtf.pl

  • bridgeResult2Cytoscape.pl

  • changeNewLine.pl

  • checkPeakFile.pl

  • checkTagBias.pl

  • chopify.pl

  • chopUpBackground.pl

  • chopUpPeakFile.pl

  • cleanUpPeakFile.pl

  • cleanUpSequences.pl

  • cluster2bedgraph.pl

  • cluster2bed.pl

  • combineGO.pl

  • combineHubs.pl

  • compareMotifs.pl

  • condenseBedGraph.pl

  • cons2fasta.pl

  • conservationAverage.pl

  • conservationPerLocus.pl

  • convertCoordinates.pl

  • convertIDs.pl

  • convertOrganismID.pl

  • duplicateCol.pl

  • eland2tags.pl

  • fasta2tab.pl

  • fastq2fasta.pl

  • filterListBy.pl

  • filterTADsAndCPs.pl

  • filterTADsAndLoops.pl

  • findcsRNATSS.pl

  • findGO.pl

  • findGOtxt.pl

  • findHiCCompartments.pl

  • findHiCDomains.pl

  • findHiCInteractionsByChr.pl

  • findKnownMotifs.pl

  • findMotifsGenome.pl

  • findMotifs.pl

  • findRedundantBLAT.pl

  • findTADsAndLoops.pl

  • findTopMotifs.pl

  • flipPC1toMatch.pl

  • freq2group.pl

  • genericConvertIDs.pl

  • GenomeOntology.pl

  • getChrLengths.pl

  • getConservedRegions.pl

  • getDifferentialBedGraph.pl

  • getDifferentialPeaksReplicates.pl

  • getDiffExpression.pl

  • getDistalPeaks.pl

  • getFocalPeaks.pl

  • getGenesInCategory.pl

  • getGWASoverlap.pl

  • getHiCcorrDiff.pl

  • getHomerQCstats.pl

  • getLikelyAdapters.pl

  • getMappingStats.pl

  • getPartOfPromoter.pl

  • getPos.pl

  • getRandomReads.pl

  • getSiteConservation.pl

  • getTopPeaks.pl

  • gff2pos.pl

  • go2cytoscape.pl

  • groupSequences.pl

  • joinFiles.pl

  • loadGenome.pl

  • loadPromoters.pl

  • makeBigBedMotifTrack.pl

  • makeBigWig.pl

  • makeBinaryFile.pl

  • makeHiCWashUfile.pl

  • makeMetaGeneProfile.pl

  • makeMultiWigHub.pl

  • map-fastq.pl

  • merge2Dbed.pl

  • mergeData.pl

  • motif2Jaspar.pl

  • motif2Logo.pl

  • parseGTF.pl

  • pos2bed.pl

  • preparseGenome.pl

  • prepForR.pl

  • profile2seq.pl

  • qseq2fastq.pl

  • randomizeGroupFile.pl

  • randomizeMotifs.pl

  • randRemoveBackground.pl

  • removeAccVersion.pl

  • removeBadSeq.pl

  • removeOutOfBoundsReads.pl

  • removePoorSeq.pl

  • removeRedundantPeaks.pl

  • renamePeaks.pl

  • resizePosFile.pl

  • revoppMotif.pl

  • rotateHiCmatrix.pl

  • runHiCpca.pl

  • sam2spliceJunc.pl

  • scanMotifGenomeWide.pl

  • scrambleFasta.pl

  • selectRepeatBg.pl

  • seq2profile.pl

  • SIMA.pl

  • subtractBedGraphsDirectory.pl

  • subtractBedGraphs.pl

  • tab2fasta.pl

  • tag2bed.pl

  • tag2pos.pl

  • tagDir2bed.pl

  • tagDir2hicFile.pl

  • tagDir2HiCsummary.pl

  • zipHomerResults.pl

Database

Selected database have been downloaded for users.

  • ORGANISMS: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and fly

  • PROMOTERS: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and fly

  • GENOMES: hg19, hg38, mm10, ce11, dm6, rn6, danRer11, tair10, and sacCer3

Module

You can load the modules by:

module load biocontainers
module load hommer/4.11

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run HOMMER on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=hommer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hommer/4.11

configureHomer.pl -list   ## Check the installed database.
findMotifs.pl mouse_geneid.txt mouse motif_out_mouse
findMotifs.pl geneid.txt human motif_out

Homopolish

Introduction

Homopolish is a genome polisher originally developed for Nanopore and subsequently extended for PacBio CLR. It generates a high-quality genome (>Q50) for virus, bacteria, and fungus. Nanopore/PacBio systematic errors are corrected by retreiving homologs from closely-related genomes and polished by an SVM. When paired with Racon and Medaka, the genome quality can reach Q50-90 (>99.999%) on Nanopore R9.4/10.3 flowcells (Guppy >3.4). For PacBio CLR, Homopolish also improves the majority of Flye-assembled genomes to Q90.

For more information, please check:

Versions

  • 0.4.1

Commands

  • homopolish

Module

You can load the modules by:

module load biocontainers
module load homopolish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run homopolish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=homopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers homopolish

How_are_we_stranded_here

Introduction

How_are_we_stranded_here is a python package for testing strandedness of RNA-Seq fastq files.

For more information, please check its website: https://biocontainers.pro/tools/how_are_we_stranded_here and its home page on Github.

Versions

  • 1.0.1

Commands

  • check_strandedness

Module

You can load the modules by:

module load biocontainers
module load how_are_we_stranded_here

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run How_are_we_stranded_here on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=how_are_we_stranded_here
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers how_are_we_stranded_here

check_strandedness --gtf Homo_sapiens.GRCh38.105.gtf \
    --transcripts Homo_sapiens.GRCh38.cds.all.fa \
    --reads_1 seq_1.fastq  --reads_2 seq_2.fastq

HTSeq

Introduction

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

For more information, please check its website: https://biocontainers.pro/tools/htseq and its home page on Github.

Versions

  • 0.13.5

  • 1.99.2

  • 2.0.1

  • 2.0.2

  • 2.0.2-py310

Commands

  • htseq-count

  • htseq-count-barcodes

  • htseq-qa

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load htseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run HTSeq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers htseq

python -m HTSeq.scripts.count \
       -f bam input.bam ref.gtf \
       > test.out

Htslib

Introduction

Htslib is a C library for high-throughput sequencing data formats.

For more information, please check its website: https://biocontainers.pro/tools/htslib and its home page on Github.

Versions

  • 1.14

  • 1.15

  • 1.16

  • 1.17

Commands

  • bgzip

  • htsfile

  • tabix

Module

You can load the modules by:

module load biocontainers
module load htslib

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Htslib on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htslib
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers htslib

tabix sorted.gff.gz chr1:10,000,000-20,000,000

Htstream

Introduction

Htstream is a quality control and processing pipeline for High Throughput Sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/htstream and its home page on Github.

Versions

  • 1.3.3

Commands

  • hts_AdapterTrimmer

  • hts_CutTrim

  • hts_LengthFilter

  • hts_NTrimmer

  • hts_Overlapper

  • hts_PolyATTrim

  • hts_Primers

  • hts_QWindowTrim

  • hts_SeqScreener

  • hts_Stats

  • hts_SuperDeduper

Module

You can load the modules by:

module load biocontainers
module load htstream

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Htstream on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htstream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers htstream

HUMAnN 3

Introduction

HUMAnN 3.0 is the next iteration of HUMAnN, the HMP Unified Metabolic Analysis Network. HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.

For more information please check its website: https://huttenhower.sph.harvard.edu/humann/

Versions

  • 3.0.0

  • 3.6

Commands

  • humann

  • humann3

  • humann3_databases

  • humann_barplot

  • humann_benchmark

  • humann_build_custom_database

  • humann_config

  • humann_databases

  • humann_genefamilies_genus_level

  • humann_infer_taxonomy

  • humann_join_tables

  • humann_reduce_table

  • humann_regroup_table

  • humann_rename_table

  • humann_renorm_table

  • humann_split_stratified_table

  • humann_split_table

  • humann_test

  • humann_unpack_pathways

Database

Full ChocoPhlAn, UniRef90, EC-filtered UniRef90, UniRef50, EC-filtered UniRef50, and utility_mapping databases have been downloaded for users.

Module

You can load the modules by:

module load biocontainers
module load humann/3.0.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run HUMAnN3 on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=humann
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers humann/3.0.0
# Check the database and config by:
humann_config --print

humann --threads 24 --input examples/demo.fastq --output demo_output --metaphlan-options "--bowtie2db /depot/itap/datasets/metaphlan"

Hyphy

Introduction

Hyphy is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning.

For more information, please check its website: https://biocontainers.pro/tools/hyphy and its home page on Github.

Versions

  • 2.5.36

Commands

  • hyphy

Module

You can load the modules by:

module load biocontainers
module load hyphy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Hyphy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hyphy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hyphy

Hypo

Introduction

HyPo–a Hybrid Polisher– utilises short as well as long reads within a single run to polish a long reads assembly of small and large genomes. It exploits unique genomic kmers to selectively polish segments of contigs using partial order alignment of selective read-segments. As demonstrated on human genome assemblies, Hypo generates significantly more accurate polished assembly in about one-third time with about half the memory requirements in comparison to contemporary widely used polishers like Racon.

For more information, please check:

Versions

  • 1.0.3

Commands

  • hypo

Module

You can load the modules by:

module load biocontainers
module load hypo

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run hypo on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hypo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers hypo

Idba

Introduction

Idba is a practical iterative De Bruijn Graph De Novo Assembler for sequence assembly in bioinfomatics.

For more information, please check its website: https://biocontainers.pro/tools/idba and its home page: https://i.cs.hku.hk/~alse/hkubrg/projects/idba/index.html.

Versions

  • 1.1.3

Commands

  • fa2fq

  • filter_blat

  • filter_contigs

  • filterfa

  • fq2fa

  • idba

  • idba_hybrid

  • idba_tran

  • idba_tran_test

  • idba_ud

  • parallel_blat

  • parallel_rna_blat

  • print_graph

  • raw_n50

  • run-unittest.py

  • sample_reads

  • scaffold

  • scan.py

  • shuffle_reads

  • sim_reads

  • sim_reads_tran

  • sort_psl

  • sort_reads

  • split_fa

  • split_fq

  • split_scaffold

  • test

  • validate_blat

  • validate_blat_parallel

  • validate_component

  • validate_contigs_blat

  • validate_contigs_mummer

  • validate_reads_blat

  • validate_rna

Module

You can load the modules by:

module load biocontainers
module load idba

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Idba on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=idba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers idba

fq2fa --paired --filter SRR1977249.abundtrim.subset.pe.fq SRR1977249.abundtrim.subset.pe.fa
idba_ud  -r SRR1977249.abundtrim.subset.pe.fa -o output

IGV

Introduction

IGV (Integrative Genomics Viewer) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data.

For more information, please check its home page: http://www.broadinstitute.org/software/igv/home.

Versions

  • 2.11.9

  • 2.12.3

Commands

  • igv_hidpi.sh

  • igv.sh

Module

You can load the modules by:

module load biocontainers
module load igv

Interactive job

Since IGV requires GUI, it is recommended to run it within ThinLinc:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
 salloc: Granted job allocation 12345869
 salloc: Waiting for resource configuration
 salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module --force purge
(base) UserID@bell-a008:~ $ ml biocontainers igv
(base) UserID@bell-a008:~ $ igv.sh

Impute2

Introduction

Impute2 is a genotype imputation and haplotype phasing program.

For more information, please check its website: https://biocontainers.pro/tools/impute2 and its home page: https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#home.

Versions

  • 2.3.2

Commands

  • impute2

Module

You can load the modules by:

module load biocontainers
module load impute2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Impute2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=impute2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers impute2

impute2 \
    -m Example/example.chr22.map \
    -h Example/example.chr22.1kG.haps \
    -l Example/example.chr22.1kG.legend \
    -g Example/example.chr22.study.gens \
    -strand_g Example/example.chr22.study.strand \
    -int 20.4e6 20.5e6 \
    -Ne 20000 \
    -o example.chr22.one.phased.impute2

Infernal

Introduction

Infernal (“INFERence of RNA ALignment”) is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence. For more information, please check: BioContainers: https://biocontainers.pro/tools/infernal Home page: http://eddylab.org/infernal/

Versions

  • 1.1.4

Commands

  • cmalign

  • cmbuild

  • cmcalibrate

  • cmconvert

  • cmemit

  • cmfetch

  • cmpress

  • cmscan

  • cmsearch

  • cmstat

Module

You can load the modules by:

module load biocontainers
module load infernal

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run infernal on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=infernal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers infernal

Instrain

Introduction

Instrain is a python program for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification.

For more information, please check its website: https://biocontainers.pro/tools/instrain and its home page on Github.

Versions

  • 1.5.7

  • 1.6.3

Commands

  • inStrain

Module

You can load the modules by:

module load biocontainers
module load instrain

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Instrain on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=instrain
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers instrain

Intarna

Introduction

Intarna is a general and fast approach to the prediction of RNA-RNA interactions incorporating both the accessibility of interacting sites as well as the existence of a user-definable seed interaction.

For more information, please check its website: https://biocontainers.pro/tools/intarna and its home page on Github.

Versions

  • 3.3.1

Commands

  • IntaRNA

Module

You can load the modules by:

module load biocontainers
module load intarna

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Intarna on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=intarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers intarna

IntaRNA -t CCCCCCCCGGGGGGGGGGGGGG -q AAAACCCCCCCUUUU

InterProScan

Introduction

InterPro is a database which integrates together predictive information about proteins’ function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.

Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database’s signatures and the results are then output in a variety of formats.

Versions

  • 5.54_87.0

  • 5.61-93.0

Commands

interproscan.sh

Database

Latest version of database has been downloaded and setup in /depot/itap/datasets/interproscan-5.54-87.0/data.

Module

You can load the modules by:

module load biocontainers
module load interproscan/5.54_87.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run run_dbcan on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=interproscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers interproscan/5.54_87.0

interproscan.sh -cpu 24 -i test_proteins.fasta
interproscan.sh -cpu 24 -t n -i test_nt_seqs.fasta

IQ-TREE

Introduction

IQ-TREE is an efficient phylogenomic software by maximum likelihood.

For more information, please check its website: https://biocontainers.pro/tools/iqtree and its home page: http://www.iqtree.org.

Versions

  • 1.6.12

  • 2.1.2

  • 2.2.0_beta

  • 2.2.2.2

Commands

  • iqtree

Module

You can load the modules by:

module load biocontainers
module load iqtree

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run IQ-TREE on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers iqtree

iqtree -s input.phy -m GTR+I+G > test.out

Iqtree2

Introduction

IQ-TREE is an efficient phylogenomic software by maximum likelihood.

For more information, please check:

Versions

  • 2.2.2.6

Commands

  • iqtree2

Module

You can load the modules by:

module load biocontainers
module load iqtree2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run iqtree2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers iqtree2

Ismapper

Introduction

ISMapper searches for IS positions in sequence data using paired end Illumina short reads, an IS query/queries of interest and a reference genome. ISMapper reports the IS positions it has found in each isolate, relative to the provided reference genome.

For more information, please check:

Versions

  • 2.0.2

Commands

  • ismap

Module

You can load the modules by:

module load biocontainers
module load ismapper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ismapper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ismapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ismapper

Isoquant

Introduction

IsoQuant is a tool for the genome-based analysis of long RNA reads, such as PacBio or Oxford Nanopores. IsoQuant allows to reconstruct and quantify transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon and intron quantification. If reads are grouped (e.g. according to cell type), counts are reported according to the provided grouping.

For more information, please check:

Versions

  • 3.1.2

Commands

  • isoquant.py

Module

You can load the modules by:

module load biocontainers
module load isoquant

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run isoquant on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers isoquant

isoquant.py --reference chr9.4M.fa.gz \
    --genedb chr9.4M.gtf.gz \
    --fastq  chr9.4M.ont.sim.fq.gz \
    --data_type nanopore -o test_ont

Isoseq3

Introduction

Isoseq3 - Scalable De Novo Isoform Discovery.

For more information, please check its website: https://biocontainers.pro/tools/isoseq3 and its home page on Github.

Versions

  • 3.4.0

  • 3.7.0

  • 3.8.2

Commands

  • isoseq3

Module

You can load the modules by:

module load biocontainers
module load isoseq3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Isoseq3 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoseq3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers isoseq3

isoseq3 --version

isoseq3 refine --require-polya \
    alz.demult.5p--3p.bam \
    primers.fasta alz.flnc.bam

isoseq3 cluster alz.flnc.bam \
    alz.polished.bam --verbose --use-qvs

Ivar

Introduction

Ivar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.

For more information, please check:

Versions

  • 1.3.1

  • 1.4.2

Commands

  • ivar

Module

You can load the modules by:

module load biocontainers
module load ivar

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ivar on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ivar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ivar

Jcvi

Introduction

Jcvi is a collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.

For more information, please check:

Versions

  • 1.2.7

  • 1.3.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load jcvi

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run jcvi on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=jcvi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers jcvi

python -m jcvi.formats.fasta format Vvinifera_145_Genoscope.12X.cds.fa.gz grape.cds
python -m jcvi.formats.fasta format Ppersica_298_v2.1.cds.fa.gz peach.cds
python -m jcvi.formats.gff bed --type=mRNA --key=Name --primary_only Vvinifera_145_Genoscope.12X.gene.gff3.gz -o grape.bed
python -m jcvi.compara.catalog ortholog grape peach --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
rm grape.peach.last.filtered
python -m jcvi.compara.catalog ortholog grape peach --cscore=.99 --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
python -m jcvi.compara.synteny depth --histogram grape.peach.anchors
python -m jcvi.graphics.grabseeds seeds test-data/test.JPG

Kaiju

Introduction

Kaiju is a tool for fast taxonomic classification of metagenomic sequencing reads using a protein reference database.

For more information, please check its website: https://biocontainers.pro/tools/kaiju and its home page on Github.

Versions

  • 1.8.2

Commands

  • kaiju

  • kaiju-addTaxonNames

  • kaiju-convertMAR.py

  • kaiju-convertNR

  • kaiju-excluded-accessions.txt

  • kaiju-gbk2faa.pl

  • kaiju-makedb

  • kaiju-mergeOutputs

  • kaiju-mkbwt

  • kaiju-mkfmi

  • kaiju-multi

  • kaiju-taxonlistEuk.tsv

  • kaiju2krona

  • kaiju2table

  • kaijup

  • kaijux

Module

You can load the modules by:

module load biocontainers
module load kaiju

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Kaiju on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kaiju
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kaiju

kaiju -t kaijudb/nodes.dmp \
     -f kaijudb/refseq/kaiju_db_refseq.fmi \
    -i input_1.fastq -j input_2.fastq
     -z 24

Kakscalculator2

Introduction

kakscalculator2 is a toolkit of incorporating gamma series methods and sliding window strategies.

For more information, please check:

Versions

  • 2.0.1

Commands

  • KaKs_Calculator

Module

You can load the modules by:

module load biocontainers
module load kakscalculator2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kakscalculator2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kakscalculator2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kakscalculator2

KaKs_Calculator -i example.axt -o example.axt.kaks -m YN

Kallisto

Introduction

Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.

Detailed usage can be found here: https://github.com/pachterlab/kallisto

Versions

  • 0.46.2

  • 0.48.0

Commands

  • kallisto

Module

You can load the modules by:

module load biocontainers
module load kallisto/0.48.0

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kallisto on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kallisto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kallisto/0.48.0

kallisto index -i transcripts.idx Homo_sapiens.GRCh38.cds.all.fa.gz
kallisto quant -t 24 -i transcripts.idx -o output -b 100  SRR11614709_1.fastq  SRR11614709_2.fastq

Kentutils

Introduction

Kentutils: UCSC command line bioinformatic utilities.

For more information, please check:

Versions

  • 302.1.0

Commands

  • addCols

  • ameme

  • autoDtd

  • autoSql

  • autoXml

  • ave

  • aveCols

  • axtChain

  • axtSort

  • axtSwap

  • axtToMaf

  • axtToPsl

  • bedClip

  • bedCommonRegions

  • bedCoverage

  • bedExtendRanges

  • bedGeneParts

  • bedGraphPack

  • bedGraphToBigWig

  • bedIntersect

  • bedItemOverlapCount

  • bedPileUps

  • bedRemoveOverlap

  • bedRestrictToPositions

  • bedSort

  • bedToBigBed

  • bedToExons

  • bedToGenePred

  • bedToPsl

  • bedWeedOverlapping

  • bigBedInfo

  • bigBedNamedItems

  • bigBedSummary

  • bigBedToBed

  • bigWigAverageOverBed

  • bigWigCat

  • bigWigCorrelate

  • bigWigInfo

  • bigWigMerge

  • bigWigSummary

  • bigWigToBedGraph

  • bigWigToWig

  • blastToPsl

  • blastXmlToPsl

  • calc

  • catDir

  • catUncomment

  • chainAntiRepeat

  • chainFilter

  • chainMergeSort

  • chainNet

  • chainPreNet

  • chainSort

  • chainSplit

  • chainStitchId

  • chainSwap

  • chainToAxt

  • chainToPsl

  • checkAgpAndFa

  • checkCoverageGaps

  • checkHgFindSpec

  • checkTableCoords

  • chopFaLines

  • chromGraphFromBin

  • chromGraphToBin

  • colTransform

  • countChars

  • crTreeIndexBed

  • crTreeSearchBed

  • dbSnoop

  • dbTrash

  • estOrient

  • faCmp

  • faCount

  • faFilter

  • faFilterN

  • faFrag

  • faNoise

  • faOneRecord

  • faPolyASizes

  • faRandomize

  • faRc

  • faSize

  • faSomeRecords

  • faSplit

  • faToFastq

  • faToTab

  • faToTwoBit

  • faTrans

  • fastqToFa

  • featureBits

  • fetchChromSizes

  • findMotif

  • gapToLift

  • genePredCheck

  • genePredHisto

  • genePredSingleCover

  • genePredToBed

  • genePredToFakePsl

  • genePredToGtf

  • genePredToMafFrames

  • gfClient

  • gfServer

  • gff3ToGenePred

  • gff3ToPsl

  • gmtime

  • gtfToGenePred

  • headRest

  • hgFindSpec

  • hgGcPercent

  • hgLoadBed

  • hgLoadOut

  • hgLoadWiggle

  • hgTrackDb

  • hgWiggle

  • hgsql

  • hgsqldump

  • htmlCheck

  • hubCheck

  • ixIxx

  • lavToAxt

  • lavToPsl

  • ldHgGene

  • liftOver

  • liftOverMerge

  • liftUp

  • linesToRa

  • linux.x86_64

  • localtime

  • mafAddIRows

  • mafAddQRows

  • mafCoverage

  • mafFetch

  • mafFilter

  • mafFrag

  • mafFrags

  • mafGene

  • mafMeFirst

  • mafOrder

  • mafRanges

  • mafSpeciesList

  • mafSpeciesSubset

  • mafSplit

  • mafSplitPos

  • mafToAxt

  • mafToPsl

  • mafsInRegion

  • makeTableList

  • maskOutFa

  • mktime

  • mrnaToGene

  • netChainSubset

  • netClass

  • netFilter

  • netSplit

  • netSyntenic

  • netToAxt

  • netToBed

  • newProg

  • nibFrag

  • nibSize

  • oligoMatch

  • overlapSelect

  • paraFetch

  • paraSync

  • positionalTblCheck

  • pslCDnaFilter

  • pslCat

  • pslCheck

  • pslDropOverlap

  • pslFilter

  • pslHisto

  • pslLiftSubrangeBlat

  • pslMap

  • pslMrnaCover

  • pslPairs

  • pslPartition

  • pslPretty

  • pslRecalcMatch

  • pslReps

  • pslSelect

  • pslSort

  • pslStats

  • pslSwap

  • pslToBed

  • pslToChain

  • pslToPslx

  • pslxToFa

  • qaToQac

  • qacAgpLift

  • qacToQa

  • qacToWig

  • raSqlQuery

  • raToLines

  • raToTab

  • randomLines

  • rmFaDups

  • rowsToCols

  • sizeof

  • spacedToTab

  • splitFile

  • splitFileByColumn

  • sqlToXml

  • stringify

  • subChar

  • subColumn

  • tailLines

  • tdbQuery

  • textHistogram

  • tickToDate

  • toLower

  • toUpper

  • trfBig

  • twoBitDup

  • twoBitInfo

  • twoBitMask

  • twoBitToFa

  • validateFiles

  • validateManifest

  • wigCorrelate

  • wigEncode

  • wigToBigWig

  • wordLine

  • xmlCat

  • xmlToSql

Module

You can load the modules by:

module load biocontainers
module load kentutils

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kentutils on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kentutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kentutils

Khmer

Introduction

Khmer is a tool for k-mer counting, filtering, and graph traversal FTW!

For more information, please check its website: https://biocontainers.pro/tools/khmer and its home page on Github.

Versions

  • 3.0.0a3

Commands

  • abundance-dist.py

  • abundance-dist-single.py

  • annotate-partitions.py

  • count-median.py

  • cygdb

  • cython

  • cythonize

  • do-partition.py

  • extract-long-sequences.py

  • extract-paired-reads.py

  • extract-partitions.py

  • fastq-to-fasta.py

  • filter-abund.py

  • filter-abund-single.py

  • filter-stoptags.py

  • find-knots.py

  • interleave-reads.py

  • load-graph.py

  • load-into-counting.py

  • make-initial-stoptags.py

  • merge-partitions.py

  • normalize-by-median.py

  • partition-graph.py

  • readstats.py

  • sample-reads-randomly.py

  • screed

  • split-paired-reads.py

  • trim-low-abund.py

  • unique-kmers.py

Module

You can load the modules by:

module load biocontainers
module load khmer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Khmer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=khmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers khmer

Kissde

Introduction

kissDE is a R package, similar to DEseq, but which works on pairs of variants, and tests if a variant is enriched in one condition. It has been developped to work easily with KisSplice output. It can also work with a simple table of counts obtained by any other means. It requires at least two replicates per condition and at least two conditions.

For more information, please check:

Versions

  • 1.15.3

Commands

  • R

  • Rscript

  • kissDE.R

Module

You can load the modules by:

module load biocontainers
module load kissde

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kissde on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissde
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kissde

Kissplice

Introduction

KisSplice is a software that enables to analyse RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.

For more information, please check:

Versions

  • 2.6.2

Commands

  • kissplice

Module

You can load the modules by:

module load biocontainers
module load kissplice

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kissplice on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kissplice

Kissplice2refgenome

Introduction

KisSplice can also be used when a reference (annotated) genome is available, in order to annotate the variants found and help prioritize cases to validate experimentally. In this case, the results of KisSplice are mapped to the reference genome, using for instance STAR, and the mapping results are analysed using KisSplice2RefGenome.

For more information, please check:

Versions

  • 2.0.8

Commands

  • kissplice2refgenome

Module

You can load the modules by:

module load biocontainers
module load kissplice2refgenome

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kissplice2refgenome on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice2refgenome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kissplice2refgenome

Kma

Introduction

KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend.

For more information, please check:

Versions

  • 1.4.3

Commands

  • kma

  • kma_index

  • kma_shm

  • kma_update

Module

You can load the modules by:

module load biocontainers
module load kma

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kma on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kma

Kmc

Introduction

Kmc is a tool for efficient k-mer counting and filtering of reads based on k-mer content.

For more information, please check its website: https://biocontainers.pro/tools/kmc and its home page on Github.

Versions

  • 3.2.1

Commands

  • kmc

  • kmc_dump

  • kmc_tools

Module

You can load the modules by:

module load biocontainers
module load kmc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Kmc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kmc

kmc -k27 seq.fastq 27mers .

Kmergenie

Introduction

KmerGenie estimates the best k-mer length for genome de novo assembly.

For more information, please check:

Versions

  • 1.7051

Commands

  • kmergenie

Module

You can load the modules by:

module load biocontainers
module load kmergenie

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kmergenie on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmergenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kmergenie

Jellyfish

Introduction

Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence.

For more information, please check its website: https://biocontainers.pro/tools/kmer-jellyfish and its home page: http://www.genome.umd.edu/jellyfish.html.

Versions

  • 2.3.0

Commands

  • jellyfish

Module

You can load the modules by:

module load biocontainers
module load kmer-jellyfish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Jellyfish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=kmer-jellyfish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kmer-jellyfish

jellyfish count -m 16 -s 100M -t 12 \
     -o mer_counts -c 7  input.fastq

KneadData

Introduction

KneadData is a tool designed to perform quality control on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments. In these experiments, samples are typically taken from a host in hopes of learning something about the microbial community on the host.

Detailed usage can be found here: https://huttenhower.sph.harvard.edu/kneaddata/

Versions

  • 0.10.0

Commands

  • kneaddata

  • kneaddata_bowtie2_discordant_pairs

  • kneaddata_build_database

  • kneaddata_database

  • kneaddata_read_count_table

  • kneaddata_test

  • kneaddata_trf_parallel

Module

You can load the modules by:

module load biocontainers
module load kneaddata

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kneaddata on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kneaddata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kneaddata

kneaddata --input examples/demo.fastq --reference-db examples/demo_db --output kneaddata_demo_outpu --threads 24 --processes 24

Kover

Introduction

Kover is an out-of-core implementation of rule-based machine learning algorithms that has been tailored for genomic biomarker discovery.

For more information, please check:

Versions

  • 2.0.6

Commands

  • kover

Module

You can load the modules by:

module load biocontainers
module load kover

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kover on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kover
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kover

Kraken2

Introduction

Kraken2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer.

Detailed usage can be found here: https://ccb.jhu.edu/software/kraken2/

Versions

  • 2.1.2_fixftp

  • 2.1.2

  • 2.1.3

Commands

  • kraken2

  • kraken2-build

  • kraken2-inspect

Module

You can load the modules by:

module load biocontainers
module load kraken2/2.1.2

Download database

Note

There is a known bug in rsync_from_ncbi.pl (https://github.com/DerrickWood/kraken2/issues/292). When users want to download and build databases by kraken2-build --download-library, there will an error rsync_from_ncbi.pl: unexpected FTP path(new server?). We modifed rsync_from_ncbi.pl to fix the bug, and created a new module ending with the suffix _fixftp. Please use this corrected module to download the library.

To download databases, please use the below command:

module load biocontainers
module load kraken2/2.1.2_fixftp

kraken2-build --download-library archaea --db archaea

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run kraken2 on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kraken2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers kraken2/2.1.2

kraken2 --threads 24  --report kranken2.report --db minikraken2_v2_8GB_201904_UPDATE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq

KrakenTools

Introduction

KrakenTools provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.

Detailed usage can be found here: https://github.com/jenniferlu717/KrakenTools

Versions

  • 1.2

Commands

  • alpha_diversity.py

  • beta_diversity.py

  • combine_kreports.py

  • combine_mpa.py

  • extract_kraken_reads.py

  • filter_bracken.out.py

  • fix_unmapped.py

  • kreport2krona.py

  • kreport2mpa.py

  • make_kreport.py

  • make_ktaxonomy.py

Module

You can load the modules by:

module load biocontainers
module load krakentools/1.2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run krakentools on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=krakentools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers krakentools/1.2

extract_kraken_reads.py -k myfile.kraken -t 2 -s1 SRR5043021_1.fastq -s2 SRR5043021_2.fastq -o extracted1.fq -o2 extracted2.fq

Lambda

Introduction

Lambda is a local aligner optimized for many query sequences and searches in protein space.

For more information, please check its website: https://biocontainers.pro/tools/lambda and its home page: http://seqan.github.io/lambda/.

Versions

  • 2.0.0

Commands

  • lambda2

Module

You can load the modules by:

module load biocontainers
module load lambda

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Lambda on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lambda
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lambda

lambda2 mkindexp -d uniprot_sprot.fasta

lambda2 searchp \
    -q proteins.fasta \
    -i uniprot_sprot.fasta.lambda

Last

Introduction

Last is used to find & align related regions of sequences.

For more information, please check its website: https://biocontainers.pro/tools/last and its home page on Gitlab.

Versions

  • 1268

  • 1356

  • 1411

  • 1418

Commands

  • last-dotplot

  • last-map-probs

  • last-merge-batches

  • last-pair-probs

  • last-postmask

  • last-split

  • last-split5

  • last-train

  • lastal

  • lastal5

  • lastdb

  • lastdb5

Module

You can load the modules by:

module load biocontainers
module load last

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Last on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=last
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers last

lastdb humdb humanMito.fa
lastal humdb fuguMito.fa > myalns.maf

Lastz

Introduction

LASTZ - pairwise DNA sequence aligner

For more information, please check:

Versions

  • 1.04.15

Commands

  • lastz

  • lastz_32

  • lastz_D

Module

You can load the modules by:

module load biocontainers
module load lastz

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run lastz on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lastz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lastz

lastz cmc_CFBP8216.fasta cmp_LPPA982.fasta \
     --notransition --step=20 --nogapped \
     --format=maf > cmc_vs_cmp.maf

Ldhat

Introduction

LDhat is a package written in the C and C++ languages for the analysis of recombination rates from population genetic data.

For more information, please check:

Versions

  • 2.2a

Commands

  • convert

  • pairwise

  • interval

  • rhomap

  • fin

Module

You can load the modules by:

module load biocontainers
module load ldhat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ldhat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldhat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ldhat

Ldjump

Introduction

LDJump is an R package to estimate variable recombination rates from population genetic data.

For more information, please check:

Versions

  • 0.3.1

Commands

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load ldjump

Note

A full path to the Phi file of PhiPack needs to be provided as follows pathPhi = "/opt/PhiPack/Phi". In order to use LDhat to quickly calculate some of the summary statistics, please set pathLDhat = "/opt/LDhat/".

Interactive job

To run interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers ldjump
(base) UserID@bell-a008:~ $ R

R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


> library(LDJump)
> LDJump(seqFullPath, alpha = 0.05, segLength = 1000, pathLDhat = "/opt/LDhat/", pathPhi = "/opt/PhiPack/Phi", format = "fasta", refName = NULL,
   start = NULL, constant = F, status = T, cores = 1, accept = F, demography = F, out = "")

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ldjump on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldjump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ldjump
Rscript script.R

Ldsc

Introduction

ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics.

For more information, please check:

Versions

  • 1.0.1

Commands

  • ldsc.py

  • munge_sumstats.py

Module

You can load the modules by:

module load biocontainers
module load ldsc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ldsc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldsc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ldsc

Liftoff

Introduction

Liftoff is an accurate GFF3/GTF lift over pipeline.

For more information, please check its website: https://biocontainers.pro/tools/liftoff and its home page on Github.

Versions

  • 1.6.3

Commands

  • liftoff

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load liftoff

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Liftoff on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftoff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers liftoff

liftoff -g reference.gff3 -o target.gff3 \
    -chroms chr_pairs.txt target.fasta reference.fa

Liftofftools

Introduction

LiftoffTools is a toolkit to compare genes lifted between genome assemblies. Specifically it is designed to compare genes lifted over using Liftoff although it is also compatible with other lift-over tools such as UCSC liftOver as long as the feature IDs are the same. LiftoffTools provides 3 different modules. The first identifies variants in protein-coding genes and their effects on the gene. The second compares the gene synteny, and the third clusters genes into groups of paralogs to evaluate gene copy number gain and loss. The input for all modules is the reference genome assembly (FASTA), target genome assembly (FASTA), reference annotation (GFF/GTF), and target annotation (GFF/GTF).

For more information, please check:

Versions

  • 0.4.4

Commands

  • liftofftools

Module

You can load the modules by:

module load biocontainers
module load liftofftools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run liftofftools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftofftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers liftofftools

Lima

Introduction

Lima is the standard tool to identify barcode and primer sequences in PacBio single-molecule sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/lima and its home page: https://lima.how.

Versions

  • 2.2.0

Commands

  • lima

Module

You can load the modules by:

module load biocontainers
module load lima

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Lima on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lima
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lima

lima --version
lima --isoseq --dump-clips \
    --peek-guess -j 12 \
    alz.ccs.bam primers.fasta \
    alz.demult.bam

Lofreq

Introduction

Lofreq is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/lofreq and its home page on Github.

Versions

  • 2.1.5

Commands

  • lofreq

Module

You can load the modules by:

module load biocontainers
module load lofreq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Lofreq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lofreq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lofreq

lofreq  call -f ref.fa -o vars.vcf out_sorted.bam

lofreq call-parallel --pp-threads 8 \
     -f ref.fa -o vars_pallel.vcf out_sorted.bam

Longphase

Introduction

LongPhase is an ultra-fast program for simultaneously co-phasing SNPs and SVs by using Nanopore and PacBio long reads. It is capable of producing nearly chromosome-scale haplotype blocks by using Nanpore ultra-long reads without the need for additional trios, chromosome conformation, and strand-seq data. On an 8-core machine, LongPhase can finish phasing a human genome in 10-20 minutes.

For more information, please check:

Versions

  • 1.4

Commands

  • longphase

Module

You can load the modules by:

module load biocontainers
module load longphase

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run longphase on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=longphase
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers longphase

longphase phase \
    -s SNP.vcf \
    --sv-file SV.vcf \
    -b alignment.bam \
    -r reference.fasta \
    -t 8 \
    -o phased_prefix \
    --ont # or --pb for PacBio Hifi

Longqc

Introduction

LongQC is a tool for the data quality control of the PacBio and ONT long reads.

For more information, please check:

Versions

  • 1.2.0c

Commands

  • longQC.py

Module

You can load the modules by:

module load biocontainers
module load longqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run longqc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=longqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers longqc

longQC.py sampleqc -x pb-rs2 -o out_dir seq.fastq

Lra

Introduction

Lra is a sequence alignment program that aligns long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies.

For more information, please check its website: https://biocontainers.pro/tools/lra and its home page on Github.

Versions

  • 1.3.2

Commands

  • lra

Module

You can load the modules by:

module load biocontainers
module load lra

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Lra on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lra
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lra

lra index genome.fasta

lra align genome.fasta input.fastq -t 12 -p s > output.sam

Ltr_finder

Introduction

LTR_Finder is an efficient program for finding full-length LTR retrotranspsons in genome sequences.

For more information, please check:

Versions

  • 1.07

Commands

  • ltr_finder

  • check_result.pl

  • down_tRNA.pl

  • filter_rt.pl

  • genome_plot.pl

  • genome_plot2.pl

  • genome_plot_svg.pl

Module

You can load the modules by:

module load biocontainers
module load ltr_finder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ltr_finder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltr_finder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ltr_finder

ltr_finder 3ds_72.fa -P 3ds_72 -w2  > test/3ds_72_result.txt \
    |   genome_plot.pl test/

Ltrpred

Introduction

LTRpred(ict): de novo annotation of young and intact retrotransposons.

For more information, please check:

Versions

  • 1.1.0

Commands

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load ltrpred

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ltrpred on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltrpred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ltrpred

Lumpy-sv

Introduction

Lumpy-sv is a general probabilistic framework for structural variant discovery.

For more information, please check its website: https://biocontainers.pro/tools/lumpy-sv and its home page on Github.

Versions

  • 0.3.1

Commands

  • lumpy

  • lumpyexpress

Module

You can load the modules by:

module load biocontainers
module load lumpy-sv

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Lumpy-sv on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lumpy-sv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lumpy-sv

lumpy -mw 4 -tt 0.0 -pe \
bam_file:AL87.discordant.sort.bam,histo_file:AL87.histo,mean:429,stdev:84,read_length:83,min_non_overlap:83,discordant_z:4,back_distance:1,weight:1,id:1,min_mapping_threshold:20 \
-sr bam_file:AL87.sr.sort.bam,back_distance:1,weight:1,id:2,min_mapping_threshold:20

Lyveset

Introduction

Lyveset is a method of using hqSNPs to create a phylogeny, especially for outbreak investigations.

For more information, please check:

Versions

  • 2.0.1

Commands

  • applyFstToTree.pl

  • cladeDistancesFromTree.pl

  • clusterPairwise.pl

  • convertAlignment.pl

  • downloadDataset.pl

  • errorProneRegions.pl

  • filterMatrix.pl

  • filterVcf.pl

  • genomeDist.pl

  • launch_bwa.pl

  • launch_set.pl

  • launch_smalt.pl

  • launch_snap.pl

  • launch_snpeff.pl

  • launch_varscan.pl

  • makeRegions.pl

  • matrixToAlignment.pl

  • pairwiseDistances.pl

  • pairwiseTo2d.pl

  • removeUninformativeSites.pl

  • removeUninformativeSitesFromMatrix.pl

  • run_assembly_isFastqPE.pl

  • run_assembly_metrics.pl

  • run_assembly_readMetrics.pl

  • run_assembly_removeDuplicateReads.pl

  • run_assembly_shuffleReads.pl

  • run_assembly_trimClean.pl

  • set_bayesHammer.pl

  • set_diagnose.pl

  • set_diagnose_msa.pl

  • set_downloadTestData.pl

  • set_findCliffs.pl

  • set_findPhages.pl

  • set_indexCase.pl

  • set_manage.pl

  • set_processPooledVcf.pl

  • set_samtools_depth.pl

  • set_test.pl

  • shuffleSplitReads.pl

  • snpDistribution.pl

  • vcfToAlignment.pl

  • vcfutils.pl

Module

You can load the modules by:

module load biocontainers
module load lyveset

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run lyveset on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lyveset
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers lyveset

set_test.pl lambda
set_manage.pl --create setTest

Macrel

Introduction

Macrel is a pipeline to mine antimicrobial peptides (AMPs) from (meta)genomes.

For more information, please check:

Versions

  • 1.2.0

Commands

  • macrel

Module

You can load the modules by:

module load biocontainers
module load macrel

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run macrel on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macrel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers macrel

MACS2

Introduction

MACS2 is Model-based Analysis of ChIP-Seq for identifying transcript factor binding sites.

For more information, please check its website: https://biocontainers.pro/tools/macs2 and its home page on Github.

Versions

  • 2.2.7.1

Commands

  • macs2

Module

You can load the modules by:

module load biocontainers
module load macs2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MACS2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers macs2

macs2 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01

Macs3

Introduction

MACS3 is Model-based Analysis of ChIP-Seq for identifying transcript factor.

For more information, please check its | Docker hub: https://hub.docker.com/r/lbmc/macs3/3.0.0a6 and its home page on Github.

Versions

  • 3.0.0a6

Commands

  • macs3

Module

You can load the modules by:

module load biocontainers
module load macs3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Macs3 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers macs3

macs3 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01

MAFFT

Introduction

MAFFT is a multiple alignment program for amino acid or nucleotide sequences.

For more information, please check its website: https://biocontainers.pro/tools/mafft and its home page: https://mafft.cbrc.jp/alignment/software/.

Versions

  • 7.475

  • 7.490

Commands

  • einsi

  • fftns

  • fftnsi

  • ginsi

  • linsi

  • mafft

  • mafft-distance

  • mafft-einsi

  • mafft-fftns

  • mafft-fftnsi

  • mafft-ginsi

  • mafft-homologs.rb

  • mafft-linsi

  • mafft-nwns

  • mafft-nwnsi

  • mafft-profile

  • mafft-qinsi

  • mafft-sparsecore.rb

  • mafft-xinsi

  • nwns

  • nwnsi

Module

You can load the modules by:

module load biocontainers
module load mafft

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MAFFT on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mafft
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mafft

Mageck

Introduction

Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens (or GeCKO) technology.

For more information, please check:

Versions

  • 0.5.9.5

Commands

  • mageck

  • mageckGSEA

  • RRA

Module

You can load the modules by:

module load biocontainers
module load mageck

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mageck on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mageck
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mageck


mageck count -l library.txt -n demo \
     --sample-label L1,CTRL \
     --fastq test1.fastq test2.fastq

mageck test -k demo.count.txt \
     -t L1 -c CTRL -n demo

Magicblast

Introduction

Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.

For more information, please check:

Versions

  • 1.5.0

Commands

  • magicblast

Module

You can load the modules by:

module load biocontainers
module load magicblast

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run magicblast on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=magicblast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers magicblast

MAKER

Introduction

MAKER is a popular genome annotation pipeline for both prokaryotic and eukaryotic genomes. This guide describes best practices for running MAKER on RCAC clusters. For detailed information about MAKER, see its offical website (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018).

Versions

  • 2.31.11

  • 3.01.03

Commands

  • cegma2zff

  • chado2gff3

  • compare

  • cufflinks2gff3

  • evaluator

  • fasta_merge

  • fasta_tool

  • genemark_gtf2gff3

  • gff3_merge

  • iprscan2gff3

  • iprscan_wrap

  • ipr_update_gff

  • maker

  • maker2chado

  • maker2eval_gtf

  • maker2jbrowse

  • maker2wap

  • maker2zff

  • maker_functional

  • maker_functional_fasta

  • maker_functional_gff

  • maker_map_ids

  • map2assembly

  • map_data_ids

  • map_fasta_ids

  • map_gff_ids

  • tophat2gff3

Module

You can load the modules by:

module load biocontainers
module load maker/2.31.11 # OR maker/3.01.03

Note

Dfam release 3.5 (October 2021) downloaded from Dfam website (https://www.dfam.org/home) that required by RepeatMasker has been set up for users. The RepeatMakser library is stored here /depot/itap/datasets/Maker/RepeatMasker/Libraries.

Prerequisites

  1. After loading MAKER modules, users can create MAKER control files by the folowing comand:

    maker -CTL
    

    This will generate three files:

  • maker_opts.ctl (required to be modified)

  • maker_exe.ctl (do not need to modify this file)

  • maker_bopts.ctl (optionally modify this file)

  1. maker_opts.ctl: - If not using RepeatMasker, modify model_org=all to model_org= - If not using RepeatMasker, modify model_org=all to an appropriate family/genus/species.

Example job non-mpi

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MAKER on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MAKER
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers maker/2.31.11  # or maker/3.01.03

maker -c 24

Example job mpi

To use MAKER in MPI mode, we cannot use the maker modules. Instead we have to use the singularity image files stored in /apps/biocontainers/images:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 5:00:00
#SBATCH -N 2
#SBATCH -n 24
#SBATCH -c 8
#SBATCH --job-name=MAKER_mpi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --mail-user=UserID@purdue.edu
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

## MAKER2
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_2.31.11.sif maker -c 8

## MAKER3
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_3.01.03.sif maker -c 8

Manta

Introduction

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads.

For more information, please check:

Versions

  • 1.6.0

Commands

  • configManta.py

  • python

Module

You can load the modules by:

module load biocontainers
module load manta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run manta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=manta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers manta

configManta.py --normalBam=HCC1954.NORMAL.30x.compare.COST16011_region.bam \
    --tumorBam=G15512.HCC1954.1.COST16011_region.bam \
    --referenceFasta=Homo_sapiens_assembly19.COST16011_region.fa \
    --region=8:107652000-107655000 \
    --region=11:94974000-94989000 \
    --exome --runDir="MantaDemoAnalysis"

 python MantaDemoAnalysis/runWorkflow.py

Mapcaller

Introduction

Mapcaller is an efficient and versatile approach for short-read mapping and variant identification using high-throughput sequenced data.

For more information, please check its website: https://biocontainers.pro/tools/mapcaller and its home page on Github.

Versions

  • 0.9.9.41

Commands

  • MapCaller

Module

You can load the modules by:

module load biocontainers
module load mapcaller

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mapcaller on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mapcaller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mapcaller

MapCaller index ref.fasta ref

MapCaller -t 12 -i ref -f input_1.fastq  -f2 input_2.fastq  -vcf out.vcf

Mapdamage2

Introduction

mapDamage2 is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.

For more information, please check:

Versions

  • 2.2.1

Commands

  • mapDamage

Module

You can load the modules by:

module load biocontainers
module load mapdamage2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mapdamage2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mapdamage2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mapdamage2

Marginpolish

Introduction

MarginPolish is a graph-based assembly polisher. It iteratively finds multiple probable alignment paths for run-length-encoded reads and uses these to generate a refined sequence. It takes as input a FASTA assembly and an indexed BAM (ONT reads aligned to the assembly), and it produces a polished FASTA assembly.

Versions

  • 0.1.3

Commands

  • marginpolish

Module

You can load the modules by:

module load biocontainers
module load marginpolish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run marginpolish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=marginpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers marginpolish

marginpolish \
    Reads_to_assembly_StaphAur.bam \
    Draft_assembly_StaphAur.fasta \
    helen_modles/MP_r941_guppy344_microbial.json \
    -t 32 \
    -o mp_output/mp_images \
    -f

Mash

Introduction

Mash is a fast sequence distance estimator that uses MinHash.

For more information, please check its website: https://biocontainers.pro/tools/mash and its home page on Github.

Versions

  • 2.3

Commands

  • mash

Module

You can load the modules by:

module load biocontainers
module load mash

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mash on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mash

mash dist genome1.fasta genome2.fasta

Mashmap

Introduction

Mashmap is a fast approximate aligner for long DNA sequences.

For more information, please check its website: https://biocontainers.pro/tools/mashmap and its home page on Github.

Versions

  • 2.0-pl5321

Commands

  • mashmap

Module

You can load the modules by:

module load biocontainers
module load mashmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mashmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mashmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mashmap

mashmap -r ref.fasta -t 12 -q input.fasta

Mashtree

Introduction

Mashtree is a tool to create a tree using Mash distances.

For more information, please check its website: https://biocontainers.pro/tools/mashtree and its home page on Github.

Versions

  • 1.2.0

Commands

  • mashtree

  • mashtree_bootstrap.pl

  • mashtree_cluster.pl

  • mashtree_init.pl

  • mashtree_jackknife.pl

  • mashtree_wrapper_deprecated.pl

Module

You can load the modules by:

module load biocontainers
module load mashtree

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mashtree on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mashtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mashtree

Masurca

Introduction

The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner.

For more information, please check:

Versions

  • 4.0.9

  • 4.1.0

Commands

  • masurca

  • build_human_reference.sh

  • chromosome_scaffolder.sh

  • close_gaps.sh

  • close_scaffold_gaps.sh

  • correct_with_k_unitigs.sh

  • deduplicate_contigs.sh

  • deduplicate_unitigs.sh

  • eugene.sh

  • extract_chrM.sh

  • filter_library.sh

  • final_polish.sh

  • fix_unitigs.sh

  • fragScaff.sh

  • mega_reads_assemble_cluster.sh

  • mega_reads_assemble_cluster2.sh

  • mega_reads_assemble_polish.sh

  • mega_reads_assemble_ref.sh

  • parallel_delta-filter.sh

  • polca.sh

  • polish_with_illumina_assembly.sh

  • recompute_astat_superreads.sh

  • recompute_astat_superreads_CA8.sh

  • reconcile_alignments.sh

  • refine.sh

  • resolve_trio.sh

  • run_ECR.sh

  • samba.sh

  • splitScaffoldsAtNs.sh

Module

You can load the modules by:

module load biocontainers
module load masurca

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run masurca on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=masurca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers masurca

Mauve

Introduction

Mauve is a system for constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion.

For more information, please check its website: https://biocontainers.pro/tools/mauve and its home page: http://darlinglab.org/mauve/.

Versions

  • 2.4.0

Commands

  • mauveAligner

  • progressiveMauve

Module

You can load the modules by:

module load biocontainers
module load mauve

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mauve on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mauve
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mauve

mauveAligner seqs.fasta --output=mauveAligner_output

progressiveMauve --output=threeway.xmfa \
    --output-guide-tree=threeway.tree \
    --backbone-output=threeway.backbone genome1.gbk genome2.gbk genome3.gbk

Maxbin2

Introduction

Maxbin2 is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.

For more information, please check:

Versions

  • 2.2.7

Commands

  • run_MaxBin.pl

  • run_FragGeneScan.pl

Module

You can load the modules by:

module load biocontainers
module load maxbin2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run maxbin2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxbin2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers maxbin2

run_MaxBin.pl -contig subset_assembly.fa \
     -abund_list abundance.list -max_iteration 5 -out mbin

Maxquant

Introduction

Maxquant is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data.

For more information, please check home page: https://www.maxquant.org.

Versions

  • 2.1.0.0

  • 2.1.3.0

  • 2.1.4.0

  • 2.3.1.0

Commands

  • MaxQuantGui.exe

  • MaxQuantCmd.exe

Module

You can load the modules by:

module load biocontainers
module load maxquant

GUI

To run Maxquant with GUI, it is recommended to run within ThinLinc:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers maxquant
(base) UserID@bell-a008:~ $ MaxQuantGui.exe
images/maxquant.png

CMD job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Maxquant without GUI on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers maxquant

MaxQuantCmd.exe mqpar.xml

Mcl

Introduction

Mcl is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs.

For more information, please check its website: https://biocontainers.pro/tools/mcl and its home page: http://micans.org/mcl/.

Versions

  • 14.137-pl5262

Commands

  • clm

  • clmformat

  • clxdo

  • mcl

  • mclblastline

  • mclcm

  • mclpipeline

  • mcx

  • mcxarray

  • mcxassemble

  • mcxdeblast

  • mcxdump

  • mcxi

  • mcxload

  • mcxmap

  • mcxrand

  • mcxsubs

Module

You can load the modules by:

module load biocontainers
module load mcl

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mcl on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mcl

Mcscanx

images/merge_circle.png

Introduction

The MCScanX package has two major components: a modified version of MCscan algorithm allowing users to handle MCScan more conveniently and to view multiple alignment of syntenic blocks more clearly, and a variety of downstream analysis tools to conduct different biological analyses based on the synteny data generated by the modified MCScan algorithm.

For more information, please check:

Versions

  • default

Commands

  • MCScanX

  • MCScanX_h

  • duplicate_gene_classifier

  • add_ka_and_ks_to_collinearity

  • add_kaks_to_synteny

  • detect_collinearity_within_gene_families

  • detect_synteny_within_gene_families

  • group_collinear_genes

  • group_syntenic_genes

  • origin_enrichment_analysis

Module

You can load the modules by:

module load biocontainers
module load mcscanx

Helper command

Note

To conduct downstream analyses, users need to copy the folder downstream_analyses from container into the host system.

A helper command copy_downstream_analyses is provided to simplify the task. Follow the procedure below to copy downstream_analyses into target directory:

$ copy_downstream_analyses $PWD # this will copy the downstream_analyses into the current directory.

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mcscanx on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcscanx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mcscanx

## Run MCScanX
MCScanX Result/merge
## Copy downstream_analyses
copy_downstream_analyses $PWD
## Downstream analyses
java circle_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_circ.ctl -o ../Result/merge_circle.png
java dot_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dot.png
java dual_synteny_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dual_synteny.png

Medaka

Introduction

Medaka is a tool to create consensus sequences and variant calls from nanopore sequencing data.

For more information, please check its | Docker hub: https://hub.docker.com/r/ontresearch/medaka and its home page on Github.

Versions

  • 1.6.0

Commands

  • medaka

  • medaka_consensus

  • medaka_counts

  • medaka_data_path

  • medaka_haploid_variant

  • medaka_version_report

Module

You can load the modules by:

module load biocontainers
module load medaka

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Medaka on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=medaka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers medaka

Megadepth

Introduction

Megadepth is an efficient tool for extracting coverage related information from RNA and DNA-seq BAM and BigWig files.

For more information, please check its website: https://biocontainers.pro/tools/megadepth and its home page on Github.

Versions

  • 1.2.0

Commands

  • megadepth

Module

You can load the modules by:

module load biocontainers
module load megadepth

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Megadepth on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megadepth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers megadepth

megadepth sorted.bam

Megahit

Introduction

Megahit is a ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.

For more information, please check its website: https://biocontainers.pro/tools/megahit and its home page on Github.

Versions

  • 1.2.9

Commands

  • megahit

Module

You can load the modules by:

module load biocontainers
module load megahit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Megahit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=megahit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers megahit

megahit --12 SRR1976948.abundtrim.subset.pe.fq.gz,SRR1977249.abundtrim.subset.pe.fq.gz -o combined

Megan

Introduction

Megan is a computer program that allows optimized analysis of large metagenomic datasets. Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample.

Versions

  • 6.21.7

Commands

  • MEGAN

  • blast2lca

  • blast2rma

  • daa2info

  • daa2rma

  • daa-meganizer

  • gc-assembler

  • rma2info

  • sam2rma

  • references-annotator

Module

You can load the modules by:

module load biocontainers
module load megan

GUI

To run MEGAN with GUI, it is recommended to run within ThinLinc:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers megan
(base) UserID@bell-a008:~ $ MEGAN
images/megan.png

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Megan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers megan

Meme

Introduction

Meme is a collection of tools for the discovery and analysis of sequence motifs.

For more information, please check its website: https://biocontainers.pro/tools/meme and its home page: https://meme-suite.org/meme/.

Versions

  • 5.3.3

  • 5.4.1

  • 5.5.0

Commands

  • ame

  • centrimo

  • dreme

  • dust

  • fimo

  • glam2

  • glam2scan

  • gomo

  • mast

  • mcast

  • meme

  • meme-chip

  • momo

  • purge

  • spamo

  • tomtom

Module

You can load the modules by:

module load biocontainers
module load meme

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Meme on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meme
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers meme

meme seq.fasta -dna -mod oops -pal

meme-chip Klf1.fna -o memechip_klf1_out

Memes

Introduction

memes is an R interface to the MEME Suite family of tools, which provides several utilities for performing motif analysis on DNA, RNA, and protein sequences. memes works by detecting a local install of the MEME suite, running the commands, then importing the results directly into R.

For more information, please check:

Versions

  • 1.1.2

Commands

  • R

Module

You can load the modules by:

module load biocontainers
module load memes

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run memes on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=memes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers memes

Meraculous

Introduction

Meraculous is a whole genome assembler for Next Generation Sequencing data, geared for large genomes. It is hybrid k-mer/read-based approach capitalizes on the high accuracy of Illumina sequence by eschewing an explicit error correction step which we argue to be redundant with the assembly process. Meraculous achieves high performance with large datasets by utilizing lightweight data structures and multi-threaded parallelization, allowing to assemble human-sized genomes on a high-cpu cluster in under a day. The process pipeline implements a highly transparent and portable model of job control and monitoring where different assembly stages can be executed and re-executed separately or in unison on a wide variety of architectures.

For more information, please check:

Versions

  • 2.2.6

Commands

  • run_meraculous.sh

  • blastMapAnalyzer2.pl

  • bmaToLinks.pl

  • _bubbleFinder2.pl

  • bubblePopper.pl

  • bubbleScout.pl

  • contigBias.pl

  • divide_it.pl

  • fasta_splitter.pl

  • findDMin2.pl

  • gapDivider.pl

  • gapPlacer.pl

  • haplotyper.Naive.pl

  • haplotyper.pl

  • histogram2.pl

  • kmerHistAnalyzer.pl

  • loadBalanceMers.pl

  • meraculous4h.pl

  • meraculous.pl

  • N50.pl

  • _oNo4.pl

  • oNo7.pl

  • optimize2.pl

  • randomList2.pl

  • scaffold2contig.pl

  • scaffReportToFasta.pl

  • screen_list2.pl

  • spanner.pl

  • splinter.pl

  • splinter_scaffolds.pl

  • split_and_validate_reads.pl

  • test_dependencies.pl

  • unique.pl

Module

You can load the modules by:

module load biocontainers
module load meraculous

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run meraculous on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meraculous
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers meraculous

Merqury

Introduction

Merqury is a tool to evaluate genome assemblies with k-mers and more.

For more information, please check:

Versions

  • 1.3

Commands

  • merqury.sh

Module

You can load the modules by:

module load biocontainers
module load merqury

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run merqury on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=merqury
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers merqury

merqury.sh F1.k18.meryl col0.hapmer.meryl cvi0.hapmer.meryl \
    athal_COL.fasta athal_CVI.fasta test

Meryl

Introduction

Meryl is a genomic k-mer counter (and sequence utility) with nice features.

For more information, please check its website: https://biocontainers.pro/tools/meryl and its home page on Github.

Versions

  • 1.3

Commands

  • meryl

  • meryl-analyze

  • meryl-import

  • meryl-lookup

  • meryl-simple

Module

You can load the modules by:

module load biocontainers
module load meryl

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Meryl on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meryl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers meryl

meryl count k=42 data/ec.fna.gz output ec.meryl

Metabat

Introduction

Metabat is a robust statistical framework for reconstructing genomes from metagenomic data.

For more information, please check its | Docker hub: https://hub.docker.com/r/metabat/metabat and its home page: https://bitbucket.org/berkeleylab/metabat/src/master/

Versions

  • 2.15-5

Commands

  • aggregateBinDepths.pl

  • aggregateContigOverlapsByBin.pl

  • contigOverlaps

  • jgi_summarize_bam_contig_depths

  • merge_depths.pl

  • metabat

  • metabat1

  • metabat2

  • runMetaBat.sh

Module

You can load the modules by:

module load biocontainers
module load metabat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Metabat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=metabat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers metabat

metabat2 -m 10000 \
    -t 24 \
    -i contig.fasta \
    -o metabat2_output \
    -a depth.txt

Metachip

Introduction

Metachip is a pipeline for Horizontal gene transfer (HGT) identification.

For more information, please check:

Versions

  • 1.10.12

Commands

  • MetaCHIP

Module

You can load the modules by:

module load biocontainers
module load metachip

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run metachip on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metachip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers metachip

MetaPhlAn 3

Introduction

MetaPhlAn (Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:

  • up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);

  • unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;

  • accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);

  • species-level resolution for bacteria, archaea, eukaryotes and viruses;

  • extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.

For more information, please check its user guide at: https://huttenhower.sph.harvard.edu/metaphlan/

Versions

  • 3.0.14

  • 3.0.9

  • 4.0.2

Commands

metaphlan

Database

The lastest version of database(mpa_v30) has been downloaded and built in /depot/itap/datasets/metaphlan/.

Module

You can load the modules by:

module load biocontainers
module load metaphlan/3.0.14

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MetaPhlAn on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MetaPhlAn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers metaphlan/3.0.14

DATABASE=/depot/itap/datasets/metaphlan/
metaphlan SRR11234553_1.fastq,SRR11234553_2.fastq --input_type fastq --nproc 24 -o profiled_metagenome.txt --bowtie2db $DATABASE  --bowtie2out metagenome.bowtie2.bz2

Metaseq

Introduction

Metaseq is a Python package for integrative genome-wide analysis reveals relationships between chromatin insulators and associated nuclear mRNA.

For more information, please check:

Versions

  • 0.5.6

Commands

  • python

  • python2

Module

You can load the modules by:

module load biocontainers
module load metaseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run metaseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers metaseq

Methyldackel

Introduction

MethylDackel (formerly named PileOMeth, which was a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.

For more information, please check:

Versions

  • 0.6.1

Commands

  • MethylDackel

Module

You can load the modules by:

module load biocontainers
module load methyldackel

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run methyldackel on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=methyldackel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers methyldackel

MethylDackel extract chgchh.fa chgchh_aln.bam

Metilene

Introduction

Metilene is a versatile tool to study the effect of epigenetic modifications in differentiation/development, tumorigenesis, and systems biology on a global, genome-wide level.

For more information, please check:

Versions

  • 0.2.8

Commands

  • metilene

  • metilene_input.pl

  • metilene_output.pl

  • metilene_output.R

Module

You can load the modules by:

module load biocontainers
module load metilene

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run metilene on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metilene
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers metilene

metilene -a g1 -b g2 methylation-file

Mhm2

Introduction

MetaHipMer is a de novo metagenome short-read assembler. Version 2 (MHM2) is written entirely in UPC++ and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.

For more information, please check:

Versions

  • 2.0.0

Commands

  • mhm2.py

Module

You can load the modules by:

module load biocontainers
module load mhm2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mhm2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mhm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mhm2

mhm2.py -r input_1.fastq,input_2.fastq

MicrobeDMM

Introduction

MicrobeDMM is a suite of programs used for empirical Bayes fitting of DMM models.

For more information, please check its home page: https://code.google.com/archive/p/microbedmm.

Versions

  • 1.0

Commands

  • DirichletMixtureGHPFit

Module

You can load the modules by:

module load biocontainers
module load microbedmm

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MicrobeDMM on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=microbedmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers microbedmm

Minialign

Introduction

Minialign is a little bit fast and moderately accurate nucleotide sequence alignment tool designed for PacBio and Nanopore long reads.

For more information, please check its website: https://biocontainers.pro/tools/minialign and its home page on Github.

Versions

  • 0.5.3

Commands

  • minialign

Module

You can load the modules by:

module load biocontainers
module load minialign

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Minialign on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minialign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers minialign

minialign -d index.mai genome.fasta
minialign -l index.mai input.fastq > out.sam

Miniasm

Introduction

Miniasm is a very fast OLC-based de novo assembler for noisy long reads.

For more information, please check its website: https://biocontainers.pro/tools/miniasm and its home page on Github.

Versions

  • 0.3_r179

Commands

  • miniasm

  • minidot

Module

You can load the modules by:

module load biocontainers
module load miniasm

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Miniasm on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=miniasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers miniasm

miniasm -f Elysia_ont_test.fq  Elysia_reads.paf.gz \
     > Elysia_reads.gfa

Minimap2

Introduction

Minimap2 is a versatile pairwise aligner for genomic and spliced nucleotide sequences.

For more information, please check its website: https://biocontainers.pro/tools/minimap2 and its home page on Github.

Versions

  • 2.22

  • 2.24

  • 2.26

Commands

  • minimap2

  • paftools.js

  • k8

Module

You can load the modules by:

module load biocontainers
module load minimap2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Minimap2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minimap2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers minimap2

minimap2 -ax sr Wuhan-Hu-1.fasta \
    seq_1.fastq seq_2.fastq \
    > aln.sam

Minipolish

Introduction

Minipolish is a tool for Racon polishing of miniasm assemblies.

For more information, please check:

Versions

  • 0.1.3

Commands

  • minipolish

Module

You can load the modules by:

module load biocontainers
module load minipolish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run minipolish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=minipolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers minipolish

minipolish -t 8 long_reads.fastq.gz assembly.gfa > polished.gfa

Miniprot

Introduction

Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. Miniprot is similar to GeneWise and Exonerate in functionality but it can map proteins to whole genomes and is much faster at the residue alignment step.

For more information, please check:

Versions

  • 0.3

  • 0.7

Commands

  • miniprot

Module

You can load the modules by:

module load biocontainers
module load miniprot

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run miniprot on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=miniprot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers miniprot

miRDeep2

Introduction

miRDeep2 discovers active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, …).

For more information, please check its website: https://biocontainers.pro/tools/mirdeep2 and its home page on Github.

Versions

  • 2.0.1.3

Commands

  • bwa_sam_converter.pl

  • clip_adapters.pl

  • collapse_reads_md.pl

  • convert_bowtie_output.pl

  • excise_precursors_iterative_final.pl

  • excise_precursors.pl

  • extract_miRNAs.pl

  • fastaparse.pl

  • fastaselect.pl

  • fastq2fasta.pl

  • find_read_count.pl

  • geo2fasta.pl

  • get_mirdeep2_precursors.pl

  • illumina_to_fasta.pl

  • make_html2.pl

  • make_html.pl

  • mapper.pl

  • mirdeep2bed.pl

  • miRDeep2_core_algorithm.pl

  • miRDeep2.pl

  • parse_mappings.pl

  • perform_controls.pl

  • permute_structure.pl

  • prepare_signature.pl

  • quantifier.pl

  • remove_white_space_in_id.pl

  • rna2dna.pl

  • samFLAGinfo.pl

  • sam_reads_collapse.pl

  • sanity_check_genome.pl

  • sanity_check_mapping_file.pl

  • sanity_check_mature_ref.pl

  • sanity_check_reads_ready_file.pl

  • select_for_randfold.pl

  • survey.pl

Module

You can load the modules by:

module load biocontainers
module load mirdeep2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run miRDeep2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirdeep2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mirdeep2

Mirtop

Introduction

Mirtop is a ommand line tool to annotate with a standard naming miRNAs e isomiRs.

For more information, please check:

Versions

  • 0.4.25

Commands

  • mirtop

Module

You can load the modules by:

module load biocontainers
module load mirtop

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mirtop on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirtop
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mirtop

mirtop gff --format prost --sps hsa
    --hairpin examples/annotate/hairpin.fa \
    --gtf examples/annotate/hsa.gff3 \
    -o test_out \
    examples/prost/prost.example.txt

Mitofinder

Introduction

Mitofinder is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.

For more information, please check its website: https://cloud.sylabs.io/library/remiallio/default/mitofinder and its home page on Github.

Versions

  • 1.4.1

Commands

  • mitofinder

Module

You can load the modules by:

module load biocontainers
module load mitofinder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mitofinder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mitofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mitofinder

mitofinder -j Aphaenogaster_megommata_SRR1303315 \
           -1 Aphaenogaster_megommata_SRR1303315_R1_cleaned.fastq.gz \
           -2 Aphaenogaster_megommata_SRR1303315_R2_cleaned.fastq.gz \
           -r reference.gb -o 5 -p 5 -m 10

Mlst

Introduction

Mlst is used to scan contig files against traditional PubMLST typing schemes.

For more information, please check:

Versions

  • 2.22.0

  • 2.23.0

Commands

  • mlst

Module

You can load the modules by:

module load biocontainers
module load mlst

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mlst on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mlst
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mlst

mlst contigs.fa
mlst genome.gbk.gz

Mmseqs2

Introduction

Mmseqs2 is a software suite to search and cluster huge protein and nucleotide sequence sets.

For more information, please check its website: https://biocontainers.pro/tools/mmseqs2 and its home page on Github.

Versions

  • 13.45111

  • 14.7e284

Commands

  • mmseqs

Module

You can load the modules by:

module load biocontainers
module load mmseqs2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mmseqs2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mmseqs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mmseqs2

mmseqs createdb examples/DB.fasta targetDB
mmseqs createtaxdb targetDB tmp
mmseqs createindex targetDB tmp
mmseqs easy-taxonomy examples/QUERY.fasta targetDB alnRes tmp

Mob_suite

Introduction

MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies.

For more information, please check:

Versions

  • 3.0.3

Commands

  • mob_cluster

  • mob_init

  • mob_recon

  • mob_typer

Module

You can load the modules by:

module load biocontainers
module load mob_suite

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mob_suite on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mob_suite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mob_suite

Modbam2bed

Introduction

Modbam2bed is a program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.

For more information, please check:

Versions

  • 0.9.1

Commands

  • modbam2bed

Module

You can load the modules by:

module load biocontainers
module load modbam2bed

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run modbam2bed on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modbam2bed
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers modbam2bed

Modeltest-ng

Introduction

ModelTest-NG is a tool for selecting the best-fit model of evolution for DNA and protein alignments. ModelTest-NG supersedes jModelTest and ProtTest in one single tool, with graphical and command console interfaces.

For more information, please check:

Versions

  • 0.1.7

Commands

  • modeltest-ng

  • modeltest-ng-mpi

Module

You can load the modules by:

module load biocontainers
module load modeltest-ng

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run modeltest-ng on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modeltest-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers modeltest-ng

Momi

Introduction

momi (MOran Models for Inference) is a Python package that computes the expected sample frequency spectrum (SFS), a statistic commonly used in population genetics, and uses it to fit demographic history.

For more information, please check:

Versions

  • 2.1.19

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load momi

Interactive job

To run momi interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers momi
(base) UserID@bell-a008:~ $ python
Python 3.9.7 (default, Sep 16 2021, 13:09:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import momi
>>> import logging
>>> logging.basicConfig(level=logging.INFO,
                 filename="tutorial.log")
>>> model = momi.DemographicModel(N_e=1.2e4, gen_time=29,
                           muts_per_gen=1.25e-8)

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run momi on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=momi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers momi

python python.py

Mothur

Introduction

Mothur is an open source software package for bioinformatics data processing. The package is frequently used in the analysis of DNA from uncultured microbes.

Detailed information about Mothur can be found here: https://mothur.org

Versions

  • 1.46.0

  • 1.47.0

  • 1.48.0

Commands

  • mothur

Module

You can load the modules by:

module load biocontainers
module load mothur/1.47.0

Interactive job

To run mothur interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers mothur/1.47.0
(base) UserID@bell-a008:~ $ mothur
Linux version

Using ReadLine,Boost,HDF5,GSL
mothur v.1.47.0
Last updated: 1/21/22
by
Patrick D. Schloss

Department of Microbiology & Immunology

University of Michigan
http://www.mothur.org

When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.

Distributed under the GNU General Public License

Type 'help()' for information on the commands that are available

For questions and analysis support, please visit our forum at https://forum.mothur.org

Type 'quit()' to exit program

[NOTE]: Setting random seed to 19760620.

Interactive Mode

mothur > align.seqs(help)
mothur > quit()

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=mothur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mothur/1.47.0

mothur batch_file

Motus

Introduction

The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data.

For more information, please check:

Versions

  • 3.0.3

Commands

  • motus

Module

You can load the modules by:

module load biocontainers
module load motus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run motus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=motus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers motus

MrBayes

Introduction

MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.

MrBayes is available both in a serial version (‘mb’) and in a parallel version (‘mb-mpi’) that uses MPI instructions to distribute computations across several processors or processor cores. The serial version does not support multi-threading, which means that you will not be able to utilize more than one core on a multi-core machine for a single MrBayes analysis. If you want to utilize all cores,you need to run the MPI version of MrBayes.

Note: ‘mb-mpi’ in this version of the container does not run across multiple nodes (only within a node). This is a bug in the container (upstream).

For more information, please check its website: https://biocontainers.pro/tools/mrbayes and its home page: http://mrbayes.net.

Versions

  • 3.2.7

Commands

  • mb

  • mb-mpi

  • mpirun

  • mpiexec

Module

You can load the modules by:

module load biocontainers
module load mrbayes

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run MrBayes on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mrbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mrbayes

Multiqc

Introduction

Multiqc is a reporting tool that parses summary statistics from results and log files generated by other bioinformatics tools.

For more information, please check its website: https://biocontainers.pro/tools/multiqc and its home page: https://multiqc.info.

Versions

  • 1.11

Commands

  • multiqc

Module

You can load the modules by:

module load biocontainers
module load multiqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Multiqc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=multiqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers multiqc

multiqc fastqc_out -o multiqc_out

Mummer4

Introduction

Mummer4 is a versatile alignment tool for DNA and protein sequences.

For more information, please check its website: https://biocontainers.pro/tools/mummer4 and its home page on Github.

Versions

  • 4.0.0rc1-pl5262

Commands

  • annotate

  • combineMUMs

  • delta-filter

  • delta2vcf

  • dnadiff

  • exact-tandems

  • mummer

  • mummerplot

  • nucmer

  • promer

  • repeat-match

  • show-aligns

  • show-coords

  • show-diff

  • show-snps

  • show-tiling

Module

You can load the modules by:

module load biocontainers
module load mummer4

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Mummer4 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mummer4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mummer4

mummer -mum -b -c H_pylori26695_Eslice.fasta H_pyloriJ99_Eslice.fasta > mummer.mums

Muscle

Introduction

Muscle is a modified progressive alignment algorithm which has comparable accuracy to MAFFT, but faster performance.

For more information, please check its website: https://biocontainers.pro/tools/muscle and its home page: http://www.drive5.com/muscle/muscle_userguide3.8.html.

Versions

  • 3.8.1551

  • 5.1

Versions

  • 3.8.1551

  • 5.1

Commands

  • muscle

Module

You can load the modules by:

module load biocontainers
module load muscle

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Muscle on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=muscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers muscle

muscle -align seqs2.fasta  -output seqs.afa

Mutmap

Introduction

MutMap is a powerful and efficient method to identify agronomically important loci in crop plants.

For more information, please check:

Versions

  • 2.3.3

Commands

  • mutmap

  • mutplot

Module

You can load the modules by:

module load biocontainers
module load mutmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mutmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mutmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mutmap

Mykrobe

Introduction

Mykrobe analyses the whole genome of a bacterial sample, all within a couple of minutes, and predicts which drugs the infection is resistant to.

For more information, please check:

Versions

  • 0.11.0

Commands

  • mykrobe

Module

You can load the modules by:

module load biocontainers
module load mykrobe

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run mykrobe on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mykrobe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers mykrobe

N50

Introduction

N50 is a command line tool to calculate assembly metrices.

For more information, please check:

Versions

  • 1.5.6

Commands

  • n50

Module

You can load the modules by:

module load biocontainers
module load n50

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run n50 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=n50
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers n50

Nanofilt

Introduction

Nanofilt is a tool for filtering and trimming of Oxford Nanopore Sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/nanofilt and its home page on Github.

Versions

  • 2.8.0

Commands

  • NanoFilt

Module

You can load the modules by:

module load biocontainers
module load nanofilt

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nanofilt on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanofilt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nanofilt

NanoFilt -q 12 --headcrop 75 reads.fastq |  gzip > trimmed-reads.fastq.gz

Nanolyse

Introduction

Nanolyse is a tool to remove reads mapping to the lambda phage genome from a fastq file.

For more information, please check its website: https://biocontainers.pro/tools/nanolyse and its home page on Github.

Versions

  • 1.2.0

Commands

  • NanoLyse

Module

You can load the modules by:

module load biocontainers
module load nanolyse

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nanolyse on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanolyse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nanolyse

gunzip -c reads.fastq.gz |  NanoLyse |  gzip > reads_without_lambda.fastq.gz

Nanoplot

Introduction

Nanoplot is a plotting tool for long read sequencing data and alignments.

For more information, please check its website: https://biocontainers.pro/tools/nanoplot and its home page on Github.

Versions

  • 1.39.0

Commands

  • NanoPlot

Module

You can load the modules by:

module load biocontainers
module load nanoplot

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nanoplot on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=nanoplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nanoplot

NanoPlot --summary sequencing_summary.txt --loglength -o summary-plots-log-transformed
NanoPlot -t 2 --fastq reads1.fastq.gz reads2.fastq.gz --maxlength 40000 --plots dot --legacy hex
NanoPlot -t 12 --color yellow --bam alignment1.bam alignment2.bam alignment3.bam --downsample 10000 -o bamplots_downsampled

Nanopolish

Introduction

Nanopolish is a software package for signal-level analysis of Oxford Nanopore sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/nanopolish and its home page on Github.

Versions

  • 0.13.2

  • 0.14.0

Commands

  • nanopolish

Module

You can load the modules by:

module load biocontainers
module load nanopolish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nanopolish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nanopolish

nanopolish index -d fast5_files/ reads.fasta

nanopolish variants --consensus \
    -o polished.vcf -w "tig00000001:200000-202000" \
     -r reads.fasta -b reads.sorted.bam  -g draft.fa

Ncbi-amrfinderplus

Introduction

Ncbi-amrfinderplus and the accompanying database identify acquired antimicrobial resistance genes in bacterial protein and/or assembled nucleotide sequences as well as known resistance-associated point mutations for several taxa.

For more information, please check:

Versions

  • 3.10.30

  • 3.10.42

  • 3.11.2

Commands

  • amrfinder

Module

You can load the modules by:

module load biocontainers
module load ncbi-amrfinderplus

Note

AMRFinderPlus database has been setup for users. Users can check the database version by amrfinder -V. RCAC will keep updating database for users. If you notice our database is out of date, you can contact us to update the database.

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ncbi-amrfinderplus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-amrfinderplus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ncbi-amrfinderplus

# Protein AMRFinder with no genomic coordinates
amrfinder -p test_prot.fa

# Translated nucleotide AMRFinder (will not use HMMs)
amrfinder -n test_dna.fa

# Protein AMRFinder using GFF to get genomic coordinates and 'plus' genes
amrfinder -p test_prot.fa -g test_prot.gff --plus

# Protein AMRFinder with Escherichia protein point mutations
amrfinder -p test_prot.fa -O Escherichia

# Full AMRFinderPlus search combining results
amrfinder -p test_prot.fa -g test_prot.gff -n test_dna.fa -O Escherichia --plus

Ncbi-datasets

Introduction

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or NCBI Datasets web interface.

For more information, please check:

Versions

  • 14.3.0

Commands

  • datasets

  • dataformat

Module

You can load the modules by:

module load biocontainers
module load ncbi-datasets

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ncbi-datasets on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-datasets
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ncbi-datasets

Ncbi-genome-download

Introduction

Ncbi-genome-download is a script to download genomes from the NCBI FTP servers.

For more information, please check its website: https://biocontainers.pro/tools/ncbi-genome-download and its home page on Github.

Versions

  • 0.3.1

Commands

  • ncbi-genome-download

Module

You can load the modules by:

module load biocontainers
module load ncbi-genome-download

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ncbi-genome-download on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=ncbi-genome-download
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ncbi-genome-download

ncbi-genome-download bacteria,viral --parallel 4
ncbi-genome-download --genera "Streptomyces coelicolor,Escherichia coli" bacteria
ncbi-genome-download --species-taxids 562 bacteria

Ncbi-table2asn

Introduction

table2asn is a command-line program that creates sequence records for submission to GenBank. It uses many of the same functions as Genome Workbench but is driven generally by data files, and the records it produces do not necessarily require additional manual editing before submission to GenBank.

For more information, please check:

Versions

  • 1.26.678

Commands

  • table2asn

Module

You can load the modules by:

module load biocontainers
module load ncbi-table2asn

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ncbi-table2asn on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-table2asn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ncbi-table2asn

Neusomatic

Introduction

NeuSomatic is based on deep convolutional neural networks for accurate somatic mutation detection. With properly trained models, it can robustly perform across sequencing platforms, strategies, and conditions. NeuSomatic summarizes and augments sequence alignments in a novel way and incorporates multi-dimensional features to capture variant signals effectively. It is not only a universal but also accurate somatic mutation detection method.

For more information, please check:

Versions

  • 0.2.1

Commands

  • call.py

  • dataloader.py

  • extract_postprocess_targets.py

  • filter_candidates.py

  • generate_dataset.py

  • long_read_indelrealign.py

  • merge_post_vcfs.py

  • merge_tsvs.py

  • network.py

  • postprocess.py

  • preprocess.py

  • resolve_scores.py

  • resolve_variants.py

  • scan_alignments.py

  • split_bed.py

  • train.py

  • utils.py

Module

You can load the modules by:

module load biocontainers
module load neusomatic

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run neusomatic on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=neusomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers neusomatic

Nextalign

Introduction

Nextalign is a viral genome sequence alignment tool for command line.

Versions

  • 1.10.3

Commands

  • nextalign

Module

You can load the modules by:

module load biocontainers
module load nextalign

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nextalign on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nextalign

nextalign \
     --sequences data/sars-cov-2/sequences.fasta \
     --reference data/sars-cov-2/reference.fasta \
     --genemap data/sars-cov-2/genemap.gff \
    --genes E,M,N,ORF1a,ORF1b,ORF3a,ORF6,ORF7a,ORF7b,ORF8,ORF9b,S \
    --output-dir output/ \
    --output-basename nextalign

Nextclade

Introduction

Nextclade is a tool that identifies differences between your sequences and a reference sequence, uses these differences to assign your sequences to clades, and reports potential sequence quality issues in your data.

Versions

  • 1.10.3

Commands

  • nextclade

Module

You can load the modules by:

module load biocontainers
module load nextclade

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nextclade on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextclade
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nextclade

mkdir -p data
nextclade dataset get --name 'sars-cov-2' --output-dir 'data/sars-cov-2'

nextclade \
    --in-order \
    --input-fasta data/sars-cov-2/sequences.fasta \
    --input-dataset data/sars-cov-2 \
    --output-tsv output/nextclade.tsv \
    --output-tree output/nextclade.auspice.json \
    --output-dir output/ \
    --output-basename nextclade

Nextdenovo

Introduction

NextDenovo is a string graph-based de novo assembler for long reads (CLR, HiFi and ONT). It uses a “correct-then-assemble” strategy similar to canu (no correction step for PacBio HiFi reads), but requires significantly less computing resources and storages. After assembly, the per-base accuracy is about 98-99.8%, to further improve single base accuracy, try NextPolish.

For more information, please check:

Versions

  • 2.5.2

Commands

  • nextDenovo

Module

You can load the modules by:

module load biocontainers
module load nextdenovo

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run nextdenovo on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextdenovo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nextdenovo

Nextflow

Introduction

Nextflow is a bioinformatics workflow manager that enables the development of portable and reproducible workflows.

For more information, please check its website: https://biocontainers.pro/tools/nextflow and its home page on Github.

Versions

  • 21.10.0

Commands

  • nextflow

Module

You can load the modules by:

module load biocontainers
module load nextflow

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Nextflow on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextflow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nextflow

Nextpolish

Introduction

NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both. It contains two core modules, and use a stepwise fashion to correct the error bases in reference genome. To correct/assemble the raw third-generation sequencing (TGS) long reads with approximately 10-15% sequencing errors, please use NextDenovo.

For more information, please check:

Versions

  • 1.4.1

Commands

  • nextPolish

Module

You can load the modules by:

module load biocontainers
module load nextpolish

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run nextpolish on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers nextpolish

Ngs-bits

Introduction

Ngs-bits - Short-read sequencing tools.

For more information, please check its website: https://biocontainers.pro/tools/ngs-bits and its home page on Github.

Versions

  • 2022_04

Commands

  • SampleAncestry

  • SampleDiff

  • SampleGender

  • SampleOverview

  • SampleSimilarity

  • SeqPurge

  • CnvHunter

  • RohHunter

  • UpdHunter

  • CfDnaQC

  • MappingQC

  • NGSDImportQC

  • ReadQC

  • SomaticQC

  • VariantQC

  • TrioMaternalContamination

  • BamCleanHaloplex

  • BamClipOverlap

  • BamDownsample

  • BamFilter

  • BamToFastq

  • BedAdd

  • BedAnnotateFreq

  • BedAnnotateFromBed

  • BedAnnotateGC

  • BedAnnotateGenes

  • BedChunk

  • BedCoverage

  • BedExtend

  • BedGeneOverlap

  • BedHighCoverage

  • BedInfo

  • BedIntersect

  • BedLiftOver

  • BedLowCoverage

  • BedMerge

  • BedReadCount

  • BedShrink

  • BedSort

  • BedSubtract

  • BedToFasta

  • BedpeAnnotateBreakpointDensity

  • BedpeAnnotateCnvOverlap

  • BedpeAnnotateCounts

  • BedpeAnnotateFromBed

  • BedpeFilter

  • BedpeGeneAnnotation

  • BedpeSort

  • BedpeToBed

  • FastqAddBarcode

  • FastqConcat

  • FastqConvert

  • FastqDownsample

  • FastqExtract

  • FastqExtractBarcode

  • FastqExtractUMI

  • FastqFormat

  • FastqList

  • FastqMidParser

  • FastqToFasta

  • FastqTrim

  • VcfAnnotateFromBed

  • VcfAnnotateFromBigWig

  • VcfAnnotateFromVcf

  • VcfBreakMulti

  • VcfCalculatePRS

  • VcfCheck

  • VcfExtractSamples

  • VcfFilter

  • VcfLeftNormalize

  • VcfSort

  • VcfStreamSort

  • VcfToBedpe

  • VcfToTsv

  • SvFilterAnnotations

  • NGSDExportGenes

  • GenePrioritization

  • GenesToApproved

  • GenesToBed

  • GraphStringDb

  • PhenotypeSubtree

  • PhenotypesToGenes

  • PERsim

  • FastaInfo

Module

You can load the modules by:

module load biocontainers
module load ngs-bits

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ngs-bits on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngs-bits
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ngs-bits

SeqPurge -in1 input1_1.fastq input2_1.fastq \
     -in2 input2_2.fastq input2_2.fastq \
     -out1 R1.fastq.gz -out2 R2.fastq.gz

Ngsld

Introduction

ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype’s assignation into account. It does so by avoiding genotype calling and using genotype likelihoods or posterior probabilities.

For more information, please check:

Versions

  • 1.1.1

Commands

  • ngsLD

Module

You can load the modules by:

module load biocontainers
module load ngsld

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ngsld on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsld
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ngsld

Ngsutils

Introduction

Ngsutils is a suite of software tools for working with next-generation sequencing datasets.

For more information, please check its website: https://biocontainers.pro/tools/ngsutils and its home page: http://ngsutils.org.

Versions

  • 0.5.9

Commands

  • ngsutils

  • bamutils

  • bedutils

  • fastqutils

  • gtfutils

Module

You can load the modules by:

module load biocontainers
module load ngsutils

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ngsutils on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ngsutils

bamutils filter \
    input.bam \
    MQ10filtered.bam  \
    -mapped \
    -noqcfail \
    -gte MAPQ 10

bamutils stats \
   -gtf genome.gtf MQ10filtered.bam \
   > MQ10filtered_bamstats

OrthoFinder

Introduction

OrthoFinder: phylogenetic orthology inference for comparative genomics

Detailed usage can be found here: https://github.com/davidemms/OrthoFinder

Versions

  • 2.5.2

  • 2.5.4

  • 2.5.5

Commands

  • orthofinder

Module

You can load the modules by:

module load biocontainers
module load orthofinder/2.5.4

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run orthofinder on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=orthofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers orthofinder/2.5.4

orthofinder -t 24 -f InputData -o output

Paml

Introduction

Paml is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.

For more information, please check its website: https://biocontainers.pro/tools/paml and its home page: http://abacus.gene.ucl.ac.uk/software/paml.html.

Versions

  • 4.9

Commands

  • baseml

  • basemlg

  • chi2

  • codeml

  • evolver

  • infinitesites

  • mcmctree

  • pamp

  • yn00

Module

You can load the modules by:

module load biocontainers
module load paml

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Paml on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=paml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers paml

Panacota

Introduction

Panacota is a software providing tools for large scale bacterial comparative genomics.

For more information, please check its website: https://biocontainers.pro/tools/panacota and its home page on Github.

Versions

  • 1.3.1

Commands

  • PanACoTA

Module

You can load the modules by:

module load biocontainers
module load panacota

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Panacota on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panacota
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers panacota

PanACoTA annotate \
    -d Examples/genomes_init \
    -l Examples/input_files/list_genomes.lst \
    -r Examples/2-res-QC -Q

Panaroo

Introduction

Panaroo is an updated pipeline for pangenome investigation.

For more information, please check:

Versions

  • 1.2.10

Commands

  • panaroo

  • panaroo-extract-gene

  • panaroo-filter-pa

  • panaroo-fmg

  • panaroo-gene-neighbourhood

  • panaroo-img

  • panaroo-integrate

  • panaroo-merge

  • panaroo-msa

  • panaroo-plot-abundance

  • panaroo-qc

  • panaroo-spydrpick

Module

You can load the modules by:

module load biocontainers
module load panaroo

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run panaroo on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panaroo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers panaroo

panaroo -i gff/*.gff -o results --clean-mode strict

Pandaseq

Introduction

Pandaseq is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.

For more information, please check its | Docker hub: https://hub.docker.com/r/pipecraft/pandaseq and its home page on Github.

Versions

  • 2.11

Commands

  • pandaseq

Module

You can load the modules by:

module load biocontainers
module load pandaseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pandaseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pandaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pandaseq

pandaseq -f SRR069027_1.fastq -r SRR069027_2.fastq

Pandora

Introduction

Pandora is a tool for bacterial genome analysis using a pangenome reference graph (PanRG). It allows gene presence/absence detection and genotyping of SNPs, indels and longer variants in one or a number of samples.

For more information, please check:

Versions

  • 0.9.1

Commands

  • pandora

Module

You can load the modules by:

module load biocontainers
module load pandora

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pandora on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pandora
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pandora

pandora index -t 4 GC00006032.fa

Pangolin

Introduction

Pangolin is a software package for assigning SARS-CoV-2 genome sequences to global lineages.

For more information, please check its website: https://biocontainers.pro/tools/pangolin and its home page on Github.

Versions

  • 3.1.20

  • 4.0.6

  • 4.1.2

  • 4.1.3

  • 4.2

Commands

  • pangolin

Module

You can load the modules by:

module load biocontainers
module load pangolin

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pangolin on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pangolin
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pangolin

PanPhlAn

Introduction

PanPhlAn (Pangenome-based Phylogenomic Analysis) is a strain-level metagenomic profiling tool for identifying the gene composition and in-vivo transcriptional activity of individual strains in metagenomic samples.

For more information, please check its home page: http://segatalab.cibio.unitn.it/tools/panphlan/.

Versions

  • 3.1

Commands

  • panphlan_download_pangenome.py

  • panphlan_map.py

  • panphlan_profiling.py

Module

You can load the modules by:

module load biocontainers
module load panphlan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run PanPhlAn on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers panphlan

Clara Parabricks

Introduction

NVIDIA’s Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Google’s DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.

Versions

  • 4.0.0-1

Commands

  • pbrun

Module

You can load the modules by:

module load biocontainers
module load parabricks

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Note

As Clara Parabricks depends on Nvidia GPU, it is only deployed in Scholar, Gilbreth, and ACCESS Anvil.

To run Clara Parabricks on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --gpus=1
#SBATCH --job-name=parabricks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers parabricks

pbrun haplotypecaller \
  --ref  FVZG01.1.fsa_nt \
  --in-bam output.bam \
  --out-variants variants.vcf

Parallel-fastq-dump

Introduction

Parallel-fastq-dump is the parallel fastq-dump wrapper.

For more information, please check its website: https://biocontainers.pro/tools/parallel-fastq-dump and its home page on Github.

Versions

  • 0.6.7

Commands

  • parallel-fastq-dump

Module

You can load the modules by:

module load biocontainers
module load parallel-fastq-dump

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Parallel-fastq-dump on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=parallel-fastq-dump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers parallel-fastq-dump

parallel-fastq-dump -s SRR11941281/SRR11941281.sra \
    --split-files --threads 4 --gzip

Parliament2

Introduction

Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.

For more information, please check:

Versions

  • 0.1.11

Commands

  • parliament2.py

Module

You can load the modules by:

module load biocontainers
module load parliament2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run parliament2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=parliament2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers parliament2

Parsnp

Introduction

Parsnp is used to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours.

For more information, please check its website: https://biocontainers.pro/tools/parsnp and its home page on Github.

Versions

  • 1.6.2

Commands

  • parsnp

Module

You can load the modules by:

module load biocontainers
module load parsnp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Parsnp on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=parsnp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers parsnp

parsnp -g examples/mers_virus/ref/England1.gbk \
     -d examples/mers_virus/genomes/*.fna -c -p 8

Pasapipeline

Introduction

PASA, acronym for Program to Assemble Spliced Alignments (and pronounced ‘pass-uh’), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.

Versions

  • 2.5.2-devb

Commands

  • pasa

  • Launch_PASA_pipeline.pl

  • GMAP_multifasta_processor.pl

  • blat_to_btab.pl

  • blat_to_cdna_clusters.pl

  • blat_top_hit_extractor.pl

  • ensure_single_valid_alignment_per_cdna_per_cluster.pl

  • errors_to_newalign_btabs.pl

  • extract_FL_transdecoder_entries.pl

  • get_failed_transcripts.pl

  • gmap_to_btab.pl

  • import_GMAP_gff3.pl

  • pasa_alignment_assembler_textprocessor.pl

  • pasa_asmbls_to_training_set.extract_reference_orfs.pl

  • polyCistronAnalyzer.pl

  • process_BLAT_alignments.pl

  • process_GMAP_alignments_gff3_chimeras_ok.pl

  • process_PBLAT_alignments.pl

  • process_minimap2_alignments.pl

  • pslx_to_gff3.pl

  • run_spliced_aligners.pl

  • sim4_to_btab.pl

  • Annotation_store_preloader.dbi

  • Load_Current_Gene_Annotations.dbi

  • PASA_transcripts_and_assemblies_to_GFF3.dbi

  • UTR_category_analysis.dbi

  • __drop_many_mysql_dbs.dbi

  • alignment_assembly_to_gene_models.dbi

  • alt_splice_AAT_alignment_generator.dbi

  • assemble_clusters.dbi

  • assembly_db_loader.dbi

  • assign_clusters_by_gene_intergene_overlap.dbi

  • assign_clusters_by_stringent_alignment_overlap.dbi

  • build_comprehensive_transcriptome.dbi

  • build_comprehensive_transcriptome.tabix.dbi

  • cDNA_annotation_comparer.dbi

  • cDNA_annotation_updater.dbi

  • classify_alt_splice_as_UTR_or_protein.dbi

  • classify_alt_splice_isoforms.dbi

  • classify_alt_splice_isoforms_per_subcluster.dbi

  • comprehensive_alt_splice_report.dbi

  • compute_gene_coverage_by_incorporated_PASA_assemblies.dbi

  • create_mysql_cdnaassembly_db.dbi

  • create_sqlite_cdnaassembly_db.dbi

  • describe_alignment_assemblies.dbi

  • describe_alignment_assemblies_cgi_convert.dbi

  • drop_mysql_db_if_exists.dbi

  • dump_annot_store.dbi

  • dump_valid_annot_updates.dbi

  • extract_regions_for_probe_design.dbi

  • extract_skipped_exons.dbi

  • extract_transcript_alignment_clusters.dbi

  • find_FL_equivalent_support.dbi

  • find_alternate_internal_exons.dbi

  • get_antisense_transcripts.dbi

  • import_custom_alignments.dbi

  • import_spliced_alignments.dbi

  • invalidate_RNA-Seq_assembly_artifacts.dbi

  • invalidate_single_exon_ESTs.dbi

  • mapPolyAsites_to_genes.dbi

  • pasa_asmbl_genes_to_GFF3.dbi

  • pasa_asmbls_to_training_set.dbi

  • polyA_site_summarizer.dbi

  • polyA_site_transcript_mapper.dbi

  • populate_alignments_via_btab.dbi

  • populate_ath1_cdnas.dbi

  • populate_cdna_clusters.dbi

  • populate_mysql_assembly_alignment_field.dbi

  • populate_mysql_assembly_sequence_field.dbi

  • purge_PASA_database.dbi

  • purge_annot_comparisons.dbi

  • reassign_clusters_via_valid_align_coords.dbi

  • reconstruct_FL_isoforms_from_parts.dbi

  • report_alt_splicing_findings.dbi

  • reset_to_prior_to_assembly_build.dbi

  • retrieve_assembly_sequences.dbi

  • set_spliced_orient_transcribed_orient.dbi

  • splicing_events_in_subcluster_context.dbi

  • splicing_variation_to_splicing_event.dbi

  • subcluster_builder.dbi

  • subcluster_loader.dbi

  • test_assemble_clusters.dbi

  • test_mysql_connection.dbi

  • update_alignment_status.dbi

  • update_clusters_coordinates.dbi

  • update_fli_status.dbi

  • update_spliced_orient.dbi

  • upload_cdna_headers.dbi

  • upload_transcript_data.dbi

  • validate_alignments_in_db.dbi

Module

You can load the modules by:

module load biocontainers
module load pasapipeline

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pasapipeline on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasapipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pasapipeline

Pasta

Introduction

PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

For more information, please check:

Versions

  • 1.8.7

Commands

  • run_pasta.py

  • run_seqtools.py

  • sumlabels.py

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load pasta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pasta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pasta

Pblat

Introduction

pblat is parallelized blat with multi-threads support.

For more information, please check:

Versions

  • 2.5.1

Commands

  • pblat

Module

You can load the modules by:

module load biocontainers
module load pblat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pblat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pblat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pblat

Pbmm2

Introduction

Pbmm2 is a minimap2 frontend for PacBio native data formats.

For more information, please check its website: https://biocontainers.pro/tools/pbmm2 and its home page on Github.

Versions

  • 1.7.0

Commands

  • pbmm2

Module

You can load the modules by:

module load biocontainers
module load pbmm2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pbmm2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pbmm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pbmm2

pbmm2 --version

pbmm2 align hg38.fa \
    alz.polished.hq.bam alz.aligned.bam \
     -j 12 --preset ISOSEQ --sort \
     --log-level INFO

Pbptyper

Introduction

pbptyper is a tool to identify the Penicillin Binding Protein (PBP) of Streptococcus pneumoniae assemblies.

For more information, please check:

Versions

  • 1.0.4

Commands

  • pbptyper

Module

You can load the modules by:

module load biocontainers
module load pbptyper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pbptyper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pbptyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pbptyper

pbptyper --assembly test/SRR2912551.fna.gz --outdir output

PCAngsd

Introduction

PCAngsd is a program that estimates the covariance matrix and individual allele frequencies for low-depth next-generation sequencing (NGS) data in structured/heterogeneous populations using principal component analysis (PCA) to perform multiple population genetic analyses using genotype likelihoods.

For more information, please check its home page on Github.

Versions

  • 1.10

Commands

  • pcangsd

Module

You can load the modules by:

module load biocontainers
module load pcangsd

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run PCAngsd on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pcangsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pcangsd

pcangsd -b pupfish.beagle.gz --inbreedSites \
     --selection -o pup_pca2 --threads 12

Peakranger

Introduction

Peakranger is a multi-purporse software suite for analyzing next-generation sequencing (NGS) data.

For more information, please check its website: https://biocontainers.pro/tools/peakranger and its home page: http://ranger.sourceforge.net.

Versions

  • 1.18

Commands

  • peakranger

Module

You can load the modules by:

module load biocontainers
module load peakranger

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Peakranger on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=peakranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers peakranger

peakranger ccat --format bam  27-1_sorted_MDRD_MQ30filtered.bam 27-4_sorted_MDRD_MQ30filtered.bam \
     ccat_result_with_HTML_report_5kb_region --report \
     --gene_annot_file refGene.txt --plot_region 10000

Pepper_deepvariant

Introduction

PEPPER is a genome inference module based on recurrent neural networks that enables long-read variant calling and nanopore assembly polishing in the PEPPER-Margin-DeepVariant pipeline. This pipeline enables nanopore-based variant calling with DeepVariant.

For more information, please check:

Versions

  • r0.4.1

Commands

  • run_pepper_margin_deepvariant

Module

You can load the modules by:

module load biocontainers
module load pepper_deepvariant

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pepper_deepvariant on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=pepper_deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pepper_deepvariant

BASE=$PWD

# Set up input data
INPUT_DIR="${BASE}/input/data"
REF="GRCh38_no_alt.chr20.fa"
BAM="HG002_ONT_2_GRCh38.chr20.quickstart.bam"

# Set the number of CPUs to use
THREADS=32

# Set up output directory
OUTPUT_DIR="${BASE}/output"
OUTPUT_PREFIX="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20"
OUTPUT_VCF="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20.vcf.gz"
TRUTH_VCF="HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz"
TRUTH_BED="HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed"

# Create local directory structure
mkdir -p "${OUTPUT_DIR}"
mkdir -p "${INPUT_DIR}"

# Download the data to input directory
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam.bai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa.fai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed

run_pepper_margin_deepvariant call_variant \
    -b input/data/HG002_ONT_2_GRCh38.chr20.quickstart.bam \
    -f input/data/GRCh38_no_alt.chr20.fa -o output \
    -p HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20 \
    -t 32 -r chr20:1000000-1020000 \
    --ont_r9_guppy5_sup --ont

BioPerl

Introduction

BioPerl is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It provides software modules for many of the typical tasks of bioinformatics programming.

For more information, please check its website: https://biocontainers.pro/tools/perl-bioperl.

Versions

  • 1.7.2-pl526

Commands

  • SOAPsh.pl

  • ace.pl

  • bam2bedgraph

  • bamToGBrowse.pl

  • bdf2gdfont.pl

  • bdftogd

  • binhex.pl

  • bp_aacomp.pl

  • bp_biofetch_genbank_proxy.pl

  • bp_bioflat_index.pl

  • bp_biogetseq.pl

  • bp_blast2tree.pl

  • bp_bulk_load_gff.pl

  • bp_chaos_plot.pl

  • bp_classify_hits_kingdom.pl

  • bp_composite_LD.pl

  • bp_das_server.pl

  • bp_dbsplit.pl

  • bp_download_query_genbank.pl

  • bp_extract_feature_seq.pl

  • bp_fast_load_gff.pl

  • bp_fastam9_to_table.pl

  • bp_fetch.pl

  • bp_filter_search.pl

  • bp_find-blast-matches.pl

  • bp_flanks.pl

  • bp_gccalc.pl

  • bp_genbank2gff.pl

  • bp_genbank2gff3.pl

  • bp_generate_histogram.pl

  • bp_heterogeneity_test.pl

  • bp_hivq.pl

  • bp_hmmer_to_table.pl

  • bp_index.pl

  • bp_load_gff.pl

  • bp_local_taxonomydb_query.pl

  • bp_make_mrna_protein.pl

  • bp_mask_by_search.pl

  • bp_meta_gff.pl

  • bp_mrtrans.pl

  • bp_mutate.pl

  • bp_netinstall.pl

  • bp_nexus2nh.pl

  • bp_nrdb.pl

  • bp_oligo_count.pl

  • bp_pairwise_kaks

  • bp_parse_hmmsearch.pl

  • bp_process_gadfly.pl

  • bp_process_sgd.pl

  • bp_process_wormbase.pl

  • bp_query_entrez_taxa.pl

  • bp_remote_blast.pl

  • bp_revtrans-motif.pl

  • bp_search2alnblocks.pl

  • bp_search2gff.pl

  • bp_search2table.pl

  • bp_search2tribe.pl

  • bp_seq_length.pl

  • bp_seqconvert.pl

  • bp_seqcut.pl

  • bp_seqfeature_delete.pl

  • bp_seqfeature_gff3.pl

  • bp_seqfeature_load.pl

  • bp_seqpart.pl

  • bp_seqret.pl

  • bp_seqretsplit.pl

  • bp_split_seq.pl

  • bp_sreformat.pl

  • bp_taxid4species.pl

  • bp_taxonomy2tree.pl

  • bp_translate_seq.pl

  • bp_tree2pag.pl

  • bp_unflatten_seq.pl

  • ccconfig

  • chartex

  • chi2

  • chrom_sizes.pl

  • circo

  • clustalw

  • clustalw2

  • corelist

  • cpan

  • cpanm

  • dbilogstrip

  • dbiprof

  • dbiproxy

  • debinhex.pl

  • enc2xs

  • encguess

  • genomeCoverageBed.pl

  • h2ph

  • h2xs

  • htmltree

  • instmodsh

  • json_pp

  • json_xs

  • lwp-download

  • lwp-dump

  • lwp-mirror

  • lwp-request

  • perl

  • perl5.26.2

  • perlbug

  • perldoc

  • perlivp

  • perlthanks

  • piconv

  • pl2pm

  • pod2html

  • pod2man

  • pod2text

  • pod2usage

  • podchecker

  • podselect

  • prove

  • ptar

  • ptardiff

  • ptargrep

  • shasum

  • splain

  • stag-autoschema.pl

  • stag-db.pl

  • stag-diff.pl

  • stag-drawtree.pl

  • stag-filter.pl

  • stag-findsubtree.pl

  • stag-flatten.pl

  • stag-grep.pl

  • stag-handle.pl

  • stag-itext2simple.pl

  • stag-itext2sxpr.pl

  • stag-itext2xml.pl

  • stag-join.pl

  • stag-merge.pl

  • stag-mogrify.pl

  • stag-parse.pl

  • stag-query.pl

  • stag-splitter.pl

  • stag-view.pl

  • stag-xml2itext.pl

  • stubmaker.pl

  • t_coffee

  • tpage

  • ttree

  • unflatten

  • webtidy

  • xml_grep

  • xml_merge

  • xml_pp

  • xml_spellcheck

  • xml_split

  • xpath

  • xsubpp

  • zipdetails

Module

You can load the modules by:

module load biocontainers
module load perl-bioperl

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run BioPerl on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=perl-bioperl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers perl-bioperl

Phast

Introduction

PHAST is a freely available software package for comparative and evolutionary genomics. For more information, please check: BioContainers: https://biocontainers.pro/tools/phast Home page: http://compgen.cshl.edu/phast/

Versions

  • 1.5

Commands

  • all_dists

  • base_evolve

  • chooseLines

  • clean_genes

  • consEntropy

  • convert_coords

  • display_rate_matrix

  • dless

  • dlessP

  • draw_tree

  • eval_predictions

  • exoniphy

  • hmm_train

  • hmm_tweak

  • hmm_view

  • indelFit

  • indelHistory

  • maf_parse

  • makeHKY

  • modFreqs

  • msa_diff

  • msa_split

  • msa_view

  • pbsDecode

  • pbsEncode

  • pbsScoreMatrix

  • pbsTrain

  • phast

  • phastBias

  • phastCons

  • phastMotif

  • phastOdds

  • phyloBoot

  • phyloFit

  • phyloP

  • prequel

  • refeature

  • stringiphy

  • treeGen

  • tree_doctor

Module

You can load the modules by:

module load biocontainers
module load phast

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phast on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phast

Phd2fasta

Introduction

Phd2fasta is a tool to convert Phred ‘phd’ format files to ‘fasta’ format.

For more information, please check its home page: http://www.phrap.org/phredphrapconsed.html.

Versions

  • 0.990622

Commands

  • phd2fasta

Module

You can load the modules by:

module load biocontainers
module load phd2fasta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Phd2fasta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phd2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phd2fasta

Phg

Introduction

Practical Haplotype Graph (PHG) is a general, graph-based, computational framework that can be used with a variety of skim sequencing methods to infer high-density genotypes directly from low-coverage sequence.

For more information, please check:

Versions

  • 1.0

Commands

  • CreateConsensi.sh

  • CreateHaplotypes.sh

  • CreateReferenceIntervals.sh

  • CreateSmallDataSet.sh

  • CreateValidIntervalsFile.sh

  • IndexPangenome.sh

  • LoadAssemblyAnchors.sh

  • LoadGenomeIntervals.sh

  • ParallelAssemblyAnchorsLoad.sh

  • RunLiquibaseUpdates.sh

  • CreateHaplotypesFromBAM.groovy

  • CreateHaplotypesFromFastq.groovy

  • CreateHaplotypesFromGVCF.groovy

Module

You can load the modules by:

module load biocontainers
module load phg

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phg on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phg

Phipack

Introduction

PhiPack: PHI test and other tests of recombination

For more information, please check:

Versions

  • 1.1

Commands

  • Phi

  • Profile

Module

You can load the modules by:

module load biocontainers
module load phipack

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phipack on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phipack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phipack

phrap

Introduction

phrap is a program for assembling shotgun DNA sequence data.

For more information, please check its home page: http://www.phrap.org/phredphrapconsed.html#block_phrap.

Versions

  • 1.090518

Commands

  • phrap

Module

You can load the modules by:

module load biocontainers
module load phrap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phrap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phrap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phrap

phred

Introduction

phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.

For more information, please check its home page: http://www.phrap.org/phredphrapconsed.html#block_phred.

Versions

  • 0.071220.c

Commands

  • phred

Module

You can load the modules by:

module load biocontainers
module load phred

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phred on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phred

Phylofisher

Introduction

PhyloFisher is a software package written in Python3 that can be used for the creation, analysis, and visualization of phylogenomic datasets that consist of eukaryotic protein sequences.

For more information, please check:

Versions

  • 1.2.7

  • 1.2.9

Commands

  • aa_comp_calculator.py

  • aa_recoder.py

  • apply_to_db.py

  • astral_runner.py

  • backup_restoration.py

  • bipartition_examiner.py

  • build_database.py

  • config.py

  • edirect.py

  • explore_database.py

  • fast_site_remover.py

  • fast_taxa_remover.py

  • fisher.py

  • forest.py

  • genetic_code_examiner.py

  • gfmix_runner.py

  • heterotachy.py

  • informant.py

  • install_deps.py

  • jp.py

  • mammal_modeler.py

  • matrix_constructor.py

  • prep_final_dataset.py

  • purge.py

  • random_resampler.py

  • rst2html.py

  • rst2html4.py

  • rst2html5.py

  • rst2latex.py

  • rst2man.py

  • rst2odt.py

  • rst2odt_prepstyles.py

  • rst2pseudoxml.py

  • rst2s5.py

  • rst2xetex.py

  • rst2xml.py

  • rstpep2html.py

  • rtc_binner.py

  • runxlrd.py

  • select_orthologs.py

  • select_taxa.py

  • sgt_constructor.py

  • taxon_collapser.py

  • vba_extract.py

  • windowmasker_2.2.22_adapter.py

  • working_dataset_constructor.py

Module

You can load the modules by:

module load biocontainers
module load phylofisher

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phylofisher on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylofisher
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phylofisher

Phylosuite

Introduction

PhyloSuite is an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies.

For more information, please check:

Versions

  • 1.2.3

Commands

  • PhyloSuite.sh

Module

You can load the modules by:

module load biocontainers
module load phylosuite

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run phylosuite on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylosuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers phylosuite

Picard Tools

Introduction

Picard is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF. Detailed usage can be found here: https://broadinstitute.github.io/picard/

Versions

  • 2.25.1

  • 2.26.10

Commands

picard

Module

You can load the modules by:

module load biocontainers
module load picard/2.26.10

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run picard our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=picard
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers picard/2.26.10

picard MarkDuplicates -Xmx64g I=19P0126636WES_sorted.bam O=19P0126636WES_sorted_md.bam M=19P0126636WES.sorted.markdup.txt REMOVE_DUPLICATES=true
picard BuildBamIndex -Xmx64g I=19P0126636WES_sorted_md.bam
picard CreateSequenceDictionary -R hg38.fa -O hg38.dict

Picrust2

Introduction

Picrust2 is a software for predicting functional abundances based only on marker gene sequences.

For more information, please check its website: https://biocontainers.pro/tools/picrust2 and its home page on Github.

Versions

  • 2.4.2

  • 2.5.0

Commands

  • add_descriptions.py

  • convert_table.py

  • hsp.py

  • metagenome_pipeline.py

  • pathway_pipeline.py

  • picrust2_pipeline.py

  • place_seqs.py

  • print_picrust2_config.py

  • run_abundance.py

  • run_sepp.py

  • run_tipp.py

  • run_tipp_tool.py

  • run_upp.py

  • shuffle_predictions.py

  • split_sequences.py

  • sumlabels.py

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load picrust2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Picrust2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=picrust2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers picrust2


place_seqs.py -s ../seqs.fna -o out.tre -p 10 \
          --intermediate intermediate/place_seqs

hsp.py -i 16S -t out.tre -o marker_predicted_and_nsti.tsv.gz -p 10 -n

hsp.py -i EC -t out.tre -o EC_predicted.tsv.gz -p 10

metagenome_pipeline.py -i ../table.biom -m marker_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out

convert_table.py EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
             -c contrib_to_legacy \
             -o EC_metagenome_out/pred_metagenome_contrib.legacy.tsv.gz

pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
                -o pathways_out -p 10

add_descriptions.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC \
                -o EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz


add_descriptions.py -i pathways_out/path_abun_unstrat.tsv.gz -m METACYC \
                -o pathways_out/path_abun_unstrat_descrip.tsv.gz

picrust2_pipeline.py -s chemerin_16S/seqs.fna -i chemerin_16S/table.biom \
    -o picrust2_out_pipeline -p 10

Pilon

Introduction

Pilon is an automated genome assembly improvement and variant detection tool.

For more information, please check its website: https://biocontainers.pro/tools/pilon and its home page on Github.

Versions

  • 1.24

Commands

  • pilon.jar

Module

You can load the modules by:

module load biocontainers
module load pilon

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pilon on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pilon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pilon

pilon.jar --nostrays \
     --genome scaffolds.fasta \
     --frags out_sorted.bam \
     --vcf --verbose --threads 12 \
     --output pilon_corrected \
     --outdir pilon_outdir

Pindel

Introduction

Pindel is used to detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data.

For more information, please check its website: https://biocontainers.pro/tools/pindel and its home page: http://gmt.genome.wustl.edu/packages/pindel/index.html.

Versions

  • 0.2.5b9

Commands

  • pindel

  • pindel2cvf

Module

You can load the modules by:

module load biocontainers
module load pindel

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pindel on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pindel

pindel -i simulated_config.txt -f simulated_reference.fa -o bamtest -c ALL

pindel -p COLO-829_20-p_ok.txt -f hs_ref_chr20.fa -o colontumor -c 20

pindel2vcf -r hs_ref_chr20.fa -R HUMAN_G1K_V2 -d 20100101 -p colontumor_D -e 5

Pirate

Introduction

Pirate is a pangenome analysis and threshold evaluation toolbox.

For more information, please check its website: https://biocontainers.pro/tools/pirate and its home page on Github.

Versions

  • 1.0.4

Commands

  • PIRATE

  • FET.pl

  • PIRATE_to_Rtab.pl

  • PIRATE_to_roary.pl

  • SOAPsh.pl

  • ace.pl

  • analyse_blast_outputs.pl

  • analyse_loci_list.pl

  • annotate_treeWAS_output.pl

  • bamToGBrowse.pl

  • bdf2gdfont.pl

  • binhex.pl

  • bp_aacomp.pl

  • bp_biofetch_genbank_proxy.pl

  • bp_bioflat_index.pl

  • bp_biogetseq.pl

  • bp_blast2tree.pl

  • bp_bulk_load_gff.pl

  • bp_chaos_plot.pl

  • bp_classify_hits_kingdom.pl

  • bp_composite_LD.pl

  • bp_das_server.pl

  • bp_dbsplit.pl

  • bp_download_query_genbank.pl

  • bp_extract_feature_seq.pl

  • bp_fast_load_gff.pl

  • bp_fastam9_to_table.pl

  • bp_fetch.pl

  • bp_filter_search.pl

  • bp_find-blast-matches.pl

  • bp_flanks.pl

  • bp_gccalc.pl

  • bp_genbank2gff.pl

  • bp_genbank2gff3.pl

  • bp_generate_histogram.pl

  • bp_heterogeneity_test.pl

  • bp_hivq.pl

  • bp_hmmer_to_table.pl

  • bp_index.pl

  • bp_load_gff.pl

  • bp_local_taxonomydb_query.pl

  • bp_make_mrna_protein.pl

  • bp_mask_by_search.pl

  • bp_meta_gff.pl

  • bp_mrtrans.pl

  • bp_mutate.pl

  • bp_netinstall.pl

  • bp_nexus2nh.pl

  • bp_nrdb.pl

  • bp_oligo_count.pl

  • bp_parse_hmmsearch.pl

  • bp_process_gadfly.pl

  • bp_process_sgd.pl

  • bp_process_wormbase.pl

  • bp_query_entrez_taxa.pl

  • bp_remote_blast.pl

  • bp_revtrans-motif.pl

  • bp_search2alnblocks.pl

  • bp_search2gff.pl

  • bp_search2table.pl

  • bp_search2tribe.pl

  • bp_seq_length.pl

  • bp_seqconvert.pl

  • bp_seqcut.pl

  • bp_seqfeature_delete.pl

  • bp_seqfeature_gff3.pl

  • bp_seqfeature_load.pl

  • bp_seqpart.pl

  • bp_seqret.pl

  • bp_seqretsplit.pl

  • bp_split_seq.pl

  • bp_sreformat.pl

  • bp_taxid4species.pl

  • bp_taxonomy2tree.pl

  • bp_translate_seq.pl

  • bp_tree2pag.pl

  • bp_unflatten_seq.pl

  • cd-hit-2d-para.pl

  • cd-hit-clstr_2_blm8.pl

  • cd-hit-div.pl

  • cd-hit-para.pl

  • chrom_sizes.pl

  • clstr2tree.pl

  • clstr2txt.pl

  • clstr2xml.pl

  • clstr_cut.pl

  • clstr_list.pl

  • clstr_list_sort.pl

  • clstr_merge.pl

  • clstr_merge_noorder.pl

  • clstr_quality_eval.pl

  • clstr_quality_eval_by_link.pl

  • clstr_reduce.pl

  • clstr_renumber.pl

  • clstr_rep.pl

  • clstr_reps_faa_rev.pl

  • clstr_rev.pl

  • clstr_select.pl

  • clstr_select_rep.pl

  • clstr_size_histogram.pl

  • clstr_size_stat.pl

  • clstr_sort_by.pl

  • clstr_sort_prot_by.pl

  • clstr_sql_tbl.pl

  • clstr_sql_tbl_sort.pl

  • convert_to_distmat.pl

  • convert_to_treeWAS.pl

  • debinhex.pl

  • genomeCoverageBed.pl

  • legacy_blast.pl

  • make_multi_seq.pl

  • pangenome_variants_to_treeWAS.pl

  • paralogs_to_Rtab.pl

  • plot_2d.pl

  • plot_len1.pl

  • stag-autoschema.pl

  • stag-db.pl

  • stag-diff.pl

  • stag-drawtree.pl

  • stag-filter.pl

  • stag-findsubtree.pl

  • stag-flatten.pl

  • stag-grep.pl

  • stag-handle.pl

  • stag-itext2simple.pl

  • stag-itext2sxpr.pl

  • stag-itext2xml.pl

  • stag-join.pl

  • stag-merge.pl

  • stag-mogrify.pl

  • stag-parse.pl

  • stag-query.pl

  • stag-splitter.pl

  • stag-view.pl

  • stag-xml2itext.pl

  • stubmaker.pl

  • subsample_outputs.pl

  • subset_alignments.pl

  • unique_sequences.pl

  • update_blastdb.pl

Module

You can load the modules by:

module load biocontainers
module load pirate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pirate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pirate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pirate

Piscem

Introduction

piscem is a rust wrapper for a next-generation index + mapper tool (still currently written in C++17).

For more information, please check:

Versions

  • 0.4.3

Commands

  • piscem

Module

You can load the modules by:

module load biocontainers
module load piscem

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run piscem on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=piscem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers piscem

Pixy

Introduction

pixy is a command-line tool for painlessly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.

For more information, please check:

Versions

  • 1.2.7

Commands

  • pixy

Module

You can load the modules by:

module load biocontainers
module load pixy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pixy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pixy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pixy

Plasmidfinder

Introduction

PlasmidFinder identifies plasmids in total or partial sequenced isolates of bacteria.

Versions

  • 2.1.6

Commands

  • plasmidfinder.py

Module

You can load the modules by:

module load biocontainers
module load plasmidfinder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run plasmidfinder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plasmidfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers plasmidfinder

plasmidfinder.py -p test/database \
    -i test/test.fsa -o output -mp blastn -x -q

Platon

Introduction

Platon: identification and characterization of bacterial plasmid contigs from short-read draft assemblies.

For more information, please check:

Versions

  • 1.6

Commands

  • platon

Module

You can load the modules by:

module load biocontainers
module load platon

Note

The environment variable PLATON_DB is set as /depot/itap/datasets/platon/db. This directory contains the required database.

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run platon on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=platon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers platon

platon --verbose --threads 4 contigs.fasta

Platypus

Introduction

Platypus is a tool designed for efficient and accurate variant-detection in high-throughput sequencing data.

Versions

  • 0.8.1

Commands

  • platypus

Module

You can load the modules by:

module load biocontainers
module load platypus

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Platypus on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=platypus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers platypus

Plink2

Introduction

Plink2 is a whole genome association analysis toolset.

For more information, please check its website: https://biocontainers.pro/tools/plink2 and its home page on Github.

Versions

  • 2.00a2.3

Commands

  • plink2

Module

You can load the modules by:

module load biocontainers
module load plink2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Plink2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plink2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers plink2

plink2 --bfile HapMap_3_r3_1 --freq --out HapMap_3_r3_1_out

Plotsr

Introduction

Plotsr generates high-quality visualisation of synteny and structural rearrangements between multiple genomes. For this, it uses the genomic structural annotations between multiple chromosome-level assemblies.

For more information, please check:

Versions

  • 0.5.4

Commands

  • plotsr

Module

You can load the modules by:

module load biocontainers
module load plotsr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run plotsr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plotsr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers plotsr

plotsr syri.out refgenome qrygenome -H 8 -W 5

Pomoxis

Introduction

Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing. Notably tools are included for generating and analysing draft assemblies. Many of these tools are used by the research data analysis group at Oxford Nanopore Technologies.

For more information, please check:

Versions

  • 0.3.9

Commands

  • assess_assembly

  • catalogue_errors

  • common_errors_from_bam

  • coverage_from_bam

  • coverage_from_fastx

  • fast_convert

  • find_indels

  • intersect_assembly_errors

  • long_fastx

  • mini_align

  • mini_assemble

  • pomoxis_path

  • qscores_from_summary

  • ref_seqs_from_bam

  • reverse_bed

  • split_fastx

  • stats_from_bam

  • subsample_bam

  • summary_from_stats

  • tag_bam

  • trim_alignments

Module

You can load the modules by:

module load biocontainers
module load pomoxis

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pomoxis on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pomoxis
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pomoxis

assess_assembly \
    -i helen_output/Staph_Aur_draft_helen.fa \
    -r truth_assembly_staph_aur.fasta \
    -p polished_assembly_quality \
    -l 50 \
    -t 4 \
    -e \
    -T

Poppunk

Introduction

PopPUNK is a tool for clustering genomes. We refer to the clusters as variable-length-k-mer clusters, or VLKCs. Biologically, these clusters typically represent distinct strains. We refer to subclusters of strains as lineages.

For more information, please check:

Versions

  • 2.5.0

  • 2.6.0

Commands

  • poppunk

  • poppunk_add_weights.py

  • poppunk_assign

  • poppunk_batch_mst.py

  • poppunk_calculate_rand_indices.py

  • poppunk_calculate_silhouette.py

  • poppunk_easy_run.py

  • poppunk_extract_components.py

  • poppunk_extract_distances.py

  • poppunk_info

  • poppunk_iterate.py

  • poppunk_mandrake

  • poppunk_mst

  • poppunk_references

  • poppunk_visualise

Module

You can load the modules by:

module load biocontainers
module load poppunk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run poppunk on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=poppunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers poppunk

Popscle

Introduction

Popscle is a suite of population scale analysis tools for single-cell genomics data.

For more information, please check its | Docker hub: https://hub.docker.com/r/cumulusprod/popscle and its home page on Github.

Versions

  • 0.1b

Commands

  • popscle

Module

You can load the modules by:

module load biocontainers
module load popscle

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Popscle on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=popscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers popscle

popscle dsc-pileup --sam data/$bam --vcf data/$ref_vcf --out data/$pileup

Pplacer

Introduction

Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment, guppy does all of the downstream analysis of placements, and rppr does useful things having to do with reference packages. For more information, please check: BioContainers: https://biocontainers.pro/tools/pplacer Home page: https://matsen.fhcrc.org/pplacer/

Versions

  • 1.1.alpha19

Commands

  • pplacer

  • guppy

  • rppr

Module

You can load the modules by:

module load biocontainers
module load pplacer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pplacer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pplacer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pplacer

Prinseq

Introduction

Prinseq is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data.

For more information, please check its website: https://biocontainers.pro/tools/prinseq and its home page: http://prinseq.sourceforge.net.

Versions

  • 0.20.4

Commands

  • prinseq-graphs-noPCA.pl

  • prinseq-graphs.pl

  • prinseq-lite.pl

Module

You can load the modules by:

module load biocontainers
module load prinseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Prinseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prinseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers prinseq

prinseq-lite.pl -verbose -fastq  SRR5043021_1.fastq -fastq2 SRR5043021_2.fastq -graph_data test.gd -out_good null -out_bad null
prinseq-graphs.pl -i test.gd -png_all -o test
prinseq-graphs-noPCA.pl -i test.gd -png_all -o test_noPCA

Prodigal

Introduction

Prodigal is a tool for fast, reliable protein-coding gene prediction for prokaryotic genome.

For more information, please check its website: https://biocontainers.pro/tools/prodigal and its home page on Github.

Versions

  • 2.6.3

Commands

  • prodigal

Module

You can load the modules by:

module load biocontainers
module load prodigal

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Prodigal on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prodigal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers prodigal

prodigal -i genome.fasta -o output.genes -a proteins.faa

Prokka

Introduction

Prokka is a pipeline for rapidly annotating prokaryotic genomes. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.

Detailed usage can be found here: https://github.com/tseemann/prokka

Versions

  • 1.14.6

Commands

  • prokka

  • prokka-abricate_to_fasta_db

  • prokka-biocyc_to_fasta_db

  • prokka-build_kingdom_dbs

  • prokka-cdd_to_hmm

  • prokka-clusters_to_hmm

  • prokka-genbank_to_fasta_db

  • prokka-genpept_to_fasta_db

  • prokka-hamap_to_hmm

  • prokka-tigrfams_to_hmm

  • prokka-uniprot_to_fasta_db

Module

You can load the modules by:

module load biocontainers
module load prokka

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run prokka on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=prokka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers prokka

prokka --compliant --centre UoN --outdir PRJEB12345 --locustag EHEC --prefix EHEC-Chr1 contigs.fa  --cpus 24
prokka-genbank_to_fasta_db Coccus1.gbk Coccus2.gbk Coccus3.gbk Coccus4.gbk > Coccus.faa

Proteinortho

Introduction

Proteinortho is a tool to detect orthologous genes within different species.

For more information, please check its website: https://biocontainers.pro/tools/proteinortho and its home page on Gitlab.

Versions

  • 6.0.33

Commands

  • proteinortho

  • proteinortho2html.pl

  • proteinortho2tree.pl

  • proteinortho2xml.pl

  • proteinortho6.pl

  • proteinortho_cleanupblastgraph

  • proteinortho_clustering

  • proteinortho_compareProteinorthoGraphs.pl

  • proteinortho_do_mcl.pl

  • proteinortho_extract_from_graph.pl

  • proteinortho_ffadj_mcs.py

  • proteinortho_formatUsearch.pl

  • proteinortho_grab_proteins.pl

  • proteinortho_graphMinusRemovegraph

  • proteinortho_history.pl

  • proteinortho_singletons.pl

  • proteinortho_summary.pl

  • proteinortho_treeBuilderCore

Module

You can load the modules by:

module load biocontainers
module load proteinortho

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Proteinortho on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=proteinortho
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers proteinortho

proteinortho6.pl test/C.faa test/E.faa test/L.faa test/M.faa

ProtHint

Introduction

ProtHint is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.

Versions

  • 2.6.0

Commands

  • cds_with_upstream_support.py

  • combine_gff_records.pl

  • count_cds_overlaps.py

  • flag_top_proteins.py

  • gff_from_region_to_contig.pl

  • make_chains.py

  • nucseq_for_selected_genes.pl

  • print_high_confidence.py

  • print_longest_isoform.py

  • proteins_from_gtf.pl

  • prothint.py

  • prothint2augustus.py

  • run_spliced_alignment.pl

  • run_spliced_alignment_pbs.pl

  • select_best_proteins.py

  • select_for_next_iteration.py

  • spalnBatch.sh

  • spaln_to_gff.py

Academic license

ProtHint depends on GenMark. To use GeneMark, users need to download license files by yourself.

Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic and LINUX 64 next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit licenss. Paste the link addresses in the commands below:

cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key

Module

You can load the modules by:

module load biocontainers
module load prothint

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ProtHint on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=prothint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers prothint

prothint.py --threads 4 input/genome.fasta input/proteins.fasta --geneSeeds input/genemark.gtf --workdir test

Pullseq

Introduction

Pullseq is an utility program for extracting sequences from a fasta/fastq file.

For more information, please check:

Versions

  • 1.0.2

Commands

  • pcre-config

  • pcregrep

  • pcretest

  • pullseq

  • seqdiff

Module

You can load the modules by:

module load biocontainers
module load pullseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pullseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pullseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pullseq

Purge_dups

Introduction

purge_dups is designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.

For more information, please check:

Versions

  • 1.2.6

Commands

  • augustify.py

  • bamToWig.py

  • cleanup-blastdb-volumes.py

  • edirect.py

  • executeTestCGP.py

  • extractAnno.py

  • findRepetitiveProtSeqs.py

  • fix_in_frame_stop_codon_genes.py

  • generate_plot.py

  • getAnnoFastaFromJoingenes.py

  • hist_plot.py

  • pd_config.py

  • run_abundance.py

  • run_purge_dups.py

  • run_sepp.py

  • run_tipp.py

  • run_tipp_tool.py

  • run_upp.py

  • split_sequences.py

  • stringtie2fa.py

  • sumlabels.py

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load purge_dups

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run purge_dups on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=purge_dups
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers purge_dups

Pvactools

Introduction

pVACtools is a cancer immunotherapy tools suite consisting of pVACseq, pVACbind, pVACfuse, pVACvector, and pVACview.

For more information, please check:

Versions

  • 3.0.1

Commands

  • pvacbind

  • pvacfuse

  • pvacseq

  • pvactools

  • pvacvector

  • pvacview

Module

You can load the modules by:

module load biocontainers
module load pvactools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pvactools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pvactools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pvactools

pvacseq download_example_data .

pvacseq run \
  pvacseq_example_data/input.vcf \
  Test \
  HLA-A*02:01,HLA-B*35:01,DRB1*11:01 \
  MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \
  pvacseq_output_data \
  -e1 8,9,10 \
  -e2 15 \
  --iedb-install-directory /opt/iedb

Pyani

Introduction

Pyani is an application and Python module for whole-genome classification of microbes using Average Nucleotide Identity.

For more information, please check its website: https://biocontainers.pro/tools/pyani and its home page on Github.

Versions

  • 0.2.11

  • 0.2.12

Commands

  • average_nucleotide_identity.py

  • genbank_get_genomes_by_taxon.py

  • delta_filter_wrapper.py

Module

You can load the modules by:

module load biocontainers
module load pyani

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pyani on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pyani

average_nucleotide_identity.py -i tests/ -o tests/test_ANIm_output -m ANIm -g
average_nucleotide_identity.py -i tests/  -o tests/test_ANIb_output -m ANIb -g
average_nucleotide_identity.py -i tests/ -o tests/test_ANIblastall_output -m ANIblastall -g
average_nucleotide_identity.py -i tests/  -o tests/test_TETRA_output -m TETRA -g

Pybedtools

Introduction

Pybedtools wraps and extends BEDTools and offers feature-level manipulations from within Python.

For more information, please check its website: https://biocontainers.pro/tools/pybedtools and its home page on Github.

Versions

  • 0.9.0

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pybedtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pybedtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pybedtools

Pybigwig

Introduction

Pybigwig is a python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.

For more information, please check its website: https://biocontainers.pro/tools/pybigwig and its home page on Github.

Versions

  • 0.3.18

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pybigwig

Interactive job

To run pybigwig interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers pybigwig
(base) UserID@bell-a008:~ $ python
Python 3.6.15 |  packaged by conda-forge |  (default, Dec  3 2021, 18:49:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyBigWig
>>> bw = pyBigWig.open("test/test.bw")

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run batch jobs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybigwig
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pybigwig

python script.py

Pychopper

Introduction

Pychopper is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.

For more information, please check:

Versions

  • 2.5.0

Commands

  • cdna_classifier.py

Module

You can load the modules by:

module load biocontainers
module load pychopper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pychopper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pychopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pychopper

Pycoqc

Introduction

Pycoqc is a tool that computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/pycoqc and its home page on Github.

Versions

  • 2.5.2

Commands

  • pycoQC

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pycoqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pycoqc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pycoqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pycoqc

pycoQC \
    -f Albacore-1.2.1_basecall-1D-DNA_sequencing_summary.txt\
     -o Albacore-1.2.1_basecall-1D-DNA.html \
    --quiet

Pyensembl

Introduction

Pyensembl is a Python interface to Ensembl reference genome metadata such as exons and transcripts.

For more information, please check its website: https://biocontainers.pro/tools/pyensembl and its home page on Github.

Versions

  • 1.9.4

Commands

  • pyensembl

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pyensembl

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pyensembl on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyensembl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pyensembl

Pyfaidx

Introduction

Pyfaidx is a Python package for random access and indexing of fasta files.

For more information, please check its website: https://biocontainers.pro/tools/pyfaidx and its home page on Github.

Versions

  • 0.6.4

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pyfaidx

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pyfaidx on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyfaidx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pyfaidx

Pygenometracks

Introduction

pyGenomeTracks aims to produce high-quality genome browser tracks that are highly customizable.

For more information, please check:

Versions

  • 3.7

Commands

  • make_tracks_file

  • pyGenomeTracks

Module

You can load the modules by:

module load biocontainers
module load pygenometracks

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pygenometracks on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenometracks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pygenometracks

make_tracks_file --trackFiles domains.bed bigwig.bw -o tracks.ini

pyGenomeTracks --tracks tracks.ini \
   --region chr2:10,000,000-11,000,000 --outFileName nice_image.pdf

Pygenomeviz

Introduction

pyGenomeViz is a genome visualization python package for comparative genomics implemented based on matplotlib.

For more information, please check:

Versions

  • 0.2.2

  • 0.3.2

Commands

  • pgv-download-dataset

  • pgv-mmseqs

  • pgv-mummer

  • pgv-pmauve

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pygenomeviz

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pygenomeviz on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenomeviz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pygenomeviz

Pyranges

Introduction

Pyranges are collections of intervals that support comparison operations (like overlap and intersect) and other methods that are useful for genomic analyses.

For more information, please check its website: https://biocontainers.pro/tools/pyranges and its home page on Github.

Versions

  • 0.0.115

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pyranges

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pyranges on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyranges
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pyranges

Pysam

Introduction

Pysam is a python module that makes it easy to read and manipulate mapped short read sequence data stored in SAM/BAM files.

For more information, please check its website: https://biocontainers.pro/tools/pysam and its home page on Github.

Versions

  • 0.18.0

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pysam

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Pysam on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pysam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pysam

Pyvcf3

Introduction

PyVCF3 has been created because the Official PyVCF repository is no longer maintained and do not accept any pull requests.

For more information, please check:

Versions

  • 1.0.3

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load pyvcf3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pyvcf3 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyvcf3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers pyvcf3

QIIME 2

Introduction

QIIME 2 is a is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.

For more information, please check its website: https://quay.io/repository/qiime2/core and its home page: https://qiime2.org/.

Versions

  • 2021.2

  • 2022.11

  • 2022.2

  • 2022.8

  • 2023.2

  • 2023.5

Commands

  • qiime

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load qiime2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run QIIME 2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qiime2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers qiime2

qiime metadata tabulate \
    --m-input-file rep-seqs.qza \
    --m-input-file taxonomy.qza \
    --o-visualization tabulated-feature-metadata.qzv

Qtlseq

Introduction

Bulked segregant analysis, as implemented in QTL-seq (Takagi et al., 2013), is a powerful and efficient method to identify agronomically important loci in crop plants. QTL-seq was adapted from MutMap to identify quantitative trait loci. It utilizes sequences pooled from two segregating progeny populations with extreme opposite traits (e.g. resistant vs susceptible) and a single whole-genome resequencing of either of the parental cultivars.

For more information, please check:

Versions

  • 2.2.3

Commands

  • qtlseq

Module

You can load the modules by:

module load biocontainers
module load qtlseq

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run qtlseq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qtlseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers qtlseq

Qualimap

Introduction

Qualimap is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

For more information, please check its website: https://biocontainers.pro/tools/qualimap and its home page: http://qualimap.conesalab.org.

Versions

  • 2.2.1

Commands

  • qualimap

Module

You can load the modules by:

module load biocontainers
module load qualimap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Qualimap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qualimap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers qualimap

Quast

Introduction

Quast is Quality Assessment Tool for Genome Assemblies.

Note: Running QUAST, please use the command: quast.py| metaquast.py fastafile [OTHER OPTIONS] DO NOT call it ‘python quast.py| metaquast.py’

For more information, please check its website: https://biocontainers.pro/tools/quast and its home page on Github.

Versions

  • 5.0.2

  • 5.2.0

Commands

  • quast.py

  • metaquast.py

Module

You can load the modules by:

module load biocontainers
module load quast

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Quast on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=quast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers quast

metaquast.py  --gene-finding --threads 8  \
    meta_contigs_1.fasta meta_contigs_2.fasta \
    -r meta_ref_1.fasta,meta_ref_2.fasta,meta_ref_3.fasta \
    -o quast_out_genefinding

QuickMIRSeq

Introduction

QuickMIRSeq is an integrated pipeline for quick and accurate quantification of known miRNAs and isomiRs by jointly processing multiple samples.

For more information, please check its | Docker hub: https://hub.docker.com/r/gcfntnu/quickmirseq and its home page on Github.

Versions

  • 1.0

Commands

  • perl

  • QuickMIRSeq-report.sh

Module

You can load the modules by:

module load biocontainers
module load quickmirseq

Note

This module defines program installation directory (note: inside the container!) as environment variable $QuickMIRSeq. Once again, this is not a host path, this path is only available from inside the container.

With the way this module is organized, you should be able to use the variable freely for both the perl $QuickMIRSeq/QuickMIRSeq.pl allIDs.txt run.config and the $QuickMIRSeq/QuickMIRSeq-report.sh steps as directed by the user guide.

A simple QuickMIRSeq.pl and QuickMIRSeq-report.sh will also work (and can be a backup if the variable expansion somehow does not work for you).

You will also need a run configuration file. You can copy from an existing one, or take from the user guide, or as a last resort, use Singularity to copy the template (in $QuickMIRSeq/run.config.template) from inside the container image. singularity shell may be an easiest way for the latter.

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run QuickMIRSeq on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=quickmirseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers quickmirseq

quickmerge -d out.rq.delta -q q.fasta -r scab8722.fasta  -hco 5.0 -c 1.5 -l n -ml m -p prefix

R

Introduction

R is a system for statistical computation and graphics.

This is a plain R-base installation (see https://github.com/rocker-org/rocker/) repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files.

For more information, please check its | Docker hub: https://hub.docker.com/_/r-base and its home page: https://www.r-project.org/.

Versions

  • 4.1.1

Commands

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load r

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run R on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers r

Racon

Introduction

Racon is a consensus module for raw de novo DNA assembly of long uncorrected reads.

For more information, please check its website: https://biocontainers.pro/tools/racon and its home page on Github.

Versions

  • 1.4.20

  • 1.5.0

Commands

  • racon

Module

You can load the modules by:

module load biocontainers
module load racon

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Racon on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=racon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers racon

Ragout

Introduction

Ragout is a tool for chromosome-level scaffolding using multiple references.

For more information, please check its website: https://biocontainers.pro/tools/ragout and its home page on Github.

Versions

  • 2.3

Commands

  • ragout

Module

You can load the modules by:

module load biocontainers
module load ragout

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ragout on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ragout

Ragtag

Introduction

Ragtag is a tool for fast reference-guided genome assembly scaffolding.

For more information, please check its website: https://biocontainers.pro/tools/ragtag and its home page on Github.

Versions

  • 2.1.0

Commands

  • ragtag.py

Module

You can load the modules by:

module load biocontainers
module load ragtag

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Ragtag on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragtag
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ragtag

ragtag.py correct ref.fasta query.fasta
ragtag.py patch target.fa query.fa

Rapmap

Introduction

RapMap is a testing ground for ideas in quasi-mapping and selective alignment.

For more information, please check:

Versions

  • 0.6.0

Commands

  • rapmap

Module

You can load the modules by:

module load biocontainers
module load rapmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rapmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rapmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rapmap

Rasusa

Introduction

Rasusa: Randomly subsample sequencing reads to a specified coverage.

For more information, please check:

Versions

  • 0.6.0

  • 0.7.0

Commands

  • rasusa

Module

You can load the modules by:

module load biocontainers
module load rasusa

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rasusa on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rasusa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rasusa

rasusa -i seq_1.fq -i seq_2.fq  \
    --coverage 100 --genome-size 35mb  \
    -o out.r1.fq -o out.r2.fq

Raven-assembler

Introduction

Raven-assembler is a de novo genome assembler for long uncorrected reads.

For more information, please check its website: https://biocontainers.pro/tools/raven-assembler and its home page on Github.

Versions

  • 1.8.1

Commands

  • raven

Module

You can load the modules by:

module load biocontainers
module load raven-assembler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Raven-assembler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=raven-assembler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers raven-assembler

raven -t 12 input.fastq

Raxml

Introduction

Raxml (Randomized Axelerated Maximum Likelihood) is a program for the Maximum Likelihood-based inference of large phylogenetic trees.

For more information, please check its website: https://biocontainers.pro/tools/raxml and its home page: https://cme.h-its.org/exelixis/web/software/raxml/.

Versions

  • 8.2.12

Commands

  • raxmlHPC

  • raxmlHPC-AVX2

  • raxmlHPC-PTHREADS

  • raxmlHPC-PTHREADS-AVX2

  • raxmlHPC-PTHREADS-SSE3

  • raxmlHPC-SSE3

Module

You can load the modules by:

module load biocontainers
module load raxml

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Raxml on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 36
#SBATCH --job-name=raxml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers raxml

raxmlHPC-SSE3 -m GTRGAMMA  -p 12345 -s input.fasta -n HPC-SSE3_out -# 20 -T 36
raxmlHPC -m GTRGAMMA  -p 12345 -s input.fasta -n HPC_out -# 20 -T 36
raxmlHPC-AVX2  -m GTRGAMMA  -p 12345 -s input.fasta -n HPC-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS  -m GTRGAMMA  -p 12345 -s input.fasta -n HPC-PTHREADS_out -# 20 -T 36
raxmlHPC-PTHREADS-AVX2  -m GTRGAMMA  -p 12345 -s input.fasta -n HPC-PTHREADS-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS-SSE3  -m GTRGAMMA  -p 12345 -s input.fasta -n HPC-PTHREADS-SSE3_out -# 20 -T 36

Raxml-ng

Introduction

Raxml-ng is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.

For more information, please check its website: https://biocontainers.pro/tools/raxml-ng and its home page on Github.

Versions

  • 1.1.0

Commands

  • raxml-ng

  • raxml-ng-mpi

  • mpirun

  • mpiexec

Module

You can load the modules by:

module load biocontainers
module load raxml-ng

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Raxml-ng on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=raxml-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers raxml-ng

raxml-ng --bootstrap --msa alignment.phy \
     --model GTR+G --threads 12 --bs-trees 1000

R-cellchat

Introduction

CellChat: Inference and analysis of cell-cell communication.

For more information, please check:

Versions

  • 1.5.0

Commands

  • R

  • Rscript

  • rstudio

Module

You can load the modules by:

module load biocontainers
module load r-cellchat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run r-cellchat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-cellchat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers r-cellchat

Reapr

Introduction

Reapr is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads.

For more information, please check:

Notes provided by Neelam Jha

https://bioinformaticsonline.com/bookmarks/view/26925/reapr-a-universal-tool-for-genome-assembly-evaluation

Reapr is a tool trying to find explicit errors in the assembly based on incongruently mapped reads. It is heavily based on too low span coverage, or reads mapping too far or too close to each other. The program will also break up contigs/scaffolds at spurious sites to form smaller (but hopefully correct) contigs. Reapr runs pretty slowly, sadly,

Reapr is a bit fuzzy with contig names, but luckily it’s given us a tool to check if things are ok before we proceed! The command reapr facheck <assembly.fasta> will tell you if everything’s ok! in this case, no output is good output, since the only output from the command is the potential problems with the contig names. If you run into any problems, run reapr facheck <assembly.fasta> <renamed_assembly.fasta>, and you will get an assembly file with renamed contigs.

Once the names are ok, we continue:

The first thing we reapr needs, is a list of all “perfect” reads. This is reads that have a perfect map to the reference. Reapr is finicky though, and can’t use libraries with different read lengths, so you’ll have to use assemblies based on the raw data for this. Run the command reapr perfectmap to get information on how to create a perfect mapping file, and create a perfect mapping called <assembler>_perfect.

The next tool we need is reapr smaltmap which creates a bam file of read-pair mappings. Do the same thing you did with perfectmap and create an output file called <assembler>_smalt.bam.

Finally we can use the smalt mapping, and the perfect mapping to run the reapr pipeline. Run reapr pipeline to get help on how to run, and then run the pipeline. Store the results in reapr_<assembler>.

Versions

  • 1.0.18

Commands

  • reapr

Module

You can load the modules by:

module load biocontainers
module load reapr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run reapr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reapr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers reapr

reapr facheck Assembly.fasta renamedAssembly.fasta
reapr perfectmap renamedAssembly.fasta reads_1.fastq reads_2.fastq 100 outputPrefix
reapr smaltmap renamedAssembly.fasta reads_1.fastq reads_2.fastq mapped.bam
reapr pipeline renamedAssembly.fasta mapped.bam pipeoutdir outputPrefix

Rebaler

Introduction

Rebaler is a program for conducting reference-based assemblies using long reads.

For more information, please check its website: https://biocontainers.pro/tools/rebaler and its home page on Github.

Versions

  • 0.2.0

Commands

  • rebaler

Module

You can load the modules by:

module load biocontainers
module load rebaler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Rebaler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rebaler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rebaler

Reciprocal Smallest Distance

Introduction

The reciprocal smallest distance (RSD) algorithm accurately infers orthologs between pairs of genomes by considering global sequence alignment and maximum likelihood evolutionary distance between sequences.

For more information, please check its home page on Github.

Versions

  • 1.1.7

Commands

  • rsd_search

  • rsd_blast

  • rsd_format

Module

You can load the modules by:

module load biocontainers
module load reciprocal_smallest_distance

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Reciprocal Smallest Distance on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reciprocal_smallest_distance
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers reciprocal_smallest_distance

rsd_search
    -q Mycoplasma_genitalium.aa \
    --subject-genome=Mycobacterium_leprae.aa \
    -o Mycoplasma_genitalium.aa_Mycobacterium_leprae.aa_0.8_1e-5.orthologs.txt

rsd_format -g Mycoplasma_genitalium.aa

rsd_blast -v -q Mycoplasma_genitalium.aa \
    --subject-genome=Mycobacterium_leprae.aa \
    --forward-hits q_s.hits --reverse-hits s_q.hits \
    --no-format --evalue 0.1

Recycler

Introduction

Recycler is a tool designed for extracting circular sequences from de novo assembly graphs.

For more information, please check its website: https://biocontainers.pro/tools/recycler and its home page on Github.

Versions

  • 0.7

Commands

  • make_fasta_from_fastg.py

  • get_simple_cycs.py

  • recycle.py

Module

You can load the modules by:

module load biocontainers
module load recycler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Recycler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=recycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers recycler

recycle.py -g test/assembly_graph.fastg \
    -k 55 -b test/test.sort.bam -i True

Regtools

Introduction

Regtools are tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.

For more information, please check:

Versions

  • 1.0.0

Commands

  • regtools

Module

You can load the modules by:

module load biocontainers
module load regtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run regtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=regtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers regtools

RepeatMasker

Introduction

RepeatMakser is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. Detailed usage can be found here: http://www.repeatmasker.org.

Versions

  • 4.1.2

Commands

  • RepeatMasker

Database

Note

As of May 20, 2019 GIRI has rescinded the working agreement allowing the www.repeatmasker.org website to offer a repeatmasking service utilizing the RepBase RepeatMasker Edition library. As a result, repeatmasker can only offer masking using the open database Dfam, which starting in 3.0 includes consensus sequences in addition to profile hidden Markov models for many transposable element families. Users requiring RepBase will need to purchase a commercial or academic license from GIRI and run RepeatMasker localy.

In our cluster, we set up the Dfam relaese 3.5 (October 2021) that include 285,580 repetitive DNA families.

Species name

Note

Since v4.1.1, RepeatMakser has switched to the FamDB format for the Dfam database. Due to this change, RepeatMasker becomes more strict with regards to what is acceptable for the -species flag. The commonly used names such as “mammal” and “mouse” will not be accepted. To check for valid names, you can query the database using the python script famdb.py (https://github.com/Dfam-consortium/FamDB).

See famdb.py --help for usage information and below for an example the check the valid name for “mammal” using our copy of the Dfam database:

/depot/itap/datasets/Maker/RepeatMasker/Libraries/famdb.py -i /depot/itap/datasets/Maker/RepeatMasker/Libraries/Dfam.h5 names mammal

Module

You can load the modules by:

module load biocontainers
module load repeatmasker/4.1.2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RepeatMasker on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=repeatmsker
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers repeatmasker/4.1.2

RepeatMasker -pa 24 -species mammals genome.fasta

RepeatModeler

Introduction

RepeatModeler is a de novo transposable element (TE) family identification and modeling package.

For more information, please check its website: https://biocontainers.pro/tools/repeatmodeler and its home page: http://www.repeatmasker.org/RepeatModeler/.

Versions

  • 2.0.2

  • 2.0.3

Commands

  • RepeatModeler

  • BuildDatabase

  • RepeatClassifier

Module

You can load the modules by:

module load biocontainers
module load repeatmodeler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RepeatModeler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatmodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers repeatmodeler

RepeatScout

Introduction

RepeatScout is a tool to discover repetitive substrings in DNA.

For more information, please check its website: https://biocontainers.pro/tools/repeatscout and its home page on Github.

Versions

  • 1.0.6

Commands

  • RepeatScout

  • build_lmer_table

  • compare-out-to-gff.prl

  • filter-stage-1.prl

  • filter-stage-2.prl

  • merge-lmer-tables.prl

Module

You can load the modules by:

module load biocontainers
module load repeatscout

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RepeatScout on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatscout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers repeatscout

build_lmer_table -l 14 -sequence genome.fasta -freq Final_assembly.freq

RepeatScout -sequence genome.fasta -output Final_assembly_repeats.fasta -freq Final_assembly.freq -l 14

Resfinder

Introduction

ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.

For more information, please check:

Versions

  • 4.1.5

Commands

  • run_resfinder.py

  • run_batch_resfinder.py

Module

You can load the modules by:

module load biocontainers
module load resfinder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run resfinder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=resfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers resfinder

run_resfinder.py -o output -db_res db_resfinder/ \
     -db_res_kma db_resfinder/kma_indexing -db_point db_pointfinder/ \
     -s "Escherichia coli" --acquired --point -ifq data/test_isolate_01_*

Revbayes

Introduction

RevBayes – Bayesian phylogenetic inference using probabilistic graphical models and an interactive language.

For more information, please check:

Versions

  • 1.1.1

Commands

  • rb

  • rb-mpi

Module

You can load the modules by:

module load biocontainers
module load revbayes

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run revbayes on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=revbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers revbayes

rMATS

Introduction

MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.

Detailed usage can be found here: http://rnaseq-mats.sourceforge.net

Versions

  • 4.1.1

Commands

  • rmats.py

Module

You can load the modules by:

module load biocontainers
module load rmats

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rmats on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rmats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rmats

rmats.py --b1 SR_b1.txt --b2 SR_b2.txt --gtf Homo_sapiens.GRCh38.105.gtf --od rmats_out_homo --tmp rmats_tmp  -t paired --nthread 10 --readLength 150

rmats2sashimiplot

Introduction

rmats2sashimiplot produces a sashimiplot visualization of rMATS output. rmats2sashimiplot can also produce plots using an annotation file and genomic coordinates. The plotting backend is MISO.

Detailed usage can be found here: https://github.com/Xinglab/rmats2sashimiplot

Versions

  • 2.0.4

Commands

  • rmats2sashimiplot

Module

You can load the modules by:

module load biocontainers
module load rmats2sashimiplot

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rmats on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=rmats2sashimiplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rmats2sashimiplot

rmats2sashimiplot --s1 sample_1_replicate_1.sam,sample_1_replicate_2.sam,sample_1_replicate_3.sam \
                  --s2 sample_2_replicate_1.sam,sample_2_replicate_2.sam,sample_2_replicate_3.sam \
                  -t SE -e SE.MATS.JC.txt --l1 SampleOne --l2 SampleTwo --exon_s 1 --intron_s 5 \
                  -o test_events_output

RNAIndel

Introduction

RNAIndel calls coding indels from tumor RNA-Seq data and classifies them as somatic, germline, and artifactual. RNAIndel supports GRCh38 and 37.

For more information, please check its Github package: https://github.com/stjude/RNAIndel/pkgs/container/rnaindel and its home page on Github.

Versions

  • 3.0.9

Commands

  • rnaindel

Module

You can load the modules by:

module load biocontainers
module load rnaindel

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RNAIndel on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnaindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rnaindel

RNApeg

Introduction

RNApeg is an RNA junction calling, correction, and quality-control package. RNAIndel supports GRCh38 and 37.

For more information, please check its Github package: https://github.com/stjude/RNApeg/pkgs/container/rnapeg and its home page on Github.

Versions

  • 2.7.1

Commands

  • RNApeg.sh

Module

You can load the modules by:

module load biocontainers
module load rnapeg

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RNApeg on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnapeg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rnapeg

Rnaquast

Introduction

Rnaquast is a quality assessment tool for de novo transcriptome assemblies.

For more information, please check its website: https://biocontainers.pro/tools/rnaquast and its home page: http://cab.spbu.ru/software/rnaquast/.

Versions

  • 2.2.1

Commands

  • rnaQUAST.py

Dependencies de novo quality assessment and read alignment

Note

When reference genome and gene database are unavailable, users can also use BUSCO and GeneMarkS-T in rnaQUAST pipeline. Since GeneMarkS-T requires the license key, users may need to download your own key, and put it in your $HOME. rnaQUAST is also capable of calculating various statistics using raw reads (e.g. database coverage by reads). To use this, you will need use STAR in the pipeline. BUSCO, GeneMarkS-T, and STAR have been installed, and the directories of their exectuables have been added to $PATH. Users do not need to load these modules. The only module required is rnaquast itself.

Module

You can load the modules by:

module load biocontainers
module load rnaquast

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Rnaquast on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=rnaquast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rnaquast

rnaQUAST.py -t 12 -o output \
     --transcripts Trinity.fasta idba.fasta \
     --reference Saccharomyces_cerevisiae.R64-1-1.75.dna.toplevel.fa \
     --gtf Saccharomyces_cerevisiae.R64-1-1.75.gtf

rnaQUAST.py -t 12 -o output2 \
     --reference reference.fasta \
     --transcripts transcripts.fasta \
     --left_reads lef.fastq \
     --right_reads right.fastq \
     --busco fungi_odb10

Roary

Introduction

Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka) and calculates the pan genome.

For more information, please check:

Versions

  • 3.13.0

Commands

  • roary

Module

You can load the modules by:

module load biocontainers
module load roary

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run roary on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=roary
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers roary

roary -f demo -e -n -v gff/*.gff

r-rnaseq

Introduction

r-rnaseq is a customerized R module based on R/4.1.1 used for RNAseq analysis.

In the module, we have some packages installed:

  • BiocManager 1.30.16

  • ComplexHeatmap 2.9.4

  • DESeq2 1.34.0

  • edgeR 3.36.0

  • pheatmap 1.0.12

  • limma 3.48.3

  • tibble 3.1.5

  • tidyr 1.1.4

  • readr 2.0.2

  • readxl 1.3.1

  • purrr 0.3.4

  • dplyr 1.0.7

  • stringr 1.4.0

  • forcats 0.5.1

  • ggplot2 3.3.5

  • openxlsx 4.2.5

Versions

  • 4.1.1-1

  • 4.1.1-1-rstudio

Commands

  • R

  • Rscript

  • rstudio (only for the rstudio version)

Module

You can load the modules by:

module load biocontainers
module load r-rnaseq/4.1.1-1
# If you want to use Rstudio, load the rstudio version
module load r-rnaseq/4.1.1-1-rstudio

Install packages

Note

Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile. Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.

Interactive job

To run interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-rnaseq/4.1.1-1 # or r-rnaseq/4.1.1-1-rstudio
(base) UserID@bell-a008:~ $ R

R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


> library(edgeR)
> library(pheatmap)

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_RNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers r-rnaseq

Rscript RNAseq.R

RStudio

Introduction

RStudio is an integrated development environment (IDE) for the R statistical computation and graphics system.

This is an RStudio IDE together with a plain R-base installation (see https://github.com/rocker-org/rocker/), repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files. It is intentionally separate from the biocontainers’ ‘r’ module for reasons of image size (700MB vs 360MB).

For more information, please check its | Docker hub: https://hub.docker.com/_/r-base and its home page: https://www.rstudio.com/products/rstudio/ and https://www.r-project.org/.

Versions

  • 4.1.1

Commands

  • R

  • Rscript

  • rstudio

Module

You can load the modules by:

module load biocontainers
module load r-studio

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RStudio on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-studio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers r-studio

r-scrnaseq

Introduction

r-scrnaseq is a customerized R module based on R/4.1.1 or R/4.2.0 used for scRNAseq analysis.

In the module, we have some packages installed:

  • BiocManager 1.30.16

  • CellChat 1.6.1

  • ProjecTILs 3.0

  • Seurat 4.1.0

  • SeuratObject 4.0.4

  • SeuratWrappers 0.3.0

  • monocle3 1.0.0

  • SnapATAC 1.0.0

  • SingleCellExperiment 1.14.1, 1.16.0

  • scDblFinder 1.8.0

  • SingleR 1.8.1

  • scCATCH 3.0

  • scMappR 1.0.7

  • rliger 1.0.0

  • schex 1.8.0

  • CoGAPS 3.14.0

  • celldex 1.4.0

  • dittoSeq 1.6.0

  • DropletUtils 1.14.2

  • miQC 1.2.0

  • Nebulosa 1.4.0

  • tricycle 1.2.0

  • pheatmap 1.0.12

  • limma 3.48.3, 3.50.0

  • tibble 3.1.5

  • tidyr 1.1.4

  • readr 2.0.2

  • readxl 1.3.1

  • purrr 0.3.4

  • dplyr 1.0.7

  • stringr 1.4.0

  • forcats 0.5.1

  • ggplot2 3.3.5

  • openxlsx 4.2.5

Versions

  • 4.1.1-1

  • 4.1.1-1-rstudio

  • 4.2.0

  • 4.2.0-rstudio

  • 4.2.3-rstudio

Commands

  • R

  • Rscript

  • rstudio (only for the rstudio version)

Module

You can load the modules by:

module load biocontainers
module load r-scrnaseq
# or module load r-scrnaseq/4.2.0
# If you want to use Rstudio, load the rstudio version
module load r-scrnaseq/4.1.1-1-rstudio
# or module load r-scrnaseq/4.2.0-rstudio

Install packages

Note

Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile. Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.

Interactive job

To run interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-scrnaseq/4.2.0 # or r-scrnaseq/4.2.0-rstudio
(base) UserID@bell-a008:~ $ R

R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


> library(Seurat)
> library(monocle3)

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_scRNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers r-scrnaseq

Rscript scRNAseq.R

RSEM

Introduction

RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. Further information can be found here: https://deweylab.github.io/RSEM/.

Versions

  • 1.3.3

Commands

  • rsem-bam2readdepth

  • rsem-bam2wig

  • rsem-build-read-index

  • rsem-calculate-credibility-intervals

  • rsem-calculate-expression

  • rsem-control-fdr

  • rsem-extract-reference-transcripts

  • rsem-generate-data-matrix

  • rsem-generate-ngvector

  • rsem-gen-transcript-plots

  • rsem-get-unique

  • rsem-gff3-to-gtf

  • rsem-parse-alignments

  • rsem-plot-model

  • rsem-plot-transcript-wiggles

  • rsem-prepare-reference

  • rsem-preref

  • rsem-refseq-extract-primary-assembly

  • rsem-run-ebseq

  • rsem-run-em

  • rsem-run-gibbs

  • rsem-run-prsem-testing-procedure

  • rsem-sam-validator

  • rsem-scan-for-paired-end-reads

  • rsem-simulate-reads

  • rsem-synthesis-reference-transcripts

  • rsem-tbam2gbam

Dependencies

STAR v2.7.9a, Bowtie v1.2.3, Bowtie2 v2.3.5.1, HISAT2 v2.2.1 were included in the container image. So users do not need to provide the dependency path in the RSEM parameter.

Module

You can load the modules by:

module load biocontainers
module load rsem/1.3.3

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run RSEM on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rsem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rsem/1.3.3

rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie  -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie2 Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie2  -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --hisat2-hca  Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_hisat2  -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --star Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_star  -p 24
rsem-calculate-expression --paired-end --star -p 24 SRR12095148_1.fastq SRR12095148_2.fastq  Gh38_star SRR12095148_rsem_expression

Rseqc

Introduction

Rseqc is a package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.

For more information, please check its website: https://biocontainers.pro/tools/rseqc and its home page: http://rseqc.sourceforge.net.

Versions

  • 4.0.0

Commands

  • FPKM-UQ.py

  • FPKM_count.py

  • RNA_fragment_size.py

  • RPKM_saturation.py

  • aggregate_scores_in_intervals.py

  • align_print_template.py

  • axt_extract_ranges.py

  • axt_to_fasta.py

  • axt_to_lav.py

  • axt_to_maf.py

  • bam2fq.py

  • bam2wig.py

  • bam_stat.py

  • bed_bigwig_profile.py

  • bed_build_windows.py

  • bed_complement.py

  • bed_count_by_interval.py

  • bed_count_overlapping.py

  • bed_coverage.py

  • bed_coverage_by_interval.py

  • bed_diff_basewise_summary.py

  • bed_extend_to.py

  • bed_intersect.py

  • bed_intersect_basewise.py

  • bed_merge_overlapping.py

  • bed_rand_intersect.py

  • bed_subtract_basewise.py

  • bnMapper.py

  • clipping_profile.py

  • deletion_profile.py

  • div_snp_table_chr.py

  • divide_bam.py

  • find_in_sorted_file.py

  • geneBody_coverage.py

  • geneBody_coverage2.py

  • gene_fourfold_sites.py

  • get_scores_in_intervals.py

  • infer_experiment.py

  • inner_distance.py

  • insertion_profile.py

  • int_seqs_to_char_strings.py

  • interval_count_intersections.py

  • interval_join.py

  • junction_annotation.py

  • junction_saturation.py

  • lav_to_axt.py

  • lav_to_maf.py

  • line_select.py

  • lzop_build_offset_table.py

  • mMK_bitset.py

  • maf_build_index.py

  • maf_chop.py

  • maf_chunk.py

  • maf_col_counts.py

  • maf_col_counts_all.py

  • maf_count.py

  • maf_covered_ranges.py

  • maf_covered_regions.py

  • maf_div_sites.py

  • maf_drop_overlapping.py

  • maf_extract_chrom_ranges.py

  • maf_extract_ranges.py

  • maf_extract_ranges_indexed.py

  • maf_filter.py

  • maf_filter_max_wc.py

  • maf_gap_frequency.py

  • maf_gc_content.py

  • maf_interval_alignibility.py

  • maf_limit_to_species.py

  • maf_mapping_word_frequency.py

  • maf_mask_cpg.py

  • maf_mean_length_ungapped_piece.py

  • maf_percent_columns_matching.py

  • maf_percent_identity.py

  • maf_print_chroms.py

  • maf_print_scores.py

  • maf_randomize.py

  • maf_region_coverage_by_src.py

  • maf_select.py

  • maf_shuffle_columns.py

  • maf_species_in_all_files.py

  • maf_split_by_src.py

  • maf_thread_for_species.py

  • maf_tile.py

  • maf_tile_2.py

  • maf_tile_2bit.py

  • maf_to_axt.py

  • maf_to_concat_fasta.py

  • maf_to_fasta.py

  • maf_to_int_seqs.py

  • maf_translate_chars.py

  • maf_truncate.py

  • maf_word_frequency.py

  • mask_quality.py

  • mismatch_profile.py

  • nib_chrom_intervals_to_fasta.py

  • nib_intervals_to_fasta.py

  • nib_length.py

  • normalize_bigwig.py

  • one_field_per_line.py

  • out_to_chain.py

  • overlay_bigwig.py

  • prefix_lines.py

  • pretty_table.py

  • qv_to_bqv.py

  • random_lines.py

  • read_GC.py

  • read_NVC.py

  • read_distribution.py

  • read_duplication.py

  • read_hexamer.py

  • read_quality.py

  • split_bam.py

  • split_paired_bam.py

  • table_add_column.py

  • table_filter.py

  • tfloc_summary.py

  • tin.py

  • ucsc_gene_table_to_intervals.py

  • wiggle_to_array_tree.py

  • wiggle_to_binned_array.py

  • wiggle_to_chr_binned_array.py

  • wiggle_to_simple.py

Module

You can load the modules by:

module load biocontainers
module load rseqc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Rseqc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rseqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rseqc

bam_stat.py -i *.bam -q 30

run-dbCAN

Introduction

run_dbCAN using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes. This is a standalone tool of http://bcb.unl.edu/dbCAN2/. Details aobut its uage can be found in its Github repository.

Versions

  • 3.0.2

  • 3.0.6

Commands

run_dbcan

Database

Latest version of database has been downloaded and setup, including CAZyDB.09242021.fa, dbCAN-HMMdb-V10.txt, tcdb.fa, tf-1.hmm, tf-2.hmm, and stp.hmm.

Module

You can load the modules by:

module load biocontainers
module load run_dbcan/3.0.2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run run_dbcan on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=run_dbcan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers run_dbcan/3.0.2

run_dbcan protein.faa protein --out_dir test1_dbcan
run_dbcan genome.fasta prok --out_dir test2_dbcan

rush

Introduction

rush is a tool similar to GNU parallel and gargs. rush borrows some idea from them and has some unique features, e.g., supporting custom defined variables, resuming multi-line commands, more advanced embeded replacement strings.

For more information, please check its home page on Github.

Versions

  • 0.4.2

Commands

  • rush

Module

You can load the modules by:

module load biocontainers
module load rush

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rush on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rush
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers rush

Sage

Introduction

Sage is a proteomics search engine - a tool that transforms raw mass spectra from proteomics experiments into peptide identificatons via database searching & spectral matching. But, it’s also more than just a search engine - Sage includes a variety of advanced features that make it a one-stop shop: retention time prediction, quantification (both isobaric & LFQ), peptide-spectrum match rescoring, and FDR control.

For more information, please check:

Versions

  • 0.8.1

Commands

  • sage

Module

You can load the modules by:

module load biocontainers
module load sage

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run sage on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sage
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sage

Salmon

Introduction

Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data.

Detailed usage can be found here: https://github.com/COMBINE-lab/salmon

Versions

  • 1.10.1

  • 1.5.2

  • 1.6.0

  • 1.7.0

  • 1.8.0

  • 1.9.0

Commands

  • salmon index

  • salmon quant

  • salmon alevin

  • salmon swim

  • salmon quantmerge

Module

You can load the modules by:

module load biocontainers
module load salmon

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Salmon on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=salmon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers salmon

salmon index -t  Homo_sapiens.GRCh38.cds.all.fa -i salmon_index
salmon quant -i salmon_index -l A -p 24 -1 SRR16956239_1.fastq -2 SRR16956239_2.fastq --validateMappings -o transcripts_quan

Sambamba

Introduction

Sambamba is a high performance highly parallel robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files.

For more information, please check its website: https://biocontainers.pro/tools/sambamba and its home page on Github.

Versions

  • 0.8.2

Commands

  • sambamba

Module

You can load the modules by:

module load biocontainers
module load sambamba

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Sambamba on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sambamba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sambamba

sambamba view --reference-info input.bam
sambamba view -c -F "mapping_quality >= 40" input.bam

Samblaster

Introduction

Samblaster is a tool to mark duplicates and extract discordant and split reads from sam files.

For more information, please check its website: https://biocontainers.pro/tools/samblaster and its home page on Github.

Versions

  • 0.1.26

Commands

  • samblaster

Module

You can load the modules by:

module load biocontainers
module load samblaster

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Samblaster on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samblaster
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers samblaster

Samclip

Introduction

Samclip is a tool to filter SAM file for soft and hard clipped alignments.

For more information, please check:

Versions

  • 0.4.0

Commands

  • samclip

Module

You can load the modules by:

module load biocontainers
module load samclip

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run samclip on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samclip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers samclip

samclip --ref test.fna < test.sam > out.sam

Samplot

Introduction

Samplot is a command line tool for rapid, multi-sample structural variant visualization.

For more information, please check its website: https://biocontainers.pro/tools/samplot and its home page on Github.

Versions

  • 1.3.0

Commands

  • samplot

Module

You can load the modules by:

module load biocontainers
module load samplot

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Samplot on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers samplot

samplot plot \
-n NA12878 NA12889 NA12890 \
-b samplot/test/data/NA12878_restricted.bam \
  samplot/test/data/NA12889_restricted.bam \
  samplot/test/data/NA12890_restricted.bam \
-o 4_115928726_115931880.png \
-c chr4 \
-s 115928726 \
-e 115931880 \
-t DEL

Samtools

Introduction

Samtools is a set of utilities for the Sequence Alignment/Map (SAM) format.

For more information, please check its website: https://biocontainers.pro/tools/samtools and its home page on Github.

Versions

  • 1.15

  • 1.16

  • 1.17

  • 1.9

Commands

  • samtools

  • ace2sam

  • htsfile

  • maq2sam-long

  • maq2sam-short

  • tabix

  • wgsim

Module

You can load the modules by:

module load biocontainers
module load samtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Samtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers samtools

Scanpy

Introduction

Scanpy is scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells. Details about its usage can be found here (https://scanpy.readthedocs.io/en/stable/)

Versions

  • 1.8.2

  • 1.9.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load scanpy/1.8.2

Interactive job

To run scanpy interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun  4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> sc.tl.umap(adata, **tool_params)

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scanpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scanpy/1.8.2

python script.py

Scarches

Introduction

scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases.

For more information, please check:

Versions

  • 0.5.3

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load scarches

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run scarches on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scarches
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scarches

Scgen

Introduction

scGen is a generative model to predict single-cell perturbation response across cell types, studies and species.

For more information, please check:

Versions

  • 2.1.0

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load scgen

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run scgen on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scgen

Scirpy

Introduction

Scirpy is a scalable python-toolkit to analyse T cell receptor (TCR) or B cell receptor (BCR) repertoires from single-cell RNA sequencing (scRNA-seq) data. It seamlessly integrates with the popular scanpy library and provides various modules for data import, analysis and visualization.

For more information, please check:

Versions

  • 0.10.1

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load scirpy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run scirpy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scirpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scirpy

scVelo

Introduction

scVelo is a scalable toolkit for RNA velocity analysis in single cells, based on https://doi.org/10.1038/s41587-020-0591-3. Its detailed usage can be found here: https://scvelo.readthedocs.io.

Versions

  • 0.2.4

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load scvelo/0.2.4

Interactive job

To run scVelo interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scvelo/0.2.4
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun  4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scvelo as scv
>>> scv.set_figure_params()

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scvelo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scvelo/0.2.4

python script.py

Scvi-tools

Introduction

scvi-tools (single-cell variational inference tools) is a package for end-to-end analysis of single-cell omics data primarily developed and maintained by the Yosef Lab at UC Berkeley.

For more information, please check:

Versions

  • 0.16.2

Commands

  • python

  • python3

  • R

  • Rscript

Module

You can load the modules by:

module load biocontainers
module load scvi-tools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run scvi-tools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scvi-tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers scvi-tools

Segalign

Introduction

Segalign is a scalable GPU system for pairwise whole genome alignments based on LASTZ’s seed-filter-extend paradigm.

For more information, please check:

Versions

  • 0.1.2

Commands

  • faToTwoBit

  • run_segalign

  • run_segalign_repeat_masker

  • segalign

  • segalign_repeat_masker

  • twoBitToFa

Module

You can load the modules by:

module load biocontainers
module load segalign

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run segalign on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=segalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers segalign

Seidr

Introduction

Seidr is a community gene network inference and exploration toolkit.

For more information, please check its website: https://biocontainers.pro/tools/seidr and its home page on Github.

Versions

  • 0.14.2

Commands

  • correlation

  • seidr

  • mi

  • pcor

  • narromi

  • plsnet

  • llr-ensemble

  • svm-ensemble

  • genie3

  • tigress

  • el-ensemble

  • makeconv

  • genrb

  • gencfu

  • gencnval

  • gendict

  • tomsimilarity

Module

You can load the modules by:

module load biocontainers
module load seidr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Seidr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seidr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers seidr

Sepp

Introduction

Sepp stands for SATé-Enabled Phylogenetic Placement and addresses the problem of phylogenetic placement for meta-genomic short reads.

For more information, please check its website: https://biocontainers.pro/tools/sepp and its home page on Github.

Versions

  • 4.5.1

Commands

  • run_sepp.py

  • run_upp.py

  • split_sequences.py

  • sumlabels.py

  • sumtrees.py

Module

You can load the modules by:

module load biocontainers
module load sepp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Sepp on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sepp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sepp

run_sepp.py -t mock/rpsS/sate.tre \
    -r mock/rpsS/sate.tre.RAxML_info \
    -a mock/rpsS/sate.fasta \
    -f mock/rpsS/rpsS.even.fas \
    -o rpsS.out.default

Seqcode

Introduction

SeqCode is a family of applications designed to develop high-quality images and perform genome-wide calculations from high-throughput sequencing experiments. This software is presented into two distinct modes: web tools and command line. The website of SeqCode offers most functions to users with no previous expertise in bioinformatics, including operations on a selection of published ChIP-seq samples and applications to generate multiple classes of graphics from data files of the user. On the contrary, the standalone version of SeqCode allows bioinformaticians to run each command on any type of sequencing data locally in their computer. The architecture of the source code is modular and the input/output interface of the commands is suitable to be integrated into existing pipelines of genome analysis. SeqCode has been written in ANSI C, which favors the compatibility in every UNIX platform and grants a high performance and speed when analyzing sequencing data. Meta-plots, heatmaps, boxplots and the rest of images produced by SeqCode are internally generated using R. SeqCode relies on the RefSeq reference annotations and is able to deal with the genome and assembly release of every organism that is available from this consortium.

For more information, please check:

Versions

  • 1.0

Commands

  • buildChIPprofile

  • combineChIPprofiles

  • combineTSSmaps

  • combineTSSplots

  • computemaxsignal

  • findPeaks

  • genomeDistribution

  • matchpeaks

  • matchpeaksgenes

  • processmacs

  • produceGENEmaps

  • produceGENEplots

  • producePEAKmaps

  • producePEAKplots

  • produceTESmaps

  • produceTESplots

  • produceTSSmaps

  • produceTSSplots

  • recoverChIPlevels

  • scorePhastCons

Module

You can load the modules by:

module load biocontainers
module load seqcode

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run seqcode on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqcode
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers seqcode

buildChIPprofile -vd ChromInfo.txt \
     H3K4me3_sample.bam test_buildChIPprofile

Seqkit

Introduction

Seqkit is a rapid tool for manipulating fasta and fastq files.

For more information, please check its website: https://biocontainers.pro/tools/seqkit and its home page on Github.

Versions

  • 2.0.0

  • 2.1.0

  • 2.3.1

Commands

  • seqkit

Module

You can load the modules by:

module load biocontainers
module load seqkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Seqkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers seqkit

seqkit stats configs.fasta > contigs_statistics.txt

Seqyclean

Introduction

Seqyclean is used to pre-process NGS data in order to prepare for downstream analysis. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/seqyclean Home page: https://github.com/ibest/seqyclean

Versions

  • 1.10.09

Commands

  • seqyclean

Module

You can load the modules by:

module load biocontainers
module load seqyclean

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run seqyclean on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqyclean
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers seqyclean

Shapeit4

Introduction

SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data.

For more information, please check:

Versions

  • 4.2.2

Commands

  • shapeit4

Module

You can load the modules by:

module load biocontainers
module load shapeit4

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run shapeit4 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shapeit4

Shapeit5

Introduction

SHAPEIT5 is a software package to estimate haplotypes in large genotype datasets (WGS and SNP array).

For more information, please check:

Versions

  • 5.1.1

Commands

  • phase_common

  • ligate

  • phase_rare

  • simulate

  • switch

  • xcftools

Module

You can load the modules by:

module load biocontainers
module load shapeit5

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run shapeit5 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit5
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shapeit5

Shasta

Introduction

Shasta is a software for de novo assembly from Oxford Nanopore reads.

For more information, please check:

Versions

  • 0.10.0

Commands

  • shasta

Module

You can load the modules by:

module load biocontainers
module load shasta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run shasta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shasta

shasta --input r94_ec_rad2.181119.60x-10kb.fasta \
    --config Nanopore-May2022

Shigeifinder

Introduction

Shigeifinder is a tool that is used to identify differentiate Shigella/EIEC using cluster-specific genes and identify the serotype using O-antigen/H-antigen genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/shigeifinder Home page: https://github.com/LanLab/ShigEiFinder

Versions

  • 1.3.2

Commands

  • shigeifinder

Module

You can load the modules by:

module load biocontainers
module load shigeifinder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run shigeifinder on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shigeifinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shigeifinder

Shorah

Introduction

Shorah is an open source project for the analysis of next generation sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/shorah and its home page on Github.

Versions

  • 1.99.2

Commands

  • shorah

  • b2w

  • diri_sampler

  • fil

Module

You can load the modules by:

module load biocontainers
module load shorah

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Shorah on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shorah
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shorah

shorah amplicon -b ampli_sorted.bam -f reference.fasta
shorah shotgun -b test_aln.cram -f test_ref.fasta
shorah shotgun -a 0.1 -w 42 -x 100000 -p 0.9 -c 0 -r REF:42-272 -R 42 -b test_aln.cram -f ref.fasta

Shortstack

Introduction

Shortstack is a tool for comprehensive annotation and quantification of small RNA genes.

For more information, please check its website: https://biocontainers.pro/tools/shortstack and its home page on Github.

Versions

  • 3.8.5

Commands

  • ShortStack

Module

You can load the modules by:

module load biocontainers
module load shortstack

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Shortstack on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shortstack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shortstack

Shovill

Introduction

Shovill is a tool to assemble bacterial isolate genomes from Illumina paired-end reads.

For more information, please check:

Versions

  • 1.1.0

Commands

  • shovill

Module

You can load the modules by:

module load biocontainers
module load shovill

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run shovill on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shovill
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers shovill

shovill --outdir out \
    --R1 test/R1.fq.gz \
    --R2 test/R2.fq.gz

Sicer

Introduction

Sicer is a clustering approach for identification of enriched domains from histone modification ChIP-Seq data.

For more information, please check its website: https://biocontainers.pro/tools/sicer and its home page: http://home.gwu.edu/~wpeng/Software.htm.

Versions

  • 1.1

Commands

  • SICER-df-rb.sh

  • SICER-df.sh

  • SICER-rb.sh

  • SICER.sh

Module

You can load the modules by:

module load biocontainers
module load sicer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Sicer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sicer

SICER.sh ./ test.bed control.bed . hg18 1 200 150 0.74 600 .01

SICER-rb.sh ./ test.bed . hg18 1 200 150 0.74 400 100

Sicer2

Introduction

Sicer2 is the redesigned and improved ChIP-seq broad peak calling tool SICER.

For more information, please check its website: https://biocontainers.pro/tools/sicer2 and its home page on Github.

Versions

  • 1.0.3

  • 1.2.0

Commands

  • sicer

  • sicer_df

  • recognicer

  • recognicer_df

Module

You can load the modules by:

module load biocontainers
module load sicer2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Sicer2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sicer2

sicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
    -c ./test/control_1.bed ./test/control_2.bed \
    -s hg38 --significant_reads

recognicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
    -c ./test/control_1.bed ./test/control_2.bed \
    -s hg38 --significant_reads

SignalP

Introduction

SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.

For more information, please check its home page: https://services.healthtech.dtu.dk/service.php?SignalP-4.1.

Versions

  • 4.1

Commands

  • signalp

Module

You can load the modules by:

module load biocontainers
module load signalp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run SignalP on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=signalp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers signalp

signalp -t gram+ -f all proka.fasta > proka_out
signalp -t euk -f all euk.fasta > euk.out

Signalp6

Introduction

SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.

For more information, please check:

Versions

  • 6.0-fast

  • 6.0-slow

Commands

  • signalp6

Module

You can load the modules by:

module load biocontainers
module load signalp6

Example job for fast mode

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run signalp6 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-fast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers signalp6/6.0-fast

signalp6 --write_procs 24 --fastafile proteins_clean.fasta  \
    --organism euk --output_dir output_fast  \
    --format txt --mode fast

Example job for slow mode

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run signalp6 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-slow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers signalp6/6.0-slow

signalp6 --write_procs 24 --fastafile proteins_clean.fasta  \
    --organism euk --output_dir output_slow  \
    --format txt --mode slow

signalp6 --write_procs 24 --fastafile proteins_clean.fasta  \
    --organism euk --output_dir output_slow-sequential  \
    --format txt --mode slow-sequential

Simug

Introduction

Simug is a general-purpose genome simulator.

For more information, please check its website: https://biocontainers.pro/tools/simug and its home page on Github.

Versions

  • 1.0.0

Commands

  • simuG

  • vcf2model

Module

You can load the modules by:

module load biocontainers
module load simug

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Simug on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=simug
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers simug

Singlem

Introduction

SingleM is a tool for profiling shotgun metagenomes. It has a particular strength in detecting microbial lineages which are not in reference databases. The method it uses also makes it suitable for some related tasks, such as assessing eukaryotic contamination, finding bias in genome recovery, computing ecological diversity metrics, and lineage-targeted MAG recovery.

For more information, please check:

Versions

  • 0.13.2

Commands

  • singlem

Module

You can load the modules by:

module load biocontainers
module load singlem

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run singlem on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=singlem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers singlem

Ska

Introduction

SKA (Split Kmer Analysis) is a toolkit for prokaryotic (and any other small, haploid) DNA sequence analysis using split kmers. A split kmer is a pair of kmers in a DNA sequence that are separated by a single base. Split kmers allow rapid comparison and alignment of small genomes, and is particulalry suited for surveillance or outbreak investigation. SKA can produce split kmer files from fasta format assemblies or directly from fastq format read sequences, cluster them, align them with or without a reference sequence and provide various comparison and summary statistics. Currently all testing has been carried out on high-quality Illumina read data, so results for other platforms may vary.

For more information, please check:

Versions

  • 1.0

Commands

  • ska

Module

You can load the modules by:

module load biocontainers
module load ska

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run ska on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ska
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ska

Skewer

Introduction

Skewer is a fast and accurate adapter trimmer for paired-end reads.

For more information, please check its website: https://biocontainers.pro/tools/skewer and its home page on Github.

Versions

  • 0.2.2

Commands

  • skewer

Module

You can load the modules by:

module load biocontainers
module load skewer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Skewer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=skewer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers skewer

skewer -l 50 -m pe -o skewerQ30 --mean-quality 30 \
     --end-quality 30 -t 10 -x TruSeq3-PE.fa \
     input_1.fastq input_2.fastq

Slamdunk

Introduction

Slamdunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.

For more information, please check:

Versions

  • 0.4.3

Commands

  • slamdunk

  • alleyoop

Module

You can load the modules by:

module load biocontainers
module load slamdunk

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run slamdunk on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=slamdunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers slamdunk

Smoove

Introduction

Smoove simplifies and speeds calling and genotyping SVs for short reads.

For more information, please check its website: https://biocontainers.pro/tools/smoove and its home page on Github.

Versions

  • 0.2.7

Commands

  • smoove

Module

You can load the modules by:

module load biocontainers
module load smoove

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Smoove on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=smoove
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers smoove

smoove call \
    -x --name my-cohort \
    --exclude hg38_blacklist.bed \
    --fasta  Homo_sapiens.GRCh38.dna.primary_assembly.fa \
     -p 24 \
    --genotype input_bams/*.bam

Snakemake

Introduction

Snakemake is a workflow engine that provides a readable Python-based workflow definition language and a powerful execution environment that scales from single-core workstations to compute clusters without modifying the workflow.

For more information, please check its website: https://biocontainers.pro/tools/snakemake and its home page: https://snakemake.readthedocs.io/en/stable/.

Versions

  • 6.8.0

Commands

  • snakemake

Module

You can load the modules by:

module load biocontainers
module load snakemake

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snakemake on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snakemake
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snakemake

Snap

Introduction

Snap is a semi-HMM-based Nucleic Acid Parser – gene prediction tool.

For more information, please check its website: https://biocontainers.pro/tools/snap and its home page: http://korflab.ucdavis.edu/software.html.

Versions

  • 2013_11_29

Commands

  • snap

Module

You can load the modules by:

module load biocontainers
module load snap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snap

Snap-aligner

Introduction

Snap-aligner (Scalable Nucleotide Alignment Program) is a fast and accurate read aligner for high-throughput sequencing data.

For more information, please check its website: https://biocontainers.pro/tools/snap-aligner and its home page: http://snap.cs.berkeley.edu/.

Versions

  • 2.0.0

Commands

  • snap-aligner

Module

You can load the modules by:

module load biocontainers
module load snap-aligner

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snap-aligner on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap-aligner
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snap-aligner

Snaptools

Introduction

Snaptools is a python module for pre-processing and working with snap file.

For more information, please check its website: https://biocontainers.pro/tools/snaptools and its home page on Github.

Versions

  • 1.4.8

Commands

  • snaptools

Module

You can load the modules by:

module load biocontainers
module load snaptools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snaptools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snaptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snaptools

Snippy

Introduction

Snippy is a tool for rapid haploid variant calling and core genome alignment.

For more information, please check its | Docker hub: https://hub.docker.com/r/staphb/snippy and its home page on Github.

Versions

  • 4.6.0

Commands

  • snippy

  • snippy-clean_full_aln

  • snippy-core

  • snippy-multi

  • snippy-vcf_extract_subs

  • snippy-vcf_report

  • snippy-vcf_to_tab

Module

You can load the modules by:

module load biocontainers
module load snippy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snippy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snippy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snippy

Snp-dists

Introduction

Snp-dists is a tool to convert a FASTA alignment to SNP distance matrix.

For more information, please check:

Versions

  • 0.8.2

Commands

  • snp-dists

Module

You can load the modules by:

module load biocontainers
module load snp-dists

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run snp-dists on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-dists
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snp-dists

snp-dists test/good.aln > distances.tab

Snpeff

Introduction

Snpeff is an open source tool that annotates variants and predicts their effects on genes by using an interval forest approach.

For more information, please check its website: https://biocontainers.pro/tools/snpeff and its home page on Github.

Versions

  • 5.1d

  • 5.1

Commands

  • snpEff

Module

You can load the modules by:

module load biocontainers
module load snpeff

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Note

By default, snpEff only uses 1gb of memory. To allocate larger memory, add -Xmx flag in your command.:

snpeff -Xmx10g ## To allocate 10gb of memory.

To run Snpeff on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpeff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snpeff

snpEff GRCh37.75 examples/test.chr22.vcf > test.chr22.ann.vcf

Snpgenie

Introduction

Snpgenie is a collection of Perl scripts for estimating πN/πS, dN/dS, and gene diversity from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.

For more information, please check its website: https://biocontainers.pro/tools/snpgenie and its home page on Github.

Versions

  • 1.0

Commands

  • fasta2revcom.pl

  • gtf2revcom.pl

  • snpgenie.pl

  • snpgenie_between_group.pl

  • snpgenie_between_group_processor.pl

  • snpgenie_within_group.pl

  • snpgenie_within_group_processor.pl

  • vcf2revcom.pl

Module

You can load the modules by:

module load biocontainers
module load snpgenie

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snpgenie on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpgenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snpgenie

snpgenie.pl --minfreq=0.01 --snpreport=CLC_SNP_EXAMPLE.txt \
    --fastafile=REFERENCE_EXAMPLE.fasta --gtffile=CDS_EXAMPLE.gtf

Snphylo

Introduction

Snphylo is a pipeline to generate a phylogenetic tree from huge SNP data.

For more information, please check:

Versions

  • 20180901

Commands

  • Rscript

  • snphylo.sh

  • convert_fasta_to_phylip.py

  • convert_simple_to_hapmap.py

  • determine_bs_tree.R

  • draw_unrooted_tree.R

  • generate_snp_sequence.R

  • remove_low_depth_genotype_data.py

  • remove_no_genotype_data.py

Module

You can load the modules by:

module load biocontainers
module load snphylo

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run snphylo on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snphylo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snphylo

Snpsift

Introduction

Snpsift is a tool used to annotate genomic variants using databases, filters, and manipulates genomic annotated variants.

For more information, please check its website: https://biocontainers.pro/tools/snpsift and its home page on Github.

Versions

  • 4.3.1t

Commands

  • SnpSift

Module

You can load the modules by:

module load biocontainers
module load snpsift

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Snpsift on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpsift
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snpsift

SnpSift annotate -id dbSnp132.vcf \
    variants.vcf > variants_annotated.vcf

Snp-sites

Introduction

SNP-sites is a tool that apidly extracts SNPs from a multi-FASTA alignment.

For more information, please check:

Versions

  • 2.5.1

Commands

  • snp-sites

Module

You can load the modules by:

module load biocontainers
module load snp-sites

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run snp-sites on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-sites
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers snp-sites

snp-sites salmonella_serovars_core_genes.aln

Soapdenovo2

Introduction

Soapdenovo2 is a short-read assembly method to build de novo draft assembly.

For more information, please check its website: https://biocontainers.pro/tools/soapdenovo2 and its home page: http://soap.genomics.org.cn/soapdenovo.html.

Versions

  • 2.40

Commands

  • SOAPdenovo-127mer

  • SOAPdenovo-63mer

Module

You can load the modules by:

module load biocontainers
module load soapdenovo2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Soapdenovo2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=soapdenovo2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers soapdenovo2

SOAPdenovo-127mer all -s config_file -K 63 -R -o graph_prefix 1>ass.log 2>ass.err

SortMeRNA

Introduction

SortMeRNA is a local sequence alignment tool for filtering, mapping and clustering.

For more information, please check its website: https://biocontainers.pro/tools/sortmerna and its home page on Github.

Versions

  • 2.1b

  • 4.3.4

Commands

  • sortmerna

Module

You can load the modules by:

module load biocontainers
module load sortmerna

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run SortMeRNA on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sortmerna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sortmerna

sortmerna --ref silva-bac-16s-id90.fasta,silva-bac-16s-db \
    --reads set2_environmental_study_550_amplicon.fasta \
    --fastx --aligned Test

Souporcell

Introduction

souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.

For more information, please check:

Versions

  • 2.0

Commands

  • check_modules.py

  • compile_stan_model.py

  • consensus.py

  • renamer.py

  • retag.py

  • shared_samples.py

  • souporcell.py

  • souporcell_pipeline.py

Module

You can load the modules by:

module load biocontainers
module load souporcell

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run souporcell on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=souporcell
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers souporcell

souporcell_pipeline.py -i A.merged.bam \
    -b GSM2560245_barcodes.tsv \
    -f refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa \
    -t 8 -o demux_data_test -k 4

Sourmash

Introduction

Sourmash is a tool for quickly search, compare, and analyze genomic and metagenomic data sets.

For more information, please check its website: https://biocontainers.pro/tools/sourmash and its home page on Github.

Versions

  • 4.3.0

  • 4.5.0

Commands

  • sourmash

Module

You can load the modules by:

module load biocontainers
module load sourmash

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Sourmash on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sourmash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sourmash

sourmash sketch dna -p k=31 *.fna.gz
sourmash compare *.sig -o cmp.dist
sourmash plot cmp.dist --labels

Spaceranger

Introduction

Spaceranger is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images.

Versions

  • 1.3.0

  • 1.3.1

  • 2.0.0

Commands

  • spaceranger

Module

You can load the modules by:

module load biocontainers
module load spaceranger

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Spaceranger on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=spaceranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers spaceranger

spaceranger count --id=sample345 \ #Output directory
               --transcriptome=/opt/refdata/GRCh38-2020-A \ #Path to Reference
               --fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ #Path to FASTQs
               --sample=mysample \ #Sample name from FASTQ filename
               --image=/home/jdoe/runs/images/sample345.tiff \ #Path to brightfield image
               --slide=V19J01-123 \ #Slide ID
               --area=A1 \ #Capture area
               --localcores=8 \ #Allowed cores in localmode
               --localmem=64 #Allowed memory (GB) in localmode

SPAdes

Introduction

SPAdes- St. Petersburg genome assembler - is an assembly toolkit containing various assembly pipelines.

Detailed usage can be found here: https://github.com/ablab/spades

Versions

  • 3.15.3

  • 3.15.4

  • 3.15.5

Commands

  • coronaspades.py

  • metaplasmidspades.py

  • metaspades.py

  • metaviralspades.py

  • plasmidspades.py

  • rnaspades.py

  • rnaviralspades.py

  • spades.py

  • spades_init.py

  • truspades.py

  • spades-bwa

  • spades-convert-bin-to-fasta

  • spades-core

  • spades-corrector-core

  • spades-gbuilder

  • spades-gmapper

  • spades-gsimplifier

  • spades-hammer

  • spades-ionhammer

  • spades-kmer-estimating

  • spades-kmercount

  • spades-read-filter

  • spades-truseq-scfcorrection

Module

You can load the modules by:

module load biocontainers
module load spades

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run spades on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=spades
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers spades

spades.py --pe1-1 SRR11234553_1.fastq --pe1-2 SRR11234553_2.fastq -o spades_out -t 24

Sprod

Introduction

Sprod: De-noising Spatially Resolved Transcriptomics Data Based on Position and Image Information.

For more information, please check:

Versions

  • 1.0

Commands

  • python

  • python3

  • sprod.py

Module

You can load the modules by:

module load biocontainers
module load sprod

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run sprod on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sprod
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sprod

python3 test_examples.py

Squeezemeta

Introduction

SqueezeMeta is a fully automated metagenomics pipeline, from reads to bins.

For more information, please check:

Versions

  • 1.5.1

Commands

  • 01.merge_assemblies.pl

  • 01.merge_sequential.pl

  • 01.remap.pl

  • 01.run_assembly.pl

  • 01.run_assembly_merged.pl

  • 02.rnas.pl

  • 03.run_prodigal.pl

  • 04.rundiamond.pl

  • 05.run_hmmer.pl

  • 06.lca.pl

  • 07.fun3assign.pl

  • 08.blastx.pl

  • 09.summarycontigs3.pl

  • 10.mapsamples.pl

  • 11.mcount.pl

  • 12.funcover.pl

  • 13.mergeannot2.pl

  • 14.runbinning.pl

  • 15.dastool.pl

  • 16.addtax2.pl

  • 17.checkM_batch.pl

  • 18.getbins.pl

  • 19.getcontigs.pl

  • 20.minpath.pl

  • 21.stats.pl

  • SqueezeMeta.pl

  • SqueezeMeta_conf.pl

  • SqueezeMeta_conf_original.pl

  • parameters.pl

  • restart.pl

  • add_database.pl

  • cover.pl

  • sqm2ipath.pl

  • sqm2itol.pl

  • sqm2keggplots.pl

  • sqm2pavian.pl

  • sqm_annot.pl

  • sqm_hmm_reads.pl

  • sqm_longreads.pl

  • sqm_mapper.pl

  • sqm_reads.pl

  • versionchange.pl

  • find_missing_markers.pl

  • remove_duplicate_markers.pl

  • anvi-filter-sqm.py

  • anvi-load-sqm.py

  • sqm2anvio.pl

  • configure_nodb.pl

  • configure_nodb_alt.pl

  • download_databases.pl

  • make_databases.pl

  • make_databases_alt.pl

  • test_install.pl

Module

You can load the modules by:

module load biocontainers
module load squeezemeta

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run squeezemeta on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squeezemeta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers squeezemeta

SqueezeMeta.pl -m coassembly -p Hadza -s test.samples -f raw

Squid

Introduction

SQUID is designed to detect both fusion-gene and non-fusion-gene transcriptomic structural variations from RNA-seq alignment.

For more information, please check:

Versions

  • 1.5

Commands

  • squid

  • AnnotateSQUIDOutput.py

Module

You can load the modules by:

module load biocontainers
module load squid

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run squid on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squid
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers squid

SRA-Toolkit

Introduction

SRA-Toolkit is a collection of tools and libraries for using data in the INSDC Sequence Read Archives. Its detailed documentation can be found in https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc.

Versions

  • 2.11.0-pl5262

Commands

  • abi-dump

  • align-cache

  • align-info

  • bam-load

  • cache-mgr

  • cg-load

  • fasterq-dump

  • fasterq-dump-orig

  • fastq-dump

  • fastq-dump-orig

  • illumina-dump

  • kar

  • kdbmeta

  • kget

  • latf-load

  • md5cp

  • prefetch

  • prefetch-orig

  • rcexplain

  • read-filter-redact

  • sam-dump

  • sam-dump-orig

  • sff-dump

  • sra-pileup

  • sra-pileup-orig

  • sra-sort

  • sra-sort-cg

  • sra-stat

  • srapath

  • srapath-orig

  • sratools

  • test-sra

  • vdb-config

  • vdb-copy

  • vdb-diff

  • vdb-dump

  • vdb-encrypt

  • vdb-lock

  • vdb-passwd

  • vdb-unlock

  • vdb-validate

Module

You can load the modules by:

module load biocontainers
module load sra-tools/2.11.0-pl5262

Configuring SRA-Toolkit

Users can config SRA-Toolkit by the command vdb-config. For example, the below command set up the current working directory for downloading:

vdb-config --prefetch-to-cwd

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run SRA-Toolkit on our cluster:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=SRA-Toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers sra-tools/2.11.0-pl5262

vdb-config --prefetch-to-cwd # The data will be downloaded to the current working directory.
prefetch SRR11941281
fastq-dump --split-3 SRR11941281/SRR11941281.sra

Srst2

Introduction

Srst2 is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/srst2 Home page: https://github.com/katholt/srst2

Versions

  • 0.2.0

Commands

  • getmlst.py

  • srst2

  • slurm_srst2.py

Module

You can load the modules by:

module load biocontainers
module load srst2

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run srst2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=srst2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers srst2

Stacks

Introduction

Stacks is a software pipeline for building loci from RAD-seq.

For more information, please check its website: https://biocontainers.pro/tools/stacks and its home page: https://catchenlab.life.illinois.edu/stacks/.

Versions

  • 2.60

Commands

  • clone_filter

  • count_fixed_catalog_snps.py

  • cstacks

  • denovo_map.pl

  • gstacks

  • integrate_alignments.py

  • kmer_filter

  • phasedstacks

  • populations

  • process_radtags

  • process_shortreads

  • ref_map.pl

  • sstacks

  • stacks-dist-extract

  • stacks-gdb

  • stacks-integrate-alignments

  • tsv2bam

  • ustacks

Module

You can load the modules by:

module load biocontainers
module load stacks

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Stacks on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=stacks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers stacks

denovo_map.pl -T 8 -M 4 -o ./stacks/  \
    --samples ./samples --popmap ./popmaps/popmap

STAR

Introduction

STAR: ultrafast universal RNA-seq aligner.

Detailed usage can be found here: https://github.com/alexdobin/STAR

Versions

  • 2.7.10a

  • 2.7.10b

  • 2.7.9a

Commands

  • STAR

  • STARlong

Module

You can load the modules by:

module load biocontainers
module load star/2.7.10a

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run STAR on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=star
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers star/2.7.10a


STAR  --runThreadN 24  --runMode genomeGenerate  --genomeDir ref_genome  --genomeFastaFiles ref_genome.fasta

STAR --runThreadN 24 --genomeDir ref_genome --readFilesIn seq_1.fastq seq_2.fastq  --outSAMtype BAM SortedByCoordinate --outWigType wiggle read2

Staramr

Introduction

staramr scans bacterial genome contigs against the ResFinder, PointFinder, and PlasmidFinder databases (used by the ResFinder webservice and other webservices offered by the Center for Genomic Epidemiology) and compiles a summary report of detected antimicrobial resistance genes.

For more information, please check:

Versions

  • 0.7.1

Commands

  • staramr

Module

You can load the modules by:

module load biocontainers
module load staramr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run staramr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=staramr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers staramr

staramr db info
staramr search \
    --pointfinder-organism salmonella \
    -o out *.fasta

STAR-Fusion

Introduction

STAR-Fusion is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT).

For more information, please check its | Docker hub: https://hub.docker.com/r/trinityctat/starfusion and its home page on Github.

Versions

  • 1.11b

Commands

  • STAR-Fusion

Module

You can load the modules by:

module load biocontainers
module load starfusion

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run STAR-Fusion on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=starfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers starfusion

STAR-Fusion --CPU 24 --left_fq ../star/SRR12095148_1.fastq --right_fq  ../star/SRR12095148_2.fastq\
     --genome_lib_dir  GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-play/ctat_genome_lib_build_dir \
     --FusionInspector validate \
     --denovo_reconstruct \
     --examine_coding_effect \
     --output_dir STAR-Fusion-output

STREAM

Introduction

STREAM (Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data.

For more information, please check its | Docker hub: https://hub.docker.com/r/pinellolab/stream and its home page on Github.

Versions

  • 1.0

Commands

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load stream

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run STREAM on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers stream

Stringdecomposer

Introduction

Stringdecomposer is a tool for decomposition centromeric assemblies and long reads into monomers.

For more information, please check:

Versions

  • 1.1.2

Commands

  • stringdecomposer

Module

You can load the modules by:

module load biocontainers
module load stringdecomposer

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run stringdecomposer on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stringdecomposer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers stringdecomposer

StringTie

Introduction

StringTie: efficient transcript assembly and quantitation of RNA-Seq data.

Stringtie employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome. It takes as input spliced alignments in coordinate-sorted SAM/BAM/CRAM format and produces a GTF output which consists of assembled transcript structures and their estimated expression levels (FPKM/TPM and base coverage values).

Detailed usage can be found here: https://github.com/gpertea/stringtie

Versions

  • 2.1.7

  • 2.2.1

Commands

  • stringtie

Module

You can load the modules by:

module load biocontainers
module load stringtie

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run stringtie on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=stringtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers stringtie

stringtie -o SRR11614710.gtf -G Homo_sapiens.GRCh38.105.gtf SRR11614710Aligned.sortedByCoord.out.bam

Strique

Introduction

STRique is a python package to analyze repeat expansion and methylation states of short tandem repeats (STR) in Oxford Nanopore Technology (ONT) long read sequencing data.

For more information, please check:

Versions

  • 0.4.2

Commands

  • STRique.py

  • STRique_test.py

  • fast5Masker.py

Module

You can load the modules by:

module load biocontainers
module load strique

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run strique on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=strique
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers strique

STRique_test.py
STRique.py index data/ > data/reads.fofn
cat data/c9orf72.sam |  STRique.py count ./data/reads.fofn ./models/r9_4_450bps.model ./configs/repeat_config.tsv --config ./configs/STRique.json

Structure

Introduction

Structure is a software package for using multi-locus genotype data to investigate population structure.

For more information, please check:

Versions

  • 2.3.4

Commands

  • structure

Module

You can load the modules by:

module load biocontainers
module load structure

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run structure on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=structure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers structure

Subread

Introduction

Subread carries out high-performance read alignment, quantification and mutation discovery. It is a general-purpose read aligner which can be used to map both genomic DNA-seq reads and DNA-seq reads. It uses a new mapping paradigm called seed-and-vote to achieve fast, accurate and scalable read mapping. Subread automatically determines if a read should be globally or locally aligned, therefore particularly powerful in mapping RNA-seq reads. It supports INDEL detection and can map reads with both fixed and variable lengths.

For more information, please check its website: https://biocontainers.pro/tools/subread and its home page: http://subread.sourceforge.net.

Versions

  • 1.6.4

  • 2.0.1

Commands

  • detectionCall

  • exactSNP

  • featureCounts

  • flattenGTF

  • genRandomReads

  • propmapped

  • qualityScores

  • removeDup

  • repair

  • subindel

  • subjunc

  • sublong

  • subread-align

  • subread-buildindex

  • subread-fullscan

  • txUnique

Module

You can load the modules by:

module load biocontainers
module load subread

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Subread on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=subread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers subread

featureCounts -s 2 -p -Q 10 -T 4 -a genome.gtf -o featurecounts.txt mapped.bam

Survivor

Introduction

SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.

For more information, please check:

Versions

  • 1.0.7

Commands

  • SURVIVOR

Module

You can load the modules by:

module load biocontainers
module load survivor

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run survivor on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=survivor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers survivor

SURVIVOR simSV parameter_file
SURVIVOR simSV ref.fa parameter_file 0.1 0 simulated
SURVIVOR eval caller.vcf simulated.bed 10 eval_res

~

Svaba

Introduction

SvABA is a method for detecting structural variants in sequencing data using genome-wide local assembly.

For more information, please check:

Versions

  • 1.1.0

Commands

  • svaba

Module

You can load the modules by:

module load biocontainers
module load svaba

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run svaba on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=svaba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers svaba

DBSNP=dbsnp_indel.vcf
TUM_BAM=G15512.HCC1954.1.COST16011_region.bam
NORM_BAM=HCC1954.NORMAL.30x.compare.COST16011_region.bam
CORES=8 ## set any number of cores
REF=Homo_sapiens_assembly19.COST16011_region.fa
svaba run -t $TUM_BAM -n $NORM_BAM \
    -p $CORES -D $DBSNP \
    -a somatic_run -G $REF

Svtools

Introduction

Svtools is a suite of utilities designed to help bioinformaticians construct and explore cohort-level structural variation calls.

For more information, please check:

Versions

  • 0.5.1

Commands

  • svtools

Module

You can load the modules by:

module load biocontainers
module load svtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run svtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers svtools

Svtyper

Introduction

SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. svtyper is the original implementation of the genotyping algorithm, and works with multiple samples. svtyper-sso is an alternative implementation of svtyper that is optimized for genotyping a single sample. svtyper-sso is a parallelized implementation of svtyper that takes advantage of multiple CPU cores via the multiprocessing module. svtyper-sso can offer a 2x or more speedup (depending on how many CPU cores used) in genotyping a single sample. NOTE: svtyper-sso is not yet stable. There are minor logging differences between the two and svtyper-sso may exit with an error prematurely when processing CRAM files.

For more information, please check:

Versions

  • 0.7.1

Commands

  • svtyper

  • svtyper-sso

  • python

  • python2

Module

You can load the modules by:

module load biocontainers
module load svtyper

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run svtyper on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers svtyper

svtyper \
    -i data/example.vcf \
    -B data/NA12878.target_loci.sorted.bam \
    -l data/NA12878.bam.json \
    > out.vcf

swat

Introduction

swat is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties.

For more information, please check its home page: http://www.phrap.org/phredphrapconsed.html#block_phrap.

Versions

  • 1.090518

Commands

  • swat

Module

You can load the modules by:

module load biocontainers
module load swat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run swat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=swat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers swat

Syri

Introduction

Syri compares alignments between two chromosome-level assemblies and identifies synteny and structural rearrangements.

For more information, please check:

Versions

  • 1.6

Commands

  • syri

Module

You can load the modules by:

module load biocontainers
module load syri

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run syri on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=syri
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers syri

syri -c out.sam -r refgenome -q qrygenome -k -F S

Talon

Introduction

Talon is a Python package for identifying and quantifying known and novel genes/isoforms in long-read transcriptome data sets.

For more information, please check its website: https://biocontainers.pro/tools/talon and its home page on Github.

Versions

  • 5.0

Commands

  • talon

  • talon_abundance

  • talon_create_GTF

  • talon_fetch_reads

  • talon_filter_transcripts

  • talon_generate_report

  • talon_get_sjs

  • talon_initialize_database

  • talon_label_reads

  • talon_reformat_gtf

  • talon_summarize

Module

You can load the modules by:

module load biocontainers
module load talon

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Talon on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=talon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers talon

Targetp

Introduction

TargetP-2.0 tool predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (luTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.

For more information, please check:

Versions

  • 2.0

Commands

  • targetp

Module

You can load the modules by:

module load biocontainers
module load targetp

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run targetp on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=targetp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers targetp

Tassel

Introduction

TASSEL is a software package used to evaluate traits associations, evolutionary patterns, and linkage disequilibrium.

For more information, please check:

Versions

  • 5.0

Commands

  • run_pipeline.pl

  • start_tassel.pl

  • Tassel5

Module

You can load the modules by:

module load biocontainers
module load tassel

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run tassel on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tassel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tassel

Taxonkit

Introduction

Taxonkit is a practical and efficient NCBI taxonomy toolkit.

For more information, please check its website: https://biocontainers.pro/tools/taxonkit and its home page on Github.

Versions

  • 0.9.0

Commands

  • taxonkit

Module

You can load the modules by:

module load biocontainers
module load taxonkit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Taxonkit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=taxonkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers taxonkit

taxonkit list --show-rank --show-name --indent "    " --ids 9605,239934

T-coffee

Introduction

T-coffee is a multiple sequence alignment software using a progressive approach.

For more information, please check its website: https://biocontainers.pro/tools/t-coffee and its home page on Github.

Versions

  • 13.45.0.4846264

Commands

  • t_coffee

Module

You can load the modules by:

module load biocontainers
module load t-coffee

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run T-coffee on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=t-coffee
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers t-coffee

t_coffee  OG0002077.fa -mode  expresso

Tetranscripts

Introduction

Tetranscripts is a package for including transposable elements in differential enrichment analysis of sequencing datasets.

For more information, please check its website: https://biocontainers.pro/tools/tetranscripts and its home page on Github.

Versions

  • 2.2.1

Commands

  • TEtranscripts

  • TEcount

Module

You can load the modules by:

module load biocontainers
module load tetranscripts

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Tetranscripts on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tetranscripts
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tetranscripts

TEtranscripts --format BAM --mode multi \
    -t treatment_sample1.bam treatment_sample2.bam treatment_sample3.bam \
    -c control_sample1.bam control_sample2.bam control_sample3.bam \
    --GTF genic-GTF-file \
    --GTF genic-GTF-file \
    --project sample_nosort_test

Tiara

Introduction

Tiara is a deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.

For more information, please check its | Docker hub: https://hub.docker.com/r/zhan4429/tiara and its home page on Github.

Versions

  • 1.0.2

Commands

  • tiara

Module

You can load the modules by:

module load biocontainers
module load tiara

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Tiara on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=tiara
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tiara

tiara -t 24 -i archaea_fr.fasta -o archaea_out.txt
tiara -t 24 -i bacteria_fr.fasta -o bacteria_out.txt
tiara -t 24 -i eukarya_fr.fasta -o eukarya_out.txt
tiara -t 24 -i mitochondria_fr.fasta -o mitochondria_out.txt
tiara -t 24  -i plast_fr.fasta -o plast_out.txt
tiara -t 24  -i total.fasta -o mix_out.txt  --tf all  -p 0.65 0.60 --probabilities

Tigmint

Introduction

Tigmint identifies and corrects misassemblies using linked (e.g. MGI’s stLFR, 10x Genomics Chromium) or long (e.g. Oxford Nanopore Technologies long reads) DNA sequencing reads. The reads are first aligned to the assembly, and the extents of the large DNA molecules are inferred from the alignments of the reads. The physical coverage of the large molecules is more consistent and less prone to coverage dropouts than that of the short read sequencing data. The sequences are cut at positions that have insufficient spanning molecules. Tigmint outputs a BED file of these cut points, and a FASTA file of the cut sequences. For more information, please check: Home page: https://github.com/bcgsc/tigmint

Versions

  • 1.2.6

Commands

  • tigmint

  • tigmint-arcs-tsv

  • tigmint-cut

  • tigmint-make

  • tigmint_estimate_dist.py

  • tigmint_molecule.py

  • tigmint_molecule_paf.py

Module

You can load the modules by:

module load biocontainers
module load tigmint

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run tigmint on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tigmint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tigmint

Tobias

Introduction

Tobias is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.

For more information, please check its website: https://biocontainers.pro/tools/tobias and its home page on Github.

Versions

  • 0.13.3

Commands

  • TOBIAS

Module

You can load the modules by:

module load biocontainers
module load tobias

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Tobias on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=tobias
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tobias

TOBIAS DownloadData --bucket data-tobias-2020
mv data-tobias-2020/ test_data/

TOBIAS PlotAggregate --TFBS test_data/BATF_all.bed \
     --signals test_data/Bcell_corrected.bw test_data/Tcell_corrected.bw \
     --output BATFJUN_footprint_comparison_all.pdf \
     --share_y both --plot_boundaries --signal-on-x

TOBIAS BINDetect --motifs test_data/motifs.jaspar \
     --signals test_data/Bcell_footprints.bw test_data/Tcell_footprints.bw \
     --genome test_data/genome.fa.gz \
     --peaks test_data/merged_peaks_annotated.bed \
     --peak_header test_data/merged_peaks_annotated_header.txt \
     --outdir BINDetect_output --cond_names Bcell Tcell --cores 8

TOBIAS ATACorrect --bam test_data/Bcell.bam \
    --genome test_data/genome.fa.gz \
    --peaks test_data/merged_peaks.bed \
    --blacklist test_data/blacklist.bed \
    --outdir ATACorrect_test --cores 8

TOBIAS FootprintScores --signal test_data/Bcell_corrected.bw \
    --regions test_data/merged_peaks.bed \
    --output Bcell_footprints.bw --cores 8

Tombo

Introduction

Tombo is a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data. Tombo also provides tools for the analysis and visualization of raw nanopore signal.

For more information, please check its website: https://biocontainers.pro/tools/ont-tombo and its home page on Github.

Versions

  • 1.5.1

Commands

  • tombo

Module

You can load the modules by:

module load biocontainers
module load tombo

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Tombo on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=tombo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tombo

tombo resquiggle path/to/fast5s/ genome.fasta --processes 4 --num-most-common-errors 5
tombo detect_modifications alternative_model --fast5-basedirs path/to/fast5s/ \
    --statistics-file-basename native.e_coli_sample \
    --alternate-bases dam dcm --processes 4

# plot raw signal at most significant dcm locations
tombo plot most_significant --fast5-basedirs path/to/fast5s/ \
    --statistics-filename native.e_coli_sample.dcm.tombo.stats \
    --plot-standard-model --plot-alternate-model dcm \
    --pdf-filename sample.most_significant_dcm_sites.pdf

# produces wig file with estimated fraction of modified reads at each valid reference site
tombo text_output browser_files --statistics-filename native.e_coli_sample.dam.tombo.stats \
     --file-types dampened_fraction --browser-file-basename native.e_coli_sample.dam
# also produce successfully processed reads coverage file for reference
tombo text_output browser_files --fast5-basedirs path/to/fast5s/ \
    --file-types coverage --browser-file-basename native.e_coli_sample

TopHat

Introduction

TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

For more information, please check its website: https://biocontainers.pro/tools/tophat and its home page: https://ccb.jhu.edu/software/tophat/index.shtml.

Versions

  • 2.1.1-py27

Commands

  • tophat

  • tophat2

Module

You can load the modules by:

module load biocontainers
module load tophat

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run TopHat on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tophat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers tophat

tophat -r 20 test_ref reads_1.fq reads_2.fq

TPMCalculator

Introduction

TPMCalculator quantifies mRNA abundance directly from the alignments by parsing BAM files.

Detailed usage can be found here: https://github.com/ncbi/TPMCalculator

Versions

  • 0.0.3

  • 0.0.4

Commands

  • TPMCalculator

Module

You can load the modules by:

module load biocontainers
module load tpmcalculator

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run tpmcalculator on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=tpmcalculator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transdecoder

TPMCalculator -g Homo_sapiens.GRCh38.105.chr.gtf -b SRR12095148Aligned.sortedByCoord.out.bam

Transabyss

Introduction

Transabyss is a tool for De novo assembly of RNAseq data using ABySS.

For more information, please check its website: https://bioconda.github.io/recipes/transabyss and its home page on Github.

Versions

  • 2.0.1

Commands

  • transabyss

  • transabyss-merge

Module

You can load the modules by:

module load biocontainers
module load transabyss

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Transabyss on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transabyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transabyss

transabyss --name  SRR12095148 \
    --pe SRR12095148_1.fastq SRR12095148_2.fastq \
    --outdir  SRR12095148_assembly  --threads 12

TransDecoder

Introduction

TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.

  • TransDecoder identifies likely coding sequences based on the following criteria:

  • a minimum length open reading frame (ORF) is found in a transcript sequence

  • a log-likelihood score similar to what is computed by the GeneID software is > 0.

  • the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames.

  • if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc).

  • a PSSM is built/trained/used to refine the start codon prediction.

  • optional the putative peptide has a match to a Pfam domain above the noise cutoff score.

Detailed usage can be found here: https://github.com/TransDecoder/TransDecoder/wiki#running-transdecoder

Versions

  • 5.5.0

Commands

  • TransDecoder.LongOrfs

  • TransDecoder.Predict

  • cdna_alignment_orf_to_genome_orf.pl

  • compute_base_probs.pl

  • exclude_similar_proteins.pl

  • fasta_prot_checker.pl

  • ffindex_resume.pl

  • gene_list_to_gff.pl

  • get_FL_accs.pl

  • get_longest_ORF_per_transcript.pl

  • get_top_longest_fasta_entries.pl

  • gff3_file_to_bed.pl

  • gff3_file_to_proteins.pl

  • gff3_gene_to_gtf_format.pl

  • gtf_genome_to_cdna_fasta.pl

  • gtf_to_alignment_gff3.pl

  • gtf_to_bed.pl

  • nr_ORFs_gff3.pl

  • pfam_runner.pl

  • refine_gff3_group_iso_strip_utrs.pl

  • refine_hexamer_scores.pl

  • remove_eclipsed_ORFs.pl

  • score_CDS_likelihood_all_6_frames.pl

  • select_best_ORFs_per_transcript.pl

  • seq_n_baseprobs_to_loglikelihood_vals.pl

  • start_codon_refinement.pl

  • train_start_PWM.pl

  • uri_unescape.pl

Module

You can load the modules by:

module load biocontainers
module load transdecoder

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run transdecoder on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=transdecoder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transdecoder

gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta
gtf_to_alignment_gff3.pl transcripts.gtf > transcripts.gff3
TransDecoder.LongOrfs -t transcripts.fasta
TransDecoder.Predict -t transcripts.fasta

Transrate

Introduction

Transrate is software for de-novo transcriptome assembly quality analysis.

For more information, please check:

Versions

  • 1.0.3

Commands

  • transrate

Module

You can load the modules by:

module load biocontainers
module load transrate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run transrate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transrate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transrate

transrate --assembly mm10/Mus_musculus.GRCm38.cds.all.fa \
    --left seq_1.fq.gz \
    --right seq_2.fq.gz \
    --threads 12

Transvar

Introduction

Transvar is a multi-way annotator for genetic elements and genetic variations.

For more information, please check its | Docker hub: https://hub.docker.com/r/zhouwanding/transvar and its home page: https://bioinformatics.mdanderson.org/public-software/transvar/.

Versions

  • 2.5.9

Commands

  • transvar

Module

You can load the modules by:

module load biocontainers
module load transvar

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Transvar on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=transvar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers transvar

# set up databases
transvar config --download_anno --refversion hg19

# in case you don't have a reference
transvar config --download_ref --refversion hg19

transvar panno -i 'PIK3CA:p.E545K' --ucsc --ccds

tRAX

Introduction

tRAX (tRNA Analysis of eXpression) is a software package built for in-depth analyses of tRNA-derived small RNAs (tDRs), mature tRNAs, and inference of RNA modifications from high-throughput small RNA sequencing data.

For more information, please check its | Docker hub: https://hub.docker.com/r/ucsclowelab/trax and its home page on Github.

Versions

  • 1.0.0

Commands

  • TestRun.bash

  • quickdb.bash

  • maketrnadb.py

  • trimadapters.py

  • processamples.py

Module

You can load the modules by:

module load biocontainers
module load trax

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run tRAX on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trax
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trax

Treetime

Introduction

Treetime is a tool for maximum likelihood dating and ancestral sequence inference.

For more information, please check its website: https://biocontainers.pro/tools/treetime and its home page on Github.

Versions

  • 0.8.6

  • 0.9.4

Commands

  • treetime

Module

You can load the modules by:

module load biocontainers
module load treetime

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Treetime on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=treetime
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers treetime

treetime ancestral --aln input.fasta --tree input.nwk

Trimal

Introduction

Trimal is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment.

For more information, please check its website: https://biocontainers.pro/tools/trimal and its home page: http://trimal.cgenomics.org.

Versions

  • 1.4.1

Commands

  • trimal

  • readal

  • statal

Module

You can load the modules by:

module load biocontainers
module load trimal

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trimal on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trimal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trimal

trimal -in input.fasta -out output1 -htmlout output1.html -gt 1

Trim-galore

Introduction

Trim-galore is a wrapper tool that automates quality and adapter trimming to FastQ files.

Versions

  • 0.6.7

Commands

  • trim_galore

Module

You can load the modules by:

module load biocontainers
module load trim-galore

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trim-galore on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=trim-galore
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trim-galore

trim_galore  --paired --fastqc --length 20 -o sample1_trimmed Sample1_1.fq Sample1_2.fq

Trimmomatic

Introduction

Trimmomatic is a flexible read trimming tool for Illumina NGS data.

For more information, please check its website: https://biocontainers.pro/tools/trimmomatic and its home page: http://www.usadellab.org/cms/index.php?page=trimmomatic.

Versions

  • 0.39

Commands

  • trimmomatic

Module

You can load the modules by:

module load biocontainers
module load trimmomatic

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trimmomatic on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=trimmomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trimmomatic

trimmomatic PE -threads 8 \
    input_forward.fq.gz input_reverse.fq.gz \
    output_forward_paired.fq.gz output_forward_unpaired.fq.gz \
    output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz \
    ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3 TRAILING:3 MINLEN:36

Trinity

Introduction

Trinity assembles transcript sequences from Illumina RNA-Seq data.

For more information, please check its website: https://biocontainers.pro/tools/trinity and its home page on Github.

Versions

  • 2.12.0

  • 2.13.2

  • 2.14.0

  • 2.15.0

Commands

  • Trinity

  • TrinityStats.pl

  • Trinity_gene_splice_modeler.py

  • ace2sam

  • align_and_estimate_abundance.pl

  • analyze_blastPlus_topHit_coverage.pl

  • analyze_diff_expr.pl

  • blast2sam.pl

  • bowtie

  • bowtie2

  • bowtie2-build

  • bowtie2-inspect

  • bowtie2sam.pl

  • contig_ExN50_statistic.pl

  • define_clusters_by_cutting_tree.pl

  • export2sam.pl

  • extract_supertranscript_from_reference.py

  • filter_low_expr_transcripts.pl

  • get_Trinity_gene_to_trans_map.pl

  • insilico_read_normalization.pl

  • interpolate_sam.pl

  • jellyfish

  • novo2sam.pl

  • retrieve_sequences_from_fasta.pl

  • run_DE_analysis.pl

  • sam2vcf.pl

  • samtools

  • samtools.pl

  • seq_cache_populate.pl

  • seqtk-trinity

  • sift_bam_max_cov.pl

  • soap2sam.pl

  • tabix

  • trimmomatic

  • wgsim

  • wgsim_eval.pl

  • zoom2sam.pl

Module

You can load the modules by:

module load biocontainers
module load trinity

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trinity on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 6
#SBATCH --job-name=trinity
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trinity

Trinity --seqType fq --left reads_1.fq --right reads_2.fq \
    --CPU 6 --max_memory 20G

Trinotate

Introduction

Trinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.

For more information, please check its website: https://biocontainers.pro/tools/trinotate and its home page on Github.

Versions

  • 3.2.2

Commands

  • Trinotate

  • Build_Trinotate_Boilerplate_SQLite_db.pl

  • EMBL_dat_to_Trinotate_sqlite_resourceDB.pl

  • EMBL_swissprot_parser.pl

  • PFAM_dat_parser.pl

  • PFAMtoGoParser.pl

  • RnammerTranscriptome.pl

  • TrinotateSeqLoader.pl

  • Trinotate_BLAST_loader.pl

  • Trinotate_GO_to_SLIM.pl

  • Trinotate_GTF_loader.pl

  • Trinotate_GTF_or_GFF3_annot_prep.pl

  • Trinotate_PFAM_loader.pl

  • Trinotate_RNAMMER_loader.pl

  • Trinotate_SIGNALP_loader.pl

  • Trinotate_TMHMM_loader.pl

  • Trinotate_get_feature_name_encoding_attributes.pl

  • Trinotate_report_writer.pl

  • assign_eggnog_funccats.pl

  • autoTrinotate.pl

  • build_DE_cache_tables.pl

  • cleanMe.pl

  • cleanme.pl

  • count_table_fields.pl

  • create_clusters_tables.pl

  • extract_GO_assignments_from_Trinotate_xls.pl

  • extract_GO_for_BiNGO.pl

  • extract_specific_genes_from_all_matrices.pl

  • import_DE_results.pl

  • import_Trinotate_xls_as_annot.pl

  • import_expression_and_DE_results.pl

  • import_expression_matrix.pl

  • import_samples_n_expression_matrix.pl

  • import_samples_only.pl

  • import_transcript_annotations.pl

  • import_transcript_clusters.pl

  • import_transcript_names.pl

  • init_Trinotate_sqlite_db.pl

  • legacy_blast.pl

  • make_cXp_html.pl

  • obo_tab_to_sqlite_db.pl

  • obo_to_tab.pl

  • prep_nuc_prot_set_for_trinotate_loading.pl

  • print.pl

  • rnammer_supperscaffold_gff_to_indiv_transcripts.pl

  • runMe.pl

  • run_TrinotateWebserver.pl

  • run_cluster_functional_enrichment_analysis.pl

  • shrink_db.pl

  • sqlite.pl

  • superScaffoldGenerator.pl

  • test_Barplot.pl

  • test_GO_DAG.pl

  • test_GenomeBrowser.pl

  • test_Heatmap.pl

  • test_Lineplot.pl

  • test_Piechart.pl

  • test_Scatter2D.pl

  • test_Sunburst.pl

  • trinotate_report_summary.pl

  • update_blastdb.pl

  • update_seq_n_annotation_fields.pl

Module

You can load the modules by:

module load biocontainers
module load trinotate

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trinotate on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trinotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trinotate

sqlite_db="myTrinotate.sqlite"

Trinotate ${sqlite_db} init \
    --gene_trans_map data/Trinity.fasta.gene_to_trans_map \
    --transcript_fasta data/Trinity.fasta \
     --transdecoder_pep \
    data/Trinity.fasta.transdecoder.pep

Trinotate ${sqlite_db} LOAD_swissprot_blastp data/swissprot.blastp.outfmt6

Trinotate ${sqlite_db} LOAD_pfam data/TrinotatePFAM.out

Trnascan-se

Introduction

Trnascan-se is a convenient, ready-for-use means to identify tRNA genes in one or more query sequences.

For more information, please check its website: https://biocontainers.pro/tools/trnascan-se and its home page: http://lowelab.ucsc.edu/tRNAscan-SE/.

Versions

  • 2.0.9

Commands

  • tRNAscan-SE

Module

You can load the modules by:

module load biocontainers
module load trnascan-se

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Trnascan-se on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=trnascan-se
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trnascan-se

tRNAscan-SE --thread 12 -o tRNA.out \
    -f rRNA.ss -m tRNA.stats genome.fasta

Trtools

Introduction

TRTools includes a variety of utilities for filtering, quality control and analysis of tandem repeats downstream of genotyping them from next-generation sequencing.

For more information, please check:

Versions

  • 5.0.1

Commands

  • associaTR

  • compareSTR

  • dumpSTR

  • mergeSTR

  • qcSTR

  • statSTR

Module

You can load the modules by:

module load biocontainers
module load trtools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Warning

We noticed that xalt module can cause the failure of certain commands including statSTR. Please unload all loaded modules by module --force purge before loading required modules.

To run trtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trtools htslib bcftools

mergeSTR --vcfs ceu_ex.vcf.gz,yri_ex.vcf.gz --out merged
bgzip merged.vcf
tabix -p vcf merged.vcf.gz

# Get the CEU and YRI sample lists
bcftools query -l yri_ex.vcf.gz > yri_samples.txt
bcftools query -l ceu_ex.vcf.gz > ceu_samples.txt

# Run statSTR on region chr21:35348646-35348646 (hg38)
statSTR \
    --vcf merged.vcf.gz \
    --samples yri_samples.txt,ceu_samples.txt \
    --sample-prefixes YRI,CEU \
    --out stdout \
    --mean --het --acount \
    --use-length \
    --region chr21:34351482-34363028

Trust4

Introduction

Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR and BCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors.

For more information, please check:

Versions

  • 1.0.7

Commands

  • run-trust4

  • BuildDatabaseFa.pl

  • BuildImgtAnnot.pl

  • trust-airr.pl

  • trust-barcoderep.pl

  • trust-simplerep.pl

  • trust-smartseq.pl

Module

You can load the modules by:

module load biocontainers
module load trust4

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run trust4 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trust4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trust4

run-trust4 -b mapped.bam -f hg38_bcrtcr.fa --ref human_IMGT+C.fa

Trycycler

Introduction

Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes. I.e. if you have multiple long-read assemblies for the same isolate, Trycycler can combine them into a single assembly that is better than any of your inputs.

For more information, please check:

Versions

  • 0.5.0

  • 0.5.3

  • 0.5.4

Commands

  • trycycler

Module

You can load the modules by:

module load biocontainers
module load trycycler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run trycycler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trycycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers trycycler

trycycler cluster --assemblies \
    test/test_cluster/assembly_*.fasta \
    --read test/test_cluster/reads.fastq \
    --out_dir trycycler_out

UCSC Executables

Introduction

UCSC Executables is a variety of executables that perform functions ranging from sequence analysis and format conversion, to basic number crunching and statistics, to complex database generation and manipulation.

These executables have been downloaded from http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v369/ and made available on RCAC clusters.

Versions

  • 369

Commands

  • addCols

  • ameme

  • autoDtd

  • autoSql

  • autoXml

  • ave

  • aveCols

  • axtChain

  • axtSort

  • axtSwap

  • axtToMaf

  • axtToPsl

  • bamToPsl

  • barChartMaxLimit

  • bedClip

  • bedCommonRegions

  • bedCoverage

  • bedExtendRanges

  • bedGeneParts

  • bedGraphPack

  • bedGraphToBigWig

  • bedIntersect

  • bedItemOverlapCount

  • bedJoinTabOffset

  • bedJoinTabOffset.py

  • bedMergeAdjacent

  • bedPartition

  • bedPileUps

  • bedRemoveOverlap

  • bedRestrictToPositions

  • bedSingleCover.pl

  • bedSort

  • bedToBigBed

  • bedToExons

  • bedToGenePred

  • bedToPsl

  • bedWeedOverlapping

  • bigBedInfo

  • bigBedNamedItems

  • bigBedSummary

  • bigBedToBed

  • bigGenePredToGenePred

  • bigHeat

  • bigMafToMaf

  • bigPslToPsl

  • bigWigAverageOverBed

  • bigWigCat

  • bigWigCluster

  • bigWigCorrelate

  • bigWigInfo

  • bigWigMerge

  • bigWigSummary

  • bigWigToBedGraph

  • bigWigToWig

  • binFromRange

  • blastToPsl

  • blastXmlToPsl

  • blat

  • calc

  • catDir

  • catUncomment

  • chainAntiRepeat

  • chainBridge

  • chainCleaner

  • chainFilter

  • chainMergeSort

  • chainNet

  • chainPreNet

  • chainScore

  • chainSort

  • chainSplit

  • chainStitchId

  • chainSwap

  • chainToAxt

  • chainToPsl

  • chainToPslBasic

  • checkAgpAndFa

  • checkCoverageGaps

  • checkHgFindSpec

  • checkTableCoords

  • chopFaLines

  • chromGraphFromBin

  • chromGraphToBin

  • chromToUcsc

  • clusterGenes

  • clusterMatrixToBarChartBed

  • colTransform

  • countChars

  • cpg_lh

  • crTreeIndexBed

  • crTreeSearchBed

  • dbSnoop

  • dbTrash

  • endsInLf

  • estOrient

  • expMatrixToBarchartBed

  • faAlign

  • faCmp

  • faCount

  • faFilter

  • faFilterN

  • faFrag

  • faNoise

  • faOneRecord

  • faPolyASizes

  • faRandomize

  • faRc

  • faSize

  • faSomeRecords

  • faSplit

  • faToFastq

  • faToTab

  • faToTwoBit

  • faToVcf

  • faTrans

  • fastqStatsAndSubsample

  • fastqToFa

  • featureBits

  • fetchChromSizes

  • findMotif

  • fixStepToBedGraph.pl

  • gapToLift

  • genePredCheck

  • genePredFilter

  • genePredHisto

  • genePredSingleCover

  • genePredToBed

  • genePredToBigGenePred

  • genePredToFakePsl

  • genePredToGtf

  • genePredToMafFrames

  • genePredToProt

  • gensub2

  • getRna

  • getRnaPred

  • gff3ToGenePred

  • gff3ToPsl

  • gmtime

  • gtfToGenePred

  • headRest

  • hgBbiDbLink

  • hgFakeAgp

  • hgFindSpec

  • hgGcPercent

  • hgGoldGapGl

  • hgLoadBed

  • hgLoadChain

  • hgLoadGap

  • hgLoadMaf

  • hgLoadMafSummary

  • hgLoadNet

  • hgLoadOut

  • hgLoadOutJoined

  • hgLoadSqlTab

  • hgLoadWiggle

  • hgSpeciesRna

  • hgTrackDb

  • hgWiggle

  • hgsql

  • hgsqldump

  • hgvsToVcf

  • hicInfo

  • htmlCheck

  • hubCheck

  • hubClone

  • hubPublicCheck

  • ixIxx

  • lastz-1.04.00

  • lastz_D-1.04.00

  • lavToAxt

  • lavToPsl

  • ldHgGene

  • liftOver

  • liftOverMerge

  • liftUp

  • linesToRa

  • localtime

  • mafAddIRows

  • mafAddQRows

  • mafCoverage

  • mafFetch

  • mafFilter

  • mafFrag

  • mafFrags

  • mafGene

  • mafMeFirst

  • mafNoAlign

  • mafOrder

  • mafRanges

  • mafSpeciesList

  • mafSpeciesSubset

  • mafSplit

  • mafSplitPos

  • mafToAxt

  • mafToBigMaf

  • mafToPsl

  • mafToSnpBed

  • mafsInRegion

  • makeTableList

  • maskOutFa

  • matrixClusterColumns

  • matrixMarketToTsv

  • matrixNormalize

  • mktime

  • mrnaToGene

  • netChainSubset

  • netClass

  • netFilter

  • netSplit

  • netSyntenic

  • netToAxt

  • netToBed

  • newProg

  • newPythonProg

  • nibFrag

  • nibSize

  • oligoMatch

  • overlapSelect

  • para

  • paraFetch

  • paraHub

  • paraHubStop

  • paraNode

  • paraNodeStart

  • paraNodeStatus

  • paraNodeStop

  • paraSync

  • paraTestJob

  • parasol

  • positionalTblCheck

  • pslCDnaFilter

  • pslCat

  • pslCheck

  • pslDropOverlap

  • pslFilter

  • pslHisto

  • pslLiftSubrangeBlat

  • pslMap

  • pslMapPostChain

  • pslMrnaCover

  • pslPairs

  • pslPartition

  • pslPosTarget

  • pslPretty

  • pslRc

  • pslRecalcMatch

  • pslRemoveFrameShifts

  • pslReps

  • pslScore

  • pslSelect

  • pslSomeRecords

  • pslSort

  • pslSortAcc

  • pslStats

  • pslSwap

  • pslToBed

  • pslToBigPsl

  • pslToChain

  • pslToPslx

  • pslxToFa

  • qaToQac

  • qacAgpLift

  • qacToQa

  • qacToWig

  • raSqlQuery

  • raToLines

  • raToTab

  • randomLines

  • rmFaDups

  • rowsToCols

  • sizeof

  • spacedToTab

  • splitFile

  • splitFileByColumn

  • sqlToXml

  • strexCalc

  • stringify

  • subChar

  • subColumn

  • tabQuery

  • tailLines

  • tdbQuery

  • tdbRename

  • tdbSort

  • textHistogram

  • tickToDate

  • toLower

  • toUpper

  • trackDbIndexBb

  • transMapPslToGenePred

  • trfBig

  • twoBitDup

  • twoBitInfo

  • twoBitMask

  • twoBitToFa

  • ucscApiClient

  • udr

  • vai.pl

  • validateFiles

  • validateManifest

  • varStepToBedGraph.pl

  • webSync

  • wigCorrelate

  • wigEncode

  • wigToBigWig

  • wordLine

  • xmlCat

  • xmlToSql

Module

You can load the modules by:

module load biocontainers
module load ucsc_genome_toolkit/369

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run UCSC executables on our our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=UCSC
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers ucsc_genome_toolkit/369

blat genome.fasta input.fasta blat.out
fastqToFa input.fastq  output.fasta

Umi_tools

Introduction

Umi_tools is a collection of tools for handling Unique Molecular Identifiers in NGS data sets.

For more information, please check:

Versions

  • 1.1.4

Commands

  • umi_tools

Module

You can load the modules by:

module load biocontainers
module load umi_tools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run umi_tools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=umi_tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers umi_tools

Unicycler

Introduction

Unicycler is an assembly pipeline for bacterial genomes.

For more information, please check its website: https://biocontainers.pro/tools/unicycler and its home page on Github.

Versions

  • 0.5.0

Commands

  • unicycler

Module

You can load the modules by:

module load biocontainers
module load unicycler

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Unicycler on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=unicycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers unicycler

unicycler -t 12 -1 SRR11234553_1.fastq  -2 SRR11234553_2.fastq -o shortout

unicycler -t 12  -l SRR3982487.fastq  -o longout

Usefulaf

Introduction

Usefulaf is an all-in-one Docker/Singularity image for single-cell processing with Alevin-fry(paper). It includes the all tools you need to turn your FASTQ files into a count matrix and then load it into your favorite analysis environment.

For more information, please check:

Versions

  • 0.9.2

Commands

  • simpleaf

  • R

  • Rscript

  • python

  • python3

Module

You can load the modules by:

module load biocontainers
module load usefulaf

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run usefulaf on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=usefulaf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers usefulaf

Vadr

Introduction

VADR is a suite of tools for classifying and analyzing sequences homologous to a set of reference models of viral genomes or gene families. It has been mainly tested for analysis of Norovirus, Dengue, and SARS-CoV-2 virus sequences in preparation for submission to the GenBank database.

For more information, please check:

Versions

  • 1.4.1

  • 1.4.2

  • 1.5

Commands

  • parse_blast.pl

  • v-annotate.pl

  • v-build.pl

  • v-test.pl

Module

You can load the modules by:

module load biocontainers
module load vadr

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vadr on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vadr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vadr

v-annotate.pl noro.9.fa va-noro.9

Vardict-java

Introduction

VarDictJava is a variant discovery program written in Java and Perl. It is a Java port of VarDict variant caller.

For more information, please check:

Versions

  • 1.8.3

Commands

  • vardict-java

  • var2vcf_paired.pl

  • var2vcf_valid.pl

  • testsomatic.R

  • teststrandbias.R

Module

You can load the modules by:

module load biocontainers
module load vardict-java

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vardict-java on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vardict-java
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vardict-java

AF_THR="0.01" # minimum allele frequency
vardict-java -G genome.fasta \
    -f $AF_THR -N genome \
    -b input.bam \
    -c 1 -S 2 -E 3 -g 4 output.bed \
     |  teststrandbias.R \
     |  var2vcf_valid.pl \
     -N genome -E -f $AF_THR \
     > vars.vcf

Varlociraptor

Introduction

Varlociraptor implements a novel, unified fully uncertainty-aware approach to genomic variant calling in arbitrary scenarios.

For more information, please check its website: https://biocontainers.pro/tools/varlociraptor and its home page on Github.

Versions

  • 4.11.4

Commands

  • varlociraptor

Module

You can load the modules by:

module load biocontainers
module load varlociraptor

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Varlociraptor on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varlociraptor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers varlociraptor

varlociraptor call variants tumor-normal --purity 0.75 --tumor

Varscan

Introduction

Varscan is a tool used for variant detection in massively parallel sequencing data.

For more information, please check its home page: http://varscan.sourceforge.net/index.html.

Versions

  • 2.4.2

  • 2.4.4

Commands

  • VarScan.v2.4.4.jar

Module

You can load the modules by:

module load biocontainers
module load varscan

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Varscan on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers varscan

Vartrix

Introduction

Vartrix is a software tool for extracting single cell variant information from 10x Genomics single cell data.

For more information, please check its website: https://biocontainers.pro/tools/vartrix and its home page on Github.

Versions

  • 1.1.22

Commands

  • vartrix

Module

You can load the modules by:

module load biocontainers
module load vartrix

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Vartrix on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vartrix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vartrix

vartrix -v test/test.vcf -b test/test.bam \
    -f test/test.fa -c test/barcodes.tsv \
    -o output.matrix

Vatools

Introduction

VAtools is a python package that includes several tools to annotate VCF files with data from other tools.

For more information, please check:

Versions

  • 5.0.1

Commands

  • ref-transcript-mismatch-reporter

  • transform-split-values

  • vcf-expression-annotator

  • vcf-genotype-annotator

  • vcf-info-annotator

  • vcf-readcount-annotator

  • vep-annotation-reporter

Module

You can load the modules by:

module load biocontainers
module load vatools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vatools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vatools

vcf-readcount-annotator <input_vcf> <snv_bam_readcount_file> <DNA| RNA> \
            -s <sample_name> -t snv -o <snv_annotated_vcf>

Vcf2maf

Introduction

To convert a VCF into a MAF, each variant must be mapped to only one of all possible gene transcripts/isoforms that it might affect. This selection of a single effect per variant, is often subjective. So this project is an attempt to make the selection criteria smarter, reproducible, and more configurable. And the default criteria must lean towards best practices.

For more information, please check:

Versions

  • 1.6.21

Commands

  • maf2maf.pl

  • maf2vcf.pl

  • vcf2maf.pl

  • vcf2vcf.pl

Module

You can load the modules by:

module load biocontainers
module load vcf2maf

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

Note

If users need to use vep, please add --vep-path /opt/conda/bin.

To run vcf2maf on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2maf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vcf2maf

vcf2maf.pl --vep-path /opt/conda/bin \
    --ref-fasta Homo_sapiens.GRCh37.dna.toplevel.fa.gz \
    --input-vcf tests/test.vcf --output-maf test.vep.maf

Vcf2phylip

Introduction

vcf2phylip is a tool to convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis.

For more information, please check:

Versions

  • 2.8

Commands

  • vcf2phylip.py

Module

You can load the modules by:

module load biocontainers
module load vcf2phylip

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vcf2phylip on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2phylip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vcf2phylip

vcf2phylip --input myfile.vcf

Vcf2tsvpy

Introduction

Vcf2tsvpy is a small Python program that converts genomic variant data encoded in VCF format into a tab-separated values (TSV) file.

For more information, please check:

Versions

  • 0.6.0

Commands

  • vcf2tsvpy

Module

You can load the modules by:

module load biocontainers
module load vcf2tsvpy

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vcf2tsvpy on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2tsvpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vcf2tsvpy

Vcf-kit

Introduction

VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files.

For more information, please check:

Versions

  • 0.2.6

  • 0.2.9

Commands

  • vk

Module

You can load the modules by:

module load biocontainers
module load vcf-kit

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vcf-kit on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf-kit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vcf-kit

VCFtools

Introduction

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

For more information, please check its website: https://biocontainers.pro/tools/vcftools and its home page on Github.

Versions

  • 0.1.16

Commands

  • vcftools

Module

You can load the modules by:

module load biocontainers
module load vartrix

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run VCFtools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vcftools

vcftools --vcf input_data.vcf --chr 1 \
    --from-bp 1000000 --to-bp 2000000

Velocyto.py

Introduction

Velocyto.py a library for the analysis of RNA velocity.

Detailed information about velocyto.py can be found here: https://github.com/velocyto-team/velocyto.py.

Versions

  • 0.17.17

Commands

  • python

  • python3

  • velocyto

Module

You can load the modules by:

module load biocontainers
module load velocyto.py/0.17.17-py39

Interactive job

To run Velocyto.py interactively on our clusters:

(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.10 |  packaged by conda-forge |  (main, Feb  1 2022, 21:24:11)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import velocyto as vcy
>>> vlm = vcy.VelocytoLoom("YourData.loom")
>>> vlm.normalize("S", size=True, log=True)
>>> vlm.S_norm  # contains log normalized

Batch job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To submit a sbatch job on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=Velocyto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers velocyto.py/0.17.17-py39

velocyto run10x cellranger_count_1kpbmcs_out refdata-gex-GRCh38-2020-A/genes/genes.gtf

Velvet

Introduction

Velvet is a sequence assembler for very short reads.

For more information, please check its website: https://biocontainers.pro/tools/velvet and its home page: https://www.ebi.ac.uk/~zerbino/velvet/.

Versions

  • 1.2.10

Commands

  • velveth

  • velvetg

Module

You can load the modules by:

module load biocontainers
module load trimmomatic

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Velvet on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=velvet
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers velvet

velveth output_directory 21 -fasta -short solexa1.fa solexa2.fa solexa3.fa -long capillary.fa
velvetg output_directory -cov_cutoff 4

Veryfasttree

Introduction

VeryFastTree is a highly-tuned implementation of the FastTree-2 tool that takes advantage of parallelization and vectorization strategies to speed up the inference of phylogenies for huge alignments. It is important to highlight that VeryFastTree keeps unchanged the phases, methods and heuristics used by FastTree-2 to estimate the phylogenetic tree. In this way, it produces trees with the same topological accuracy than FastTree-2. In addition, unlike the parallel version of FastTree-2, VeryFastTree is deterministic.

For more information, please check:

Versions

  • 3.2.1

Commands

  • VeryFastTree

Module

You can load the modules by:

module load biocontainers
module load veryfasttree

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run veryfasttree on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=veryfasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers veryfasttree

Vg

Introduction

Variation graphs (vg) provides tools for working with genome variation graphs.

For more information, please check:

Quay.io: https://quay.io/repository/vgteam/vg?tabinfo | Home page: https://github.com/vgteam/vg

Versions

  • 1.40.0

Commands

  • vg

Module

You can load the modules by:

module load biocontainers
module load vg

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run vg on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vg

vg construct -r test/small/x.fa -v test/small/x.vcf.gz >x.vg

# GFA output
vg view x.vg >x.gfa

# dot output suitable for graphviz
vg view -d x.vg >x.dot

# And if you have a GAM file
cp small/x-s1337-n1.gam x.gam

# json version of binary alignments
vg view -a x.gam >x.json

vg align -s CTACTGACAGCAGAAGTTTGCTGTGAAGATTAAATTAGGTGATGCTTG x.vg

Viennarna

Introduction

Viennarna is a set of standalone programs and libraries used for prediction and analysis of RNA secondary structures.

For more information, please check its website: https://biocontainers.pro/tools/viennarna and its home page: https://www.tbi.univie.ac.at/RNA/.

Versions

  • 2.5.0

Commands

  • RNA2Dfold

  • RNALalifold

  • RNALfold

  • RNAPKplex

  • RNAaliduplex

  • RNAalifold

  • RNAcofold

  • RNAdistance

  • RNAdos

  • RNAduplex

  • RNAeval

  • RNAfold

  • RNAforester

  • RNAheat

  • RNAinverse

  • RNAlocmin

  • RNAmultifold

  • RNApaln

  • RNAparconv

  • RNApdist

  • RNAplex

  • RNAplfold

  • RNAplot

  • RNApvmin

  • RNAsnoop

  • RNAsubopt

  • RNAup

  • Kinfold

  • b2ct

  • popt

Module

You can load the modules by:

module load biocontainers
module load viennarna

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Viennarna on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=viennarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers viennarna

RNAfold < test.seq
RNAfold -p --MEA < test.seq

Vsearch

Introduction

Vsearch is a versatile open source tool for metagenomics.

For more information, please check its website: https://biocontainers.pro/tools/vsearch and its home page on Github.

Versions

  • 2.19.0

  • 2.21.1

  • 2.22.1

Commands

  • vsearch

Module

You can load the modules by:

module load biocontainers
module load vsearch

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Vsearch on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vsearch
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers vsearch

vsearch -sintax SRR8723605_merged.fasta -db rdp_16s_v16_sp.fa \
    -tabbedout SRR8723605_out.txt -strand both -sintax_cutoff 0.5

Whatshap

Introduction

Whatshap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.

For more information, please check:

Versions

  • 1.4

Commands

  • whatshap

Module

You can load the modules by:

module load biocontainers
module load whatshap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run whatshap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=whatshap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers whatshap

whatshap phase --indels \
    --reference=reference.fasta \
    variants.vcf pacbio.bam

Wiggletools

Introduction

The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon’s rank sum test, etc).

For more information, please check:

Versions

  • 1.2.11

Commands

  • wiggletools

Module

You can load the modules by:

module load biocontainers
module load wiggletools

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run wiggletools on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=wiggletools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers wiggletools

wiggletools test/fixedStep.wig
wiggletools test/fixedStep.bw
wiggletools test/bedfile.bg
wiggletools test/overlapping.bed
wiggletools test/bam.bam
wiggletools test/cram.cram
wiggletools test/vcf.vcf
wiggletools test/bcf.bcf

Winnowmap

Introduction

Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences.

For more information, please check:

Versions

  • 2.03

Commands

  • winnowmap

Module

You can load the modules by:

module load biocontainers
module load winnowmap

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run winnowmap on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=winnowmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers winnowmap

winnowmap -W repetitive_k15.txt \
    -ax map-pb Cm.contigs.fasta \
    SRR3982487.fastq > output.sam

Wtdbg2

Introduction

Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).

For more information, please check its website: https://biocontainers.pro/tools/wtdbg and its home page on Github.

Versions

  • 2.5

Commands

  • wtdbg-cns

  • wtdbg2

  • wtpoa-cns

Module

You can load the modules by:

module load biocontainers
module load wtdbg

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run Wtdbg2 on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=wtdbg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml biocontainers wtdbg

wtpoa-cns -t 24 -i dbg.ctg.lay.gz -fo dbg.ctg.fa

NVIDIA NGC containers

autodock

Description

The AutoDock Suite is a growing collection of methods for computational docking and virtual screening, for use in structure-based drug discovery and exploration of the basic mechanisms of biomolecular structure and function.

Versions

  • 2020.06

Module

You can load the modules by:

module load ngc
module load autodock

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run autodock on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=autodock
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc autodock

gamess

Description

The General Atomic and Molecular Electronic Structure Systems GAMESS program simulates molecular quantum chemistry, allowing users to calculate various molecular properties and dynamics.

Versions

  • 17.09-r2-libcchem

Module

You can load the modules by:

module load ngc
module load gamess

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gamess on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=gamess
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc gamess

gromacs

Description

GROMACS GROningen MAchine for Chemical Simulations is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.

Versions

  • 2018.2

  • 2020.2

  • 2021

  • 2021.3

Module

You can load the modules by:

module load ngc
module load gromacs

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run gromacs on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=gromacs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc gromacs

julia

Description

The Julia programming language is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.

Versions

  • v1.5.0

  • v2.4.2

Module

You can load the modules by:

module load ngc
module load julia

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run julia on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=julia
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc julia

lammps

Description

Large-scale Atomic/Molecular Massively Parallel Simulator LAMMPS is a software application designed for molecular dynamics simulations. It has potentials for solid-state materials metals, semiconductor, soft matter biomolecules, polymers and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.

Versions

  • 10Feb2021

  • 15Jun2020

  • 24Oct2018

  • 29Oct2020

Module

You can load the modules by:

module load ngc
module load lammps

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run lammps on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=lammps
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc lammps

namd

Description

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.

Versions

  • 2.13-multinode

  • 2.13-singlenode

  • 3.0-alpha3-singlenode

Module

You can load the modules by:

module load ngc
module load namd

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run namd on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=namd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc namd

nvhpc

Description

The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC® directives, and CUDA®. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming.

Versions

  • 20.7

  • 20.9

  • 20.11

  • 21.5

  • 21.9

Module

You can load the modules by:

module load ngc
module load nvhpc

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run nvhpc on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=nvhpc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc nvhpc

parabricks

Description

NVIDIAs Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Googles DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.

Versions

  • 4.0.0-1

Module

You can load the modules by:

module load ngc
module load parabricks

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run parabricks on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=parabricks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc parabricks

paraview

Description

no ParaView client GUI in this container, but ParaView Web application is included.

Versions

  • 5.9.0

Module

You can load the modules by:

module load ngc
module load paraview

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run paraview on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=paraview
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc paraview

pytorch

Description

PyTorch is a GPU accelerated tensor computational framework with a Python front end. Functionality can be easily extended with common Python libraries such as NumPy, SciPy, and Cython. Automatic differentiation is done with a tape-based system at both a functional and neural network layer level. This functionality brings a high level of flexibility and speed as a deep learning framework and provides accelerated NumPy-like functionality.

Versions

  • 20.02-py3

  • 20.03-py3

  • 20.06-py3

  • 20.11-py3

  • 20.12-py3

  • 21.06-py3

  • 21.09-py3

Module

You can load the modules by:

module load ngc
module load pytorch

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run pytorch on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=pytorch
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc pytorch

qmcpack

Description

QMCPACK is an open-source, high-performance electronic structure code that implements numerous Quantum Monte Carlo algorithms. Its main applications are electronic structure calculations of molecular, periodic 2D and periodic 3D solid-state systems. Variational Monte Carlo VMC, diffusion Monte Carlo DMC and a number of other advanced QMC algorithms are implemented. By directly solving the Schrodinger equation, QMC methods offer greater accuracy than methods such as density functional theory, but at a trade-off of much greater computational expense. Distinct from many other correlated many-body methods, QMC methods are readily applicable to both bulk periodic and isolated molecular systems.

Versions

  • v3.5.0

Module

You can load the modules by:

module load ngc
module load qmcpack

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run qmcpack on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=qmcpack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc qmcpack

quantum_espresso

Description

Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale based on density-functional theory, plane waves, and pseudopotentials.

Versions

  • v6.6a1

  • v6.7

Module

You can load the modules by:

module load ngc
module load quantum_espresso

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run quantum_espresso on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=quantum_espresso
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc quantum_espresso

rapidsai

Description

The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Versions

  • 0.12

  • 0.13

  • 0.14

  • 0.15

  • 0.16

  • 0.17

  • 21.06

  • 21.10

Module

You can load the modules by:

module load ngc
module load rapidsai

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run rapidsai on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=rapidsai
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc rapidsai

relion

Description

RELION for REgularized LIkelihood OptimizatioN implements an empirical Bayesian approach for analysis of electron cryo-microscopy Cryo-EM. Specifically it provides methods of refinement of singular or multiple 3D reconstructions as well as 2D class averages. RELION is an important tool in the study of living cells.

Versions

  • 2.1.b1

  • 3.1.0

  • 3.1.2

  • 3.1.3

Module

You can load the modules by:

module load ngc
module load relion

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run relion on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=relion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc relion

tensorflow

Description

TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays tensors that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.

Versions

  • 20.02-tf1-py3

  • 20.02-tf2-py3

  • 20.03-tf1-py3

  • 20.03-tf2-py3

  • 20.06-tf1-py3

  • 20.06-tf2-py3

  • 20.11-tf1-py3

  • 20.11-tf2-py3

  • 20.12-tf1-py3

  • 20.12-tf2-py3

  • 21.06-tf1-py3

  • 21.06-tf2-py3

  • 21.09-tf1-py3

  • 21.09-tf2-py3

Module

You can load the modules by:

module load ngc
module load tensorflow

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run tensorflow on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=tensorflow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc tensorflow

torchani

Description

TorchANI is a PyTorch-based program for training/inference of ANI (ANAKIN-ME) deep learning models to obtain potential energy surfaces and other physical properties of molecular systems.

Versions

  • 2021.04

Module

You can load the modules by:

module load ngc
module load torchani

Example job

Warning

Using #!/bin/sh -l as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash instead.

To run torchani on our clusters:

#!/bin/bash
#SBATCH -A myallocation     # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=torchani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out

module --force purge
ml ngc torchani