Scientific Applications on ACCESS Anvil¶
This is the list of Applications, Compilers, MPIs, NVIDIA NGC containers, and biocontainers deployed on ACCESS Anvil that is managed by Rosen Center for Advanced Computing (RCAC) clusters at Purdue University.
Compilers¶
aocc¶
Description¶
The AOCC compiler system is a high performance, production quality code generation tool. The AOCC environment provides various options to developers when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux® platforms.
Versions¶
3.1.0
Module¶
You can load the modules by:
module load aocc
gcc¶
Description¶
The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Ada, and Go, as well as libraries for these languages.
Versions¶
8.4.1
10.2.0
11.2.0
11.2.0-openacc
Module¶
You can load the modules by:
module load gcc
intel¶
Description¶
Intel Parallel Studio.
Versions¶
19.0.5.281
Module¶
You can load the modules by:
module load intel
MPIs¶
impi¶
Description¶
Intel MPI
Versions¶
2019.5.281
Module¶
You can load the modules by:
module load impi
mvapich2¶
Description¶
Mvapich2 is a High-Performance MPI Library for clusters with diverse networks InfiniBand, Omni-Path, Ethernet/iWARP, and RoCE and computing platforms x86 Intel and AMD, ARM and OpenPOWER
Versions¶
2.3.6
Module¶
You can load the modules by:
module load mvapich2
openmpi¶
Description¶
An open source Message Passing Interface implementation.
Versions¶
3.1.6
4.0.6
4.0.6-cu11.0.3
Module¶
You can load the modules by:
module load openmpi
Applications¶
AMD¶
amdblis¶
Description¶
AMD Optimized BLIS. BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries.
Versions¶
3.0
Module¶
You can load the modules by:
module load amdblis
amdfftw¶
Description¶
FFTW AMD Optimized version is a comprehensive collection of fast C routines for computing the Discrete Fourier Transform DFT and various special cases thereof.
Versions¶
3.0
Module¶
You can load the modules by:
module load amdfftw
amdlibflame¶
Description¶
libFLAME AMD Optimized version is a portable library for dense matrix computations, providing much of the functionality present in Linear Algebra Package LAPACK. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation.
Versions¶
3.0
Module¶
You can load the modules by:
module load amdlibflame
amdlibm¶
Description¶
AMD LibM is a software library containing a collection of basic math functions optimized for x86-64 processor-based machines. It provides many routines from the list of standard C99 math functions. Applications can link into AMD LibM library and invoke math functions instead of compilers math functions for better accuracy and performance.
Versions¶
3.0
Module¶
You can load the modules by:
module load amdlibm
amdscalapack¶
Description¶
ScaLAPACK is a library of high-performance linear algebra routines for parallel distributed memory machines. It depends on external libraries including BLAS and LAPACK for Linear Algebra computations.
Versions¶
3.0
Module¶
You can load the modules by:
module load amdscalapack
Audio/Visualization¶
ffmpeg¶
Description¶
FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.
Versions¶
4.2.2
Module¶
You can load the modules by:
module load ffmpeg
gmt¶
Description¶
GMT Generic Mapping Tools is an open source collection of about 80 command-line tools for manipulating geographic and Cartesian data sets including filtering, trend fitting, gridding, projecting, etc. and producing PostScript illustrations ranging from simple x-y plots via contour maps to artificially illuminated surfaces and 3D perspective views.
Versions¶
6.1.0
Module¶
You can load the modules by:
module load gmt
gnuplot¶
Description¶
Gnuplot is a portable command-line driven graphing utility for Linux, OS/2, MS Windows, OSX, VMS, and many other platforms. The source code is copyrighted but freely distributed i.e., you dont have to pay for it. It was originally created to allow scientists and students to visualize mathematical functions and data interactively, but has grown to support many non-interactive uses such as web scripting. It is also used as a plotting engine by third-party applications like Octave. Gnuplot has been supported and under active development since 1986
Versions¶
5.4.2
Module¶
You can load the modules by:
module load gnuplot
paraview¶
Description¶
ParaView is an open-source, multi-platform data analysis and visualization application.
Versions¶
5.9.1
5.10.1
Module¶
You can load the modules by:
module load paraview
visit¶
Description¶
VisIt is an Open Source, interactive, scalable, visualization, animation and analysis tool. Description
Versions¶
3.1.4
Module¶
You can load the modules by:
module load visit
vlc¶
Description¶
VLC is a free and open source multimedia player for most multimedia formats.
Versions¶
3.0.9.2
Module¶
You can load the modules by:
module load vlc
vtk¶
Description¶
The Visualization Toolkit VTK is an open-source, freely available software system for 3D computer graphics, image processing and visualization.
Versions¶
9.0.0
Module¶
You can load the modules by:
module load vtk
Bioinformatics¶
bamtools¶
Description¶
C++ API & command-line toolkit for working with BAM data.
Versions¶
2.5.2
Module¶
You can load the modules by:
module load bamtools
Example job¶
To run bamtools our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=bamtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module load bamtools
bamtools convert -format fastq -in in.bam -out out.fastq
beagle¶
Description¶
Beagle is a software package for phasing genotypes and for imputing ungenotyped markers.
Versions¶
5.1
Module¶
You can load the modules by:
module load beagle
beast2¶
Description¶
BEAST is a cross-platform program for Bayesian inference using MCMC of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology.
Versions¶
2.6.4
Module¶
You can load the modules by:
module load beast2
bismark¶
Description¶
A tool to map bisulfite converted sequence reads and determine cytosine methylation states
Versions¶
0.23.0
Module¶
You can load the modules by:
module load bismark
blast-plus¶
Description¶
Basic Local Alignment Search Tool. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
Versions¶
2.12.0
Module¶
You can load the modules by:
module load blast-plus
BLAST Databases¶
Local copies of the blast dabase can be found in the directory /anvil/datasets/ncbi/blast/latest
. The environment varialbe BLASTDB
was also set as /anvil/datasets/ncbi/blast/latest
. If users want to use cdd_delta
, env_nr
, env_nt
, nr
, nt
, pataa
, patnt
, pdbnt
, refseq_protein
, refseq_rna
, swissprot
, or tsa_nt
databases, do not need to provide the database path. Instead, just use the format like this -db nr
.
Example job¶
To run bamtools our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=blast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module load blast-plus
blastp -query protein.fasta -db nr -out test_out -num_threads 4
bowtie2¶
Description¶
Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences
Versions¶
2.4.2
Module¶
You can load the modules by:
module load bowtie2
bwa¶
Description¶
Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.
Versions¶
0.7.17
Module¶
You can load the modules by:
module load bwa
cufflinks¶
Description¶
Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples.
Versions¶
2.2.1
Module¶
You can load the modules by:
module load cufflinks
cutadapt¶
Description¶
Cutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
Versions¶
2.10
Module¶
You can load the modules by:
module load cutadapt
fastqc¶
Description¶
A quality control tool for high throughput sequence data.
Versions¶
0.11.9
Module¶
You can load the modules by:
module load fastqc
fasttree¶
Description¶
FastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.
Versions¶
2.1.10
Module¶
You can load the modules by:
module load fasttree
fastx-toolkit¶
Description¶
The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Versions¶
0.0.14
Module¶
You can load the modules by:
module load fastx-toolkit
gatk¶
Description¶
Genome Analysis Toolkit Variant Discovery in High-Throughput Sequencing Data
Versions¶
4.1.8.1
Module¶
You can load the modules by:
module load gatk
Example job¶
To run gatk our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -p PartitionName
#SBATCH --job-name=gatk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module load gatk
gatk --java-options "-Xmx12G -XX:ParallelGCThreads=24" HaplotypeCaller -R hg38.fa -I 19P0126636WES.sorted.bam -O 19P0126636WES.HC.vcf --sample-name 19P0126636
htseq¶
Description¶
HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays.
Versions¶
0.11.2
Module¶
You can load the modules by:
module load htseq
mrbayes¶
Description¶
MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo MCMC methods to estimate the posterior distribution of model parameters.
Versions¶
3.2.7a
Module¶
You can load the modules by:
module load mrbayes
nf-core¶
Description¶
A community effort to collect a curated set of analysis pipelines built using Nextflow and tools to run the pipelines.
Versions¶
2.7.2
2.8
Module¶
You can load the modules by:
module load nf-core
perl-bioperl¶
Description¶
BioPerl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects. These objects not only do what they are advertised to do in the documentation, but they also interact - Alignment objects are made from the Sequence objects, Sequence objects have access to Annotation and SeqFeature objects and databases, Blast objects can be converted to Alignment objects, and so on. This means that the objects provide a coordinated and extensible framework to do computational biology.
Versions¶
1.7.6
Module¶
You can load the modules by:
module load perl-bioperl
picard¶
Description¶
Picard is a set of command line tools for manipulating high-throughput sequencing HTS data and formats such as SAM/BAM/CRAM and VCF.
Versions¶
2.25.7
Module¶
You can load the modules by:
module load picard
samtools¶
Description¶
SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format
Versions¶
1.12
Module¶
You can load the modules by:
module load samtools
sratoolkit¶
Description¶
The NCBI SRA Toolkit enables reading dumping of sequencing files from the SRA database and writing loading files into the .sra format.
Versions¶
2.10.9
Module¶
You can load the modules by:
module load sratoolkit
tophat¶
Description¶
Spliced read mapper for RNA-Seq.
Versions¶
2.1.2
Module¶
You can load the modules by:
module load tophat
trimmomatic¶
Description¶
A flexible read trimming tool for Illumina NGS data.
Versions¶
0.39
Module¶
You can load the modules by:
module load trimmomatic
vcftools¶
Description¶
VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
Versions¶
0.1.14
Module¶
You can load the modules by:
module load vcftools
Climate¶
cdo¶
Description¶
CDO is a collection of command line Operators to manipulate and analyse Climate and NWP model Data.
Versions¶
1.9.9
Module¶
You can load the modules by:
module load cdo
ncl¶
Description¶
NCL is an interpreted language designed specifically for scientific data analysis and visualization. Supports NetCDF 3/4, GRIB 1/2, HDF 4/5, HDF-EOD 2/5, shapefile, ASCII, binary. Numerous analysis functions are built-in.
Versions¶
6.4.0
Module¶
You can load the modules by:
module load ncl
Computational chemistry¶
amber¶
Description¶
AMBER (Assisted Model Building with Energy Refinement) is a package of molecular simulation programs.
Versions¶
20
Module¶
You can load the modules by:
module load amber
cp2k¶
Description¶
CP2K is a quantum chemistry and solid state physics software package that can perform atomistic simulations of solid state, liquid, molecular, periodic, material, crystal, and biological systems
Versions¶
8.2
Module¶
You can load the modules by:
module load cp2k
gromacs¶
Description¶
GROMACS GROningen MAchine for Chemical Simulations is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.
Versions¶
2021.2
Module¶
You can check available gromacs version by:
module spider gromacs
You can check how to load the gromacs module by the module’s full name:
module spider gromacs/XXXX
- Note: RCAC also installed some containerized gromacs modules.
To use these containerized modules, please following the instructions in the output of “module spider gromacs/XXXX”
You can load the modules by:
module load gromacs # for default version
module load gromacs/XXXX # for specific version
Usage¶
The GROMACS executable is gmx_mpi
and you can use gmx help commands
for help on a command.
For more details about how to run GROMACS, please check GROMACS.
Example job¶
#!/bin/bash
# FILENAME: myjobsubmissionfile
#SBATCH --nodes=2 # Total # of nodes
#SBATCH --ntasks=256 # Total # of MPI tasks
#SBATCH --time=1:30:00 # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname # Job name
#SBATCH -o myjob.o%j # Name of stdout output file
#SBATCH -e myjob.e%j # Name of stderr error file
# Manage processing environment, load compilers and applications.
module purge
module load gcc/XXXX openmpi/XXXX # or module load intel/XXXX impi/XXXX | depends on the output of "module spider gromacs/XXXX"
module load gromacs/XXXX
module list
# Launch MPI code
gmx_mpi pdb2gmx -f my.pdb -o my_processed.gro -water spce
gmx_mpi grompp -f my.mdp -c my_processed.gro -p topol.top -o topol.tpr
srun -n $SLURM_NTASKS gmx_mpi mdrun -s topol.tpr
Note¶
Using mpirun -np $SLURM_NTASKS gmx_mpi
or mpiexex -np $SLURM_NTASKS gmx_mpi
may not work for non-exclusive jobs on some clusters. Use srun -n $SLURM_NTASKS gmx_mpi
or mpirun gmx_mpi
instead. mpirun gmx_mpi
without specifying the number of ranks will automatically pick up the number of SLURM_NTASKS
and works fine.
lammps¶
Description¶
LAMMPS is a classical molecular dynamics code with a focus on materials modelling. It’s an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.
LAMMPS has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
Versions¶
20210310
20210310-kokkos
Module¶
You can check available lammps version by:
module spider lammps
You can check how to load the lammps module by the module’s full name:
module spider lammps/XXXX
You can load the modules by:
module load lammps # for default version
module load lammps/XXXX # for specific version
Usage¶
LAMMPS reads command lines from an input file like “in.file”. The LAMMPS executable is lmp
, to run the lammps input file, use the -in
command:
lmp -in in.file
For more details about how to run LAMMPS, please check LAMMPS.
Example job¶
#!/bin/bash
# FILENAME: myjobsubmissionfile
#SBATCH --nodes=2 # Total # of nodes
#SBATCH --ntasks=256 # Total # of MPI tasks
#SBATCH --time=1:30:00 # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname # Job name
#SBATCH -o myjob.o%j # Name of stdout output file
#SBATCH -e myjob.e%j # Name of stderr error file
# Manage processing environment, load compilers and applications.
module purge
module load gcc/XXXX openmpi/XXXX # or module load intel/XXXX impi/XXXX | depends on the output of "module spider lammps/XXXX"
module load lammps/XXXX
module list
# Launch MPI code
srun -n $SLURM_NTASKS lmp
Note¶
Using mpirun -np $SLURM_NTASKS lmp
or mpiexex -np $SLURM_NTASKS lmp
may not work for non-exclusive jobs on some clusters. Use srun -n $SLURM_NTASKS lmp
or mpirun lmp
instead. mpirun lmp
without specifying the number of ranks will automatically pick up the number of SLURM_NTASKS
and works fine.
namd¶
Description¶
NAMDis a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
Versions¶
2.14
Module¶
You can load the modules by:
module load namd
nwchem¶
Description¶
High-performance computational chemistry software
Versions¶
7.0.2
Module¶
You can load the modules by:
module load nwchem
quantum-espresso¶
Description¶
Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials.
Versions¶
6.7
Module¶
You can load the modules by:
module load quantum-espresso
vasp¶
Description¶
The Vienna Ab initio Simulation Package VASP is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles.
Versions¶
5.4.4.pl2
6.3.0
Module¶
You can load the modules by:
module load vasp
vmd¶
Description¶
VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
Versions¶
1.9.3
Module¶
You can load the modules by:
module load vmd
wannier90¶
Description¶
Wannier90 is an open-source code released under GPLv2 for generating maximally-localized Wannier functions and using them to compute advanced electronic properties of materials with high efficiency and accuracy.
Versions¶
3.1.0
Module¶
You can load the modules by:
module load wannier90
Fluid dynamics¶
Geospatial tools¶
gdal¶
Description¶
GDAL Geospatial Data Abstraction Library is a translator library for raster and vector geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single raster abstract data model and vector abstract data model to the calling application for all supported formats. It also comes with a variety of useful command line utilities for data translation and processing.
Versions¶
2.4.4
3.2.0
Module¶
You can load the modules by:
module load gdal
geos¶
Description¶
GEOS Geometry Engine - Open Source is a C++ port of the Java Topology Suite JTS. As such, it aims to contain the complete functionality of JTS in C++. This includes all the OpenGIS Simple Features for SQL spatial predicate functions and spatial operators, as well as specific JTS enhanced topology functions.
Versions¶
3.8.1
3.9.1
Module¶
You can load the modules by:
module load geos
grads¶
Description¶
The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. GrADS has two data models for handling gridded and station data. GrADS supports many data file formats, including binary (stream or sequential), GRIB (version 1 and 2), NetCDF, HDF (version 4 and 5), and BUFR (for station data).
Versions¶
2.2.1
Module¶
You can load the modules by:
module load grads
proj¶
Description¶
PROJ is a generic coordinate transformation software, that transforms geospatial coordinates from one coordinate reference system CRS to another. This includes cartographic projections as well as geodetic transformations.
Versions¶
5.2.0
6.2.0
Module¶
You can load the modules by:
module load proj
Libraries¶
arpack-ng¶
Description¶
ARPACK-NG is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.
Versions¶
3.8.0
Module¶
You can load the modules by:
module load arpack-ng
blis¶
Description¶
BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries.
Versions¶
0.8.1
Module¶
You can load the modules by:
module load blis
boost¶
Description¶
Boost provides free peer-reviewed portable C++ source libraries, emphasizing libraries that work well with the C++ Standard Library.
Versions¶
1.74.0
Module¶
You can load the modules by:
module load boost
eigen¶
Description¶
Eigen is a C++ template library for linear algebra matrices, vectors, numerical solvers, and related algorithms.
Versions¶
3.3.9
Module¶
You can load the modules by:
module load eigen
fftw¶
Description¶
FFTW is a C subroutine library for computing the discrete Fourier transform DFT in one or more dimensions, of arbitrary input size, and of both real and complex data as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST. We believe that FFTW, which is free software, should become the FFT library of choice for most applications.
Versions¶
2.1.5
3.3.8
Module¶
You can load the modules by:
module load fftw
gmp¶
Description¶
GMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating-point numbers.
Versions¶
6.2.1
Module¶
You can load the modules by:
module load gmp
gsl¶
Description¶
The GNU Scientific Library GSL is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.
Versions¶
2.4
Module¶
You can load the modules by:
module load gsl
hdf5¶
Description¶
HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.
Versions¶
1.10.7
Module¶
You can load the modules by:
module load hdf5
hdf¶
Description¶
HDF4 also known as HDF is a library and multi-object file format for storing and managing data between machines.
Versions¶
4.2.15
Module¶
You can load the modules by:
module load hdf
intel-mkl¶
Description¶
Intel’s Math Kernel Library (MKL) provides highly optimized, threaded and vectorized functions to maximize performance on each processor family. It Utilises de-facto standard C and Fortran APIs for compatibility with BLAS, LAPACK and FFTW functions from other math libraries.
Versions¶
2019.5.281
Module¶
You can load the modules by:
module load intel-mkl
libfabric¶
Description¶
The Open Fabrics Interfaces OFI is a framework focused on exporting fabric communication services to applications.
Versions¶
1.12.0
Module¶
You can load the modules by:
module load libfabric
libflame¶
Description¶
libflame is a portable library for dense matrix computations, providing much of the functionality present in LAPACK, developed by current and former members of the Science of High-Performance Computing SHPC group in the Institute for Computational Engineering and Sciences at The University of Texas at Austin. libflame includes a compatibility layer, lapack2flame, which includes a complete LAPACK implementation.
Versions¶
5.2.0
Module¶
You can load the modules by:
module load libflame
libiconv¶
Description¶
GNU libiconv provides an implementation of the iconv function and the iconv program for character set conversion.
Versions¶
1.16
Module¶
You can load the modules by:
module load libiconv
libmesh¶
Description¶
The libMesh library provides a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms.
Versions¶
1.6.2
Module¶
You can load the modules by:
module load libmesh
libszip¶
Description¶
Szip is an implementation of the extended-Rice lossless compression algorithm.
Versions¶
2.1.1
Module¶
You can load the modules by:
module load libszip
libtiff¶
Description¶
LibTIFF - Tag Image File Format TIFF Library and Utilities.
Versions¶
4.1.0
Module¶
You can load the modules by:
module load libtiff
libv8¶
Description¶
Distributes the V8 JavaScript engine in binary and source forms in order to support fast builds of The Ruby Racer
Versions¶
6.7.17
Module¶
You can load the modules by:
module load libv8
libx11¶
Description¶
Xlib − C Language X Interface is a reference guide to the low-level C language interface to the X Window System protocol. It is neither a tutorial nor a user’s guide to programming the X Window System. Rather, it provides a detailed description of each function in the library as well as a discussion of the related background information.
Versions¶
1.7.0
Module¶
You can load the modules by:
module load libx11
libxml2¶
Description¶
Libxml2 is the XML C parser and toolkit developed for the Gnome project but usable outside of the Gnome platform, it is free software available under the MIT License.
Versions¶
2.9.10
Module¶
You can load the modules by:
module load libxml2
mpfr¶
Description¶
The MPFR library is a C library for multiple-precision floating-point computations with correct rounding.
Versions¶
4.0.2
Module¶
You can load the modules by:
module load mpfr
netcdf-c¶
Description¶
NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C distribution.
Versions¶
4.7.4
Module¶
You can load the modules by:
module load netcdf-c
netcdf-cxx4¶
Description¶
NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the C++ distribution.
Versions¶
4.3.1
Module¶
You can load the modules by:
module load netcdf-cxx4
netcdf-fortran¶
Description¶
NetCDF network Common Data Form is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. This is the Fortran distribution.
Versions¶
4.5.3
Module¶
You can load the modules by:
module load netcdf-fortran
netlib-lapack¶
Description¶
LAPACK version 3.X is a comprehensive FORTRAN library that does linear algebra operations including matrix inversions, least squared solutions to linear sets of equations, eigenvector analysis, singular value decomposition, etc. It is a very comprehensive and reputable package that has found extensive use in the scientific community.
Versions¶
3.8.0
Module¶
You can load the modules by:
module load netlib-lapack
openblas¶
Description¶
OpenBLAS is an open source implementation of the BLAS API with many hand-crafted optimizations for specific processor types
Versions¶
0.3.17
Module¶
You can load the modules by:
module load openblas
parallel-netcdf¶
Description¶
PnetCDF Parallel netCDF is a high-performance parallel I/O library for accessing files in format compatibility with Unidatas NetCDF, specifically the formats of CDF-1, 2, and 5.
Versions¶
1.11.2
Module¶
You can load the modules by:
module load parallel-netcdf
petsc¶
Description¶
PETSc is a suite of data structures and routines for the scalable parallel solution of scientific applications modeled by partial differential equations.
Versions¶
3.15.3
Module¶
You can load the modules by:
module load petsc
swig¶
Description¶
SWIG is an interface compiler that connects programs written in C and C++ with scripting languages such as Perl, Python, Ruby, and Tcl. It works by taking the declarations found in C/C++ header files and using them to generate the wrapper code that scripting languages need to access the underlying C/C++ code. In addition, SWIG provides a variety of customization features that let you tailor the wrapping process to suit your application.
Versions¶
4.0.2
Module¶
You can load the modules by:
module load swig
ucx¶
Description¶
a communication library implementing high-performance messaging for MPI/PGAS frameworks
Versions¶
1.11.2
Module¶
You can load the modules by:
module load ucx
zlib¶
Description¶
A free, general-purpose, legally unencumbered lossless data-compression library.
Versions¶
1.2.11
Module¶
You can load the modules by:
module load zlib
Mathematical/Statistics¶
gurobi¶
Description¶
The Gurobi Optimizer was designed from the ground up to be the fastest, most powerful solver available for your LP, QP, QCP, and MIP MILP, MIQP, and MIQCP problems.
Versions¶
9.5.1
Module¶
You can load the modules by:
module load gurobi
jupyter¶
Description¶
Complete Jupyter Hub/Lab/Notebook environment.
Versions¶
2.0.0
Module¶
You can load the modules by:
module load jupyter
matlab¶
Description¶
MATLAB MATrix LABoratory is a multi-paradigm numerical computing environment and fourth-generation programming language. A proprietary programming language developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and Python.
Versions¶
R2020b
R2021b
R2022a
R2023a
Module¶
You can load the modules by:
module load matlab
meep¶
Description¶
Meep or MEEP is a free finite-difference time-domain FDTD simulation software package developed at MIT to model electromagnetic systems.
Versions¶
1.20.0
Module¶
You can load the modules by:
module load meep
octave¶
Description¶
GNU Octave is a high-level language, primarily intended for numerical computations. It provides a convenient command line interface for solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly compatible with Matlab. It may also be used as a batch-oriented language.
Versions¶
6.3.0
Module¶
You can load the modules by:
module load octave
r¶
Description¶
linear and nonlinear modelling, statistical tests, time series analysis, classification, clustering, etc. Please consult the R project homepage for further information.
Versions¶
4.0.5
4.1.0
Module¶
You can load the modules by:
module load r
rstudio¶
Description¶
This package installs Rstudio desktop from pre-compiled binaries available in the Rstudio website. The installer assumes that you are running on CentOS7/Redhat7/Fedora19. Please fix the download URL for other systems.
Versions¶
2021.09.0
Module¶
You can load the modules by:
module load rstudio
ML toolkit¶
learning¶
Description¶
The learning module loads the prerequisites (such as anaconda and cudnn ) and makes ML applications visible to the user
Versions¶
conda-2021.05-py38-gpu
Module¶
You can load the modules by:
module load learning
Example job¶
Below is an example job script:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --gpus-per-node=1
#SBATCH -p PartitionName
#SBATCH --job-name=learning
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
module load learning/conda-2020.11-py38-gpu
module load ml-toolkit-gpu/pytorch/1.7.1
python torch.py
nco¶
Description¶
The NCO toolkit manipulates and analyzes data stored in netCDF-accessible formats
Versions¶
4.9.3
Module¶
You can load the modules by:
module load nco
py-mpi4py¶
Description¶
mpi4py provides a Python interface to MPI or the Message-Passing Interface. It is useful for parallelizing Python scripts
Versions¶
3.0.3
Module¶
You can load the modules by:
module load py-mpi4py
python¶
Description¶
Native Python 3.9.5 including optimized libraries.
Versions¶
3.9.5
Module¶
You can load the modules by:
module load python
spark¶
Description¶
Apache Spark is a fast and general engine for large-scale data processing.
Versions¶
3.1.1
Module¶
You can load the modules by:
module load spark
NVIDIA¶
cuda¶
Description¶
CUDA is a parallel computing platform and programming model invented by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).
Versions¶
11.0.3
11.2.2
11.4.2
12.0.1
Module¶
You can load the modules by:
module load cuda
cudnn¶
Description¶
cuDNN is a deep neural network library from Nvidia that provides a highly tuned implementation of many functions commonly used in deep machine learning applications.
Versions¶
cuda-11.0_8.0
cuda-11.2_8.1
cuda-11.4_8.2
cuda-12.0_8.8
Module¶
You can load the modules by:
module load cudnn
nccl¶
Description¶
Optimized primitives for collective multi-GPU communication.
Versions¶
cuda-11.0_2.11.4
cuda-11.2_2.8.4
cuda-11.4_2.11.4
Module¶
You can load the modules by:
module load modtree/gpu
module load nccl
nvhpc¶
Description¶
The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC® directives, and CUDA®. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming.
Versions¶
21.7
Module¶
You can load the modules by:
module load nvhpc
Programming languages¶
julia¶
Description¶
Julia is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages. One can write code in Julia that is nearly as fast as C. Julia features optional typing, multiple dispatch, and good performance, achieved using type inference and just-in-time (JIT) compilation, implemented using LLVM. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming.
Versions¶
1.6.2
Module¶
You can load the modules by:
module load julia
tcl¶
Description¶
Tcl Tool Command Language is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more. Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.
Versions¶
8.6.11
Module¶
You can load the modules by:
module load tcl
System¶
cue-login-env¶
Description¶
XSEDE Common User Environment Variables for Anvil. Load this module to have XSEDE Common User Environment variables defined for your shell session or job on Anvil. See detailed description at https://www.ideals.illinois.edu/bitstream/handle/2142/75910/XSEDE-CUE-Variable-Definitions-v1.1.pdf
Versions¶
1.1
Module¶
You can load the modules by:
module load cue-login-env
modtree¶
Description¶
ModuleTree or modtree helps users naviagate between different application stacks and sets up a default compiler and mpi environment.
Versions¶
cpu
gpu
Module¶
You can load the modules by:
module load modtree
xalt¶
Versions¶
2.10.45
Module¶
You can load the modules by:
module load xalt
Text Editors¶
Tools/Utilities¶
anaconda¶
Description¶
Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment.
Versions¶
2021.05-py38
Module¶
You can load the modules by:
module load anaconda
aws-cli¶
Description¶
The AWS Command Line Interface CLI is a unified tool to manage your AWS services from command line.
Versions¶
2.4.15
Module¶
You can load the modules by:
module load aws-cli
cmake¶
Description¶
A cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.
Versions¶
3.20.0
Module¶
You can load the modules by:
module load cmake
curl¶
Description¶
cURL is an open source command line tool and library for transferring data with URL syntax
Versions¶
7.76.1
Module¶
You can load the modules by:
module load curl
emacs¶
Description¶
The Emacs programmable text editor.
Versions¶
27.2
Module¶
You can load the modules by:
module load emacs
gdb¶
Description¶
GDB, the GNU Project debugger, allows you to see what is going on inside another program while it executes – or what another program was doing at the moment it crashed.
Versions¶
11.1
Module¶
You can load the modules by:
module load gdb
gpaw¶
Description¶
GPAW is a density-functional theory DFT Python code based on the projector-augmented wave PAW method and the atomic simulation environment ASE.
Versions¶
21.1.0
Module¶
You can load the modules by:
module load gpaw
hadoop¶
Description¶
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
Versions¶
3.3.0
Module¶
You can load the modules by:
module load hadoop
hpctoolkit¶
Description¶
HPCToolkit is an integrated suite of tools for measurement and analysis of program performance on computers ranging from multicore desktop systems to the nations largest supercomputers. By using statistical sampling of timers and hardware performance counters, HPCToolkit collects accurate measurements of a programs work, resource consumption, and inefficiency and attributes them to the full calling context in which they occur.
Versions¶
2021.03.01
Module¶
You can load the modules by:
module load hpctoolkit
hwloc¶
Description¶
The Hardware Locality hwloc software project.
Versions¶
1.11.13
Module¶
You can load the modules by:
module load hwloc
launcher¶
Description¶
Framework for running large collections of serial or multi-threaded applications
Versions¶
3.9
Module¶
You can load the modules by:
module load launcher
monitor¶
Description¶
System resource monitoring tool.
Versions¶
2.3.1
Module¶
You can load the modules by:
module load monitor
mpc¶
Description¶
Gnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result.
Versions¶
1.1.0
Module¶
You can load the modules by:
module load mpc
ncview¶
Description¶
Simple viewer for NetCDF files.
Versions¶
2.1.8
Module¶
You can load the modules by:
module load ncview
numactl¶
Description¶
Simple NUMA policy support. It consists of a numactl program to run other programs with a specific NUMA policy and a libnuma shared library (“NUMA API”) to set NUMA policy in applications.
Versions¶
2.0.14
Module¶
You can load the modules by:
module load numactl
openjdk¶
Description¶
The free and opensource java implementation
Versions¶
11.0.8_10
Module¶
You can load the modules by:
module load openjdk
papi¶
Description¶
PAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events. In addition Component PAPI provides access to a collection of components that expose performance measurement opportunities across the hardware and software stack.
Versions¶
6.0.0.1
Module¶
You can load the modules by:
module load papi
parafly¶
Description¶
Run UNIX commands in parallel
Versions¶
r2013
Module¶
You can load the modules by:
module load parafly
protobuf¶
Description¶
Protocol Buffers (a.k.a., protobuf) are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data.
Versions¶
3.11.4
Module¶
You can load the modules by:
module load protobuf
qemu¶
Description¶
QEMU is a generic and open source machine emulator and virtualizer.
Versions¶
4.1.1
Module¶
You can load the modules by:
module load qemu
qt¶
Description¶
Qt is a comprehensive cross-platform C++ application framework.
Versions¶
5.15.2
Module¶
You can load the modules by:
module load qt
texlive¶
Description¶
TeX Live is a free software distribution for the TeX typesetting system. Heads up, its is not a reproducible installation. At any point only the most recent version can be installed. Older versions are included for backward compatibility, i.e., if you have that version already installed.
Versions¶
20200406
Module¶
You can load the modules by:
module load texlive
tk¶
Description¶
Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches. Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.
Versions¶
8.6.11
Module¶
You can load the modules by:
module load tk
totalview¶
Description¶
TotalView is a GUI-based source code defect analysis tool that gives you unprecedented control over processes and thread execution and visibility into program state and variables.
Versions¶
2020.2.6
Module¶
You can load the modules by:
module load totalview
valgrind¶
Description¶
An instrumentation framework for building dynamic analysis.
Versions¶
3.15.0
Module¶
You can load the modules by:
module load valgrind
Workflow automation¶
hyper-shell¶
Description¶
Process shell commands over a distributed, asynchronous queue.
Versions¶
2.0.2
Module¶
You can load the modules by:
module load hyper-shell
nextflow¶
Description¶
Data-driven computational pipelines.
Versions¶
22.10.1
Module¶
You can load the modules by:
module load nextflow
parallel¶
Description¶
GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input.
Versions¶
20200822
Module¶
You can load the modules by:
module load parallel
Biocontainers¶
Abacas¶
Introduction¶
Abacas
is a tool for algorithm based automatic contiguation of assembled sequences.
Versions¶
1.3.1
Commands¶
abacas.pl
abacas.1.3.1.pl
Module¶
You can load the modules by:
module load biocontainers
module load abacas
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Abacas on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abacas
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abacas
abacas.pl -r cmm.fasta -q Cm.contigs.fasta -p nucmer -o out_prefix
Abismal¶
Introduction¶
Another Bisulfite Mapping Algorithm (abismal) is a read mapping program for bisulfite sequencing in DNA methylation studies.
Versions¶
3.0.0
Commands¶
abismal
abismalidx
simreads
Module¶
You can load the modules by:
module load biocontainers
module load abismal
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abismal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abismal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abismal
abismalidx ~/.local/share/genomes/hg38/hg38.fa hg38
Abpoa¶
Introduction¶
abPOA: adaptive banded Partial Order Alignment
Versions¶
1.4.1
Commands¶
abpoa
Module¶
You can load the modules by:
module load biocontainers
module load abpoa
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abpoa on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=abpoa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abpoa
abpoa seq.fa > cons.fa
Abricate¶
Introduction¶
Abricate
is a tool for mass screening of contigs for antimicrobial resistance or virulence genes.
Versions¶
1.0.1
Commands¶
abricate
Module¶
You can load the modules by:
module load biocontainers
module load abricate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Abricate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=abricate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abricate
abricate --threads 8 *.fasta
Abyss¶
Introduction¶
ABySS
is a de novo sequence assembler intended for short paired-end reads and genomes of all sizes.
Versions¶
2.3.2
2.3.4
Commands¶
ABYSS
ABYSS-P
AdjList
Consensus
DAssembler
DistanceEst
DistanceEst-ssq
KAligner
MergeContigs
MergePaths
Overlap
ParseAligns
PathConsensus
PathOverlap
PopBubbles
SimpleGraph
abyss-align
abyss-bloom
abyss-bloom-dbg
abyss-bowtie
abyss-bowtie2
abyss-bwa
abyss-bwamem
abyss-bwasw
abyss-db-txt
abyss-dida
abyss-fac
abyss-fatoagp
abyss-filtergraph
abyss-fixmate
abyss-fixmate-ssq
abyss-gapfill
abyss-gc
abyss-index
abyss-junction
abyss-kaligner
abyss-layout
abyss-longseqdist
abyss-map
abyss-map-ssq
abyss-mergepairs
abyss-overlap
abyss-paired-dbg
abyss-paired-dbg-mpi
abyss-pe
abyss-rresolver-short
abyss-samtoafg
abyss-scaffold
abyss-sealer
abyss-stack-size
abyss-tabtomd
abyss-todot
abyss-tofastq
konnector
logcounter
Module¶
You can load the modules by:
module load biocontainers
module load abyss
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run abyss on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=abyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers abyss
abyss-pe np=4 k=25 name=test B=1G \
in='test-data/reads1.fastq test-data/reads2.fastq'
Actc¶
Introduction¶
Actc is used to align subreads to ccs reads.
Versions¶
0.2.0
Commands¶
actc
Module¶
You can load the modules by:
module load biocontainers
module load actc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run actc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=actc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers actc
actc subreads.bam ccs.bam subreads_to_ccs.bam
Adapterremoval¶
Introduction¶
AdapterRemoval searches for and removes adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3’ end of reads following adapter removal. AdapterRemoval can analyze both single end and paired end data, and can be used to merge overlapping paired-ended reads into (longer) consensus sequences. Additionally, AdapterRemoval can construct a consensus adapter sequence for paired-ended reads, if which this information is not available.
Versions¶
2.3.3
Commands¶
AdapterRemoval
Module¶
You can load the modules by:
module load biocontainers
module load adapterremoval
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run adapterremoval on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=adapterremoval
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers adapterremoval
AdapterRemoval --file1 input_1.fastq --file2 input_2.fastq
Advntr¶
Introduction¶
Advntr
is a tool for genotyping Variable Number Tandem Repeats (VNTR) from sequence data.
Versions¶
1.4.0
1.5.0
Commands¶
advntr
Module¶
You can load the modules by:
module load biocontainers
module load advntr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Advntr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=advntr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers advntr
advntr addmodel -r chr21.fa -p CGCGGGGCGGGG -s 45196324 -e 45196360 -c chr21
advntr genotype --vntr_id 1 --alignment_file CSTB_2_5_testdata.bam --working_directory working_dir
Afplot¶
Introduction¶
Afplot
is a tool to plot allele frequencies in VCF files.
Versions¶
0.2.1
Commands¶
afplot
Module¶
You can load the modules by:
module load biocontainers
module load afplot
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run afplot on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers afplot
afplot whole-genome histogram -v my_vcf.gz -l my_label -s my_sample -o mysample.histogram.png
Afterqc¶
Introduction¶
Afterqc
is a tool for quality control of FASTQ data produced by HiSeq 2000/2500/3000/4000, Nextseq 500/550, MiniSeq, and Illumina 1.8 or newer.
Versions¶
0.9.7
Commands¶
after.py
Module¶
You can load the modules by:
module load biocontainers
module load afterqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run blobtools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=afterqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers afterqc
after.py -1 SRR11941281_1.fastq.paired.fq -2 SRR11941281_2.fastq.paired.fq
Agat¶
Introduction¶
Agat
is a suite of tools to handle gene annotations in any GTF/GFF format.
Versions¶
0.8.1
Commands¶
agat_convert_bed2gff.pl
agat_convert_embl2gff.pl
agat_convert_genscan2gff.pl
agat_convert_mfannot2gff.pl
agat_convert_minimap2_bam2gff.pl
agat_convert_sp_gff2bed.pl
agat_convert_sp_gff2gtf.pl
agat_convert_sp_gff2tsv.pl
agat_convert_sp_gff2zff.pl
agat_convert_sp_gxf2gxf.pl
agat_sp_Prokka_inferNameFromAttributes.pl
agat_sp_add_introns.pl
agat_sp_add_start_and_stop.pl
agat_sp_alignment_output_style.pl
agat_sp_clipN_seqExtremities_and_fixCoordinates.pl
agat_sp_compare_two_BUSCOs.pl
agat_sp_compare_two_annotations.pl
agat_sp_complement_annotations.pl
agat_sp_ensembl_output_style.pl
agat_sp_extract_attributes.pl
agat_sp_extract_sequences.pl
agat_sp_filter_by_ORF_size.pl
agat_sp_filter_by_locus_distance.pl
agat_sp_filter_by_mrnaBlastValue.pl
agat_sp_filter_feature_by_attribute_presence.pl
agat_sp_filter_feature_by_attribute_value.pl
agat_sp_filter_feature_from_keep_list.pl
agat_sp_filter_feature_from_kill_list.pl
agat_sp_filter_gene_by_intron_numbers.pl
agat_sp_filter_gene_by_length.pl
agat_sp_filter_incomplete_gene_coding_models.pl
agat_sp_filter_record_by_coordinates.pl
agat_sp_fix_cds_phases.pl
agat_sp_fix_features_locations_duplicated.pl
agat_sp_fix_fusion.pl
agat_sp_fix_longest_ORF.pl
agat_sp_fix_overlaping_genes.pl
agat_sp_fix_small_exon_from_extremities.pl
agat_sp_flag_premature_stop_codons.pl
agat_sp_flag_short_introns.pl
agat_sp_functional_statistics.pl
agat_sp_keep_longest_isoform.pl
agat_sp_kraken_assess_liftover.pl
agat_sp_list_short_introns.pl
agat_sp_load_function_from_protein_align.pl
agat_sp_manage_IDs.pl
agat_sp_manage_UTRs.pl
agat_sp_manage_attributes.pl
agat_sp_manage_functional_annotation.pl
agat_sp_manage_introns.pl
agat_sp_merge_annotations.pl
agat_sp_prokka_fix_fragmented_gene_annotations.pl
agat_sp_sensitivity_specificity.pl
agat_sp_separate_by_record_type.pl
agat_sp_statistics.pl
agat_sp_webApollo_compliant.pl
agat_sq_add_attributes_from_tsv.pl
agat_sq_add_hash_tag.pl
agat_sq_add_locus_tag.pl
agat_sq_count_attributes.pl
agat_sq_filter_feature_from_fasta.pl
agat_sq_list_attributes.pl
agat_sq_manage_IDs.pl
agat_sq_manage_attributes.pl
agat_sq_mask.pl
agat_sq_remove_redundant_entries.pl
agat_sq_repeats_analyzer.pl
agat_sq_rfam_analyzer.pl
agat_sq_split.pl
agat_sq_stat_basic.pl
Module¶
You can load the modules by:
module load biocontainers
module load agat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Agat on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers agat
agat_convert_sp_gff2bed.pl --gff genes.gff -o genes.bed
Agfusion¶
Introduction¶
AGFusion (pronounced ‘A G Fusion’) is a python package for annotating gene fusions from the human or mouse genomes.
Versions¶
1.3.11
Commands¶
agfusion
Module¶
You can load the modules by:
module load biocontainers
module load agfusion
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run agfusion on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=agfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers agfusion
Alfred¶
Introduction¶
Alfred
is an efficient and versatile command-line application that computes multi-sample quality control metrics in a read-group aware manner.
Versions¶
0.2.5
0.2.6
Commands¶
alfred
Module¶
You can load the modules by:
module load biocontainers
module load alfred
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Alfred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alfred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alfred
alfred qc -r genome.fasta -o qc.tsv.gz sorted.bam
Alien-hunter¶
Introduction¶
Alien-hunter
is an application for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs).
Versions¶
1.7.7
Commands¶
alien_hunter
Module¶
You can load the modules by:
module load biocontainers
module load alien_hunter
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Alien_hunter on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alien_hunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alien_hunter
alien_hunter genome.fasta output
Alignstats¶
Introduction¶
AlignStats produces various alignment, whole genome coverage, and capture coverage metrics for sequence alignment files in SAM, BAM, and CRAM format.
Versions¶
0.9.1
Commands¶
alignstats
Module¶
You can load the modules by:
module load biocontainers
module load alignstats
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run alignstats on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=alignstats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alignstats
alignstats -C -i input.bam -o report.txt
Allpathslg¶
Introduction¶
Allpathslg
is a whole-genome shotgun assembler that can generate high-quality genome assemblies using short reads.
Versions¶
52488
Commands¶
PrepareAllPathsInputs.pl
RunAllPathsLG
CacheLibs.pl
Fasta2Fastb
Module¶
You can load the modules by:
module load biocontainers
module load allpathslg
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Allpathslg on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=allpathslg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers allpathslg
PrepareAllPathsInputs.pl \
DATA_DIR=data \
PLOIDY=1 \
IN_GROUPS_CSV=in_groups.csv\
IN_LIBS_CSV=in_libs.csv\
OVERWRITE=True\
RunAllPathsLG PRE=allpathlg REFERENCE_NAME=test.genome \
DATA_SUBDIR=data RUN=myrun TARGETS=standard \
SUBDIR=test OVERWRITE=True
~
Alphafold¶
Introduction¶
Alphafold
is a protein structure prediction tool developed by DeepMind (Google). It uses a novel machine learning approach to predict 3D protein structures from primary sequences alone. The source code is available on Github. It has been deployed in all RCAC clusters, supporting both CPU and GPU.
It also relies on a huge database. The full database (~2.2TB) has been downloaded and setup for users.
Protein struction prediction by alphafold is performed in the following steps:
Search the amino acid sequence in uniref90 database by jackhmmer (using CPU)
Search the amino acid sequence in mgnify database by jackhmmer (using CPU)
Search the amino acid sequence in pdb70 database (for monomers) or pdb_seqres database (for multimers) by hhsearch (using CPU)
Search the amino acid sequence in bfd database and uniclust30 (updated to uniref30 since v2.3.0) database by hhblits (using CPU)
Search structure templates in pdb_mmcif database (using CPU)
Search the amino acid sequence in uniprot database (for multimers) by jackhmmer (using CPU)
Predict 3D structure by machine learning (using CPU or GPU)
Structure optimisation with OpenMM (using CPU or GPU)
Versions¶
2.1.1
2.2.0
2.2.3
2.3.0
2.3.1
Commands¶
run_alphafold.sh
Module¶
You can load the modules by:
module load biocontainers
module load alphafold
Usage¶
The usage of Alphafold on our cluster is very straightford, users can create a flagfile containing the database path information:
run_alphafold.sh --flagfile=full_db.ff --fasta_paths=XX --output_dir=XX ...
Users can check its detaied user guide in its Github.
full_db.ff¶
Example contents of full_db.ff:
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db/
--uniref90_database_path=/depot/itap/datasets/alphafold/db/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb70_database_path=/depot/itap/datasets/alphafold/db/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db/pdb_mmcif/mmcif_files
--max_template_date=2022-01-29
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Note
Since Version v2.2.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20221014
. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20221014
. Users need to update the flagfile using the updated database:
run_alphafold.sh --flagfile=full_db_20221014.ff --fasta_paths=XX --output_dir=XX ...
full_db_20221014.ff (for alphafold v2)¶
Example contents of full_db_20221014.ff (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20221014/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20221014/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20221014/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20221014/mgnify/mgy_clusters_2018_12.fa
--uniclust30_database_path=/depot/itap/datasets/alphafold/db_20221014/uniclust30/uniclust30_2018_08/uniclust30_2018_08
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20221014/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20221014/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20221014/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Note
Since Version v2.3.0, the AlphaFold-Multimer model parameters has been updated. The updated full database is stored in depot/itap/datasets/alphafold/db_20230311
. For ACCESS Anvil, the database is stored in /anvil/datasets/alphafold/db_20230311
. Users need to update the flagfile using the updated database:
run_alphafold.sh --flagfile=full_db_20230311.ff --fasta_paths=XX --output_dir=XX ...
Note
Since Version v2.3.0, uniclust30_database_path
has been changed to uniref30_database_path
.
full_db_20230311.ff (for alphafold v3)¶
Example contents of full_db_20230311.ff for monomer (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb70_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb70/pdb70
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Example contents of full_db_20230311.ff for multimer (For ACCESS Anvil, please change depot/itap
to anvil
):
--db_preset=full_dbs
--bfd_database_path=/depot/itap/datasets/alphafold/db_20230311/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt
--data_dir=/depot/itap/datasets/alphafold/db_20230311/
--uniref90_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref90/uniref90.fasta
--mgnify_database_path=/depot/itap/datasets/alphafold/db_20230311/mgnify/mgy_clusters_2022_05.fa
--uniref30_database_path=/depot/itap/datasets/alphafold/db_20230311/uniref30/UniRef30_2021_03
--pdb_seqres_database_path=/depot/itap/datasets/alphafold/db_20230311/pdb_seqres/pdb_seqres.txt
--uniprot_database_path=/depot/itap/datasets/alphafold/db_20230311/uniprot/uniprot.fasta
--template_mmcif_dir=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/mmcif_files
--obsolete_pdbs_path=/depot/itap/datasets/alphafold/db_20230311/pdb_mmcif/obsolete.dat
--hhblits_binary_path=/usr/bin/hhblits
--hhsearch_binary_path=/usr/bin/hhsearch
--jackhmmer_binary_path=/usr/bin/jackhmmer
--kalign_binary_path=/usr/bin/kalign
Example job using CPU¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
Notice that since version 2.2.0, the parameter --use_gpu_relax=False
is required.
To run alphafold using CPU:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alphafold/2.3.1
run_alphafold.sh --flagfile=full_db_20230311.ff \
--fasta_paths=sample.fasta --max_template_date=2022-02-01 \
--output_dir=af2_full_out --model_preset=monomer \
--use_gpu_relax=False
Example job using GPU¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
Notice that since version 2.2.0, the parameter --use_gpu_relax=True
is required.
To run alphafold using GPU:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 11
#SBATCH --gres=gpu:1
#SBATCH --job-name=alphafold
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers alphafold/2.3.1
run_alphafold.sh --flagfile=full_db_20230311.ff \
--fasta_paths=sample.fasta --max_template_date=2022-02-01 \
--output_dir=af2_full_out --model_preset=monomer \
--use_gpu_relax=True
Amptk¶
Introduction¶
Amptk
is a series of scripts to process NGS amplicon data using USEARCH and VSEARCH, it can also be used to process any NGS amplicon data and includes databases setup for analysis of fungal ITS, fungal LSU, bacterial 16S, and insect COI amplicons.
Versions¶
1.5.4
Commands¶
amptk
Module¶
You can load the modules by:
module load biocontainers
module load amptk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Amptk on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=amptk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers amptk
amptk illumina -i test_data/illumina_test_data -o miseq -f fITS7 -r ITS4 --cpus 4
Ananse¶
Introduction¶
ANANSE is a computational approach to infer enhancer-based gene regulatory networks (GRNs) and to identify key transcription factors between two GRNs.
Versions¶
0.4.0
Commands¶
ananse
Module¶
You can load the modules by:
module load biocontainers
module load ananse
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ananse on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ananse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ananse
mkdir -p ANANSE.REMAP.model.v1.0
wget https://zenodo.org/record/4768075/files/ANANSE.REMAP.model.v1.0.tgz
tar xvzf ANANSE.REMAP.model.v1.0.tgz -C ANANSE.REMAP.model.v1.0
rm ANANSE.REMAP.model.v1.0.tgz
wget https://zenodo.org/record/4769814/files/ANANSE_example_data.tgz
tar xvzf ANANSE_example_data.tgz
rm ANANSE_example_data.tgz
ananse binding -H ANANSE_example_data/H3K27ac/fibroblast*bam -A ANANSE_example_data/ATAC/fibroblast*bam -R ANANSE.REMAP.model.v1.0/ -o fibroblast.binding
ananse binding -H ANANSE_example_data/H3K27ac/heart*bam -A ANANSE_example_data/ATAC/heart*bam -R ANANSE.REMAP.model.v1.0/ -o heart.binding
ananse network -b fibroblast.binding/binding.h5 -e ANANSE_example_data/RNAseq/fibroblast*TPM.txt -n 4 -o fibroblast.network.txt
ananse network -b heart.binding/binding.h5 -e ANANSE_example_data/RNAseq/heart*TPM.txt -n 4 -o heart.network.txt
ananse influence -s fibroblast.network.txt -t heart.network.txt -d ANANSE_example_data/RNAseq/fibroblast2heart_degenes.csv -p -o fibroblast2heart.influence.txt
Anchorwave¶
Introduction¶
Anchorwave
is used for sensitive alignment of genomes with high sequence diversity, extensive structural polymorphism and whole-genome duplication variation.
Versions¶
1.0.1
Commands¶
anchorwave
gmap_build
gmap
minimap2
Module¶
You can load the modules by:
module load biocontainers
module load anchorwave
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Anchorwave on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=anchorwave
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers anchorwave
anchorwave gff2seq -i Zea_mays.AGPv4.34.gff3 -r Zea_mays.AGPv4.dna.toplevel.fa -o cds.fa
ANGSD¶
Introduction¶
ANGSD
is a software for analyzing next generation sequencing data. Detailed usage can be found here: http://www.popgen.dk/angsd/index.php/ANGSD.
Versions¶
0.935
0.937
0.939
0.940
Commands¶
angsd
realSFS
msToGlf
thetaStat
supersim
Module¶
You can load the modules by:
module load biocontainers
module load angsd/0.937
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run angsd on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=angsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers angsd/0.937
angsd -b bam.filelist -GL 1 -doMajorMinor 1 -doMaf 2 -P 5 -minMapQ 30 -minQ 20 -minMaf 0.05
Annogesic¶
Introduction¶
ANNOgesic is the swiss army knife for RNA-Seq based annotation of bacterial/archaeal genomes.
Versions¶
1.1.0
Commands¶
annogesic
Module¶
You can load the modules by:
module load biocontainers
module load annogesic
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run annogesic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=annogesic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers annogesic
ANNOGESIC_FOLDER=ANNOgesic
annogesic \
update_genome_fasta \
-c $ANNOGESIC_FOLDER/input/references/fasta_files/NC_009839.1.fa \
-m $ANNOGESIC_FOLDER/input/mutation_tables/mutation.csv \
-u NC_test.1 \
-pj $ANNOGESIC_FOLDER
ANNOVAR¶
Introduction¶
ANNOVAR
is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).
Versions¶
2022-01-13
Commands¶
annotate_variation.pl
coding_change.pl
convert2annovar.pl
retrieve_seq_from_fasta.pl
table_annovar.pl
variants_reduction.pl
Module¶
You can load the modules by:
module load biocontainers
module load annovar
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ANNOVAR on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=annovar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers annovar
annotate_variation.pl --buildver hg19 --downdb seq humandb/hg19_seq
convert2annovar.pl -format region -seqdir humandb/hg19_seq/ chr1:2000001-2000003
Antismash¶
Introduction¶
Antismash
Antismash allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes.
Versions¶
5.1.2
6.0.1
Commands¶
antismash
Module¶
You can load the modules by:
module load biocontainers
module load antismash
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Antismash on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=antismash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers antismash
antismash --cb-general --cb-knownclusters --cb-subclusters --asf --pfam2go --smcog-trees seq.gbk
Anvio¶
Introduction¶
Anvio
is an analysis and visualization platform for ‘omics data.
Versions¶
7.0
Commands¶
anvi-analyze-synteny
anvi-cluster-contigs
anvi-compute-ani
anvi-compute-completeness
anvi-compute-functional-enrichment
anvi-compute-gene-cluster-homogeneity
anvi-compute-genome-similarity
anvi-convert-trnaseq-database
anvi-db-info
anvi-delete-collection
anvi-delete-hmms
anvi-delete-misc-data
anvi-delete-state
anvi-dereplicate-genomes
anvi-display-contigs-stats
anvi-display-metabolism
anvi-display-pan
anvi-display-structure
anvi-estimate-genome-completeness
anvi-estimate-genome-taxonomy
anvi-estimate-metabolism
anvi-estimate-scg-taxonomy
anvi-estimate-trna-taxonomy
anvi-experimental-organization
anvi-export-collection
anvi-export-contigs
anvi-export-functions
anvi-export-gene-calls
anvi-export-gene-coverage-and-detection
anvi-export-items-order
anvi-export-locus
anvi-export-misc-data
anvi-export-splits-and-coverages
anvi-export-splits-taxonomy
anvi-export-state
anvi-export-structures
anvi-export-table
anvi-gen-contigs-database
anvi-gen-fixation-index-matrix
anvi-gen-gene-consensus-sequences
anvi-gen-gene-level-stats-databases
anvi-gen-genomes-storage
anvi-gen-network
anvi-gen-phylogenomic-tree
anvi-gen-structure-database
anvi-gen-variability-matrix
anvi-gen-variability-network
anvi-gen-variability-profile
anvi-get-aa-counts
anvi-get-codon-frequencies
anvi-get-enriched-functions-per-pan-group
anvi-get-sequences-for-gene-calls
anvi-get-sequences-for-gene-clusters
anvi-get-sequences-for-hmm-hits
anvi-get-short-reads-from-bam
anvi-get-short-reads-mapping-to-a-gene
anvi-get-split-coverages
anvi-help
anvi-import-collection
anvi-import-functions
anvi-import-items-order
anvi-import-misc-data
anvi-import-state
anvi-import-taxonomy-for-genes
anvi-import-taxonomy-for-layers
anvi-init-bam
anvi-inspect
anvi-interactive
anvi-matrix-to-newick
anvi-mcg-classifier
anvi-merge
anvi-merge-bins
anvi-meta-pan-genome
anvi-migrate
anvi-oligotype-linkmers
anvi-pan-genome
anvi-profile
anvi-push
anvi-refine
anvi-rename-bins
anvi-report-linkmers
anvi-run-hmms
anvi-run-interacdome
anvi-run-kegg-kofams
anvi-run-ncbi-cogs
anvi-run-pfams
anvi-run-scg-taxonomy
anvi-run-trna-taxonomy
anvi-run-workflow
anvi-scan-trnas
anvi-script-add-default-collection
anvi-script-augustus-output-to-external-gene-calls
anvi-script-calculate-pn-ps-ratio
anvi-script-checkm-tree-to-interactive
anvi-script-compute-ani-for-fasta
anvi-script-enrichment-stats
anvi-script-estimate-genome-size
anvi-script-filter-fasta-by-blast
anvi-script-fix-homopolymer-indels
anvi-script-gen-CPR-classifier
anvi-script-gen-distribution-of-genes-in-a-bin
anvi-script-gen-help-pages
anvi-script-gen-hmm-hits-matrix-across-genomes
anvi-script-gen-programs-network
anvi-script-gen-programs-vignette
anvi-script-gen-pseudo-paired-reads-from-fastq
anvi-script-gen-scg-domain-classifier
anvi-script-gen-short-reads
anvi-script-gen_stats_for_single_copy_genes.R
anvi-script-gen_stats_for_single_copy_genes.py
anvi-script-gen_stats_for_single_copy_genes.sh
anvi-script-get-collection-info
anvi-script-get-coverage-from-bam
anvi-script-get-hmm-hits-per-gene-call
anvi-script-get-primer-matches
anvi-script-merge-collections
anvi-script-pfam-accessions-to-hmms-directory
anvi-script-predict-CPR-genomes
anvi-script-process-genbank
anvi-script-process-genbank-metadata
anvi-script-reformat-fasta
anvi-script-run-eggnog-mapper
anvi-script-snvs-to-interactive
anvi-script-tabulate
anvi-script-transpose-matrix
anvi-script-variability-to-vcf
anvi-script-visualize-split-coverages
anvi-search-functions
anvi-self-test
anvi-setup-interacdome
anvi-setup-kegg-kofams
anvi-setup-ncbi-cogs
anvi-setup-pdb-database
anvi-setup-pfams
anvi-setup-scg-taxonomy
anvi-setup-trna-taxonomy
anvi-show-collections-and-bins
anvi-show-misc-data
anvi-split
anvi-summarize
anvi-trnaseq
anvi-update-db-description
anvi-update-structure-database
anvi-upgrade
Module¶
You can load the modules by:
module load biocontainers
module load anvio
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Anvio on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=anvio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers anvio
anvi-script-reformat-fasta assembly.fa -o contigs.fa -l 1000 --simplify-names --seq-type NT
anvi-gen-contigs-database -f contigs.fa -o contigs.db -n 'An example contigs database' --num-threads 8
anvi-display-contigs-stats contigs.db
anvi-setup-ncbi-cogs --cog-data-dir $PWD --num-threads 8 --just-do-it --reset
anvi-run-ncbi-cogs -c contigs.db --cog-data-dir COG20 --num-threads 8
Any2fasta¶
Introduction¶
Any2fasta can convert various sequence formats to FASTA.
Versions¶
0.4.2
Commands¶
any2fasta
Module¶
You can load the modules by:
module load biocontainers
module load any2fasta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run any2fasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=any2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers any2fasta
any2fasta input.gff > out.fasta
Arcs¶
Introduction¶
ARCS is a tool for scaffolding genome sequence assemblies using linked or long read sequencing data.
Versions¶
1.2.4
Commands¶
arcs
arcs-make
Module¶
You can load the modules by:
module load biocontainers
module load arcs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run arcs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=arcs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers arcs
Ascatngs¶
Introduction¶
AscatNGS contains the Cancer Genome Projects workflow implementation of the ASCAT copy number algorithm for paired end sequencing.
Versions¶
4.5.0
Commands¶
alleleCounter.pl
ascatCnToVCF.pl
ascatCounts.pl
ascatFaiChunk.pl
ascatFailedCnCsv.pl
ascat.pl
ascatSnpPanelFromVcfs.pl
ascatSnpPanelGcCorrections.pl
ascatSnpPanelGenerator.pl
ascatSnpPanelMerge.pl
ascatToBigWig.pl
bamToBw.pl
blast2sam.pl
bowtie2sam.pl
bwa_aln.pl
bwa_mem.pl
cgpAppendIdsToVcf.pl
cgpVCFSplit.pl
export2sam.pl
interpolate_sam.pl
merge_or_mark.pl
novo2sam.pl
pkg-config.pl
psl2sam.pl
sam2vcf.pl
samtools.pl
seq_cache_populate.pl
soap2sam.pl
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
wgsim_eval.pl
xam_coverage_bins.pl
zoom2sam.pl
Module¶
You can load the modules by:
module load biocontainers
module load ascatngs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ascatngs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ascatngs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ascatngs
ASGAL¶
Introduction¶
ASGAL
(Alternative Splicing Graph ALigner) is a tool for detecting the alternative splicing events expressed in a RNA-Seq sample with respect to a gene annotation.
Versions¶
1.1.7
Commands¶
asgal
Module¶
You can load the modules by:
module load biocontainers
module load asgal
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ASGAL on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=asgal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers asgal
asgal -g input/genome.fa \
-a input/annotation.gtf \
-s input/sample_1.fa -o outputFolder
Aspera-connect¶
Introduction¶
Aspera Connect is software that allows download and upload data. The software includes a command line tool (ascp) that allows scripted data transfer.
Versions¶
4.2.6
Commands¶
ascp
ascp4
asperaconnect
asperaconnect.bin
asperaconnect-nmh
asperacrypt
asunprotect
Module¶
You can load the modules by:
module load biocontainers
module load aspera-connect
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run aspera-connect on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=aspera-connect
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers aspera-connect
Assembly-stats¶
Introduction¶
Assembly-stats
is a tool to get assembly statistics from FASTA and FASTQ files.
Versions¶
1.0.1
Commands¶
assembly-stats
Module¶
You can load the modules by:
module load biocontainers
module load assembly-stats
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Assembly-stats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 00:10:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=assembly-stats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers assembly-stats
assembly-stats seq.fasta
Atac-seq-pipeline¶
Introduction¶
The ENCODE ATAC-seq pipeline is used for quality control and statistical signal processing of short-read sequencing data, producing alignments and measures of enrichment. It was developed by Anshul Kundaje’s lab at Stanford University.
Versions¶
2.1.3
Commands¶
10x_bam2fastq
SAMstats
SAMstatsParallel
ace2sam
aggregate_scores_in_intervals.py
align_print_template.py
alignmentSieve
annotate.py
annotateBed
axt_extract_ranges.py
axt_to_fasta.py
axt_to_lav.py
axt_to_maf.py
bamCompare
bamCoverage
bamPEFragmentSize
bamToBed
bamToFastq
bed12ToBed6
bedToBam
bedToIgv
bed_bigwig_profile.py
bed_build_windows.py
bed_complement.py
bed_count_by_interval.py
bed_count_overlapping.py
bed_coverage.py
bed_coverage_by_interval.py
bed_diff_basewise_summary.py
bed_extend_to.py
bed_intersect.py
bed_intersect_basewise.py
bed_merge_overlapping.py
bed_rand_intersect.py
bed_subtract_basewise.py
bedpeToBam
bedtools
bigwigCompare
blast2sam.pl
bnMapper.py
bowtie2sam.pl
bwa
chardetect
closestBed
clusterBed
complementBed
compress
computeGCBias
computeMatrix
computeMatrixOperations
correctGCBias
coverageBed
createDiff
cutadapt
cygdb
cython
cythonize
deeptools
div_snp_table_chr.py
download_metaseq_example_data.py
estimateReadFiltering
estimateScaleFactor
expandCols
export2sam.pl
faidx
fastaFromBed
find_in_sorted_file.py
flankBed
gene_fourfold_sites.py
genomeCoverageBed
getOverlap
getSeq_genome_wN
getSeq_genome_woN
get_objgraph
get_scores_in_intervals.py
gffutils-cli
groupBy
gsl-config
gsl-histogram
gsl-randist
idr
int_seqs_to_char_strings.py
interpolate_sam.pl
intersectBed
intersection_matrix.py
interval_count_intersections.py
interval_join.py
intron_exon_reads.py
jsondiff
lav_to_axt.py
lav_to_maf.py
line_select.py
linksBed
lzop_build_offset_table.py
mMK_bitset.py
macs2
maf_build_index.py
maf_chop.py
maf_chunk.py
maf_col_counts.py
maf_col_counts_all.py
maf_count.py
maf_covered_ranges.py
maf_covered_regions.py
maf_div_sites.py
maf_drop_overlapping.py
maf_extract_chrom_ranges.py
maf_extract_ranges.py
maf_extract_ranges_indexed.py
maf_filter.py
maf_filter_max_wc.py
maf_gap_frequency.py
maf_gc_content.py
maf_interval_alignibility.py
maf_limit_to_species.py
maf_mapping_word_frequency.py
maf_mask_cpg.py
maf_mean_length_ungapped_piece.py
maf_percent_columns_matching.py
maf_percent_identity.py
maf_print_chroms.py
maf_print_scores.py
maf_randomize.py
maf_region_coverage_by_src.py
maf_select.py
maf_shuffle_columns.py
maf_species_in_all_files.py
maf_split_by_src.py
maf_thread_for_species.py
maf_tile.py
maf_tile_2.py
maf_tile_2bit.py
maf_to_axt.py
maf_to_concat_fasta.py
maf_to_fasta.py
maf_to_int_seqs.py
maf_translate_chars.py
maf_truncate.py
maf_word_frequency.py
makeBAM.sh
makeDiff.sh
makeFastq.sh
make_unique
makepBAM_genome.sh
makepBAM_transcriptome.sh
mapBed
maq2sam-long
maq2sam-short
maskFastaFromBed
mask_quality.py
mergeBed
metaseq-cli
multiBamCov
multiBamSummary
multiBigwigSummary
multiIntersectBed
nib_chrom_intervals_to_fasta.py
nib_intervals_to_fasta.py
nib_length.py
novo2sam.pl
nucBed
one_field_per_line.py
out_to_chain.py
pairToBed
pairToPair
pbam2bam
pbam_mapped_transcriptome
pbt_plotting_example.py
peak_pie.py
plot-bamstats
plotCorrelation
plotCoverage
plotEnrichment
plotFingerprint
plotHeatmap
plotPCA
plotProfile
prefix_lines.py
pretty_table.py
print_unique
psl2sam.pl
py.test
pybabel
pybedtools
pygmentize
pytest
python-argcomplete-check-easy-install-script
python-argcomplete-tcsh
qv_to_bqv.py
randomBed
random_lines.py
register-python-argcomplete
sam2vcf.pl
samtools
samtools.pl
seq_cache_populate.pl
shiftBed
shuffleBed
slopBed
soap2sam.pl
sortBed
speedtest.py
subtractBed
table_add_column.py
table_filter.py
tagBam
tfloc_summary.py
ucsc_gene_table_to_intervals.py
undill
unionBedGraphs
varfilter.py
venn_gchart.py
venn_mpl.py
wgsim
wgsim_eval.pl
wiggle_to_array_tree.py
wiggle_to_binned_array.py
wiggle_to_chr_binned_array.py
wiggle_to_simple.py
windowBed
windowMaker
zoom2sam.pl
Module¶
You can load the modules by:
module load biocontainers
module load atac-seq-pipeline
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run atac-seq-pipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=atac-seq-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atac-seq-pipeline
Ataqv¶
Introduction¶
Ataqv
is a toolkit for measuring and comparing ATAC-seq results, made in the Parker lab at the University of Michigan.
Versions¶
1.3.0
Commands¶
ataqv
Module¶
You can load the modules by:
module load biocontainers
module load ataqv
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ataqv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ataqv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ataqv
ataqv --peak-file sample_1_peaks.broadPeak \
--name sample_1 --metrics-file sample_1.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_1.md.bam \
> sample_1.ataqv.out
ataqv --peak-file sample_2_peaks.broadPeak \
--name sample_2 --metrics-file sample_2.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_2.md.bam \
> sample_2.ataqv.out
ataqv --peak-file sample_3_peaks.broadPeak \
--name sample_3 --metrics-file sample_3.ataqv.json.gz \
--excluded-region-file hg19.blacklist.bed.gz \
--tss-file hg19.tss.refseq.bed.gz \
--ignore-read-groups human sample_3.md.bam \
> sample_3.ataqv.out
mkarv my_fantastic_experiment sample_1.ataqv.json.gz sample_2.ataqv.json.gz sample_3.ataqv.json.gz
aTRAM¶
Introduction¶
aTRAM
(automated target restricted assembly method) is an iterative assembler that performs reference-guided local de novo assemblies using a variety of available methods.
Detailed usage can be found here: https://bioinformaticshome.com/tools/wga/descriptions/aTRAM.html
Versions¶
2.4.3
Commands¶
atram.py
atram_preprocessor.py
atram_stitcher.py
Module¶
You can load the modules by:
module load biocontainers
module load atram/2.4.3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run aTRAM on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=atram
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atram/2.4.3a
atram_preprocessor.py --blast-db=atram_db \
--end-1=data/tutorial_end_1.fasta.gz \
--end-2=data/tutorial_end_2.fasta.gz \
--gzip
atram.py --query=tutorial-query.pep.fasta \
--blast-db=atram_db \
--output=output \
--assembler=velvet
Atropos¶
Introduction¶
Atropos
is a tool for specific, sensitive, and speedy trimming of NGS reads.
Versions¶
1.1.17
1.1.31
Commands¶
atropos
Module¶
You can load the modules by:
module load biocontainers
module load atropos
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Atropos on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=atropos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers atropos
atropos --threads 4 \
-a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTTA \
-o trimmed1.fq.gz -p trimmed2.fq.gz \
-pe1 SRR13176582_1.fastq -pe2 SRR13176582_2.fastq
Augur¶
Introduction¶
Augur
is the bioinformatics toolkit we use to track evolution from sequence and serological data.
Versions¶
14.0.0
15.0.0
Commands¶
augur
Module¶
You can load the modules by:
module load biocontainers
module load augur
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Augur on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=augur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers augur
mkdir -p results
augur index --sequences zika-tutorial/data/sequences.fasta \
--output results/sequence_index.tsv
augur filter --sequences zika-tutorial/data/sequences.fasta \
--sequence-index results/sequence_index.tsv \
--metadata zika-tutorial/data/metadata.tsv \
--exclude zika-tutorial/config/dropped_strains.txt \
--output results/filtered.fasta \
--group-by country year month \
--sequences-per-group 20 \
--min-date 2012
augur align --sequences results/filtered.fasta \
--reference-sequence zika-tutorial/config/zika_outgroup.gb \
--output results/aligned.fasta \
--fill-gaps
augur tree --alignment results/aligned.fasta \
--output results/tree_raw.nwk
augur refine --tree results/tree_raw.nwk \
--alignment results/aligned.fasta \
--metadata zika-tutorial/data/metadata.tsv \
--output-tree results/tree.nwk \
--output-node-data results/branch_lengths.json \
--timetree \
--coalescent opt \
--date-confidence \
--date-inference marginal \
--clock-filter-iqd 4
AUGUSTUS¶
Introduction¶
AUGUSTUS
is a program that predicts genes in eukaryotic genomic sequences.
Versions¶
3.4.0
3.5.0
Commands¶
aln2wig
augustus
bam2wig
bam2wig-dist
consensusFinder
curve2hints
etraining
fastBlockSearch
filterBam
getSeq
getSeq-dist
homGeneMapping
joingenes
prepareAlign
Module¶
You can load the modules by:
module load biocontainers
module load augustus/3.4.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run AUGUSTUS on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=AUGUSTUS
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers augustus/3.4.0
augustus --species=botrytis_cinerea genome.fasta > annotation.gff
Bactopia¶
Introduction¶
Bactopia is a flexible pipeline for complete analysis of bacterial genomes. The goal of Bactopia is to process your data with a broad set of tools, so that you can get to the fun part of analyses quicker!
Versions¶
2.0.3
2.1.1
2.2.0
3.0.0
Commands¶
bactopia
Module¶
You can load the modules by:
module load biocontainers
module load bactopia
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bactopia on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bactopia
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bactopia
bactopia datasets \
--ariba "vfdb_core,card" \
--species "Staphylococcus aureus" \
--include_genus \
--limit 100 \
--cpus 12
bactopia --accession SRX4563634 \
--datasets datasets/ \
--species "Staphylococcus aureus" \
--coverage 100 \
--genome_size median \
--outdir ena-single-sample \
--max_cpus 12
Bali-phy¶
Introduction¶
Bali-phy is a tool for bayesian co-estimation of phylogenies and multiple alignments via MCMC.
Versions¶
3.6.0
Commands¶
bali-phy
Module¶
You can load the modules by:
module load biocontainers
module load bali-phy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bali-phy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bali-phy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bali-phy
bali-phy examples/sequences/ITS/ITS1.fasta 5.8S.fasta ITS2.fasta --test
bali-phy examples/sequences/5S-rRNA/5d-clustalw.fasta -S gtr+Rates.gamma[4]+inv -n 5d-free
Bamgineer¶
Introduction¶
Bamgineer
is a tool that can be used to introduce user-defined haplotype-phased allele-specific copy number variations (CNV) into an existing Binary Alignment Mapping (BAM) file with demonstrated applicability to simulate somatic cancer CNVs in phased whole-genome sequencing datsets.
Versions¶
1.1
Commands¶
simulate.py
Module¶
You can load the modules by:
module load biocontainers
module load bamgineer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamgineer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamgineer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamgineer
simulate.py -config inputs/config.cfg \
-splitbamdir splitbams \
-cnv_bed inputs/cnv.bed \
-vcf inputs/normal_het.vcf \
-exons inputs/exons.bed \
-outbam tumour.bam \
-results outputs \
-cancertype LUAC1
Bamliquidator¶
Introduction¶
Bamliquidator
is a set of tools for analyzing the density of short DNA sequence read alignments in the BAM file format.
Versions¶
1.5.2
Commands¶
bamliquidator
bamliquidator_bins
bamliquidator_regions
bamliquidatorbatch
Module¶
You can load the modules by:
module load biocontainers
module load bamliquidator
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamliquidator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamliquidator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamliquidator
Bam-readcount¶
Introduction¶
Bam-readcount
is a utility that runs on a BAM or CRAM file and generates low-level information about sequencing data at specific nucleotide positions.
Versions¶
1.0.0
Commands¶
bam-readcount
Module¶
You can load the modules by:
module load biocontainers
module load bam-readcount
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bam-readcount on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bam-readcount
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bam-readcount
bam-readcount -f Homo_sapiens.GRCh38.dna.primary_assembly.fa Aligned.sortedByCoord.out.bam
Bamsurgeon¶
Introduction¶
Bamsurgeon
are tools for adding mutations to .bam files, used for testing mutation callers.
Versions¶
1.2
Commands¶
addindel.py
addsnv.py
addsv.py
Module¶
You can load the modules by:
module load biocontainers
module load bamsurgeon
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamsurgeon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamsurgeon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamsurgeon
addsv.py -p 1 -v test_sv.txt -f testregion_realign.bam \
-r reference.fasta -o testregion_sv_mut.bam \
--aligner mem --keepsecondary --seed 1234 \
--inslib test_inslib.fa
BamTools¶
Introduction¶
BamTools
is a programmer API and an end-user toolkit for handling BAM files. This container provides a toolkit-only version (no API to build against).
Versions¶
2.5.1
Commands¶
bamtools
Module¶
You can load the modules by:
module load biocontainers
module load bamtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BamTools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH -ddd-error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamtools
bamtools convert -format fastq -in in.bam -out out.fastq
Bamutil¶
Introduction¶
Bamutil
is a collection of programs for working on SAM/BAM files.
Versions¶
1.0.15
Commands¶
bam
Module¶
You can load the modules by:
module load biocontainers
module load bamutil
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bamutil on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bamutil
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bamutil
bam validate --params --in test/testFiles/testInvalid.sam --refFile test/testFilesLibBam/chr1_partial.fa --v --noph 2> results/validateInvalid.txt
bam convert --params --in test/testFiles/testFilter.bam --out results/convertBam.sam --noph 2> results/convertBam.log
bam splitChromosome --in test/testFile/sortedBam1.bam --out results/splitSortedBam --noph 2> results/splitChromosome.txt
bam stats --basic --in test/testFiles/testFilter.sam --noph 2> results/basicStats.txt
bam gapInfo --in test/testFiles/testGapInfo.sam --out results/gapInfo.txt --noph 2> results/gapInfo.log
bam findCigars --in test/testFiles/testRevert.sam --out results/cigarNonM.sam --nonM --noph 2> results/cigarNonM.log
Barrnap¶
Introduction¶
Barrnap
: BAsic Rapid Ribosomal RNA Predictor.
Versions¶
0.9.4
Commands¶
barrnap
Module¶
You can load the modules by:
module load biocontainers
module load barrnap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Barrnap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=barrnap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers barrnap
barrnap --kingdom bac -o bac_16s.fasta < bac_genome.fasta > bac_16s.gff3
barrnap --kingdom euk -o euk_16s.fasta < euk_genome.fasta > euk_16s.gff3
Basenji¶
Introduction¶
Basenji
is a tool for sequential regulatory activity predictions with deep convolutional neural networks.
Versions¶
0.5.1
Commands¶
akita_data.py
akita_data_read.py
akita_data_write.py
akita_predict.py
akita_sat_plot.py
akita_sat_vcf.py
akita_scd.py
akita_scd_multi.py
akita_test.py
akita_train.py
bam_cov.py
basenji_annot_chr.py
basenji_bench_classify.py
basenji_bench_gtex.py
basenji_bench_gtex_cmp.py
basenji_bench_phylop.py
basenji_bench_phylop_folds.py
basenji_cmp.py
basenji_data.py
basenji_data2.py
basenji_data_align.py
basenji_data_gene.py
basenji_data_hic_read.py
basenji_data_hic_write.py
basenji_data_read.py
basenji_data_write.py
basenji_fetch_app.py
basenji_fetch_app1.py
basenji_fetch_app2.py
basenji_fetch_norm.py
basenji_fetch_vcf.py
basenji_gtex_folds.py
basenji_hdf5_genes.py
basenji_hidden.py
basenji_map.py
basenji_map_genes.py
basenji_map_seqs.py
basenji_motifs.py
basenji_motifs_denovo.py
basenji_norm_h5.py
basenji_predict.py
basenji_predict_bed.py
basenji_predict_bed_multi.py
basenji_sad.py
basenji_sad_multi.py
basenji_sad_norm.py
basenji_sad_ref.py
basenji_sad_ref_multi.py
basenji_sad_table.py
basenji_sat_bed.py
basenji_sat_bed_multi.py
basenji_sat_folds.py
basenji_sat_plot.py
basenji_sat_plot2.py
basenji_sat_vcf.py
basenji_sed.py
basenji_sed_multi.py
basenji_sedg.py
basenji_test.py
basenji_test_folds.py
basenji_test_genes.py
basenji_test_reps.py
basenji_test_specificity.py
basenji_train.py
basenji_train1.py
basenji_train2.py
basenji_train_folds.py
basenji_train_hic.py
basenji_train_reps.py
save_model.py
sonnet_predict_bed.py
sonnet_sad.py
sonnet_sad_multi.py
sonnet_sat_bed.py
sonnet_sat_vcf.py
tfr_bw.py
tfr_hdf5.py
tfr_qc.py
upgrade_tf1.py
Module¶
You can load the modules by:
module load biocontainers
module load basenji
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Basenji on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=basenji
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers basenji
Bayescan¶
Introduction¶
BayeScan aims at identifying candidate loci under natural selection from genetic data, using differences in allele frequencies between populations.
Versions¶
2.1
Commands¶
bayescan
Module¶
You can load the modules by:
module load biocontainers
module load bayescan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bayescan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bayescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bayescan
Bazam¶
Introduction¶
Bazam is a tool to extract paired reads in FASTQ format from coordinate sorted BAM files. For more information, please check: Docker hub: https://hub.docker.com/r/dockanomics/bazam Home page: https://github.com/ssadedin/bazam
Versions¶
1.0.1
Commands¶
bazam
Module¶
You can load the modules by:
module load biocontainers
module load bazam
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bazam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bazam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bazam
Bbmap¶
Introduction¶
Bbmap
is a short read aligner, as well as various other bioinformatic tools.
Versions¶
38.93
38.96
Commands¶
addadapters.sh
a_sample_mt.sh
bbcountunique.sh
bbduk.sh
bbest.sh
bbfakereads.sh
bbmap.sh
bbmapskimmer.sh
bbmask.sh
bbmerge-auto.sh
bbmergegapped.sh
bbmerge.sh
bbnorm.sh
bbqc.sh
bbrealign.sh
bbrename.sh
bbsketch.sh
bbsplitpairs.sh
bbsplit.sh
bbstats.sh
bbversion.sh
bbwrap.sh
calcmem.sh
calctruequality.sh
callpeaks.sh
callvariants2.sh
callvariants.sh
clumpify.sh
commonkmers.sh
comparesketch.sh
comparevcf.sh
consect.sh
countbarcodes.sh
countgc.sh
countsharedlines.sh
crossblock.sh
crosscontaminate.sh
cutprimers.sh
decontaminate.sh
dedupe2.sh
dedupebymapping.sh
dedupe.sh
demuxbyname.sh
diskbench.sh
estherfilter.sh
explodetree.sh
filterassemblysummary.sh
filterbarcodes.sh
filterbycoverage.sh
filterbyname.sh
filterbysequence.sh
filterbytaxa.sh
filterbytile.sh
filterlines.sh
filtersam.sh
filtersubs.sh
filtervcf.sh
fungalrelease.sh
fuse.sh
getreads.sh
gi2ancestors.sh
gi2taxid.sh
gitable.sh
grademerge.sh
gradesam.sh
idmatrix.sh
idtree.sh
invertkey.sh
kcompress.sh
khist.sh
kmercountexact.sh
kmercountmulti.sh
kmercoverage.sh
loadreads.sh
loglog.sh
makechimeras.sh
makecontaminatedgenomes.sh
makepolymers.sh
mapPacBio.sh
matrixtocolumns.sh
mergebarcodes.sh
mergeOTUs.sh
mergesam.sh
msa.sh
mutate.sh
muxbyname.sh
normandcorrectwrapper.sh
partition.sh
phylip2fasta.sh
pileup.sh
plotgc.sh
postfilter.sh
printtime.sh
processfrag.sh
processspeed.sh
randomreads.sh
readlength.sh
reducesilva.sh
reformat.sh
removebadbarcodes.sh
removecatdogmousehuman.sh
removehuman2.sh
removehuman.sh
removemicrobes.sh
removesmartbell.sh
renameimg.sh
rename.sh
repair.sh
replaceheaders.sh
representative.sh
rqcfilter.sh
samtoroc.sh
seal.sh
sendsketch.sh
shred.sh
shrinkaccession.sh
shuffle.sh
sketchblacklist.sh
sketch.sh
sortbyname.sh
splitbytaxa.sh
splitnextera.sh
splitsam4way.sh
splitsam6way.sh
splitsam.sh
stats.sh
statswrapper.sh
streamsam.sh
summarizecrossblock.sh
summarizemerge.sh
summarizequast.sh
summarizescafstats.sh
summarizeseal.sh
summarizesketch.sh
synthmda.sh
tadpipe.sh
tadpole.sh
tadwrapper.sh
taxonomy.sh
taxserver.sh
taxsize.sh
taxtree.sh
testfilesystem.sh
testformat2.sh
testformat.sh
tetramerfreq.sh
textfile.sh
translate6frames.sh
unicode2ascii.sh
webcheck.sh
Module¶
You can load the modules by:
module load biocontainers
module load bbmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bbmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bbmap
stats.sh in=SRR11234553_1.fastq > stats_out.txt
statswrapper.sh *.fastq > statswrapper_out.txt
pileup.sh in=map1.sam out=pileup_out.txt
readlength.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq > readlength_out.txt
kmercountexact.sh in=SRR11234553_1.fastq in2=SRR11234553_2.fastq out=kmer_test.out khist=kmer.khist peaks=kmer.peak
bbmask.sh in=SRR11234553_1.fastq out=test.mark sam=map1.sam
Bbtools¶
Introduction¶
BBTools is a suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data.
Versions¶
39.00
Commands¶
Xcalcmem.sh
a_sample_mt.sh
addadapters.sh
addssu.sh
adjusthomopolymers.sh
alltoall.sh
analyzeaccession.sh
analyzegenes.sh
analyzesketchresults.sh
applyvariants.sh
bbcms.sh
bbcountunique.sh
bbduk.sh
bbest.sh
bbfakereads.sh
bbmap.sh
bbmapskimmer.sh
bbmask.sh
bbmerge-auto.sh
bbmerge.sh
bbnorm.sh
bbrealign.sh
bbrename.sh
bbsketch.sh
bbsplit.sh
bbsplitpairs.sh
bbstats.sh
bbversion.sh
bbwrap.sh
bloomfilter.sh
calcmem.sh
calctruequality.sh
callgenes.sh
callpeaks.sh
callvariants.sh
callvariants2.sh
clumpify.sh
commonkmers.sh
comparegff.sh
comparesketch.sh
comparessu.sh
comparevcf.sh
consect.sh
consensus.sh
countbarcodes.sh
countgc.sh
countsharedlines.sh
crossblock.sh
crosscontaminate.sh
cutgff.sh
cutprimers.sh
decontaminate.sh
dedupe.sh
dedupe2.sh
dedupebymapping.sh
demuxbyname.sh
diskbench.sh
estherfilter.sh
explodetree.sh
fetchproks.sh
filterassemblysummary.sh
filterbarcodes.sh
filterbycoverage.sh
filterbyname.sh
filterbysequence.sh
filterbytaxa.sh
filterbytile.sh
filterlines.sh
filterqc.sh
filtersam.sh
filtersilva.sh
filtersubs.sh
filtervcf.sh
fixgaps.sh
fungalrelease.sh
fuse.sh
gbff2gff.sh
getreads.sh
gi2ancestors.sh
gi2taxid.sh
gitable.sh
grademerge.sh
gradesam.sh
icecreamfinder.sh
icecreamgrader.sh
icecreammaker.sh
idmatrix.sh
idtree.sh
invertkey.sh
kapastats.sh
kcompress.sh
keepbestcopy.sh
khist.sh
kmercountexact.sh
kmercountmulti.sh
kmercoverage.sh
kmerfilterset.sh
kmerlimit.sh
kmerlimit2.sh
kmerposition.sh
kmutate.sh
lilypad.sh
loadreads.sh
loglog.sh
makechimeras.sh
makecontaminatedgenomes.sh
makepolymers.sh
mapPacBio.sh
matrixtocolumns.sh
mergeOTUs.sh
mergebarcodes.sh
mergepgm.sh
mergeribo.sh
mergesam.sh
mergesketch.sh
mergesorted.sh
msa.sh
mutate.sh
muxbyname.sh
partition.sh
phylip2fasta.sh
pileup.sh
plotflowcell.sh
plotgc.sh
postfilter.sh
printtime.sh
processfrag.sh
processhi-c.sh
processspeed.sh
randomgenome.sh
randomreads.sh
readlength.sh
readqc.sh
reducesilva.sh
reformat.sh
reformatpb.sh
removebadbarcodes.sh
removecatdogmousehuman.sh
removehuman.sh
removehuman2.sh
removemicrobes.sh
removesmartbell.sh
rename.sh
renameimg.sh
repair.sh
replaceheaders.sh
representative.sh
rqcfilter.sh
rqcfilter2.sh
runhmm.sh
samtoroc.sh
seal.sh
sendsketch.sh
shred.sh
shrinkaccession.sh
shuffle.sh
shuffle2.sh
sketch.sh
sketchblacklist.sh
sketchblacklist2.sh
sortbyname.sh
splitbytaxa.sh
splitnextera.sh
splitribo.sh
splitsam.sh
splitsam4way.sh
splitsam6way.sh
stats.sh
statswrapper.sh
streamsam.sh
subsketch.sh
summarizecontam.sh
summarizecoverage.sh
summarizecrossblock.sh
summarizemerge.sh
summarizequast.sh
summarizescafstats.sh
summarizeseal.sh
summarizesketch.sh
synthmda.sh
tadpipe.sh
tadpole.sh
tadwrapper.sh
taxonomy.sh
taxserver.sh
taxsize.sh
taxtree.sh
testfilesystem.sh
testformat.sh
testformat2.sh
tetramerfreq.sh
textfile.sh
translate6frames.sh
unicode2ascii.sh
unzip.sh
vcf2gff.sh
webcheck.sh
Module¶
You can load the modules by:
module load biocontainers
module load bbtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bbtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bbtools
Bcftools¶
Introduction¶
Bcftools
is a program for variant calling and manipulating files in the Variant Call Format (VCF) and its binary counterpart BCF.
Versions¶
1.13
1.14
1.17
Commands¶
bcftools
color-chrs.pl
guess-ploidy.py
plot-roh.py
plot-vcfstats
run-roh.pl
vcfutils.pl
Module¶
You can load the modules by:
module load biocontainers
module load bcftools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bcftools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bcftools
bcftools query -f '%CHROM %POS %REF %ALT\n' file.bcf
bcftools polysomy -v -o outdir/ file.vcf
# Variant calling
bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ob -o calls.bcf
Bcl2fastq¶
Introduction¶
bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis.
Versions¶
2.20.0
Commands¶
bcl2fastq
Module¶
You can load the modules by:
module load biocontainers
module load bcl2fastq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bcl2fastq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bcl2fastq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bcl2fastq
Beagle¶
Introduction¶
Beagle
is a software package for phasing genotypes and for imputing ungenotyped markers. Start it with: beagle [java options] [arguments]
Note: Bref is not installed in this container.
Versions¶
5.1_24Aug19.3e8
Commands¶
beagle
Module¶
You can load the modules by:
module load biocontainers
module load beagle
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Beagle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=beagle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers beagle
beagle gt=test.vcf.gz out=test.out
BEAST 2¶
Introduction¶
BEAST 2
is a cross-platform program for Bayesian phylogenetic analysis of molecular sequences.
Versions¶
2.6.3
2.6.4
2.6.6
Commands¶
applauncher
beast
beauti
densitree
loganalyser
logcombiner
packagemanager
treeannotator
Module¶
You can load the modules by:
module load biocontainers
module load beast2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BEAST 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=beast2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers beast2
beast -threads 4 -prefix input input.xml
Bedops¶
Introduction¶
Bedops
is a software package for manipulating and analyzing genomic interval data.
Versions¶
2.4.39
Commands¶
bam2bed
bam2bed-float128
bam2bed_gnuParallel
bam2bed_gnuParallel-float128
bam2bed_gnuParallel-megarow
bam2bed_gnuParallel-typical
bam2bed-megarow
bam2bed_sge
bam2bed_sge-float128
bam2bed_sge-megarow
bam2bed_sge-typical
bam2bed_slurm
bam2bed_slurm-float128
bam2bed_slurm-megarow
bam2bed_slurm-typical
bam2bed-typical
bam2starch
bam2starch-float128
bam2starch_gnuParallel
bam2starch_gnuParallel-float128
bam2starch_gnuParallel-megarow
bam2starch_gnuParallel-typical
bam2starch-megarow
bam2starch_sge
bam2starch_sge-float128
bam2starch_sge-megarow
bam2starch_sge-typical
bam2starch_slurm
bam2starch_slurm-float128
bam2starch_slurm-megarow
bam2starch_slurm-typical
bam2starch-typical
bedextract
bedextract-float128
bedextract-megarow
bedextract-typical
bedmap
bedmap-float128
bedmap-megarow
bedmap-typical
bedops
bedops-float128
bedops-megarow
bedops-typical
closest-features
closest-features-float128
closest-features-megarow
closest-features-typical
convert2bed
convert2bed-float128
convert2bed-megarow
convert2bed-typical
gff2bed
gff2bed-float128
gff2bed-megarow
gff2bed-typical
gff2starch
gff2starch-float128
gff2starch-megarow
gff2starch-typical
gtf2bed
gtf2bed-float128
gtf2bed-megarow
gtf2bed-typical
gtf2starch
gtf2starch-float128
gtf2starch-megarow
gtf2starch-typical
gvf2bed
gvf2bed-float128
gvf2bed-megarow
gvf2bed-typical
gvf2starch
gvf2starch-float128
gvf2starch-megarow
gvf2starch-typical
psl2bed
psl2bed-float128
psl2bed-megarow
psl2bed-typical
psl2starch
psl2starch-float128
psl2starch-megarow
psl2starch-typical
rmsk2bed
rmsk2bed-float128
rmsk2bed-megarow
rmsk2bed-typical
rmsk2starch
rmsk2starch-float128
rmsk2starch-megarow
rmsk2starch-typical
sam2bed
sam2bed-float128
sam2bed-megarow
sam2bed-typical
sam2starch
sam2starch-float128
sam2starch-megarow
sam2starch-typical
sort-bed
sort-bed-float128
sort-bed-megarow
sort-bed-typical
starch
starchcat
starchcat-float128
starchcat-megarow
starchcat-typical
starchcluster_gnuParallel
starchcluster_gnuParallel-float128
starchcluster_gnuParallel-megarow
starchcluster_gnuParallel-typical
starchcluster_sge
starchcluster_sge-float128
starchcluster_sge-megarow
starchcluster_sge-typical
starchcluster_slurm
starchcluster_slurm-float128
starchcluster_slurm-megarow
starchcluster_slurm-typical
starch-diff
starch-diff-float128
starch-diff-megarow
starch-diff-typical
starch-float128
starch-megarow
starchstrip
starchstrip-float128
starchstrip-megarow
starchstrip-typical
starch-typical
switch-BEDOPS-binary-type
unstarch
unstarch-float128
unstarch-megarow
unstarch-typical
update-sort-bed-migrate-candidates
update-sort-bed-migrate-candidates-float128
update-sort-bed-migrate-candidates-megarow
update-sort-bed-migrate-candidates-typical
update-sort-bed-slurm
update-sort-bed-slurm-float128
update-sort-bed-slurm-megarow
update-sort-bed-slurm-typical
update-sort-bed-starch-slurm
update-sort-bed-starch-slurm-float128
update-sort-bed-starch-slurm-megarow
update-sort-bed-starch-slurm-typical
vcf2bed
vcf2bed-float128
vcf2bed-megarow
vcf2bed-typical
vcf2starch
vcf2starch-float128
vcf2starch-megarow
vcf2starch-typical
wig2bed
wig2bed-float128
wig2bed-megarow
wig2bed-typical
wig2starch
wig2starch-float128
wig2starch-megarow
wig2starch-typical
Module¶
You can load the modules by:
module load biocontainers
module load bedops
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bedops on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedops
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bedops
bedops -m 001.merge.001.test > 001.merge.001.observed
bedops -c 001.merge.001.test > 001.complement.001.observed
bedops -i 001.intersection.001a.test 001.intersection.001b.test > 001.intersection.001.observed
Bedtools¶
Introduction¶
Bedtools
is an extensive suite of utilities for genome arithmetic and comparing genomic features in BED format.
Versions¶
2.30.0
2.31.0
Commands¶
annotateBed
bamToBed
bamToFastq
bed12ToBed6
bedpeToBam
bedToBam
bedToIgv
bedtools
closestBed
clusterBed
complementBed
coverageBed
expandCols
fastaFromBed
flankBed
genomeCoverageBed
getOverlap
groupBy
intersectBed
linksBed
mapBed
maskFastaFromBed
mergeBed
multiBamCov
multiIntersectBed
nucBed
pairToBed
pairToPair
randomBed
shiftBed
shuffleBed
slopBed
sortBed
subtractBed
tagBam
unionBedGraphs
windowBed
windowMaker
Module¶
You can load the modules by:
module load biocontainers
module load bedtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bedtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bedtools
bedtools intersect -a a.bed -b b.bed
bedtools annotate -i variants.bed -files genes.bed conserve.bed known_var.bed
Bioawk¶
Introduction¶
Bioawk
is an extension to Brian Kernighan’s awk, adding the support of several common biological data formats, including optionally gzip’ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.
Versions¶
1.0
Commands¶
bioawk
Module¶
You can load the modules by:
module load biocontainers
module load bioawk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bioawk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioawk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bioawk
bioawk -c fastx '{print ">"$name;print revcomp($seq)}' seq.fa.gz
Biobambam¶
Introduction¶
Biobambam
is a collection of tools for early stage alignment file processing.
Versions¶
2.0.183
Commands¶
bam12auxmerge
bam12split
bam12strip
bamadapterclip
bamadapterfind
bamalignfrac
bamauxmerge
bamauxmerge2
bamauxsort
bamcat
bamchecksort
bamclipXT
bamclipreinsert
bamcollate2
bamdepth
bamdepthintersect
bamdifference
bamdownsamplerandom
bamexplode
bamexploderef
bamfastcat
bamfastexploderef
bamfastnumextract
bamfastsplit
bamfeaturecount
bamfillquery
bamfilteraux
bamfiltereofblocks
bamfilterflags
bamfilterheader
bamfilterheader2
bamfilterk
bamfilterlength
bamfiltermc
bamfilternames
bamfilterrefid
bamfilterrg
bamfixmateinformation
bamfixpairinfo
bamflagsplit
bamindex
bamintervalcomment
bamintervalcommenthist
bammapdist
bammarkduplicates
bammarkduplicates2
bammarkduplicatesopt
bammaskflags
bammdnm
bammerge
bamnumericalindex
bamnumericalindexstats
bamrank
bamranksort
bamrecalculatecigar
bamrecompress
bamrefextract
bamrefinterval
bamreheader
bamreplacechecksums
bamreset
bamscrapcount
bamseqchksum
bamsormadup
bamsort
bamsplit
bamsplitdiv
bamstreamingmarkduplicates
bamtofastq
bamvalidate
bamzztoname
Module¶
You can load the modules by:
module load biocontainers
module load biobambam
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Biobambam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=biobambam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers biobambam
bammarkduplicates I=Aligned.sortedByCoord.out.bam O=out.bam D=duplcate_out
bamsort I=Aligned.sortedByCoord.out.bam O=sorted.bam sortthreads=8
bamtofastq filename=Aligned.sortedByCoord.out.bam outputdir=fastq_out
Bioconvert¶
Introduction¶
Bioconvert
is a collaborative project to facilitate the interconversion of life science data from one format to another.
Versions¶
0.4.3
0.5.2
0.6.1
0.6.2
Commands¶
bioconvert
Module¶
You can load the modules by:
module load biocontainers
module load bioconvert
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bioconvert on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bioconvert
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bioconvert
bioconvert fastq2fasta input.fastq output.fa
Biopython¶
Introduction¶
Biopython
is a set of freely available tools for biological computation written in Python.
Versions¶
1.70-np112py27
1.70-np112py36
1.78
Commands¶
easy_install
f2py
f2py3
idle3
pip
pip3
pydoc
pydoc3
python
python3
python3-config
python3.9
python3.9-config
wheel
Module¶
You can load the modules by:
module load biocontainers
module load biopython
Interactive job¶
To run biopython interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers biopython
(base) UserID@bell-a008:~ $ python
Python 3.9.1 | packaged by conda-forge | (default, Jan 26 2021, 01:34:10)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> with open("input.gb") as input_handle:
for record in SeqIO.parse(input_handle, "genbank"):
print(record)
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Biopython on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=biopython
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers biopython
python script.py
Bismark¶
Introduction¶
Bismark
is a tool to map bisulfite treated sequencing reads to a genome of interest and perform methylation calls in a single step.
Versions¶
0.23.0
0.24.0
Commands¶
bismark
bam2nuc
bismark2bedGraph
bismark2report
bismark2summary
bismark_genome_preparation
bismark_methylation_extractor
copy_bismark_files_for_release.pl
coverage2cytosine
deduplicate_bismark
filter_non_conversion
methylation_consistency
Dependencies¶
Bowtie v2.4.2
, Samtools v1.12
, HISAT2 v2.2.1
were included in the container image. So users do not need to provide the dependency path in the bismark parameter.
Module¶
You can load the modules by:
module load biocontainers
module load bismark
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bismark on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=bismark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bismark
bismark_genome_preparation --bowtie2 data/ref_genome
bismark --multicore 12 --genome data/ref_genome seq.fastq
Blasr¶
Introduction¶
Blasr
Blasr is a read mapping program that maps reads to positions in a genome by clustering short exact matches between the read and the genome, and scoring clusters using alignment.
Versions¶
5.3.5
Commands¶
blasr
Module¶
You can load the modules by:
module load biocontainers
module load blasr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Blasr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blasr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blasr
blasr reads.bas.h5 ecoli_K12.fasta -sam
BLAST¶
Introduction¶
BLAST
(Basic Local Alignment Search Tool) finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.
Versions¶
2.11.0
2.13.0
Commands¶
blastn
blastp
blastx
blast_formatter
amino-acid-composition
between-two-genes
blastdbcheck
blastdbcmd
blastdb_aliastool
cleanup-blastdb-volumes.py
deltablast
dustmasker
eaddress
eblast
get_species_taxids.sh
legacy_blast.pl
makeblastdb
makembindex
makeprofiledb
psiblast
rpsblast
rpstblastn
run-ncbi-converter
segmasker
tblastn
tblastx
update_blastdb.pl
windowmasker
Module¶
You can load the modules by:
module load biocontainers
module load blast
BLAST Databases¶
Local copies of the blast dabase can be found in the directory /depot/itap/datasets/blast/latest/. The environment varialbe BLASTDB
was also set as /depot/itap/datasets/blast/latest/
. If users want to use cdd_delta
, env_nr
, env_nt
, nr
, nt
, pataa
, patnt
, pdbnt
, refseq_protein
, refseq_rna
, swissprot
, or tsa_nt
databases, do not need to provide the database path. Instead, just use the format like this -db nr
.
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BLAST on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=blast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blast
blastp -query protein.fasta -db nr -out test_out -num_threads 4
BlobTools¶
Introduction¶
BlobTools
is a modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.
Detailed usage can be found here: https://github.com/DRL/blobtools
Versions¶
1.1.1
Commands¶
blobtools
Module¶
You can load the modules by:
module load biocontainers
module load blobtools/1.1.1
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run blobtools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=blobtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers blobtools/1.1.1
blobtools create -i example/assembly.fna -b example/mapping_1.sorted.bam -t example/blast.out -o test && \
blobtools view -i test.blobDB.json && \
blobtools plot -i test.blobDB.json
Bmge¶
Introduction¶
Bmge
is a program that selects regions in a multiple sequence alignment that are suited for phylogenetic inference.
Versions¶
1.12
Commands¶
bmge
Module¶
You can load the modules by:
module load biocontainers
module load bmge
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bmge on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bmge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bmge
bmge -i seq.fa -t AA -o out.phy
Bowtie¶
Introduction¶
Bowtie
is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
Versions¶
1.3.1
Commands¶
bowtie
bowtie-build
bowtie-inspect
Module¶
You can load the modules by:
module load biocontainers
module load bowtie
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bowtie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bowtie
bowtie-build ref.fasta ref
bowtie -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam
Bowtie 2¶
Introduction¶
``Bowtie 2``is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
Versions¶
2.4.2
2.5.1
Commands¶
bowtie2
bowtie2-build
bowtie2-inspect
Module¶
You can load the modules by:
module load biocontainers
module load bowtie2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bowtie 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bowtie2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bowtie2
bowtie2-build ref.fasta ref
bowtie2 -p 4 -x ref -1 input_1.fq -2 input_2.fq -S test.sam
Bracken¶
Introduction¶
Bracken
(Bayesian Reestimation of Abundance with KrakEN) is a highly accurate statistical method that computes the abundance of species in DNA sequences from a metagenomics sample.
Detailed usage can be found here: https://github.com/jenniferlu717/Bracken
Note
Inside the bracken
container image, kraken2
was also installed. As a result, when you load bracken/2.6.1-py37
, kraken version 2.1.1
will be automatically loaded. Please do not load kraken2
module together with bracken
module to avaoid conflict.
Versions¶
2.6.1
2.7
Commands¶
bracken
bracken-build
combine_bracken_outputs.py
kraken2
kraken2-build
kraken2-inspect
combine_bracken_outputs.py
est_abundance.py
generate_kmer_distribution.py
Module¶
You can load the modules by:
module load biocontainers
module load bracken/2.6.1-py37
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bracken on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=bracken
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bracken/2.6.1-py37
DATABASE=minikraken2_v2_8GB_201904_UPDATE
kraken2 --threads 24 --report kranken2.report --db $DATABASE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq
bracken -d $DATABASE -i kranken2.report -o bracken_output -w bracken.report
BRAKER¶
Introduction¶
BRAKER
is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET and AUGUSTUS in novel eukaryotic genomes.
Versions¶
2.1.6
Commands¶
braker.pl
Helper command¶
Note
Since BRAKER
is a pipeline that trains AUGUSTUS
, i.e. writes species specific parameter files, BRAKER needs writing access to the configuration directory of AUGUSTUS that contains such files. This installation comes with a stub of AUGUSTUS coniguration files, but you must
copy them out from the container into a location where you have write permissions.
A helper command copy_augustus_config
is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:
$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
Module¶
You can load the modules by:
module load biocontainers
module load braker2/2.1.6
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BRAKER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=BRAKER2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers braker2/2.1.6
# The augustus config step is only required for the first time to use BRAKER2
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
braker.pl --genome genome.fa --bam RNAseq.bam --softmasking --cores 24
Brass¶
Introduction¶
Brass
is used to analyze one or more related BAM files of paired-end sequencing to determine potential rearrangement breakpoints.
Versions¶
6.3.4
Commands¶
brass-assemble
brass_bedpe2vcf.pl
brass_foldback_reads.pl
brass-group
brassI_filter.pl
brassI_np_in.pl
brassI_pre_filter.pl
brassI_prep_bam.pl
brass.pl
Module¶
You can load the modules by:
module load biocontainers
module load brass
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Brass on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=brass
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers brass
brass.pl -c 4 -o myout -t tumour.bam -n normal.bam
Breseq¶
Introduction¶
Breseq
is a computational pipeline for the analysis of short-read re-sequencing data.
Versions¶
0.36.1
Commands¶
breseq
Module¶
You can load the modules by:
module load biocontainers
module load breseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Breseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=breseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers breseq
BUSCO¶
Introduction¶
BUSCO
(Benchmarking sets of Universal Single-Copy Orthologs) provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs.
Detailed information can be found here: https://gitlab.com/ezlab/busco/
Versions¶
5.2.2
5.3.0
5.4.1
5.4.3
5.4.4
5.4.5
Commands¶
busco
generate_plot.py
Helper command¶
Note
Augustus is a gene prediction program for eukaryotes which is required by BUSCO. Augustus requires a writable configuration directory. This installation comes with a stub of AUGUSTUS coniguration files, but you must
copy them out from the container into a location where you have write permissions.
A helper command copy_augustus_config
is provided to simplify the task. Follow the procedure below to put the config files in your scratch space:
$ mkdir -p $RCAC_SCRATCH/augustus
$ copy_augustus_config $RCAC_SCRATCH/augustus
$ export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
Module¶
You can load the modules by:
module load biocontainers
module load busco
Example job for prokaryotic genomes¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BUSCO on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers busco
## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets
## run the evaluation
busco -f -c 12 -l actinobacteria_class_odb10 -i bacteria_genome.fasta -o busco_out -m genome
## generate a simple summary plot
generate_plot.py -wd busco_out
Example job for eukaryotic genomes¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BUSCO on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=BUSCO
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers busco
## The augustus config step is only required for the first time to use BUSCO
mkdir -p $RCAC_SCRATCH/augustus
copy_augustus_config $RCAC_SCRATCH/augustus
## This is required for eukaryotic genomes
export AUGUSTUS_CONFIG_PATH=$RCAC_SCRATCH/augustus/config
## Print the full lineage datasets, and find the dataset fitting your organism.
busco --list-datasets
## run the evaluation
busco -f -c 12 -l fungi_odb10 -i fungi_protein.fasta -o busco_out_protein -m protein
busco -f -c 12 --augustus -l fungi_odb10 -i fungi_genome.fasta -o busco_out_genome -m genome
## generate a simple summary plot
generate_plot.py -wd busco_out_protein
generate_plot.py -wd busco_out_genome
Bustools¶
Introduction¶
Bustools
is a program for manipulating BUS files for single cell RNA-Seq datasets.
Versions¶
0.41.0
Commands¶
bustools
Module¶
You can load the modules by:
module load biocontainers
module load bustools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Bustools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bustools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bustools
bustools capture -s -o cDNA_capture.bus -c cDNA_transcripts.to_capture.txt -e matrix.ec -t transcripts.txt output.correct.sort.bus
bustools count -o u -g cDNA_introns_t2g.txt -e matrix.ec -t transcripts.txt --genecounts cDNA_capture.bus
BWA¶
Introduction¶
BWA
(Burrows-Wheeler Aligner) is a fast, accurate, memory-efficient aligner for short and long sequencing reads.
Versions¶
0.7.17
Commands¶
bwa
qualfa2fq.pl
xa2multi.pl
Module¶
You can load the modules by:
module load biocontainers
module load bwa
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BWA on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bwa
bwa index ref.fasta
bwa mem ref.fasta input.fq > test.sam
Bwameth¶
Introduction¶
Bwameth is a tool for fast and accurante alignment of BS-Seq reads.
Versions¶
0.2.5
Commands¶
bwameth.py
Module¶
You can load the modules by:
module load biocontainers
module load bwameth
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run bwameth on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=bwameth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers bwameth
Cactus¶
Introduction¶
Cactus
is a reference-free whole-genome multiple alignment program.
Versions¶
2.0.5
2.2.1
2.2.3-gpu
2.2.3
2.4.0-gpu
2.4.0
Commands¶
cactus
cactus-align
cactus-align-batch
cactus-blast
cactus-graphmap
cactus-graphmap-join
cactus-graphmap-split
cactus-minigraph
cactus-prepare
cactus-prepare-toil
cactus-preprocess
cactus-refmap
cactus2hal-stitch.sh
cactus2hal.py
cactusAPITests
cactus_analyseAssembly
cactus_barTests
cactus_batch_mergeChunks
cactus_chain
cactus_consolidated
cactus_covered_intervals
cactus_fasta_fragments.py
cactus_fasta_softmask_intervals.py
cactus_filterSmallFastaSequences.py
cactus_halGeneratorTests
cactus_local_alignment.py
cactus_makeAlphaNumericHeaders.py
cactus_softmask2hardmask
Module¶
You can load the modules by:
module load biocontainers
module load cactus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cactus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cactus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cactus
wget https://raw.githubusercontent.com/ComparativeGenomicsToolkit/cactus/master/examples/evolverMammals.txt
cactus jobStore evolverMammals.txt evolverMammals.hal
Cafe¶
Introduction¶
Cafe
is a computational tool for the study of gene family evolution.
Versions¶
4.2.1
5.0.0
Commands¶
cafe
Module¶
You can load the modules by:
module load biocontainers
module load cafe
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cafe on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cafe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cafe
#To get a list of commands just call CAFE with the -h or --help arguments
cafe5 -h
#To estimate lambda with no among family rate variation issue the command
cafe5 -i mammal_gene_families.txt -t mammal_tree.txt
Canu¶
Introduction¶
Canu
is a single molecule sequence assembler for genomes large and small.
Detailed usage can be found here: https://github.com/marbl/canu
Versions¶
2.1.1
2.2
Commands¶
canu
Module¶
You can load the modules by:
module load biocontainers
module load canu/2.2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run canu on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=canu
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers canu/2.2
canu -p Cm -d clavibacter_pacbio genomeSize=3.4m -pacbio *.fastq
Ccs¶
Introduction¶
Pbccs is a tool to generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads).
Versions¶
6.4.0
Commands¶
ccs
Module¶
You can load the modules by:
module load biocontainers
module load ccs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ccs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ccs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ccs
ccs --all subreads.bam ccs.bam
Cdbtools¶
Introduction¶
Cdbtools
is a collection of tools used for creating indices for quick retrieval of any particular sequences from large multi-FASTA files.
Versions¶
0.99
Commands¶
cdbfasta
cdbyank
Module¶
You can load the modules by:
module load biocontainers
module load cdbtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cdbtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cdbtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cdbtools
cdbfasta genome.fa
cdbyank -a 'seq_1' genome.fa.cidx
Cd-hit¶
Introduction¶
Cd-hit
is a very widely used program for clustering and comparing protein or nucleotide sequences.
Versions¶
4.8.1
Commands¶
FET.pl
cd-hit
cd-hit-2d
cd-hit-2d-para.pl
cd-hit-454
cd-hit-clstr_2_blm8.pl
cd-hit-div
cd-hit-div.pl
cd-hit-est
cd-hit-est-2d
cd-hit-para.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
make_multi_seq.pl
plot_2d.pl
plot_len1.pl
Module¶
You can load the modules by:
module load biocontainers
module load cd-hit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cd-hit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cd-hit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cd-hit
cd-hit -i Cm_pep.fasta -o Cmdb90 -c 0.9 -n 5 -M 16000 -T 8
cd-hit-est -i Cm_dna.fasta -o Cmdb90_nt -c 0.9 -n 5 -M 16000 -T 8
Cegma¶
Introduction¶
CEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.
Versions¶
2.5
Commands¶
cegma
Module¶
You can load the modules by:
module load biocontainers
module load cegma
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cegma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cegma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cegma
cegma --genome genome.fasta -o output
Cellbender¶
Introduction¶
Cellbender
is a software package for eliminating technical artifacts from high-throughput single-cell RNA sequencing (scRNA-seq) data.
Versions¶
0.2.0
0.2.2
Commands¶
cellbender
Module¶
You can load the modules by:
module load biocontainers
module load cellbender
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cellbender on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellbender
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellbender
cellbender remove-background \
--input cellranger/test_count/run_count_1kpbmcs/outs/raw_feature_bc_matrix.h5 \
--output output_cpu.h5 \
--expected-cells 1000 \
--total-droplets-included 20000 \
--fpr 0.01 \
--epochs 150
Cellphonedb¶
Introduction¶
CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions.
Versions¶
2.1.7
Commands¶
cellphonedb
Module¶
You can load the modules by:
module load biocontainers
module load cellphonedb
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellphonedb on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellphonedb
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellphonedb
Cellranger¶
Introduction¶
Cellranger
is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more.
Detailed usage can be found here: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger.
Versions¶
6.0.1
6.1.1
6.1.2
7.0.0
7.0.1
7.1.0
Commands¶
cellranger mkfastq
cellranger count
cellranger aggr
cellranger reanalyze
cellranger multi
Module¶
You can load the modules by:
module load biocontainers
module load cellranger
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 48
#SBATCH --job-name=cellranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger
cellranger count --id=run_count_1kpbmcs --fastqs=pbmc_1k_v3_fastqs --sample=pbmc_1k_v3 --transcriptome=refdata-gex-GRCh38-2020-A
Cellranger-arc¶
Introduction¶
Cell Ranger ARC is a set of analysis pipelines that process Chromium Single Cell Multiome ATAC + Gene Expression sequencing data to generate a variety of analyses pertaining to gene expression (GEX), chromatin accessibility, and their linkage. Furthermore, since the ATAC and GEX measurements are on the very same cell, we are able to perform analyses that link chromatin accessibility and GEX.
Versions¶
2.0.2
Commands¶
cellranger-arc
Module¶
You can load the modules by:
module load biocontainers
module load cellranger-arc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger-arc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-arc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-arc
Cellranger-atac¶
Introduction¶
Cellranger-atac
is a set of analysis pipelines that process Chromium Single Cell ATAC data.
Versions¶
2.0.0
2.1.0
Commands¶
cellranger-atac
Module¶
You can load the modules by:
module load biocontainers
module load cellranger-atac
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cellranger-atac on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --mem=64G
#SBATCH --job-name=cellranger-atac
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-atac
cellranger-atac count --id=sample345 \
--reference=refdata-cellranger-arc-GRCh38-2020-A-2.0.0 \
--fastqs=runs/HAWT7ADXX/outs/fastq_path \
--sample=mysample \
--localcores=8 \
--localmem=64
Cellranger-dna¶
Introduction¶
Cell Ranger DNA is a set of analysis pipelines that process Chromium single cell DNA sequencing output to align reads, identify copy number variation (CNV), and compare heterogeneity among cells.
Versions¶
1.1.0
Commands¶
cellranger-dna
Module¶
You can load the modules by:
module load biocontainers
module load cellranger-dna
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellranger-dna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cellranger-dna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellranger-dna
CellRank¶

Introduction¶
CellRank
a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data.
Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.
Versions¶
1.5.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load cellrank/1.5.1
Note
The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.
Interactive job¶
To run CellRank interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellrank/1.5.1
python script.py
CellRank-krylov¶

Introduction¶
CellRank
a toolkit to uncover cellular dynamics based on Markov state modeling of single-cell data. CellRank-krylov
is CellRank
installed with extra libraries, enabling it to have better performance for large datasets (>15k cells).
Detailed information about CellRank can be found here: https://cellrank.readthedocs.io/en/stable/.
Versions¶
1.5.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load cellrank-krylov/1.5.1
Note
The CellRank container also contained scVelo and scanpy. When you want to use CellRank, do not load scVelo or scanpy.
Interactive job¶
To run CellRank-krylov interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank-krylov/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> import scvelo as scv
>>> import cellrank as cr
>>> import numpy as np
>>> scv.settings.verbosity = 3
>>> scv.settings.set_figure_params("scvelo")
>>> cr.settings.verbosity = 2
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=cellrank-krylov
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellrank-krylov/1.5.1
python script.py
cellSNP¶
Introduction¶
cellSNP
aims to pileup the expressed alleles in single-cell or bulk RNA-seq data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, particularly with vireo, which assigns cells to donors and detects doublets, even without genotyping reference.
Versions¶
1.2.2
Commands¶
cellsnp-lite
Module¶
You can load the modules by:
module load biocontainers
module load cellsnp-lite
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cellSNP on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=cellsnp-lite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cellsnp-lite
cellsnp-lite -s sample.bam -b barcode.tsv -O cellsnp_out -p 8 --minMAF 0.1 --minCOUNT 100
Celltypist¶
Introduction¶
Celltypist
is a tool for semi-automatic cell type annotation.
Versions¶
0.2.0
1.1.0
Commands¶
celltypist
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load celltypist
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Celltypist on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=celltypist
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers celltypist
celltypist --indata demo_2000_cells.h5ad --model Immune_All_Low.pkl --outdir output
Centrifuge¶
Introduction¶
Centrifuge
is a novel microbial classification engine that enables rapid, accurate, and sensitive labeling of reads and quantification of species on desktop computers.
Versions¶
1.0.4_beta
Commands¶
centrifuge
centrifuge-BuildSharedSequence.pl
centrifuge-RemoveEmptySequence.pl
centrifuge-RemoveN.pl
centrifuge-build
centrifuge-build-bin
centrifuge-class
centrifuge-compress.pl
centrifuge-download
centrifuge-inspect
centrifuge-inspect-bin
centrifuge-kreport
centrifuge-sort-nt.pl
centrifuge_evaluate.py
centrifuge_simulate_reads.py
Module¶
You can load the modules by:
module load biocontainers
module load centrifuge
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Centrifuge on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=centrifuge
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers centrifuge
centrifuge-download -o taxonomy taxonomy
centrifuge-download -o library -m -d "archaea,bacteria,viral" refseq > seqid2taxid.map
cat library/*/*.fna > input-sequences.fna
centrifuge-build -p 8 --conversion-table seqid2taxid.map \
--taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \
input-sequences.fna abv
Cfsan-snp-pipeline¶
Introduction¶
The CFSAN SNP Pipeline is a Python-based system for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced from samples of interest to food safety.
Versions¶
2.2.1
Commands¶
cfsan_snp_pipeline
Module¶
You can load the modules by:
module load biocontainers
module load cfsan-snp-pipeline
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cfsan-snp-pipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cfsan-snp-pipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cfsan-snp-pipeline
Checkm-genome¶
Introduction¶
CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.
Versions¶
1.2.0
1.2.2
Commands¶
checkm-genome
Module¶
You can load the modules by:
module load biocontainers
module load checkm-genome
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run checkm-genome on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=checkm-genome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers checkm-genome
checkm lineage_wf -t 8 -x fa bins checkm
Chewbbaca¶
Introduction¶
chewBBACA is a comprehensive pipeline including a set of functions for the creation and validation of whole genome and core genome MultiLocus Sequence Typing (wg/cgMLST) schemas, providing an allele calling algorithm based on Blast Score Ratio that can be run in multiprocessor settings and a set of functions to visualize and validate allele variation in the loci. chewBBACA performs the schema creation and allele calls on complete or draft genomes resulting from de novo assemblers.
Versions¶
2.8.5
Commands¶
chewBBACA.py
Module¶
You can load the modules by:
module load biocontainers
module load chewbbaca
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chewbbaca on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=chewbbaca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chewbbaca
chewBBACA.py CreateSchema -i complete_genomes/ -o tutorial_schema --ptf Streptococcus_agalactiae.trn --cpu 4
chewBBACA.py AlleleCall -i complete_genomes/ -g tutorial_schema/schema_seed -o results32_wgMLST --cpu 4
Chopper¶
Introduction¶
Chopper is Rust implementation of NanoFilt+NanoLyse, both originally written in Python. This tool, intended for long read sequencing such as PacBio or ONT, filters and trims a fastq file. Filtering is done on average read quality and minimal or maximal read length, and applying a headcrop (start of read) and tailcrop (end of read) while printing the reads passing the filter.
Versions¶
0.2.0
Commands¶
chopper
Module¶
You can load the modules by:
module load biocontainers
module load chopper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chopper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chopper
Chromap¶
Introduction¶
Chromap is an ultrafast method for aligning and preprocessing high throughput chromatin profiles.
Versions¶
0.2.2
Commands¶
chromap
Module¶
You can load the modules by:
module load biocontainers
module load chromap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run chromap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=chromap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers chromap
CICERO¶
Introduction¶
CICERO
(Clipped-reads Extended for RNA Optimization) is an assembly-based algorithm to detect diverse classes of driver gene fusions from RNA-seq.
Versions¶
1.8.1
Commands¶
Cicero.sh
Module¶
You can load the modules by:
module load biocontainers
module load cicero
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CICERO on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cicero
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cicero
Circexplorer2¶
Introduction¶
CIRCexplorer2 is a comprehensive and integrative circular RNA analysis toolset. It is the successor of CIRCexplorer with plenty of new features to facilitate circular RNA identification and characterization.
Versions¶
2.3.8
Commands¶
CIRCexplorer2
fast_circ.py
fetch_ucsc.py
Module¶
You can load the modules by:
module load biocontainers
module load circexplorer2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run circexplorer2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circexplorer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circexplorer2
Circlator¶
Introduction¶
Circlator
is a tool to circularize genome assemblies.
Versions¶
1.5.5
Commands¶
circlator
python3
Module¶
You can load the modules by:
module load biocontainers
module load circlator
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Circlator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circlator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circlator
circlator minimus2 minimus2_test_run_minimus2.in.fa minimus2_test
Circompara2¶
Introduction¶
CirComPara2 is a computational pipeline to detect, quantify, and correlate expression of linear and circular RNAs from RNA-seq data that combines multiple circRNA-detection methods.
Versions¶
0.1.2.1
Commands¶
python
Rscript
circompara2
CIRCexplorer2
CIRCexplorer_compare.R
CIRI.pl
DCC
DCC_patch_CombineCounts.py
QRE_finder.py
STAR
bedtools
bowtie
bowtie-build
bowtie-inspect
bowtie2
bowtie2-build
bowtie2-inspect
bwa
ccp_circrna_expression.R
cfinder_compare.R
chimoutjunc_to_bed.py
ciri_compare.R
collect_read_stats.R
convert_circrna_collect_tables.py
cuffcompare
cuffdiff
cufflinks
cuffmerge
cuffnorm
cuffquant
dcc_compare.R
dcc_fix_strand.R
fasta_len.py
fastq_rev_comp.py
fastqc
filterCirc.awk
filterSpliceSiteCircles.pl
filter_and_cast_circexp.R
filter_fastq_reads.py
filter_findcirc_res.R
filter_segemehl.R
find_circ.py
findcirc_compare.R
gene_annotation.R
get_ce2_bwa_bks_reads.R
get_ce2_bwa_circ_reads.py
get_ce2_segemehl_bks_reads.R
get_ce2_star_bks_reads.R
get_ce2_th_bks_reads.R
get_circompara_counts.R
get_circrnaFinder_bks_reads.R
get_ciri_bks_reads.R
get_dcc_bks_reads.R
get_findcirc_bks_reads.R
get_gene_expression_files.R
get_stringtie_rawcounts.R
gffread
gtfToGenePred
gtf_collapse_features.py
gtf_to_sam
haarz.x
hisat2
hisat2-build
htseq-count
install_R_libs.R
nrForwardSplicedReads.pl
parallel
pip
postProcessStarAlignment.pl
samtools
samtools_v0
scons
segemehl.x
split_start_end_gtf.py
starCirclesToBed.pl
stringtie
testrealign_compare.R
tophat2
trim_read_header.py
trimmomatic-0.39.jar
unmapped2anchors.py
cf_filterChimout.awk
circompara
get_unmapped_reads_from_bam.sh
install_circompara
make_circrna_html
make_indexes
Module¶
You can load the modules by:
module load biocontainers
module load circompara2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run circompara2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circompara2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circompara2
Circos¶
Introduction¶
Circos
is a software package for visualizing data and information.
Versions¶
0.69.8
Commands¶
circos
Module¶
You can load the modules by:
module load biocontainers
module load circos
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Circos on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=circos
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers circos
circos -conf circos.conf
Ciri2¶
Introduction¶
CIRI2: Circular RNA identification based on multiple seed matching
Versions¶
2.0.6
Commands¶
CIRI2.pl
Module¶
You can load the modules by:
module load biocontainers
module load ciri2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ciri2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ciri2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ciri2
CIRIquant¶
Introduction¶
CIRIquant
is a comprehensive analysis pipeline for circRNA detection and quantification in RNA-Seq data.
Versions¶
1.1.2
Commands¶
CIRIquant
Module¶
You can load the modules by:
module load biocontainers
module load ciriquant
config.yml¶
All required dependencies have been installed within the CIRIquant container image. But users still need toprovide the PATH of these exectuables in config.yml. Please use the below config.yml as example:
name: hg38
tools:
bwa: /bin/bwa
hisat2: /bin/hisat2
stringtie: /bin/stringtie
samtools: /usr/local/bin/samtools
reference:
fasta: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtf: reference/Homo_sapiens.GRCh38.105.gtf
bwa_index: reference/Homo_sapiens.GRCh38.dna.primary_assembly.fa
hisat_index: reference/hg38_hisat2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CIRIquant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 64
#SBATCH --job-name=ciriquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ciriquant
CIRIquant -t 64 -1 SRR12095148_1.fastq -2 SRR12095148_2.fastq --config config.yml -o Output -p test
Clair3¶
Introduction¶
Clair3 is a germline small variant caller for long-reads. Clair3 makes the best of two major method categories: pileup calling handles most variant candidates with speed, and full-alignment tackles complicated candidates to maximize precision and recall. Clair3 runs fast and has superior performance, especially at lower coverage. Clair3 is simple and modular for easy deployment and integration.
Versions¶
0.1-r11
0.1-r12
Commands¶
run_clair3.sh
Module¶
You can load the modules by:
module load biocontainers
module load clair3
Model_path¶
Note
model_path
is in /opt/models/
. The parameter will be like this --model_path="/opt/models/MODEL_NAME"
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clair3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clair3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clair3
run_clair3.sh \
--bam_fn=input.bam \
--ref_fn=ref.fasta \
--threads=12 \
--platform=ont \
--model_path="/opt/models/ont" \
--output=output
Clairvoyante¶
Introduction¶
Clairvoyante is a deep neural network based variant caller.
Versions¶
1.02
Commands¶
clairvoyante.py
Module¶
You can load the modules by:
module load biocontainers
module load clairvoyante
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clairvoyante on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clairvoyante
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clairvoyante
cd training
clairvoyante.py callVarBam \
--chkpnt_fn ../trainedModels/fullv3-illumina-novoalign-hg001+hg002-hg38/learningRate1e-3.epoch500 \
--bam_fn ../testingData/chr21/chr21.bam \
--ref_fn ../testingData/chr21/chr21.fa \
--bed_fn ../testingData/chr21/chr21.bed \
--call_fn chr21_calls.vcf \
--ctgName chr21
Clearcnv¶
Introduction¶
ClearCNV: CNV calling from NGS panel data in the presence of ambiguity and noise.
Versions¶
0.306
Commands¶
clearCNV
Module¶
You can load the modules by:
module load biocontainers
module load clearcnv
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clearcnv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clearcnv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clearcnv
Clever-toolkit¶
Introduction¶
Clever-toolkit is a collection of tools to discover and genotype structural variations in genomes from paired-end sequencing reads. The main software is written in C++ with some auxiliary scripts in Python.
Versions¶
2.4
Commands¶
clever
laser
bam-to-alignment-priors
split-priors-by-chromosome
clever-core
postprocess-predictions
evaluate-sv-predictions
split-reads
laser-core
laser-recalibrate
genotyper
insert-length-histogram
add-score-tags-to-bam
bam2fastq
remove-redundant-variations
precompute-distributions
extract-bad-reads
filter-variations
merge-to-vcf
multiline-to-xa
filter-bam
read-group-stats
Module¶
You can load the modules by:
module load biocontainers
module load clever-toolkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clever-toolkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clever-toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clever-toolkit
cat mapped.bam | bam2fastq output_1.fq output_2.fq
Clonalframeml¶
Introduction¶
ClonalFrameML is a software package that performs efficient inference of recombination in bacterial genomes.
Versions¶
1.11
Commands¶
ClonalFrameML
Module¶
You can load the modules by:
module load biocontainers
module load clonalframeml
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clonalframeml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clonalframeml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clonalframeml
Clust¶
Introduction¶
Clust is a fully automated method for identification of clusters (groups) of genes that are consistently co-expressed (well-correlated) in one or more heterogeneous datasets from one or multiple species.
Versions¶
1.17.0
Commands¶
clust
Module¶
You can load the modules by:
module load biocontainers
module load clust
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run clust on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clust
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clust
Clustalw¶
Introduction¶
Clustalw
is a general purpose multiple alignment program for DNA or proteins.
Versions¶
2.1
Commands¶
clustalw
Module¶
You can load the modules by:
module load biocontainers
module load clustalw
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Clustalw on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=clustalw
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers clustalw
clustalw -tree -align -infile=seq.faa
CNVkit¶
Introduction¶
CNVkit
is a command-line toolkit and Python library for detecting copy number variants and alterations genome-wide from high-throughput sequencing.
Versions¶
0.9.9-py
Commands¶
cnvkit.py
cnv_annotate.py
cnv_expression_correlate.py
cnv_updater.py
Module¶
You can load the modules by:
module load biocontainers
module load cnvkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CNVkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cnvkit
cnvkit.py batch *Tumor.bam --normal *Normal.bam \
--targets my_baits.bed --fasta hg19.fasta \
--access data/access-5kb-mappable.hg19.bed \
--output-reference my_reference.cnn
--output-dir example/
Cnvnator¶
Introduction¶
Cnvnator
is a tool for discovery and characterization of copy number variation (CNV) in population genome sequencing data.
Versions¶
0.4.1
Commands¶
cnvnator
cnvnator2VCF.pl
plotbaf.py
plotcircular.py
plotrdbaf.py
pytools.py
Module¶
You can load the modules by:
module load biocontainers
module load cnvnator
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cnvnator on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cnvnator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cnvnator
cnvnator -root file.root -tree file.bam -chrom $(seq 1 22) X Y
plotcircular.py file.root
Coinfinder¶
Introduction¶
Coinfinder is an algorithm and software tool that detects genes which associate and dissociate with other genes more often than expected by chance in pangenomes.
Versions¶
1.2.0
Commands¶
coinfinder
Module¶
You can load the modules by:
module load biocontainers
module load coinfinder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run coinfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coinfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers coinfinder
coinfinder -i coinfinder-manuscript/gene_presence_absence.csv \
-I -p coinfinder-manuscript/core-gps_fasttree.newick \
-o output
CONCOCT¶
Introduction¶
CONCOCT
: Clustering cONtigs with COverage and ComposiTion.
Detailed usage can be found here: https://github.com/BinPro/CONCOCT
Versions¶
1.1.0
Commands¶
concoct
concoct_refine
concoct_coverage_table.py
cut_up_fasta.py
extract_fasta_bins.py
merge_cutup_clustering.py
Module¶
You can load the modules by:
module load biocontainers
module load concoct/1.1.0-py38
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run concoct on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=concoct
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers concoct/1.1.0-py38
cut_up_fasta.py final.contigs.fa -c 10000 -o 0 --merge_last -b contigs_10K.bed > contigs_10K.fa
concoct_coverage_table.py contigs_10K.bed SRR1976948_sorted.bam > coverage_table.tsv
concoct --composition_file contigs_10K.fa --coverage_file coverage_table.tsv -b concoct_output/
Control-freec¶
Introduction¶
Control-freec
is a tool for detection of copy-number changes and allelic imbalances (including LOH) using deep-sequencing data.
Versions¶
11.6
Commands¶
freec
Module¶
You can load the modules by:
module load biocontainers
module load control-freec
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Control-freec on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=control-freec
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers control-freec
freec -conf config_chr19.txt

Cooler¶
Introduction¶
Cooler
is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.
Versions¶
0.8.11
Commands¶
cooler
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load Cooler
Interactive job¶
To run Cooler interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cooler
(base) UserID@bell-a008:~ $ python
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cooler
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cooler batch jobs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cooler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cooler
cooler info data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -f bin-size data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler info -m data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler tree data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
cooler attrs data/Rao2014-GM12878-MboI-allreps-filtered.1000kb.cool
Coverm¶
Introduction¶
Coverm
is a configurable, easy to use and fast DNA read coverage and relative abundance calculator focused on metagenomics applications.
Versions¶
0.6.1
Commands¶
coverm
Module¶
You can load the modules by:
module load biocontainers
module load coverm
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Coverm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=coverm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers coverm
coverm genome --genome-fasta-files xcc.fasta --coupled SRR11234553_1.fastq SRR11234553_2.fastq
Covgen¶
Introduction¶
Covgen creates a target specific exome_full192.coverage.txt file required by MutSig.
Versions¶
1.0.2
Commands¶
CovGen
Module¶
You can load the modules by:
module load biocontainers
module load covgen
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run covgen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=covgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers covgen
Cramino¶
Introduction¶
Cramino is a tool for quick quality assessment of cram and bam files, intended for long read sequencing.
Versions¶
0.9.6
Commands¶
cramino
Module¶
You can load the modules by:
module load biocontainers
module load cramino
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cramino on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cramino
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cramino
CRISPRCasFinder¶
Introduction¶
CRISPRCasFinder
enables the easy detection of CRISPRs and cas genes in user-submitted sequence data. It is an updated, improved, and integrated version of CRISPRFinder and CasFinder.
Detailed usage can be found here: https://github.com/dcouvin/CRISPRCasFinder
Versions¶
4.2.20
Commands¶
CRISPRCasFinder.pl
Module¶
You can load the modules by:
module load biocontainers
module load crisprcasfinder/4.2.20
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run CRISPRCasFinder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=CRISPRCasFinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crisprcasfinder/4.2.20
CRISPRCasFinder.pl -in install_test/sequence.fasta -cas -cf CasFinder-2.0.3 -def G -keep
Crispresso2¶
Introduction¶
CRISPResso2 is a software pipeline designed to enable rapid and intuitive interpretation of genome editing experiments.
Versions¶
2.2.10
2.2.11a
2.2.8
2.2.9
Commands¶
CRISPResso
CRISPRessoAggregate
CRISPRessoBatch
CRISPRessoCompare
CRISPRessoPooled
CRISPRessoPooledWGSCompare
CRISPRessoWGS
Module¶
You can load the modules by:
module load biocontainers
module load crispresso2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run crispresso2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispresso2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crispresso2
CRISPResso --fastq_r1 nhej.r1.fastq.gz --fastq_r2 nhej.r2.fastq.gz -n nhej --amplicon_seq \
AATGTCCCCCAATGGGAAGTTCATCTGGCACTGCCCACAGGTGAGGAGGTCATGATCCCCTTCTGGAGCTCCCAACGGGCCGTGGTCTGGTTCATCATCTGTAAGAATGGCTTCAAGAGGCTCGGCTGTGGTT
Crispritz¶
Introduction¶
Crispritz
is a software package containing 5 different tools dedicated to perform predictive analysis and result assessement on CRISPR/Cas experiments.
Versions¶
2.6.5
Commands¶
crispritz.py
Module¶
You can load the modules by:
module load biocontainers
module load crispritz
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Crispritz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crispritz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crispritz
crispritz.py add-variants hg38_1000genomeproject_vcf/ hg38_ref/ &> output.redirect.out
crispritz.py index-genome hg38_ref hg38_ref/ 20bp-NGG-SpCas9.txt -bMax 2 &> output.redirect.out
crispritz.py search hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38 -mm 4 -t -scores hg38_ref/ &> output.redirect.out
crispritz.py search genome_library/NGG_2_hg38_ref/ 20bp-NGG-SpCas9.txt EMX1.sgRNA.txt emx1.hg38.bulges -index -mm 4 -bDNA 1 -bRNA 1 -t &> output.redirect.out
crispritz.py annotate-results emx1.hg38.targets.txt hg38Annotation.bed emx1.hg38 &> output.redirect.out
Crossmap¶
Introduction¶
Crossmap
is a program for genome coordinates conversion between different assemblies.
Versions¶
0.6.3
Commands¶
CrossMap.py
Module¶
You can load the modules by:
module load biocontainers
module load crossmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Crossmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=crossmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers crossmap
CrossMap.py bed GRCh37_to_GRCh38.chain.gz test.bed
cross_match¶
Introduction¶
cross_match
is a general purpose utility for comparing any two DNA sequence sets using a ‘banded’ version of swat.
Versions¶
1.090518
Commands¶
cross_match
Module¶
You can load the modules by:
module load biocontainers
module load cross_match
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cross_match on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cross_match
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cross_match
Csvkit¶
Introduction¶
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
Versions¶
1.1.1
Commands¶
csvclean
csvcut
csvformat
csvgrep
csvjoin
csvjson
csvlook
csvpy
csvsort
csvsql
csvstack
csvstat
in2csv
sql2csv
Module¶
You can load the modules by:
module load biocontainers
module load csvkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run csvkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers csvkit
Csvtk¶
Introduction¶
Csvtk
is a cross-platform, efficient and practical CSV/TSV toolkit.
Versions¶
0.23.0
0.25.0
Commands¶
csvtk
Module¶
You can load the modules by:
module load biocontainers
module load csvtk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Csvtk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=csvtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers csvtk
cat data.csv \
| csvtk summary --ignore-non-digits --fields f4:sum,f5:sum --groups f1,f2 \
| csvtk pretty
Cufflinks¶
Introduction¶
Cufflinks
assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.
Versions¶
2.2.1
Commands¶
cuffcompare
cuffdiff
cufflinks
cuffmerge
cuffnorm
cuffquant
gffread
gtf_to_sam
Module¶
You can load the modules by:
module load biocontainers
module load cufflinks
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cufflinks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=cufflinks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cufflinks
cufflinks -p 8 -G transcript.gtf --library-type fr-unstranded -o cufflinks_output tophat_out/accepted_hits.bam
Cutadapt¶
Introduction¶
Cutadapt
finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
Versions¶
3.4
3.7
Commands¶
cutadapt
Module¶
You can load the modules by:
module load biocontainers
module load cutadapt
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cutadapt on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cutadapt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cutadapt
cutadapt -a AACCGGTT -o output.fastq input.fastq
Cuttlefish¶
Introduction¶
Cuttlefish is a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from sequencing reads or reference sequences. It is highly scalable in terms of the size of the input data.
Versions¶
2.1.1
Commands¶
cuttlefish
Module¶
You can load the modules by:
module load biocontainers
module load cuttlefish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run cuttlefish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cuttlefish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cuttlefish
Cyvcf2¶
Introduction¶
Cyvcf2
is a cython wrapper around htslib built for fast parsing of Variant Call Format (VCF) files.
Versions¶
0.30.14
Commands¶
cyvcf2
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load cyvcf2
Interactive job¶
To run Cyvcf2 interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n1 -t1:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cyvcf2 import VCF
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Cyvcf2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=cyvcf2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers cyvcf2
cyvcf2 --help
cyvcf2 [OPTIONS] <vcf_file>
Das_tool¶
Introduction¶
DAS Tool is an automated method that integrates the results of a flexible number of binning algorithms to calculate an optimized, non-redundant set of bins from a single assembly.
Versions¶
1.1.6
Commands¶
DAS_Tool
Contigs2Bin_to_Fasta.sh
Fasta_to_Contig2Bin.sh
get_species_taxids.sh
Module¶
You can load the modules by:
module load biocontainers
module load das_tool
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run das_tool on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=das_tool
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers das_tool
DAS_Tool -i sample.human.gut_concoct_contigs2bin.tsv,\
sample.human.gut_maxbin2_contigs2bin.tsv,\
sample.human.gut_metabat_contigs2bin.tsv,\
sample.human.gut_tetraESOM_contigs2bin.tsv \
-l concoct,maxbin,metabat,tetraESOM \
-c sample.human.gut_contigs.fa \
-o DASToolRun2 \
--proteins DASToolRun1_proteins.faa \
--write_bin_evals \
--threads 4 \
--score_threshold 0.6
Dbg2olc¶
Introduction¶
Dbg2olc
is used for efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies.
Versions¶
20180222
20200723
Commands¶
AssemblyStatistics
DBG2OLC
RunSparcConsensus.txt
SelectLongestReads
SeqIO.py
Sparc
SparseAssembler
split_and_run_sparc.sh
split_and_run_sparc.sh.bak
split_reads_by_backbone.py
Module¶
You can load the modules by:
module load biocontainers
module load dbg2olc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dbg2olc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dbg2olc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dbg2olc
SelectLongestReads sum 600000000 longest 0 o TEST.fq f SRR1976948.abundtrim.subset.pe.fq
Deconseq¶
Introduction¶
DeconSeq: DECONtamination of SEQuence data using a modified version of BWA-SW. The DeconSeq tool can be used to automatically detect and efficiently remove sequence contamination from genomic and metagenomic datasets. It is easily configurable and provides a user-friendly interface.
Versions¶
0.4.3
Commands¶
bwa64
deconseq.pl
splitFasta.pl
Module¶
You can load the modules by:
module load biocontainers
module load deconseq
Helper command¶
Note
Users need to use DeconSeqConfig.pm
to specify the database information. Besides, for the current deconseq
module in biocontainers, users need to copy the executables to your current directory, including bwa64
, deconseq.pl
, and splitFasta.pl
. This step is only needed to run once.
A helper command copy_DeconSeqConfig
is provided to copy the configuration file DeconSeqConfig.pm
and executables to your current directory. You just need to run the command copy_DeconSeqConfig
and modify DeconSeqConfig.pm
as needed:
copy_DeconSeqConfig
nano DeconSeqConfig.pm # modify database information as needed
For detailed information about how to config DeconSeqConfig.pm
, please check its online manual (https://sourceforge.net/projects/deconseq/files/).
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deconseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deconseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deconseq
bwa64 index -p hg38_db -a bwtsw Homo_sapiens.GRCh38.dna.fa
bwa64 index -p m39_db -a bwtsw GRCm38.p4.genome.fa
deconseq.pl -f input.fastq -dbs hg38_db -dbs_retain m39_db
Deepbgc¶
Introduction¶
Deepbgc
is a tool for BGC detection and classification using deep learning.
Versions¶
0.1.26
0.1.30
Commands¶
deepbgc
Module¶
You can load the modules by:
module load biocontainers
module load deepbgc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Deepbgc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepbgc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepbgc
export DEEPBGC_DOWNLOADS_DIR=$PWD
deepbgc download
deepbgc pipeline genome.fa -o output
Deepconsensus¶
Introduction¶
DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
Versions¶
0.2.0
Commands¶
deepconsensus
ccs
actc
Module¶
You can load the modules by:
module load biocontainers
module load deepconsensus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deepconsensus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepconsensus
deepconsensus run \
--subreads_to_ccs=subreads_to_ccs.bam \
--ccs_fasta=ccs.fasta \
--checkpoint=checkpoint-50 \
--output=output.fastq \
--batch_zmws=100
Deepsignal2¶
Introduction¶
Deepsignal2
is a deep-learning method for detecting DNA methylation state from Oxford Nanopore sequencing reads.
Versions¶
0.1.2
Commands¶
deepsignal2
call_modification_frequency.py
combine_call_mods_freq_files.py
combine_two_strands_frequency.py
concat_two_files.py
evaluate_mods_call.py
filter_samples_by_label.py
filter_samples_by_positions.py
gff_reader.py
randsel_file_rows.py
shuffle_a_big_file.py
split_freq_file_by_5mC_motif.py
txt_formater.py
Module¶
You can load the modules by:
module load biocontainers
module load deepsignal2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Deepsignal2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=deepsignal2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepsignal2
DeepTools¶
Introduction¶
DeepTools
is a collection of user-friendly tools for normalization and visualization of deep-sequencing data.
Versions¶
3.5.1-py
Commands¶
alignmentSieve
bamCompare
bamCoverage
bamPEFragmentSize
bigwigCompare
computeGCBias
computeMatrix
computeMatrixOperations
correctGCBias
deeptools
estimateReadFiltering
estimateScaleFactor
multiBamSummary
multiBigwigSummary
plotCorrelation
plotCoverage
plotEnrichment
plotFingerprint
plotHeatmap
plotPCA
plotProfile
Module¶
You can load the modules by:
module load biocontainers
module load deeptools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run DeepTools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deeptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deeptools
bamCoverage --normalizeUsing CPM -p 32 \
--effectiveGenomeSize 11000000 \
-b WT_coord_sorted.bam \
-o WT_coord_sorted.bw
Deepvariant¶
Introduction¶
DeepVariant is a deep learning-based variant caller that takes aligned reads (in BAM or CRAM format), produces pileup image tensors from them, classifies each tensor using a convolutional neural network, and finally reports the results in a standard VCF or gVCF file.
Versions¶
1.0.0
1.1.0
Commands¶
call_variants
get-pip.py
make_examples
model_eval
model_train
postprocess_variants
run-prereq.sh
run_deepvariant
run_deepvariant.py
settings.sh
show_examples
vcf_stats_report
Module¶
You can load the modules by:
module load biocontainers
module load deepvariant
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run deepvariant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers deepvariant
INPUT_DIR="${PWD}/quickstart-testdata"
DATA_HTTP_DIR="https://storage.googleapis.com/deepvariant/quickstart-testdata"
mkdir -p ${INPUT_DIR}
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/NA12878_S1.chr20.10_10p1mb.bam.bai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.bed
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/test_nist.b37_chr20_100kbp_at_10mb.vcf.gz.tbi
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.fai
wget -P ${INPUT_DIR} "${DATA_HTTP_DIR}"/ucsc.hg19.chr20.unittest.fasta.gz.gzi
run_deepvariant --model_type=WGS --ref="${INPUT_DIR}"/ucsc.hg19.chr20.unittest.fasta --reads="${INPUT_DIR}"/NA12878_S1.chr20.10_10p1mb.bam --regions "chr20:10,000,000-10,010,000" --output_vcf="output/output.vcf.gz" --output_gvcf="output/output.g.vcf.gz" --intermediate_results_dir "output/intermediate_results_dir" --num_shards=4
Delly¶
Introduction¶
Delly
is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data.
Versions¶
0.9.1
1.0.3
1.1.3
1.1.5
1.1.6
Commands¶
delly
Module¶
You can load the modules by:
module load biocontainers
module load delly
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Delly on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=delly
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers delly
delly call -x hg19.excl -o delly.bcf -g hg19.fa input.bam
delly filter -f somatic -o t1.pre.bcf -s samples.tsv t1.bcf
Dendropy¶
Introduction¶
DendroPy is a Python library for phylogenetic computing. It provides classes and functions for the simulation, processing, and manipulation of phylogenetic trees and character matrices, and supports the reading and writing of phylogenetic data in a range of formats, such as NEXUS, NEWICK, NeXML, Phylip, FASTA, etc. Application scripts for performing some useful phylogenetic operations, such as data conversion and tree posterior distribution summarization, are also distributed and installed as part of the libary. DendroPy can thus function as a stand-alone library for phylogenetics, a component of more complex multi-library phyloinformatic pipelines, or as a scripting “glue” that assembles and drives such pipelines.
Versions¶
4.5.2
Commands¶
python
python3
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load dendropy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dendropy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dendropy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dendropy
Diamond¶
Introduction¶
Diamond
is a sequence aligner for protein and translated DNA searches, designed for high performance analysis of big sequence data. The key features are:
Pairwise alignment of proteins and translated DNA at 100x-10,000x speed of BLAST.
Frameshift alignments for long read analysis.
Low resource requirements and suitable for running on standard desktops or laptops.
Various output formats, including BLAST pairwise, tabular and XML, as well as taxonomic classification.
Detailed about its usage can be found here: https://github.com/bbuchfink/diamond
Versions¶
2.0.13
2.0.14
2.0.15
2.1.6
Commands¶
diamond makedb
diamond prepdb
diamond blastp
diamond blastx
diamond view
diamond version
diamond dbinfo
diamond help
diamond test
Module¶
You can load the modules by:
module load biocontainers
module load diamond/2.0.14
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run diamond on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=diamond
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers diamond/2.0.14
diamond makedb --in uniprot_sprot.fasta -d uniprot_sprot
diamond blastp -p 24 -q test.faa -d uniprot_sprot --very-sensitive -o blastp_output.txt
Dnaapler¶
Introduction¶
dnaapler is a simple python program that takes a single nucleotide input sequence (in FASTA format), finds the desired start gene using blastx against an amino acid sequence database, checks that the start codon of this gene is found, and if so, then reorients the chromosome to begin with this gene on the forward strand.
Versions¶
0.1.0
Commands¶
dnaapler
Module¶
You can load the modules by:
module load biocontainers
module load dnaapler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dnaapler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaapler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dnaapler
Dnaio¶
Introduction¶
Dnaio
is a Python 3.7+ library for very efficient parsing and writing of FASTQ and also FASTA files.
Versions¶
0.8.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load dnaio
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dnaio on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dnaio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dnaio
python dnaio_test.py
Dragonflye¶
Introduction¶
Dragonflye is a pipeline that aims to make assembling Oxford Nanopore reads quick and easy.
Versions¶
1.0.13
1.0.14
Commands¶
dragonflye
Module¶
You can load the modules by:
module load biocontainers
module load dragonflye
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run dragonflye on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=dragonflye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dragonflye
dragonflye --cpus 8 \
--outdir output \
--reads SRR18498195.fastq
Drep¶
Introduction¶
Drep
is a python program for rapidly comparing large numbers of genomes.
Versions¶
3.2.2
Commands¶
dRep
Module¶
You can load the modules by:
module load biocontainers
module load drep
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Drep on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers drep
dRep compare compare_out -g tests/genomes/*
dRep dereplicate dereplicate_out -g tests/genomes/*
Dropest¶
Introduction¶
Dropest
is a pipeline for initial analysis of droplet-based single-cell RNA-seq data.
Versions¶
0.8.6
Commands¶
dropest
droptag
dropReport.Rsc
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load dropest
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dropest on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=dropest
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dropest
dropest -f -c 10x.xml -C 1200 neurons_900_possorted_genome_bam.bam
Drop-seq¶
Introduction¶
Drop-seq are java tools for analyzing Drop-seq data.
Versions¶
2.5.2
Commands¶
AssignCellsToSamples
BamTagHistogram
BamTagOfTagCounts
BaseDistributionAtReadPosition
BipartiteRabiesVirusCollapse
CensusSeq
CollapseBarcodesInPlace
CollapseTagWithContext
CompareDropSeqAlignments
ComputeUMISharing
ConvertTagToReadGroup
ConvertToRefFlat
CountUnmatchedSampleIndices
CreateIntervalsFiles
CreateMetaCells
CreateSnpIntervalFromVcf
CsiAnalysis
DetectBeadSubstitutionErrors
DetectBeadSynthesisErrors
DetectDoublets
DigitalExpression
DownsampleBamByTag
DownsampleTranscriptsAndQuantiles
Drop-seq_Alignment_Cookbook.pdf
Drop-seq_alignment.sh
FilterBam
FilterBamByGeneFunction
FilterBamByTag
FilterDge
FilterGtf
FilterValidRabiesBarcodes
GatherGeneGCLength
GatherMolecularBarcodeDistributionByGene
GatherReadQualityMetrics
GenotypeSperm
MaskReferenceSequence
MergeDgeSparse
PolyATrimmer
ReduceGtf
RollCall
SelectCellsByNumTranscripts
SignTest
SingleCellRnaSeqMetricsCollector
SpermSeqMarkDuplicates
SplitBamByCell
TagBam
TagBamWithReadSequenceExtended
TagReadWithGeneExonFunction
TagReadWithGeneFunction
TagReadWithInterval
TagReadWithRabiesBarcodes
TrimStartingSequence
ValidateAlignedSam
ValidateReference
create_Drop-seq_reference_metadata.sh
Module¶
You can load the modules by:
module load biocontainers
module load drop-seq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run drop-seq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=drop-seq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers drop-seq
Dsuite¶
Introduction¶
Dsuite
is a fast C++ implementation, allowing genome scale calculations of the D and f4-ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file.
Versions¶
0.4.r43
0.5.r44
Commands¶
Dsuite
dtools.py
DtriosParallel
Module¶
You can load the modules by:
module load biocontainers
module load dsuite
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Dsuite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=dsuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers dsuite
Dsuite Dtrios -c -n no_geneflow -t simulated_tree_no_geneflow.nwk chr1_no_geneflow.vcf.gz species_sets.txt
easySFS¶
Introduction¶
easySFS
is a tool for the effective selection of population size projection for construction of the site frequency spectrum.
Versions¶
1.0
Commands¶
easySFS.py
Module¶
You can load the modules by:
module load biocontainers
module load easysfs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run easySFS on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=easysfs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers easysfs
easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt --preview -a
easySFS.py -i example_files/wcs_1200.vcf -p example_files/wcs_pops.txt -a --proj=7,7
Edta¶
Introduction¶
Edta
is is developed for automated whole-genome de-novo TE annotation and benchmarking the annotation performance of TE libraries.
- Note: Running EDTA, please use the command like this:
EDTA.pl [OPTIONS]
DO NOT call it ‘perl EDTA.pl’
Versions¶
1.9.6
2.0.0
Commands¶
EDTA.pl
EDTA_processI.pl
EDTA_raw.pl
FET.pl
bdf2gdfont.pl
buildRMLibFromEMBL.pl
buildSummary.pl
calcDivergenceFromAlign.pl
cd-hit-2d-para.pl
cd-hit-clstr_2_blm8.pl
cd-hit-div.pl
cd-hit-para.pl
check_result.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
convert_MGEScan3.0.pl
convert_ltr_struc.pl
convert_ltrdetector.pl
createRepeatLandscape.pl
down_tRNA.pl
dupliconToSVG.pl
filter_rt.pl
genome_plot.pl
genome_plot2.pl
genome_plot_svg.pl
getRepeatMaskerBatch.pl
legacy_blast.pl
lib-test.pl
make_multi_seq.pl
maskFile.pl
plot_2d.pl
plot_len1.pl
rmOut2Fasta.pl
rmOutToGFF3.pl
rmToUCSCTables.pl
update_blastdb.pl
viewMSA.pl
wublastToCrossmatch.pl
Module¶
You can load the modules by:
module load biocontainers
module load edta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Edta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=edta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers edta
EDTA.pl --genome genome.fa --cds genome.cds.fa --curatedlib EDTA/database/rice6.9.5.liban --exclude genome.exclude.bed --overwrite 1 --sensitive 1 --anno 1 --evaluate 1 --threads 10
Eggnog-mapper¶
Introduction¶
Eggnog-mapper
is a tool for fast functional annotation of novel sequences.
Versions¶
2.1.7
Commands¶
create_dbs.py
download_eggnog_data.py
emapper.py
hmm_mapper.py
hmm_server.py
hmm_worker.py
vba_extract.py
Module¶
You can load the modules by:
module load biocontainers
module load eggnog-mapper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Eggnog-mapper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=eggnog-mapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers eggnog-mapper
emapper.py -i proteins.faa --cpu 24 -o protein.out
emapper.py -m diamond --itype CDS -i cDNA.fasta -o cdna.out --cpu 24
Emboss¶
Introduction¶
Emboss
is “The European Molecular Biology Open Software Suite”.
Versions¶
6.6.0
Commands¶
aaindexextract
abiview
acdc
acdgalaxy
acdlog
acdpretty
acdtable
acdtrace
acdvalid
aligncopy
aligncopypair
antigenic
assemblyget
backtranambig
backtranseq
banana
biosed
btwisted
cachedas
cachedbfetch
cacheebeyesearch
cacheensembl
cai
chaos
charge
checktrans
chips
cirdna
codcmp
codcopy
coderet
compseq
cons
consambig
cpgplot
cpgreport
cusp
cutgextract
cutseq
dan
dbiblast
dbifasta
dbiflat
dbigcg
dbtell
dbxcompress
dbxedam
dbxfasta
dbxflat
dbxgcg
dbxobo
dbxreport
dbxresource
dbxstat
dbxtax
dbxuncompress
degapseq
density
descseq
diffseq
distmat
dotmatcher
dotpath
dottup
dreg
drfinddata
drfindformat
drfindid
drfindresource
drget
drtext
edamdef
edamhasinput
edamhasoutput
edamisformat
edamisid
edamname
edialign
einverted
embossdata
embossupdate
embossversion
emma
emowse
entret
epestfind
eprimer3
eprimer32
equicktandem
est2genome
etandem
extractalign
extractfeat
extractseq
featcopy
featmerge
featreport
feattext
findkm
freak
fuzznuc
fuzzpro
fuzztran
garnier
geecee
getorf
godef
goname
helixturnhelix
hmoment
iep
infoalign
infoassembly
infobase
inforesidue
infoseq
isochore
jaspextract
jaspscan
jembossctl
lindna
listor
makenucseq
makeprotseq
marscan
maskambignuc
maskambigprot
maskfeat
maskseq
matcher
megamerger
merger
msbar
mwcontam
mwfilter
needle
needleall
newcpgreport
newcpgseek
newseq
nohtml
noreturn
nospace
notab
notseq
nthseq
nthseqset
octanol
oddcomp
ontocount
ontoget
ontogetcommon
ontogetdown
ontogetobsolete
ontogetroot
ontogetsibs
ontogetup
ontoisobsolete
ontotext
palindrome
pasteseq
patmatdb
patmatmotifs
pepcoil
pepdigest
pepinfo
pepnet
pepstats
pepwheel
pepwindow
pepwindowall
plotcon
plotorf
polydot
preg
prettyplot
prettyseq
primersearch
printsextract
profit
prophecy
prophet
prosextract
pscan
psiphi
rebaseextract
recoder
redata
refseqget
remap
restover
restrict
revseq
runJemboss.sh
seealso
seqcount
seqmatchall
seqret
seqretsetall
seqretsplit
seqxref
seqxrefget
servertell
showalign
showdb
showfeat
showorf
showpep
showseq
showserver
shuffleseq
sigcleave
silent
sirna
sixpack
sizeseq
skipredundant
skipseq
splitsource
splitter
stretcher
stssearch
supermatcher
syco
taxget
taxgetdown
taxgetrank
taxgetspecies
taxgetup
tcode
textget
textsearch
tfextract
tfm
tfscan
tmap
tranalign
transeq
trimest
trimseq
trimspace
twofeat
union
urlget
variationget
vectorstrip
water
whichdb
wobble
wordcount
wordfinder
wordmatch
wossdata
wossinput
wossname
wossoperation
wossoutput
wossparam
wosstopic
xmlget
xmltext
yank
Module¶
You can load the modules by:
module load biocontainers
module load emboss
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Emboss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=emboss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers emboss
Ensembl-vep¶
Introduction¶
Ensembl-vep(Ensembl Variant Effect Predictor) predicts the functional effects of genomic variants.
Versions¶
106.1
107.0
108.2
Commands¶
vep
haplo
variant_recoder
Module¶
You can load the modules by:
module load biocontainers
module load ensembl-vep
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ensembl-vep on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ensembl-vep
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ensembl-vep
haplo -i bos_taurus_UMD3.1.vcf -o out.txt
Epic2¶
Introduction¶
Epic2
is an ultraperformant Chip-Seq broad domain finder based on SICER.
Versions¶
0.0.51
0.0.52
Commands¶
epic2
epic2-bw
epic2-df
Module¶
You can load the modules by:
module load biocontainers
module load epic2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Epic2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=epic2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers epic2
epic2 -t /examples/test.bed.gz \
-c /examples/control.bed.gz \
> deleteme.txt
Evidencemodeler¶
Introduction¶
Evidencemodeler
is a software combines ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures.
Versions¶
1.1.1
Commands¶
evidence_modeler.pl
BPbtab.pl
EVMLite.pl
EVM_to_GFF3.pl
convert_EVM_outputs_to_GFF3.pl
create_weights_file.pl
execute_EVM_commands.pl
extract_complete_proteins.pl
gff3_file_to_proteins.pl
gff3_gene_prediction_file_validator.pl
gff_range_retriever.pl
partition_EVM_inputs.pl
recombine_EVM_partial_outputs.pl
summarize_btab_tophits.pl
write_EVM_commands.pl
Module¶
You can load the modules by:
module load biocontainers
module load evidencemodeler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Evidencemodeler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=evidencemodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers evidencemodeler
evidence_modeler.pl --genome genome.fasta \
--weights weights.txt \
--gene_predictions gene_predictions.gff3 \
--protein_alignments protein_alignments.gff3 \
--transcript_alignments transcript_alignments.gff3 \
> evm.out
Exonerate¶
Introduction¶
Exonerate
is a generic tool for pairwise sequence comparison/alignment.
Versions¶
2.4.0
Commands¶
exonerate
Module¶
You can load the modules by:
module load biocontainers
module load exonerate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Exonerate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=exonerate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers exonerate
exonerate -m genome2genome cms.fasta cmm.fasta > cm_vs_cs.out
Expansionhunter¶
Introduction¶
Expansion Hunter: a tool for estimating repeat sizes.
Versions¶
4.0.2
Commands¶
ExpansionHunter
Module¶
You can load the modules by:
module load biocontainers
module load expansionhunter
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run expansionhunter on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=expansionhunter
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers expansionhunter
Fasta3¶
Introduction¶
Fasta3
is a suite of programs for searching nucleotide or protein databases with a query sequence.
Versions¶
36.3.8
Commands¶
fasta36
fastf36
fastm36
fasts36
fastx36
fasty36
ggsearch36
glsearch36
lalign36
ssearch36
tfastf36
tfastm36
tfasts36
tfastx36
tfasty36
Module¶
You can load the modules by:
module load biocontainers
module load fasta3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fasta3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasta3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasta3
fasta36 input.fasta genome.fasta
FastANI¶
Introduction¶
FastANI
is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI).
Versions¶
1.32
1.33
Commands¶
fastANI
Module¶
You can load the modules by:
module load biocontainers
module load fastani
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastANI on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastani
fastANI -q cmm.fasta -r cms.fasta -o cm_cs_out
fastANI -q cmm.fasta -r cms.fasta --visualize -o cm_cs_visualize_out
Fastp¶
Introduction¶
Fastp
is an ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging, etc).
Versions¶
0.20.1
0.23.2
Commands¶
fastp
Module¶
You can load the modules by:
module load biocontainers
module load fastp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastp
fastp -i input_1.fastq -I input_2.fastq -o out.R1.fq.gz -O out.R2.fq.gz
FastQC¶
Introduction¶
FastQC
aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses which you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
Versions¶
0.11.9
Commands¶
fastqc
Module¶
You can load the modules by:
module load biocontainers
module load fastqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=fastqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastqc
fastqc -o fastqc_out -t 4 FASTQ1 FASTQ2
Fastq_pair¶
Introduction¶
Fastq_pair
is used to match up paired end fastq files quickly and efficiently.
Versions¶
1.0
Commands¶
fastq_pair
Module¶
You can load the modules by:
module load biocontainers
module load fastq_pair
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastq_pair on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq_pair
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastq_pair
fastq_pair seq_1.fastq seq_2.fastq
Fastq-scan¶
Introduction¶
Fastq-scan reads a FASTQ from STDIN and outputs summary statistics (read lengths, per-read qualities, per-base qualities) in JSON format.
Versions¶
1.0.0
Commands¶
fastq-scan
Module¶
You can load the modules by:
module load biocontainers
module load fastq-scan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fastq-scan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastq-scan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastq-scan
cat example-q33.fq | fastq-scan -g 150000
Fastspar¶
Introduction¶
Fastspar
is a tool for rapid and scalable correlation estimation for compositional data.
Versions¶
1.0.0
Commands¶
fastspar
fastspar_bootstrap
fastspar_pvalues
fastspar_reduce
Module¶
You can load the modules by:
module load biocontainers
module load fastspar
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fastspar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastspar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastspar
fastStructure¶
Introduction¶
fastStructure
is an algorithm for inferring population structure from large SNP genotype data. It is based on a variational Bayesian framework for posterior inference and is written in Python2.x.
Note: programs “structure.py”, “chooseK.py” and “distruct.py” are standalone executable and should be called by name directly (“structure.py”, etc). DO NOT invoke them as “python structure.py”, or as “python /usr/local/bin/structure.py”, this will not work!
Note: This containers lacks X11 libraries, so GUI plots with ‘distruct.py’ do not work. Instead, we need to tell the underlying Matplotlib to use a non-interactive plotting backend (to file). The easiest and most flexible way is to use the MPLBACKEND environment variable: env MPLBACKEND=”svg” distruct.py –output myplot.svg …….
- Available backends in this container:
Backend Filetypes Description agg png raster graphics – high quality PNG output ps ps eps vector graphics – Postscript output pdf pdf vector graphics – Portable Document Format svg svg vector graphics – Scalable Vector Graphics
Default MPLBACKEND=”agg” (for PNG format output).
Versions¶
1.0-py27
Commands¶
structure.py
chooseK.py
distruct.py
Module¶
You can load the modules by:
module load biocontainers
module load faststructure
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fastStructure on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=faststructure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers faststructure
FastTree¶
Introduction¶
FastTree
infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.
Detailed usage can be found here: http://www.microbesonline.org/fasttree/
Versions¶
2.1.10
2.1.11
Commands¶
fasttree
FastTree
FastTreeMP
Note
fasttree
and FastTree
are the same program, and they only support one CPU. If you want to use multiple CPUs, please use FastTreeMP
and also set the OMP_NUM_THREADS
to the number of cores you requested.
Module¶
You can load the modules by:
module load biocontainers
module load fasttree
Example job using single CPU¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastTree on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasttree
FastTree alignmentfile > treefile
Example job using multiple CPUs¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FastTree on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=FastTreeMP
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fasttree
export OMP_NUM_THREADS=24
FastTreeMP alignmentfile > treefile
FASTX-Toolkit¶
Introduction¶
FASTX-Toolkit
is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
Versions¶
0.0.14
Commands¶
fasta_clipping_histogram.pl
fasta_formatter
fasta_nucleotide_changer
fastq_masker
fastq_quality_boxplot_graph.sh
fastq_quality_converter
fastq_quality_filter
fastq_quality_trimmer
fastq_to_fasta
fastx_artifacts_filter
fastx_barcode_splitter.pl
fastx_clipper
fastx_collapser
fastx_nucleotide_distribution_graph.sh
fastx_nucleotide_distribution_line_graph.sh
fastx_quality_stats
fastx_renamer
fastx_reverse_complement
fastx_trimmer
fastx_uncollapser
Module¶
You can load the modules by:
module load biocontainers
module load fastx_toolkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run FASTX-Toolkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fastx_toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fastx_toolkit
Filtlong¶
Introduction¶
Filtlong
is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.
Versions¶
0.2.1
Commands¶
filtlong
Module¶
You can load the modules by:
module load biocontainers
module load filtlong
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Filtlong on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=filtlong
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers filtlong
Flye¶
Introduction¶
Flye
: Fast and accurate de novo assembler for single molecule sequencing reads.
Versions¶
2.9.1
2.9.2
2.9
Commands¶
flye
Module¶
You can load the modules by:
module load biocontainers
module load flye
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Flye on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=flye
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers flye
flye --pacbio-raw E.coli_PacBio_40x.fasta --out-dir out_pacbio --threads 12
flye --nano-raw Loman_E.coli_MAP006-1_2D_50x.fasta --out-dir out_nano --threads 12
Fq¶
Introduction¶
Fq is a command line utility for manipulating Illumina-generated FastQ files.
Versions¶
0.10.0
Commands¶
fq
Module¶
You can load the modules by:
module load biocontainers
module load fq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fq
Fraggenescan¶
Introduction¶
Fraggenescan
is an application for finding (fragmented) genes in short reads. It can also be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.
Versions¶
1.31
Commands¶
FragGeneScan
run_FragGeneScan.pl
Module¶
You can load the modules by:
module load biocontainers
module load fraggenescan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fraggenescan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fraggenescan
FragGeneScanRs -t 454_10 < example/NC_000913-454.fna > example/NC_000913-454.faa
Fraggenescanrs¶
Introduction¶
FragGeneScanRs is a better and faster Rust implementation of the FragGeneScan gene prediction model for short and error-prone reads. Its command line interface is backward compatible and adds extra features for more flexible usage. Compared to the original C implementation, shotgun metagenomic reads are processed up to 22 times faster using a single thread, with better scaling for multithreaded execution.
Versions¶
1.1.0
Commands¶
FragGeneScanRs
Module¶
You can load the modules by:
module load biocontainers
module load fraggenescanrs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fraggenescanrs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fraggenescanrs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fraggenescanrs
Freebayes¶
Introduction¶
Freebayes
is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.
Versions¶
1.3.5
1.3.6
Commands¶
freebayes
freebayes-parallel
Module¶
You can load the modules by:
module load biocontainers
module load freebayes
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Freebayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freebayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers freebayes
freebayes -f ref.fa aln.cram >var.vcf
Freyja¶
Introduction¶
Freyja is a tool to recover relative lineage abundances from mixed SARS-CoV-2 samples from a sequencing dataset (BAM aligned to the Hu-1 reference). The method uses lineage-determining mutational “barcodes” derived from the UShER global phylogenetic tree as a basis set to solve the constrained (unit sum, non-negative) de-mixing problem.
Versions¶
1.3.11
1.4.2
Commands¶
freyja
Module¶
You can load the modules by:
module load biocontainers
module load freyja
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run freyja on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=freyja
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers freyja
Fseq¶
Introduction¶
Fseq
is a feature density estimator for high-throughput sequence tags.
Versions¶
2.0.3
Commands¶
fseq2
Module¶
You can load the modules by:
module load biocontainers
module load fseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Fseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fseq
Ftp¶
Introduction¶
A File Transfer Protocol client (FTP client) is a software utility that establishes a connection between a host computer and a remote server, typically an FTP server.
Versions¶
0.17
Commands¶
ftp
Module¶
You can load the modules by:
module load biocontainers
module load ftp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ftp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ftp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ftp
Funannotate¶
Introduction¶
Funannotate
is a genome prediction, annotation, and comparison software package.
Versions¶
1.8.10
1.8.13
Commands¶
funannotate
Module¶
You can load the modules by:
module load biocontainers
module load funannotate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Funannotate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=funannotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers funannotate
funannotate clean -i genome.fa -o genome_cleaned.fa
funannotate sort -i genome_cleaned.fa -o genome_cleaned_sorted.fa
funannotate predict -i genome_cleaned_sorted.fa -o predict_out --species "arabidopsis" --rna_bam RNAseq.bam --cpus 12
Fwdpy11¶
Introduction¶
Fwdpy11 is a Python package for forward-time population genetic simulation.
Versions¶
0.18.1
Commands¶
python3
python
Module¶
You can load the modules by:
module load biocontainers
module load fwdpy11
Interactive job¶
To run fwdpy11 interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers fwdpy11
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Mar 15 2022, 12:22:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import fwdpy11
>>> pop = fwdpy11.DiploidPopulation(100, 1000.0)
>>> print(f"N = {pop.N}, L = {pop.tables.genome_length}")
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run fwdpy11 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=fwdpy11
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers fwdpy11
python script.py
Gadma¶
Introduction¶
GADMA is a command-line tool. Basic pipeline presents a series of launches of the genetic algorithm folowed by local search optimization and infers demographic history from the Allele Frequency Spectrum of multiple populations (up to three).
Versions¶
2.0.0rc21
Commands¶
gadma
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load gadma
Interactive job¶
To run GADMA interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers gadma
(base) UserID@bell-a008:~ $ python
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gadma import *
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gadma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gadma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gadma
gadma -p params_file
Gambit¶
Introduction¶
GAMBIT (Genomic Approximation Method for Bacterial Identification and Tracking) is a tool for rapid taxonomic identification of microbial pathogens.
Versions¶
0.5.0
Commands¶
gambit
Module¶
You can load the modules by:
module load biocontainers
module load gambit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gambit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gambit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gambit
gambit -d database query -o results.csv *.fasta
Gamma¶
Introduction¶
GAMMA (Gene Allele Mutation Microbial Assessment) is a command line tool that finds gene matches in microbial genomic data using protein coding (rather than nucleotide) identity, and then translates and annotates the match by providing the type (i.e., mutant, truncation, etc.) and a translated description (i.e., Y190S mutant, truncation at residue 110, etc.). Because microbial gene families often have multiple alleles and existing databases are rarely exhaustive, GAMMA is helpful in both identifying and explaining how unique alleles differ from their closest known matches.
Versions¶
1.4
2.2
Commands¶
GAMMA-S.py
GAMMA.py
Module¶
You can load the modules by:
module load biocontainers
module load gamma
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gamma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gamma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gamma
GAMMA.py DHQP1701672_complete_genome.fasta ResFinderDB_Combined_05-06-20.fsa GAMMA_Test
Gangstr¶
Introduction¶
GangSTR is a tool for genome-wide profiling tandem repeats from short reads. A key advantage of GangSTR over existing genome-wide TR tools (e.g. lobSTR or hipSTR) is that it can handle repeats that are longer than the read length. GangSTR takes aligned reads (BAM) and a set of repeats in the reference genome as input and outputs a VCF file containing genotypes for each locus.
Versions¶
2.5.0
Commands¶
GangSTR
Module¶
You can load the modules by:
module load biocontainers
module load gangstr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gangstr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gangstr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gangstr
Gapfiller¶
Introduction¶
GapFiller is a seed-and-extend local assembler to fill the gap within paired reads. It can be used for both DNA and RNA and it has been tested on Illumina data. GapFiller can be used whenever a sequence is to be assembled starting from reads lying on its ends, provided a loose estimate of sequence length.
Versions¶
2.1.2
Commands¶
GapFiller
Module¶
You can load the modules by:
module load biocontainers
module load gapfiller
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gapfiller on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapfiller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gapfiller
Gapit¶
Introduction¶
GAPIT is a Genome Association and Prediction Integrated Tool.
Versions¶
3.3
Commands¶
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load gapit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gapit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gapit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gapit
GATK¶
Introduction¶
GATK
(Genome Analysis Toolkit) is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery.
Versions¶
3.8
Commands¶
gatk3
Module¶
You can load the modules by:
module load biocontainers
module load gatk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GATK on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gatk
gatk3 -T HaplotypeCaller \
-nct 24 -R hg38.fa \
-I 19P0126636WES.sorted.bam \
-o 19P0126636WES.HC.vcf
GATK4¶
Introduction¶
GATK (Genome Analysis Toolkit)
is a collection of command-line tools for analyzing high-throughput sequencing data with a primary focus on variant discovery. Detailed usage can be found here: https://www.broadinstitute.org/gatk/.
Versions¶
4.2.0
4.2.5.0
4.2.6.1
4.3.0.0
Commands¶
gatk
Module¶
You can load the modules by:
module load biocontainers
module load gatk4/4.2.5.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gatk4 our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gatk4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gatk4/4.2.5.0
gatk --java-options "-Xmx12G -XX:ParallelGCThreads=24" HaplotypeCaller -R hg38.fa -I 19P0126636WES.sorted.bam -O 19P0126636WES.HC.vcf --sample-name 19P0126636
Gemma¶
Introduction¶
Gemma
is a software toolkit for fast application of linear mixed models (LMMs) and related models to genome-wide association studies (GWAS) and other large-scale data sets.
Versions¶
0.98.3
Commands¶
gemma
Module¶
You can load the modules by:
module load biocontainers
module load gemma
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gemma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gemma
gemma -g ./example/mouse_hs1940.geno.txt.gz -p ./example/mouse_hs1940.pheno.txt \
-gk -o mouse_hs1940
gemma -g ./example/mouse_hs1940.geno.txt.gz \
-p ./example/mouse_hs1940.pheno.txt -n 1 -a ./example/mouse_hs1940.anno.txt \
-k ./output/mouse_hs1940.cXX.txt -lmm -o mouse_hs1940_CD8_lmm
Gemoma¶
Introduction¶
Gene Model Mapper (GeMoMa) is a homology-based gene prediction program. GeMoMa uses the annotation of protein-coding genes in a reference genome to infer the annotation of protein-coding genes in a target genome. Thereby, GeMoMa utilizes amino acid sequence and intron position conservation. In addition, GeMoMa allows to incorporate RNA-seq evidence for splice site prediction.
Versions¶
1.7.1
Commands¶
GeMoMa
Module¶
You can load the modules by:
module load biocontainers
module load gemoma
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gemoma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gemoma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gemoma
GeneMark-ES/ET/EP¶
Introduction¶
GeneMark-ES/ET/EP
contains GeneMark-ES, GeneMark-ET and GeneMark-EP+ algorithms.
Versions¶
4.68
4.69
Commands¶
bed_to_gff.pl
bp_seq_select.pl
build_mod.pl
calc_introns_from_gtf.pl
change_path_in_perl_scripts.pl
compare_intervals_exact.pl
gc_distr.pl
get_below_gc.pl
get_sequence_from_GTF.pl
gmes_petap.pl
hc_exons2hints.pl
histogram.pl
make_nt_freq_mat.pl
parse_ET.pl
parse_by_introns.pl
parse_gibbs.pl
parse_set.pl
predict_genes.pl
reformat_gff.pl
rescale_gff.pl
rnaseq_introns_to_gff.pl
run_es.pl
run_hmm_pbs.pl
scan_for_bp.pl
star_to_gff.pl
verify_evidence_gmhmm.pl
Academic license¶
To use GeneMark, users need to download license files by yourself.
Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic
and LINUX 64
next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit
licenss. Paste the link addresses in the commands below:
cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key
Module¶
You can load the modules by:
module load biocontainers
module load genemark/4.68
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GeneMark on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=genemark
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genemark/4.68
gmes_petap.pl --ES --cores 24 --sequence scaffolds.fasta
Genemarks-2¶
Introduction¶
GeneMarkS-2 combines GeneMark.hmm (prokaryotic) and GeneMark (prokaryotic) with a self-training procedure that determines parameters of the models of both GeneMark.hmm and GeneMark.
The users need to download your own licence key from GeneMark website and copy key “gm_key” into users’ home directory as: cp gm_key ~/.gm_key | Home page: http://opal.biology.gatech.edu/GeneMark/
Versions¶
1.14_1.25
Commands¶
gms2.pl
biogem
compp
gmhmmp2
Module¶
You can load the modules by:
module load biocontainers
module load genemarks-2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genemarks-2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genemarks-2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genemarks-2
Genmap¶
Introduction¶
GenMap: Ultra-fast Computation of Genome Mappability.
Versions¶
1.3.0
Commands¶
genmap
Module¶
You can load the modules by:
module load biocontainers
module load genmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genmap
export TMPDIR=$PWD/tmp
genmap index -F ~/.local/share/genomes/hg38/hg38.fa -I hg38_index
genmap map -K 64 -E 2 -I hg38_index -O map_output_hg38 -t -w -bg
Genomedata¶
Introduction¶
Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint.
Versions¶
1.5.0
Commands¶
python
python3
genomeCoverageBed
genomedata-close-data
genomedata-erase-data
genomedata-hardmask
genomedata-histogram
genomedata-info
genomedata-load
genomedata-load-assembly
genomedata-load-data
genomedata-load-seq
genomedata-open-data
genomedata-query
genomedata-report
Module¶
You can load the modules by:
module load biocontainers
module load genomedata
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run genomedata on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomedata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomedata
Genomepy¶
Introduction¶
Genomepy
is designed to provide a simple and straightforward way to download and use genomic data.
Versions¶
0.12.0
0.14.0
Commands¶
genomepy
Module¶
You can load the modules by:
module load biocontainers
module load genomepy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomepy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomepy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomepy
Genomescope2¶
Introduction¶
Genomescope2
: Reference-free profiling of polyploid genomes.
Versions¶
2.0
Commands¶
genomescope2
Module¶
You can load the modules by:
module load biocontainers
module load genomescope2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomescope2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomescope2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomescope2
wget https://raw.githubusercontent.com/schatzlab/genomescope/master/analysis/real_data/ara_F1_21.hist
genomescope2 -i ara_F1_21.hist -o output -k 21
Genomicconsensus¶
Introduction¶
Genomicconsensus
is the current PacBio consensus and variant calling suite.
Versions¶
2.3.3
Commands¶
quiver
arrow
variantCaller
Module¶
You can load the modules by:
module load biocontainers
module load genomicconsensus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genomicconsensus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genomicconsensus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genomicconsensus
quiver -j12 out.aligned_subreads.bam \
-r All4mer.V2.01_Insert-changed.fa \
-o consensus.fasta -o consensus.fastq
Genrich¶
Introduction¶
Genrich
is a peak-caller for genomic enrichment assays (e.g. ChIP-seq, ATAC-seq). It analyzes alignment files generated following the assay and produces a file detailing peaks of significant enrichment.
Versions¶
0.6.1
Commands¶
Genrich
Module¶
You can load the modules by:
module load biocontainers
module load genrich
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Genrich on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=genrich
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers genrich
Genrich -t sample.bam -o sample.narrowPeak -v
Getorganelle¶
Introduction¶
GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes.
Versions¶
1.7.7.0
Commands¶
get_organelle_config.py
get_organelle_from_assembly.py
get_organelle_from_reads.py
slim_graph.py
summary_get_organelle_output.py
Module¶
You can load the modules by:
module load biocontainers
module load getorganelle
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run getorganelle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=getorganelle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers getorganelle
Gfaffix¶
Introduction¶
GFAffix identifies walk-preserving shared affixes in variation graphs and collapses them into a non-redundant graph structure.
Versions¶
0.1.4
Commands¶
gfaffix
Module¶
You can load the modules by:
module load biocontainers
module load gfaffix
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfaffix on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfaffix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfaffix
Gfastats¶
Introduction¶
gfastats is a single fast and exhaustive tool for summary statistics and simultaneous fa (fasta, fastq, gfa [.gz]) genome assembly file manipulation. gfastats also allows seamless fasta<>fastq<>gfa[.gz] conversion. It has been tested in genomes even >100Gbp.
Versions¶
1.2.3
1.3.6
Commands¶
gfastats
Module¶
You can load the modules by:
module load biocontainers
module load gfastats
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfastats on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfastats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfastats
gfastats input.fasta -o gfa
Gfatools¶
Introduction¶
gfatools is a set of tools for manipulating sequence graphs in the GFA or the rGFA format. It has implemented parsing, subgraph and conversion to FASTA/BED.
Versions¶
0.5
Commands¶
gfatools
Module¶
You can load the modules by:
module load biocontainers
module load gfatools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gfatools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gfatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gfatools
# Extract a subgraph
gfatools view -l MTh4502 -r 1 test/MT.gfa > sub.gfa
# Convert GFA to segment FASTA
gfatools gfa2fa test/MT.gfa > MT-seg.fa
# Convert rGFA to stable FASTA or BED
gfatools gfa2fa -s test/MT.gfa > MT.fa
gfatools gfa2bed -m test/MT.gfa > MT.bed
Gffcompare¶
Introduction¶
Gffcompare
is used to compare, merge, annotate and estimate accuracy of one or more GFF files.
Versions¶
0.11.2
Commands¶
gffcompare
Module¶
You can load the modules by:
module load biocontainers
module load gffcompare
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gffcompare on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffcompare
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffcompare
gffcompare -r annotation.gff transcripts.gtf
Gffread¶
Introduction¶
Gffread
is used to validate, filter, convert and perform various other operations on GFF files.
Versions¶
0.12.7
Commands¶
gffread
Module¶
You can load the modules by:
module load biocontainers
module load gffread
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gffread on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffread
gffread -E annotation.gff -o ann_simple.gff
gffread annotation.gff -T -o annotation.gtf
gffread -w transcripts.fa -g genome.fa annotation.gff
Gffutils¶
Introduction¶
gffutils is a Python package for working with and manipulating the GFF and GTF format files typically used for genomic annotations.
Versions¶
0.11.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load gffutils
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gffutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gffutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gffutils
Gimmemotifs¶
Introduction¶
GimmeMotifs is a suite of motif tools, including a motif prediction pipeline for ChIP-seq experiments.
Versions¶
0.17.1
Commands¶
gimme
Module¶
You can load the modules by:
module load biocontainers
module load gimmemotifs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gimmemotifs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gimmemotifs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gimmemotifs
gimme motifs ENCFF407IVS.bed ENCFF407IVS_motifs \
-g ~/.local/share/genomes/hg38/hg38.fa --denovo
Glimmer¶
Introduction¶
Glimmer
is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses.
Versions¶
3.02
Commands¶
anomaly
build-fixed
build-icm
entropy-profile
entropy-score
extract
g3-from-scratch.csh
g3-from-training.csh
g3-iterated.csh
get-motif-counts.awk
glim-diff.awk
glimmer3
long-orfs
match-list-col.awk
multi-extract
not-acgt.awk
score-fixed
start-codon-distrib
test
uncovered
upstream-coords.awk
window-acgt
Module¶
You can load the modules by:
module load biocontainers
module load glimmer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Glimmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glimmer
long-orfs -n -t 1.15 scaffolds.fasta run1.longorfs
extract -t scaffolds.fasta run1.longorfs > run1.train
build-icm -r run1.icm < run1.train
glimmer3 scaffolds.fasta run1.icm cm
Glimmerhmm¶
Introduction¶
Glimmerhmm
is a new gene finder based on a Generalized Hidden Markov Model (GHMM).
Versions¶
3.0.4
Commands¶
glimmerhmm
glimmhmm.pl
trainGlimmerHMM
Module¶
You can load the modules by:
module load biocontainers
module load glimmerhmm
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Glimmerhmm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glimmerhmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glimmerhmm
trainGlimmerHMM Asperg.fasta Asperg.cds -d Asperg
glimmerhmm Asperg.fasta -d Asperg -o Asperg_glimmerhmm_out
Glnexus¶
Introduction¶
Glnexus: Scalable gVCF merging and joint variant calling for population sequencing projects.
Versions¶
1.4.1
Commands¶
glnexus_cli
Module¶
You can load the modules by:
module load biocontainers
module load glnexus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run glnexus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=glnexus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers glnexus
glnexus_cli --config DeepVariant \
--bed ALDH2.bed \
dv_1000G_ALDH2_gvcf/*.g.vcf.gz \
> dv_1000G_ALDH2.bcf
Gmap¶
Introduction¶
Gmap
is a genomic mapping and alignment program for mRNA and EST sequences.
Versions¶
2021.05.27
2021.08.25
Commands¶
atoiindex
cmetindex
cpuid
dbsnp_iit
ensembl_genes
fa_coords
get-genome
gff3_genes
gff3_introns
gff3_splicesites
gmap
gmap.avx2
gmap_build
gmap_cat
gmapindex
gmapl
gmapl.avx2
gmapl.nosimd
gmap.nosimd
gmap_process
gsnap
gsnap.avx2
gsnapl
gsnapl.avx2
gsnapl.nosimd
gsnap.nosimd
gtf_genes
gtf_introns
gtf_splicesites
gtf_transcript_splicesites
gvf_iit
iit_dump
iit_get
iit_store
indexdb_cat
md_coords
psl_genes
psl_introns
psl_splicesites
sam_sort
snpindex
trindex
vcf_iit
Module¶
You can load the modules by:
module load biocontainers
module load gmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=gmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gmap
gmap_build -d Cmm -D Cmm genome.fasta
gmap -d Cmm -t 4 -D ./Cmm cdna.fasta > gmap_out.txt
gmap_build -d GRCh38 -D GRCh38 Homo_sapiens.GRCh38.dna.primary_assembly.fa
gsnap -d GRCh38 -D ./GRCh38 --nthreads=4 SRR16956239_1.fastq SRR16956239_2.fastq > gsnap_out.txt
goatools¶
Introduction¶
Goatools
is a python library for gene ontology analyses. Detailed information about its usage can be found here: https://github.com/tanghaibao/goatools
Versions¶
1.1.12
1.2.3
Commands¶
python
python3
compare_gos.py
fetch_associations.py
find_enrichment.py
go_plot.py
map_to_slim.py
ncbi_gene_results_to_python.py
plot_go_term.py
prt_terms.py
runxlrd.py
vba_extract.py
wr_hier.py
wr_sections.py
Module¶
You can load the modules by:
module load biocontainers
module load goatools/1.1.12
Interactive job¶
To run goatools interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers goatools/1.1.12
(base) UserID@bell-a008:~ $ python
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from goatools.base import download_go_basic_obo
>>> obo_fname = download_go_basic_obo()
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=goatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers goatools/1.1.12
python script.py
find_enrichment.py --pval=0.05 --indent data/study data/population data/association
go_plot.py --go_file=tests/data/go_plot/go_heartjogging6.txt -r -o heartjogging6_r1.png
Graphlan¶
Introduction¶
Graphlan
is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees.
Versions¶
1.1.3
Commands¶
graphlan.py
graphlan_annotate.py
Module¶
You can load the modules by:
module load biocontainers
module load graphlan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Graphlan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers graphlan
graphlan_annotate.py hmptree.xml hmptree.annot.xml --annot annot.txt
graphlan.py hmptree.annot.xml hmptree.png --dpi 150 --size 14
Graphmap¶
Introduction¶
Graphmap
is a novel mapper targeted at aligning long, error-prone third-generation sequencing data.
Versions¶
0.6.3
Commands¶
graphmap2
Module¶
You can load the modules by:
module load biocontainers
module load graphmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Graphmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=graphmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers graphmap
Gridss¶
Introduction¶
Gridss
is a module software suite containing tools useful for the detection of genomic rearrangements.
Versions¶
2.13.2
Commands¶
R
Rscript
gridss
gridss_annotate_vcf_kraken2
gridss_annotate_vcf_repeatmasker
gridss_extract_overlapping_fragments
gridss_somatic_filter
gridsstools
virusbreakend
virusbreakend-build
Module¶
You can load the modules by:
module load biocontainers
module load gridss
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gridss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gridss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gridss
Gseapy¶
Introduction¶
Gseapy
is a python wrapper for GESA and Enrichr.
Versions¶
0.10.8
Commands¶
gseapy
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load gseapy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gseapy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gseapy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gseapy
gseapy ssgsea -d ./data/testSet_rand1200.gct \
-g data/temp.gmt \
-o test/ssgsea_report2 \
-p 4 --no-plot --no-scale
gseapy replot -i data -o test/replot_test
GTDB-Tk¶
Introduction¶
GTDB-Tk
is a software toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes based on the Genome Database Taxonomy GTDB. It is designed to work with recent advances that allow hundreds or thousands of metagenome-assembled genomes (MAGs) to be obtained directly from environmental samples. It can also be applied to isolate and single-cell genomes.
GTDB-Tk reference data (R202) has been downloaded for users.
Versions¶
1.7.0
2.1.0
Commands¶
gtdbtk
Module¶
module load biocontainers module load gtdbtk/1.7.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run GTDB-Tk our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=gtdbtk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gtdbtk/1.7.0
gtdbtk identify --genome_dir genomes --out_dir identify --extension gz --cpus 8
gtdbtk align --identify_dir identify --out_dir align --cpus 8
gtdbtk classify --genome_dir genomes --align_dir align --out_dir classify --extension gz --cpus 8
Gubbins¶
Introduction¶
Gubbins
is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions.
Versions¶
3.2.0
3.3
Commands¶
extract_gubbins_clade.py
generate_ska_alignment.py
gubbins_alignment_checker.py
mask_gubbins_aln.py
run_gubbins.py
sumlabels.py
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load gubbins
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Gubbins on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=gubbins
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers gubbins
run_gubbins.py --prefix ST239 ST239.aln
Guppy¶
Introduction¶
Guppy
is a data processing toolkit that contains the Oxford Nanopore Technologies’ basecalling algorithms, and several bioinformatic post-processing features.
Versions¶
6.0.1
6.5.7
Commands¶
guppy_aligner
guppy_barcoder
guppy_basecall_server
guppy_basecaller
guppy_basecaller_duplex
guppy_basecaller_supervisor
guppy_basecall_client
Module¶
You can load the modules by:
module load biocontainers
module load guppy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Guppy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=guppy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers guppy
guppy_basecaller --compress_fastq -i data/fast5_tiny/ \
-s basecall_tiny/ --cpu_threads_per_caller 12 \
--num_callers 1 -c dna_r9.4.1_450bps_hac.cfg
Hail¶
Introduction¶
Hail is an open-source, general-purpose, Python-based data analysis tool with additional data types and methods for working with genomic data.
Versions¶
0.2.94
0.2.98
Commands¶
python3
Module¶
You can load the modules by:
module load biocontainers
module load hail
Interactive job¶
To run Hail interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers hail
(base) UserID@bell-a008:~ $ python3
Python 3.7.13 (default, Apr 24 2022, 01:05:22)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import hail as hl
>>> print(hl.citation())
Hail Team. Hail 0.2.94-f0b38d6c436f. https://github.com/hail-is/hail/commit/f0b38d6c436f.
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hail on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hail
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hail
python3 script.py
Hap.py¶
Introduction¶
Hap.py is a tool to compare diploid genotypes at haplotype level.
Versions¶
0.3.9
Commands¶
bamstats.py
cnx.py
ftx.py
guess-ploidy.py
hap.py
ovc.py
plot-roh.py
pre.py
qfy.py
som.py
varfilter.py
Module¶
You can load the modules by:
module load biocontainers
module load hap.py
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hap.py on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hap.py
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hap.py
hap.py \
example/happy/PG_NA12878_chr21.vcf.gz \
example/happy/NA12878_chr21.vcf.gz \
-f example/happy/PG_Conf_chr21.bed.gz \
-r example/chr21.fa \
-o test
Helen¶
Introduction¶
HELEN is a multi-task RNN polisher which operates on images produced by MarginPolish.
Versions¶
1.0
Commands¶
helen
Module¶
You can load the modules by:
module load biocontainers
module load helen
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run helen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=helen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers helen
helen polish \
--image_dir mp_output \
--model_path "helen_modles/HELEN_r941_guppy344_microbial.pkl" \
--threads 32 \
--output_dir "helen_output/" \
--output_prefix Staph_Aur_draft_helen
Hicexplorer¶
Introduction¶
Hicexplorer
is a set of tools to process, normalize and visualize Hi-C data.
Versions¶
3.7.2
Commands¶
chicAggregateStatistic
chicDifferentialTest
chicExportData
chicPlotViewpoint
chicQualityControl
chicSignificantInteractions
chicViewpoint
chicViewpointBackgroundModel
hicAdjustMatrix
hicAggregateContacts
hicAverageRegions
hicBuildMatrix
hicCompareMatrices
hicCompartmentalization
hicConvertFormat
hicCorrectMatrix
hicCorrelate
hicCreateThresholdFile
hicDetectLoops
hicDifferentialTAD
hicexplorer
hicFindEnrichedContacts
hicFindRestSite
hicFindTADs
hicHyperoptDetectLoops
hicHyperoptDetectLoopsHiCCUPS
hicInfo
hicInterIntraTAD
hicMergeDomains
hicMergeLoops
hicMergeMatrixBins
hicMergeTADbins
hicNormalize
hicPCA
hicPlotAverageRegions
hicPlotDistVsCounts
hicPlotMatrix
hicPlotSVL
hicPlotTADs
hicPlotViewpoint
hicQC
hicQuickQC
hicSumMatrices
hicTADClassifier
hicTrainTADClassifier
hicTransform
hicValidateLocations
Module¶
You can load the modules by:
module load biocontainers
module load hicexplorer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hicexplorer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hicexplorer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hicexplorer
Hic-pro¶
Introduction¶
Hicpro is an optimized and flexible pipeline for Hi-C data processing.
Versions¶
3.0.0
3.1.0
Commands¶
HiC-Pro
digest_genome.py
extract_snps.py
hicpro2fithic.py
hicpro2higlass.sh
hicpro2juicebox.sh
make_viewpoints.py
sparseToDense.py
split_reads.py
split_sparse.py
Module¶
You can load the modules by:
module load biocontainers
module load hic-pro
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hic-pro on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hic-pro
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hic-pro
Hifiasm¶
Introduction¶
Hifiasm
is a fast haplotype-resolved de novo assembler for PacBio HiFi reads.
Versions¶
0.16.0
0.18.5
Commands¶
hifiasm
Module¶
You can load the modules by:
module load biocontainers
module load hifiasm
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hifiasm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hifiasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hifiasm
HISAT2¶
Introduction¶
HISAT2
is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome.
Versions¶
2.2.1
Commands¶
extract_exons.py
extract_splice_sites.py
hisat2
hisat2-align-l
hisat2-align-s
hisat2-build
hisat2-build-l
hisat2-build-s
hisat2-inspect
hisat2-inspect-l
hisat2-inspect-s
hisat2_extract_exons.py
hisat2_extract_snps_haplotypes_UCSC.py
hisat2_extract_snps_haplotypes_VCF.py
hisat2_extract_splice_sites.py
hisat2_read_statistics.py
hisat2_simulate_reads.py
Module¶
You can load the modules by:
module load biocontainers
module load hisat2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HISAT2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hisat2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hisat2
hisat2-build genome.fa genome
# for single-end FASTA reads DNA alignment
hisat2 -f -x genome -U reads.fa -S output.sam --no-spliced-alignment
# for paired-end FASTQ reads alignment
hisat2 -x genome -1 reads_1.fq -2 read2_2.fq -S output.sam
Hmmer¶
Introduction¶
Hmmer
is used for searching sequence databases for sequence homologs, and for making sequence alignments.
Versions¶
3.3.2
Commands¶
alimask
easel
esl-afetch
esl-alimanip
esl-alimap
esl-alimask
esl-alimerge
esl-alipid
esl-alirev
esl-alistat
esl-compalign
esl-compstruct
esl-construct
esl-histplot
esl-mask
esl-mixdchlet
esl-reformat
esl-selectn
esl-seqrange
esl-seqstat
esl-sfetch
esl-shuffle
esl-ssdraw
esl-translate
esl-weight
hmmalign
hmmbuild
hmmconvert
hmmemit
hmmfetch
hmmlogo
hmmpgmd
hmmpgmd_shard
hmmpress
hmmscan
hmmsearch
hmmsim
hmmstat
jackhmmer
makehmmerdb
nhmmer
nhmmscan
phmmer
Module¶
You can load the modules by:
module load biocontainers
module load hmmer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hmmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hmmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hmmer
hmmsearch Nramp.hmm protein.fa > out
HOMMER¶
Introduction¶
HOMMER
(Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis. Details about its usage can be found in HOMMER website.
Versions¶
4.11
Commands¶
addDataHeader.pl
addData.pl
addGeneAnnotation.pl
addInternalData.pl
addOligos.pl
adjustPeakFile.pl
adjustRedunGroupFile.pl
analyzeChIP-Seq.pl
analyzeRepeats.pl
analyzeRNA.pl
annotateInteractions.pl
annotatePeaks.pl
annotateRelativePosition.pl
annotateTranscripts.pl
assignGeneWeights.pl
assignTSStoGene.pl
batchAnnotatePeaksHistogram.pl
batchFindMotifsGenome.pl
batchFindMotifs.pl
batchMakeHiCMatrix.pl
batchMakeMultiWigHub.pl
batchMakeTagDirectory.pl
batchParallel.pl
bed2DtoUCSCbed.pl
bed2pos.pl
bed2tag.pl
blat2gtf.pl
bridgeResult2Cytoscape.pl
changeNewLine.pl
checkPeakFile.pl
checkTagBias.pl
chopify.pl
chopUpBackground.pl
chopUpPeakFile.pl
cleanUpPeakFile.pl
cleanUpSequences.pl
cluster2bedgraph.pl
cluster2bed.pl
combineGO.pl
combineHubs.pl
compareMotifs.pl
condenseBedGraph.pl
cons2fasta.pl
conservationAverage.pl
conservationPerLocus.pl
convertCoordinates.pl
convertIDs.pl
convertOrganismID.pl
duplicateCol.pl
eland2tags.pl
fasta2tab.pl
fastq2fasta.pl
filterListBy.pl
filterTADsAndCPs.pl
filterTADsAndLoops.pl
findcsRNATSS.pl
findGO.pl
findGOtxt.pl
findHiCCompartments.pl
findHiCDomains.pl
findHiCInteractionsByChr.pl
findKnownMotifs.pl
findMotifsGenome.pl
findMotifs.pl
findRedundantBLAT.pl
findTADsAndLoops.pl
findTopMotifs.pl
flipPC1toMatch.pl
freq2group.pl
genericConvertIDs.pl
GenomeOntology.pl
getChrLengths.pl
getConservedRegions.pl
getDifferentialBedGraph.pl
getDifferentialPeaksReplicates.pl
getDiffExpression.pl
getDistalPeaks.pl
getFocalPeaks.pl
getGenesInCategory.pl
getGWASoverlap.pl
getHiCcorrDiff.pl
getHomerQCstats.pl
getLikelyAdapters.pl
getMappingStats.pl
getPartOfPromoter.pl
getPos.pl
getRandomReads.pl
getSiteConservation.pl
getTopPeaks.pl
gff2pos.pl
go2cytoscape.pl
groupSequences.pl
joinFiles.pl
loadGenome.pl
loadPromoters.pl
makeBigBedMotifTrack.pl
makeBigWig.pl
makeBinaryFile.pl
makeHiCWashUfile.pl
makeMetaGeneProfile.pl
makeMultiWigHub.pl
map-fastq.pl
merge2Dbed.pl
mergeData.pl
motif2Jaspar.pl
motif2Logo.pl
parseGTF.pl
pos2bed.pl
preparseGenome.pl
prepForR.pl
profile2seq.pl
qseq2fastq.pl
randomizeGroupFile.pl
randomizeMotifs.pl
randRemoveBackground.pl
removeAccVersion.pl
removeBadSeq.pl
removeOutOfBoundsReads.pl
removePoorSeq.pl
removeRedundantPeaks.pl
renamePeaks.pl
resizePosFile.pl
revoppMotif.pl
rotateHiCmatrix.pl
runHiCpca.pl
sam2spliceJunc.pl
scanMotifGenomeWide.pl
scrambleFasta.pl
selectRepeatBg.pl
seq2profile.pl
SIMA.pl
subtractBedGraphsDirectory.pl
subtractBedGraphs.pl
tab2fasta.pl
tag2bed.pl
tag2pos.pl
tagDir2bed.pl
tagDir2hicFile.pl
tagDir2HiCsummary.pl
zipHomerResults.pl
Database¶
Selected database have been downloaded for users.
ORGANISMS
: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and flyPROMOTERS
: yeast, worm, mouse, arabidopsis, zebrafish, rat, human and flyGENOMES
: hg19, hg38, mm10, ce11, dm6, rn6, danRer11, tair10, and sacCer3
Module¶
You can load the modules by:
module load biocontainers
module load hommer/4.11
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HOMMER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=hommer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hommer/4.11
configureHomer.pl -list ## Check the installed database.
findMotifs.pl mouse_geneid.txt mouse motif_out_mouse
findMotifs.pl geneid.txt human motif_out
Homopolish¶
Introduction¶
Homopolish is a genome polisher originally developed for Nanopore and subsequently extended for PacBio CLR. It generates a high-quality genome (>Q50) for virus, bacteria, and fungus. Nanopore/PacBio systematic errors are corrected by retreiving homologs from closely-related genomes and polished by an SVM. When paired with Racon and Medaka, the genome quality can reach Q50-90 (>99.999%) on Nanopore R9.4/10.3 flowcells (Guppy >3.4). For PacBio CLR, Homopolish also improves the majority of Flye-assembled genomes to Q90.
Versions¶
0.4.1
Commands¶
homopolish
Module¶
You can load the modules by:
module load biocontainers
module load homopolish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run homopolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=homopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers homopolish
How_are_we_stranded_here¶
Introduction¶
How_are_we_stranded_here
is a python package for testing strandedness of RNA-Seq fastq files.
Versions¶
1.0.1
Commands¶
check_strandedness
Module¶
You can load the modules by:
module load biocontainers
module load how_are_we_stranded_here
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run How_are_we_stranded_here on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=how_are_we_stranded_here
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers how_are_we_stranded_here
check_strandedness --gtf Homo_sapiens.GRCh38.105.gtf \
--transcripts Homo_sapiens.GRCh38.cds.all.fa \
--reads_1 seq_1.fastq --reads_2 seq_2.fastq
HTSeq¶
Introduction¶
HTSeq
is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.
Versions¶
0.13.5
1.99.2
2.0.1
2.0.2
2.0.2-py310
Commands¶
htseq-count
htseq-count-barcodes
htseq-qa
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load htseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HTSeq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htseq
python -m HTSeq.scripts.count \
-f bam input.bam ref.gtf \
> test.out
Htslib¶
Introduction¶
Htslib
is a C library for high-throughput sequencing data formats.
Versions¶
1.14
1.15
1.16
1.17
Commands¶
bgzip
htsfile
tabix
Module¶
You can load the modules by:
module load biocontainers
module load htslib
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Htslib on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htslib
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htslib
tabix sorted.gff.gz chr1:10,000,000-20,000,000
Htstream¶
Introduction¶
Htstream
is a quality control and processing pipeline for High Throughput Sequencing data.
Versions¶
1.3.3
Commands¶
hts_AdapterTrimmer
hts_CutTrim
hts_LengthFilter
hts_NTrimmer
hts_Overlapper
hts_PolyATTrim
hts_Primers
hts_QWindowTrim
hts_SeqScreener
hts_Stats
hts_SuperDeduper
Module¶
You can load the modules by:
module load biocontainers
module load htstream
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Htstream on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=htstream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers htstream
HUMAnN 3¶
Introduction¶
HUMAnN 3.0
is the next iteration of HUMAnN, the HMP Unified Metabolic Analysis Network. HUMAnN is a method for efficiently and accurately profiling the abundance of microbial metabolic pathways and other molecular functions from metagenomic or metatranscriptomic sequencing data.
Versions¶
3.0.0
3.6
Commands¶
humann
humann3
humann3_databases
humann_barplot
humann_benchmark
humann_build_custom_database
humann_config
humann_databases
humann_genefamilies_genus_level
humann_infer_taxonomy
humann_join_tables
humann_reduce_table
humann_regroup_table
humann_rename_table
humann_renorm_table
humann_split_stratified_table
humann_split_table
humann_test
humann_unpack_pathways
Database¶
Full ChocoPhlAn, UniRef90, EC-filtered UniRef90, UniRef50, EC-filtered UniRef50, and utility_mapping databases have been downloaded for users.
Module¶
You can load the modules by:
module load biocontainers
module load humann/3.0.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run HUMAnN3 on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=humann
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers humann/3.0.0
# Check the database and config by:
humann_config --print
humann --threads 24 --input examples/demo.fastq --output demo_output --metaphlan-options "--bowtie2db /depot/itap/datasets/metaphlan"
Hyphy¶
Introduction¶
Hyphy
is an open-source software package for the analysis of genetic sequences using techniques in phylogenetics, molecular evolution, and machine learning.
Versions¶
2.5.36
Commands¶
hyphy
Module¶
You can load the modules by:
module load biocontainers
module load hyphy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Hyphy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hyphy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hyphy
Hypo¶
Introduction¶
HyPo–a Hybrid Polisher– utilises short as well as long reads within a single run to polish a long reads assembly of small and large genomes. It exploits unique genomic kmers to selectively polish segments of contigs using partial order alignment of selective read-segments. As demonstrated on human genome assemblies, Hypo generates significantly more accurate polished assembly in about one-third time with about half the memory requirements in comparison to contemporary widely used polishers like Racon.
Versions¶
1.0.3
Commands¶
hypo
Module¶
You can load the modules by:
module load biocontainers
module load hypo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run hypo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=hypo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers hypo
Idba¶
Introduction¶
Idba
is a practical iterative De Bruijn Graph De Novo Assembler for sequence assembly in bioinfomatics.
Versions¶
1.1.3
Commands¶
fa2fq
filter_blat
filter_contigs
filterfa
fq2fa
idba
idba_hybrid
idba_tran
idba_tran_test
idba_ud
parallel_blat
parallel_rna_blat
print_graph
raw_n50
run-unittest.py
sample_reads
scaffold
scan.py
shuffle_reads
sim_reads
sim_reads_tran
sort_psl
sort_reads
split_fa
split_fq
split_scaffold
test
validate_blat
validate_blat_parallel
validate_component
validate_contigs_blat
validate_contigs_mummer
validate_reads_blat
validate_rna
Module¶
You can load the modules by:
module load biocontainers
module load idba
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Idba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=idba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers idba
fq2fa --paired --filter SRR1977249.abundtrim.subset.pe.fq SRR1977249.abundtrim.subset.pe.fa
idba_ud -r SRR1977249.abundtrim.subset.pe.fa -o output
IGV¶
Introduction¶
IGV
(Integrative Genomics Viewer) is a high-performance, easy-to-use, interactive tool for the visual exploration of genomic data.
Versions¶
2.11.9
2.12.3
Commands¶
igv_hidpi.sh
igv.sh
Module¶
You can load the modules by:
module load biocontainers
module load igv
Interactive job¶
Since IGV requires GUI, it is recommended to run it within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module --force purge
(base) UserID@bell-a008:~ $ ml biocontainers igv
(base) UserID@bell-a008:~ $ igv.sh
Impute2¶
Introduction¶
Impute2
is a genotype imputation and haplotype phasing program.
Versions¶
2.3.2
Commands¶
impute2
Module¶
You can load the modules by:
module load biocontainers
module load impute2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Impute2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=impute2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers impute2
impute2 \
-m Example/example.chr22.map \
-h Example/example.chr22.1kG.haps \
-l Example/example.chr22.1kG.legend \
-g Example/example.chr22.study.gens \
-strand_g Example/example.chr22.study.strand \
-int 20.4e6 20.5e6 \
-Ne 20000 \
-o example.chr22.one.phased.impute2
Infernal¶
Introduction¶
Infernal (“INFERence of RNA ALignment”) is for searching DNA sequence databases for RNA structure and sequence similarities. It is an implementation of a special case of profile stochastic context-free grammars called covariance models (CMs). A CM is like a sequence profile, but it scores a combination of sequence consensus and RNA secondary structure consensus, so in many cases, it is more capable of identifying RNA homologs that conserve their secondary structure more than their primary sequence. For more information, please check: BioContainers: https://biocontainers.pro/tools/infernal Home page: http://eddylab.org/infernal/
Versions¶
1.1.4
Commands¶
cmalign
cmbuild
cmcalibrate
cmconvert
cmemit
cmfetch
cmpress
cmscan
cmsearch
cmstat
Module¶
You can load the modules by:
module load biocontainers
module load infernal
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run infernal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=infernal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers infernal
Instrain¶
Introduction¶
Instrain
is a python program for analysis of co-occurring genome populations from metagenomes that allows highly accurate genome comparisons, analysis of coverage, microdiversity, and linkage, and sensitive SNP detection with gene localization and synonymous non-synonymous identification.
Versions¶
1.5.7
1.6.3
Commands¶
inStrain
Module¶
You can load the modules by:
module load biocontainers
module load instrain
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Instrain on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=instrain
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers instrain
Intarna¶
Introduction¶
Intarna
is a general and fast approach to the prediction of RNA-RNA interactions incorporating both the accessibility of interacting sites as well as the existence of a user-definable seed interaction.
Versions¶
3.3.1
Commands¶
IntaRNA
Module¶
You can load the modules by:
module load biocontainers
module load intarna
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Intarna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=intarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers intarna
IntaRNA -t CCCCCCCCGGGGGGGGGGGGGG -q AAAACCCCCCCUUUU
InterProScan¶
Introduction¶
InterPro
is a database which integrates together predictive information about proteins’ function from a number of partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains.
Users who have novel nucleotide or protein sequences that they wish to functionally characterise can use the software package InterProScan
to run the scanning algorithms from the InterPro database in an integrated way. Sequences are submitted in FASTA format. Matches are then calculated against all of the required member database’s signatures and the results are then output in a variety of formats.
Versions¶
5.54_87.0
5.61-93.0
Commands¶
interproscan.sh
Database¶
Latest version of database has been downloaded and setup in /depot/itap/datasets/interproscan-5.54-87.0/data.
Module¶
You can load the modules by:
module load biocontainers
module load interproscan/5.54_87.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run run_dbcan on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=interproscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers interproscan/5.54_87.0
interproscan.sh -cpu 24 -i test_proteins.fasta
interproscan.sh -cpu 24 -t n -i test_nt_seqs.fasta
IQ-TREE¶
Introduction¶
IQ-TREE
is an efficient phylogenomic software by maximum likelihood.
Versions¶
1.6.12
2.1.2
2.2.0_beta
2.2.2.2
Commands¶
iqtree
Module¶
You can load the modules by:
module load biocontainers
module load iqtree
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run IQ-TREE on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers iqtree
iqtree -s input.phy -m GTR+I+G > test.out
Iqtree2¶
Introduction¶
IQ-TREE is an efficient phylogenomic software by maximum likelihood.
Versions¶
2.2.2.6
Commands¶
iqtree2
Module¶
You can load the modules by:
module load biocontainers
module load iqtree2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run iqtree2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=iqtree2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers iqtree2
Ismapper¶
Introduction¶
ISMapper searches for IS positions in sequence data using paired end Illumina short reads, an IS query/queries of interest and a reference genome. ISMapper reports the IS positions it has found in each isolate, relative to the provided reference genome.
Versions¶
2.0.2
Commands¶
ismap
Module¶
You can load the modules by:
module load biocontainers
module load ismapper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ismapper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ismapper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ismapper
Isoquant¶
Introduction¶
IsoQuant is a tool for the genome-based analysis of long RNA reads, such as PacBio or Oxford Nanopores. IsoQuant allows to reconstruct and quantify transcript models with high precision and decent recall. If the reference annotation is given, IsoQuant also assigns reads to the annotated isoforms based on their intron and exon structure. IsoQuant further performs annotated gene, isoform, exon and intron quantification. If reads are grouped (e.g. according to cell type), counts are reported according to the provided grouping.
Versions¶
3.1.2
Commands¶
isoquant.py
Module¶
You can load the modules by:
module load biocontainers
module load isoquant
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run isoquant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers isoquant
isoquant.py --reference chr9.4M.fa.gz \
--genedb chr9.4M.gtf.gz \
--fastq chr9.4M.ont.sim.fq.gz \
--data_type nanopore -o test_ont
Isoseq3¶
Introduction¶
Isoseq3
- Scalable De Novo Isoform Discovery.
Versions¶
3.4.0
3.7.0
3.8.2
Commands¶
isoseq3
Module¶
You can load the modules by:
module load biocontainers
module load isoseq3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Isoseq3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=isoseq3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers isoseq3
isoseq3 --version
isoseq3 refine --require-polya \
alz.demult.5p--3p.bam \
primers.fasta alz.flnc.bam
isoseq3 cluster alz.flnc.bam \
alz.polished.bam --verbose --use-qvs
Ivar¶
Introduction¶
Ivar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
Versions¶
1.3.1
1.4.2
Commands¶
ivar
Module¶
You can load the modules by:
module load biocontainers
module load ivar
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ivar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ivar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ivar
Jcvi¶
Introduction¶
Jcvi is a collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.
Versions¶
1.2.7
1.3.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load jcvi
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run jcvi on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=jcvi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers jcvi
python -m jcvi.formats.fasta format Vvinifera_145_Genoscope.12X.cds.fa.gz grape.cds
python -m jcvi.formats.fasta format Ppersica_298_v2.1.cds.fa.gz peach.cds
python -m jcvi.formats.gff bed --type=mRNA --key=Name --primary_only Vvinifera_145_Genoscope.12X.gene.gff3.gz -o grape.bed
python -m jcvi.compara.catalog ortholog grape peach --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
rm grape.peach.last.filtered
python -m jcvi.compara.catalog ortholog grape peach --cscore=.99 --no_strip_names
python -m jcvi.graphics.dotplot grape.peach.anchors
python -m jcvi.compara.synteny depth --histogram grape.peach.anchors
python -m jcvi.graphics.grabseeds seeds test-data/test.JPG
Kaiju¶
Introduction¶
Kaiju
is a tool for fast taxonomic classification of metagenomic sequencing reads using a protein reference database.
Versions¶
1.8.2
Commands¶
kaiju
kaiju-addTaxonNames
kaiju-convertMAR.py
kaiju-convertNR
kaiju-excluded-accessions.txt
kaiju-gbk2faa.pl
kaiju-makedb
kaiju-mergeOutputs
kaiju-mkbwt
kaiju-mkfmi
kaiju-multi
kaiju-taxonlistEuk.tsv
kaiju2krona
kaiju2table
kaijup
kaijux
Module¶
You can load the modules by:
module load biocontainers
module load kaiju
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Kaiju on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kaiju
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kaiju
kaiju -t kaijudb/nodes.dmp \
-f kaijudb/refseq/kaiju_db_refseq.fmi \
-i input_1.fastq -j input_2.fastq
-z 24
Kakscalculator2¶
Introduction¶
kakscalculator2 is a toolkit of incorporating gamma series methods and sliding window strategies.
Versions¶
2.0.1
Commands¶
KaKs_Calculator
Module¶
You can load the modules by:
module load biocontainers
module load kakscalculator2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kakscalculator2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kakscalculator2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kakscalculator2
KaKs_Calculator -i example.axt -o example.axt.kaks -m YN
Kallisto¶
Introduction¶
Kallisto
is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment.
Detailed usage can be found here: https://github.com/pachterlab/kallisto
Versions¶
0.46.2
0.48.0
Commands¶
kallisto
Module¶
You can load the modules by:
module load biocontainers
module load kallisto/0.48.0
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kallisto on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kallisto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kallisto/0.48.0
kallisto index -i transcripts.idx Homo_sapiens.GRCh38.cds.all.fa.gz
kallisto quant -t 24 -i transcripts.idx -o output -b 100 SRR11614709_1.fastq SRR11614709_2.fastq
Kentutils¶
Introduction¶
Kentutils: UCSC command line bioinformatic utilities.
Versions¶
302.1.0
Commands¶
addCols
ameme
autoDtd
autoSql
autoXml
ave
aveCols
axtChain
axtSort
axtSwap
axtToMaf
axtToPsl
bedClip
bedCommonRegions
bedCoverage
bedExtendRanges
bedGeneParts
bedGraphPack
bedGraphToBigWig
bedIntersect
bedItemOverlapCount
bedPileUps
bedRemoveOverlap
bedRestrictToPositions
bedSort
bedToBigBed
bedToExons
bedToGenePred
bedToPsl
bedWeedOverlapping
bigBedInfo
bigBedNamedItems
bigBedSummary
bigBedToBed
bigWigAverageOverBed
bigWigCat
bigWigCorrelate
bigWigInfo
bigWigMerge
bigWigSummary
bigWigToBedGraph
bigWigToWig
blastToPsl
blastXmlToPsl
calc
catDir
catUncomment
chainAntiRepeat
chainFilter
chainMergeSort
chainNet
chainPreNet
chainSort
chainSplit
chainStitchId
chainSwap
chainToAxt
chainToPsl
checkAgpAndFa
checkCoverageGaps
checkHgFindSpec
checkTableCoords
chopFaLines
chromGraphFromBin
chromGraphToBin
colTransform
countChars
crTreeIndexBed
crTreeSearchBed
dbSnoop
dbTrash
estOrient
faCmp
faCount
faFilter
faFilterN
faFrag
faNoise
faOneRecord
faPolyASizes
faRandomize
faRc
faSize
faSomeRecords
faSplit
faToFastq
faToTab
faToTwoBit
faTrans
fastqToFa
featureBits
fetchChromSizes
findMotif
gapToLift
genePredCheck
genePredHisto
genePredSingleCover
genePredToBed
genePredToFakePsl
genePredToGtf
genePredToMafFrames
gfClient
gfServer
gff3ToGenePred
gff3ToPsl
gmtime
gtfToGenePred
headRest
hgFindSpec
hgGcPercent
hgLoadBed
hgLoadOut
hgLoadWiggle
hgTrackDb
hgWiggle
hgsql
hgsqldump
htmlCheck
hubCheck
ixIxx
lavToAxt
lavToPsl
ldHgGene
liftOver
liftOverMerge
liftUp
linesToRa
linux.x86_64
localtime
mafAddIRows
mafAddQRows
mafCoverage
mafFetch
mafFilter
mafFrag
mafFrags
mafGene
mafMeFirst
mafOrder
mafRanges
mafSpeciesList
mafSpeciesSubset
mafSplit
mafSplitPos
mafToAxt
mafToPsl
mafsInRegion
makeTableList
maskOutFa
mktime
mrnaToGene
netChainSubset
netClass
netFilter
netSplit
netSyntenic
netToAxt
netToBed
newProg
nibFrag
nibSize
oligoMatch
overlapSelect
paraFetch
paraSync
positionalTblCheck
pslCDnaFilter
pslCat
pslCheck
pslDropOverlap
pslFilter
pslHisto
pslLiftSubrangeBlat
pslMap
pslMrnaCover
pslPairs
pslPartition
pslPretty
pslRecalcMatch
pslReps
pslSelect
pslSort
pslStats
pslSwap
pslToBed
pslToChain
pslToPslx
pslxToFa
qaToQac
qacAgpLift
qacToQa
qacToWig
raSqlQuery
raToLines
raToTab
randomLines
rmFaDups
rowsToCols
sizeof
spacedToTab
splitFile
splitFileByColumn
sqlToXml
stringify
subChar
subColumn
tailLines
tdbQuery
textHistogram
tickToDate
toLower
toUpper
trfBig
twoBitDup
twoBitInfo
twoBitMask
twoBitToFa
validateFiles
validateManifest
wigCorrelate
wigEncode
wigToBigWig
wordLine
xmlCat
xmlToSql
Module¶
You can load the modules by:
module load biocontainers
module load kentutils
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kentutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kentutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kentutils
Khmer¶
Introduction¶
Khmer
is a tool for k-mer counting, filtering, and graph traversal FTW!
Versions¶
3.0.0a3
Commands¶
abundance-dist.py
abundance-dist-single.py
annotate-partitions.py
count-median.py
cygdb
cython
cythonize
do-partition.py
extract-long-sequences.py
extract-paired-reads.py
extract-partitions.py
fastq-to-fasta.py
filter-abund.py
filter-abund-single.py
filter-stoptags.py
find-knots.py
interleave-reads.py
load-graph.py
load-into-counting.py
make-initial-stoptags.py
merge-partitions.py
normalize-by-median.py
partition-graph.py
readstats.py
sample-reads-randomly.py
screed
split-paired-reads.py
trim-low-abund.py
unique-kmers.py
Module¶
You can load the modules by:
module load biocontainers
module load khmer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Khmer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=khmer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers khmer
Kissde¶
Introduction¶
kissDE is a R package, similar to DEseq, but which works on pairs of variants, and tests if a variant is enriched in one condition. It has been developped to work easily with KisSplice output. It can also work with a simple table of counts obtained by any other means. It requires at least two replicates per condition and at least two conditions.
Versions¶
1.15.3
Commands¶
R
Rscript
kissDE.R
Module¶
You can load the modules by:
module load biocontainers
module load kissde
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissde on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissde
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissde
Kissplice¶
Introduction¶
KisSplice is a software that enables to analyse RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number of biological conditions, and will quantify each variant in each condition. It has been tested on Illumina datasets of up to 1G reads. Its memory consumption is around 5Gb for 100M reads.
Versions¶
2.6.2
Commands¶
kissplice
Module¶
You can load the modules by:
module load biocontainers
module load kissplice
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissplice on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissplice
Kissplice2refgenome¶
Introduction¶
KisSplice can also be used when a reference (annotated) genome is available, in order to annotate the variants found and help prioritize cases to validate experimentally. In this case, the results of KisSplice are mapped to the reference genome, using for instance STAR, and the mapping results are analysed using KisSplice2RefGenome.
Versions¶
2.0.8
Commands¶
kissplice2refgenome
Module¶
You can load the modules by:
module load biocontainers
module load kissplice2refgenome
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kissplice2refgenome on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kissplice2refgenome
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kissplice2refgenome
Kma¶
Introduction¶
KMA is a mapping method designed to map raw reads directly against redundant databases, in an ultra-fast manner using seed and extend.
Versions¶
1.4.3
Commands¶
kma
kma_index
kma_shm
kma_update
Module¶
You can load the modules by:
module load biocontainers
module load kma
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kma on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kma
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kma
Kmc¶
Introduction¶
Kmc
is a tool for efficient k-mer counting and filtering of reads based on k-mer content.
Versions¶
3.2.1
Commands¶
kmc
kmc_dump
kmc_tools
Module¶
You can load the modules by:
module load biocontainers
module load kmc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Kmc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmc
kmc -k27 seq.fastq 27mers .
Kmergenie¶
Introduction¶
KmerGenie estimates the best k-mer length for genome de novo assembly.
Versions¶
1.7051
Commands¶
kmergenie
Module¶
You can load the modules by:
module load biocontainers
module load kmergenie
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kmergenie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kmergenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmergenie
Jellyfish¶
Introduction¶
Jellyfish
is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence.
Versions¶
2.3.0
Commands¶
jellyfish
Module¶
You can load the modules by:
module load biocontainers
module load kmer-jellyfish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Jellyfish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=kmer-jellyfish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kmer-jellyfish
jellyfish count -m 16 -s 100M -t 12 \
-o mer_counts -c 7 input.fastq
KneadData¶
Introduction¶
KneadData
is a tool designed to perform quality control on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments. In these experiments, samples are typically taken from a host in hopes of learning something about the microbial community on the host.
Detailed usage can be found here: https://huttenhower.sph.harvard.edu/kneaddata/
Versions¶
0.10.0
Commands¶
kneaddata
kneaddata_bowtie2_discordant_pairs
kneaddata_build_database
kneaddata_database
kneaddata_read_count_table
kneaddata_test
kneaddata_trf_parallel
Module¶
You can load the modules by:
module load biocontainers
module load kneaddata
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kneaddata on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kneaddata
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kneaddata
kneaddata --input examples/demo.fastq --reference-db examples/demo_db --output kneaddata_demo_outpu --threads 24 --processes 24
Kover¶
Introduction¶
Kover is an out-of-core implementation of rule-based machine learning algorithms that has been tailored for genomic biomarker discovery.
Versions¶
2.0.6
Commands¶
kover
Module¶
You can load the modules by:
module load biocontainers
module load kover
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kover on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=kover
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kover
Kraken2¶
Introduction¶
Kraken2
is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer.
Detailed usage can be found here: https://ccb.jhu.edu/software/kraken2/
Versions¶
2.1.2_fixftp
2.1.2
2.1.3
Commands¶
kraken2
kraken2-build
kraken2-inspect
Module¶
You can load the modules by:
module load biocontainers
module load kraken2/2.1.2
Download database¶
Note
There is a known bug in rsync_from_ncbi.pl
(https://github.com/DerrickWood/kraken2/issues/292). When users want to download and build databases by kraken2-build --download-library
, there will an error rsync_from_ncbi.pl: unexpected FTP path(new server?)
. We modifed rsync_from_ncbi.pl
to fix the bug, and created a new module ending with the suffix _fixftp
. Please use this corrected module to download the library.
To download databases, please use the below command:
module load biocontainers
module load kraken2/2.1.2_fixftp
kraken2-build --download-library archaea --db archaea
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run kraken2 on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=kraken2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers kraken2/2.1.2
kraken2 --threads 24 --report kranken2.report --db minikraken2_v2_8GB_201904_UPDATE --paired --classified-out cseqs#.fq SRR5043021_1.fastq SRR5043021_2.fastq
KrakenTools¶
Introduction¶
KrakenTools
provides individual scripts to analyze Kraken/Kraken2/Bracken/KrakenUniq output files.
Detailed usage can be found here: https://github.com/jenniferlu717/KrakenTools
Versions¶
1.2
Commands¶
alpha_diversity.py
beta_diversity.py
combine_kreports.py
combine_mpa.py
extract_kraken_reads.py
filter_bracken.out.py
fix_unmapped.py
kreport2krona.py
kreport2mpa.py
make_kreport.py
make_ktaxonomy.py
Module¶
You can load the modules by:
module load biocontainers
module load krakentools/1.2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run krakentools on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=krakentools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers krakentools/1.2
extract_kraken_reads.py -k myfile.kraken -t 2 -s1 SRR5043021_1.fastq -s2 SRR5043021_2.fastq -o extracted1.fq -o2 extracted2.fq
Lambda¶
Introduction¶
Lambda
is a local aligner optimized for many query sequences and searches in protein space.
Versions¶
2.0.0
Commands¶
lambda2
Module¶
You can load the modules by:
module load biocontainers
module load lambda
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lambda on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lambda
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lambda
lambda2 mkindexp -d uniprot_sprot.fasta
lambda2 searchp \
-q proteins.fasta \
-i uniprot_sprot.fasta.lambda
Last¶
Introduction¶
Last
is used to find & align related regions of sequences.
Versions¶
1268
1356
1411
1418
Commands¶
last-dotplot
last-map-probs
last-merge-batches
last-pair-probs
last-postmask
last-split
last-split5
last-train
lastal
lastal5
lastdb
lastdb5
Module¶
You can load the modules by:
module load biocontainers
module load last
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Last on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=last
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers last
lastdb humdb humanMito.fa
lastal humdb fuguMito.fa > myalns.maf
Lastz¶
Introduction¶
LASTZ - pairwise DNA sequence aligner
Versions¶
1.04.15
Commands¶
lastz
lastz_32
lastz_D
Module¶
You can load the modules by:
module load biocontainers
module load lastz
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run lastz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lastz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lastz
lastz cmc_CFBP8216.fasta cmp_LPPA982.fasta \
--notransition --step=20 --nogapped \
--format=maf > cmc_vs_cmp.maf
Ldhat¶
Introduction¶
LDhat is a package written in the C and C++ languages for the analysis of recombination rates from population genetic data.
Versions¶
2.2a
Commands¶
convert
pairwise
interval
rhomap
fin
Module¶
You can load the modules by:
module load biocontainers
module load ldhat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldhat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldhat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldhat
Ldjump¶
Introduction¶
LDJump is an R package to estimate variable recombination rates from population genetic data.
Versions¶
0.3.1
Commands¶
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load ldjump
Note
A full path to the Phi file of PhiPack needs to be provided as follows pathPhi = "/opt/PhiPack/Phi"
. In order to use LDhat to quickly calculate some of the summary statistics, please set pathLDhat = "/opt/LDhat/"
.
Interactive job¶
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers ldjump
(base) UserID@bell-a008:~ $ R
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(LDJump)
> LDJump(seqFullPath, alpha = 0.05, segLength = 1000, pathLDhat = "/opt/LDhat/", pathPhi = "/opt/PhiPack/Phi", format = "fasta", refName = NULL,
start = NULL, constant = F, status = T, cores = 1, accept = F, demography = F, out = "")
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldjump on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldjump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldjump
Rscript script.R
Ldsc¶
Introduction¶
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics.
Versions¶
1.0.1
Commands¶
ldsc.py
munge_sumstats.py
Module¶
You can load the modules by:
module load biocontainers
module load ldsc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ldsc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ldsc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ldsc
Liftoff¶
Introduction¶
Liftoff
is an accurate GFF3/GTF lift over pipeline.
Versions¶
1.6.3
Commands¶
liftoff
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load liftoff
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Liftoff on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftoff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers liftoff
liftoff -g reference.gff3 -o target.gff3 \
-chroms chr_pairs.txt target.fasta reference.fa
Liftofftools¶
Introduction¶
LiftoffTools is a toolkit to compare genes lifted between genome assemblies. Specifically it is designed to compare genes lifted over using Liftoff although it is also compatible with other lift-over tools such as UCSC liftOver as long as the feature IDs are the same. LiftoffTools provides 3 different modules. The first identifies variants in protein-coding genes and their effects on the gene. The second compares the gene synteny, and the third clusters genes into groups of paralogs to evaluate gene copy number gain and loss. The input for all modules is the reference genome assembly (FASTA), target genome assembly (FASTA), reference annotation (GFF/GTF), and target annotation (GFF/GTF).
Versions¶
0.4.4
Commands¶
liftofftools
Module¶
You can load the modules by:
module load biocontainers
module load liftofftools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run liftofftools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=liftofftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers liftofftools
Lima¶
Introduction¶
Lima
is the standard tool to identify barcode and primer sequences in PacBio single-molecule sequencing data.
Versions¶
2.2.0
Commands¶
lima
Module¶
You can load the modules by:
module load biocontainers
module load lima
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lima on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lima
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lima
lima --version
lima --isoseq --dump-clips \
--peek-guess -j 12 \
alz.ccs.bam primers.fasta \
alz.demult.bam
Links¶
Introduction¶
LINKS is a genomics application for scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It can be used to scaffold high-quality draft genome assemblies with any long sequences (eg. ONT reads, PacBio reads, other draft genomes, etc). It is also used to scaffold contig pairs linked by ARCS/ARKS.
Versions¶
2.0.1
Commands¶
LINKS
Module¶
You can load the modules by:
module load biocontainers
module load links
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run links on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=links
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers links
Lofreq¶
Introduction¶
Lofreq
is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data.
Versions¶
2.1.5
Commands¶
lofreq
Module¶
You can load the modules by:
module load biocontainers
module load lofreq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lofreq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lofreq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lofreq
lofreq call -f ref.fa -o vars.vcf out_sorted.bam
lofreq call-parallel --pp-threads 8 \
-f ref.fa -o vars_pallel.vcf out_sorted.bam
Longphase¶
Introduction¶
LongPhase is an ultra-fast program for simultaneously co-phasing SNPs and SVs by using Nanopore and PacBio long reads. It is capable of producing nearly chromosome-scale haplotype blocks by using Nanpore ultra-long reads without the need for additional trios, chromosome conformation, and strand-seq data. On an 8-core machine, LongPhase can finish phasing a human genome in 10-20 minutes.
Versions¶
1.4
Commands¶
longphase
Module¶
You can load the modules by:
module load biocontainers
module load longphase
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run longphase on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=longphase
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers longphase
longphase phase \
-s SNP.vcf \
--sv-file SV.vcf \
-b alignment.bam \
-r reference.fasta \
-t 8 \
-o phased_prefix \
--ont # or --pb for PacBio Hifi
Longqc¶
Introduction¶
LongQC is a tool for the data quality control of the PacBio and ONT long reads.
Versions¶
1.2.0c
Commands¶
longQC.py
Module¶
You can load the modules by:
module load biocontainers
module load longqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run longqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=longqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers longqc
longQC.py sampleqc -x pb-rs2 -o out_dir seq.fastq
Lra¶
Introduction¶
Lra
is a sequence alignment program that aligns long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies.
Versions¶
1.3.2
Commands¶
lra
Module¶
You can load the modules by:
module load biocontainers
module load lra
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lra on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=lra
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lra
lra index genome.fasta
lra align genome.fasta input.fastq -t 12 -p s > output.sam
Ltr_finder¶
Introduction¶
LTR_Finder is an efficient program for finding full-length LTR retrotranspsons in genome sequences.
Versions¶
1.07
Commands¶
ltr_finder
check_result.pl
down_tRNA.pl
filter_rt.pl
genome_plot.pl
genome_plot2.pl
genome_plot_svg.pl
Module¶
You can load the modules by:
module load biocontainers
module load ltr_finder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ltr_finder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltr_finder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ltr_finder
ltr_finder 3ds_72.fa -P 3ds_72 -w2 > test/3ds_72_result.txt \
| genome_plot.pl test/
Ltrpred¶
Introduction¶
LTRpred(ict): de novo annotation of young and intact retrotransposons.
Versions¶
1.1.0
Commands¶
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load ltrpred
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ltrpred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ltrpred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ltrpred
Lumpy-sv¶
Introduction¶
Lumpy-sv
is a general probabilistic framework for structural variant discovery.
Versions¶
0.3.1
Commands¶
lumpy
lumpyexpress
Module¶
You can load the modules by:
module load biocontainers
module load lumpy-sv
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Lumpy-sv on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=lumpy-sv
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lumpy-sv
lumpy -mw 4 -tt 0.0 -pe \
bam_file:AL87.discordant.sort.bam,histo_file:AL87.histo,mean:429,stdev:84,read_length:83,min_non_overlap:83,discordant_z:4,back_distance:1,weight:1,id:1,min_mapping_threshold:20 \
-sr bam_file:AL87.sr.sort.bam,back_distance:1,weight:1,id:2,min_mapping_threshold:20
Lyveset¶
Introduction¶
Lyveset is a method of using hqSNPs to create a phylogeny, especially for outbreak investigations.
Versions¶
2.0.1
Commands¶
applyFstToTree.pl
cladeDistancesFromTree.pl
clusterPairwise.pl
convertAlignment.pl
downloadDataset.pl
errorProneRegions.pl
filterMatrix.pl
filterVcf.pl
genomeDist.pl
launch_bwa.pl
launch_set.pl
launch_smalt.pl
launch_snap.pl
launch_snpeff.pl
launch_varscan.pl
makeRegions.pl
matrixToAlignment.pl
pairwiseDistances.pl
pairwiseTo2d.pl
removeUninformativeSites.pl
removeUninformativeSitesFromMatrix.pl
run_assembly_isFastqPE.pl
run_assembly_metrics.pl
run_assembly_readMetrics.pl
run_assembly_removeDuplicateReads.pl
run_assembly_shuffleReads.pl
run_assembly_trimClean.pl
set_bayesHammer.pl
set_diagnose.pl
set_diagnose_msa.pl
set_downloadTestData.pl
set_findCliffs.pl
set_findPhages.pl
set_indexCase.pl
set_manage.pl
set_processPooledVcf.pl
set_samtools_depth.pl
set_test.pl
shuffleSplitReads.pl
snpDistribution.pl
vcfToAlignment.pl
vcfutils.pl
Module¶
You can load the modules by:
module load biocontainers
module load lyveset
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run lyveset on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=lyveset
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers lyveset
set_test.pl lambda
set_manage.pl --create setTest
Macrel¶
Introduction¶
Macrel is a pipeline to mine antimicrobial peptides (AMPs) from (meta)genomes.
Versions¶
1.2.0
Commands¶
macrel
Module¶
You can load the modules by:
module load biocontainers
module load macrel
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run macrel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macrel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macrel
MACS2¶
Introduction¶
MACS2
is Model-based Analysis of ChIP-Seq for identifying transcript factor binding sites.
Versions¶
2.2.7.1
Commands¶
macs2
Module¶
You can load the modules by:
module load biocontainers
module load macs2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MACS2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macs2
macs2 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01
Macs3¶
Introduction¶
MACS3
is Model-based Analysis of ChIP-Seq for identifying transcript factor.
Versions¶
3.0.0a6
Commands¶
macs3
Module¶
You can load the modules by:
module load biocontainers
module load macs3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Macs3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=macs3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers macs3
macs3 callpeak -t ChIP.bam -c Control.bam -f BAM -g hs -n test -B -q 0.01
MAFFT¶
Introduction¶
MAFFT
is a multiple alignment program for amino acid or nucleotide sequences.
Versions¶
7.475
7.490
Commands¶
einsi
fftns
fftnsi
ginsi
linsi
mafft
mafft-distance
mafft-einsi
mafft-fftns
mafft-fftnsi
mafft-ginsi
mafft-homologs.rb
mafft-linsi
mafft-nwns
mafft-nwnsi
mafft-profile
mafft-qinsi
mafft-sparsecore.rb
mafft-xinsi
nwns
nwnsi
Module¶
You can load the modules by:
module load biocontainers
module load mafft
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MAFFT on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mafft
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mafft
Mageck¶
Introduction¶
Model-based Analysis of Genome-wide CRISPR-Cas9 Knockout (MAGeCK) is a computational tool to identify important genes from the recent genome-scale CRISPR-Cas9 knockout screens (or GeCKO) technology.
Versions¶
0.5.9.5
Commands¶
mageck
mageckGSEA
RRA
Module¶
You can load the modules by:
module load biocontainers
module load mageck
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mageck on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mageck
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mageck
mageck count -l library.txt -n demo \
--sample-label L1,CTRL \
--fastq test1.fastq test2.fastq
mageck test -k demo.count.txt \
-t L1 -c CTRL -n demo
Magicblast¶
Introduction¶
Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read-pairing is ignored.
Versions¶
1.5.0
Commands¶
magicblast
Module¶
You can load the modules by:
module load biocontainers
module load magicblast
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run magicblast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=magicblast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers magicblast
MAKER¶
Introduction¶
MAKER
is a popular genome annotation pipeline for both prokaryotic and eukaryotic genomes. This guide describes best practices for running MAKER on RCAC clusters. For detailed information about MAKER, see its offical website (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018).
Versions¶
2.31.11
3.01.03
Commands¶
cegma2zff
chado2gff3
compare
cufflinks2gff3
evaluator
fasta_merge
fasta_tool
genemark_gtf2gff3
gff3_merge
iprscan2gff3
iprscan_wrap
ipr_update_gff
maker
maker2chado
maker2eval_gtf
maker2jbrowse
maker2wap
maker2zff
maker_functional
maker_functional_fasta
maker_functional_gff
maker_map_ids
map2assembly
map_data_ids
map_fasta_ids
map_gff_ids
tophat2gff3
Module¶
You can load the modules by:
module load biocontainers
module load maker/2.31.11 # OR maker/3.01.03
Note
Dfam release 3.5
(October 2021) downloaded from Dfam website (https://www.dfam.org/home) that required by RepeatMasker
has been set up for users. The RepeatMakser
library is stored here /depot/itap/datasets/Maker/RepeatMasker/Libraries
.
Prerequisites¶
After loading MAKER modules, users can create MAKER control files by the folowing comand:
maker -CTL
This will generate three files:
maker_opts.ctl (required to be modified)
maker_exe.ctl (do not need to modify this file)
maker_bopts.ctl (optionally modify this file)
maker_opts.ctl: - If not using RepeatMasker, modify
model_org=all
tomodel_org=
- If not using RepeatMasker, modifymodel_org=all
to an appropriate family/genus/species.
Example job non-mpi¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MAKER on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MAKER
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maker/2.31.11 # or maker/3.01.03
maker -c 24
Example job mpi¶
To use MAKER in MPI mode, we cannot use the maker modules. Instead we have to use the singularity image files stored in /apps/biocontainers/images
:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 5:00:00
#SBATCH -N 2
#SBATCH -n 24
#SBATCH -c 8
#SBATCH --job-name=MAKER_mpi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --mail-user=UserID@purdue.edu
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
## MAKER2
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_2.31.11.sif maker -c 8
## MAKER3
mpirun -n 24 singularity exec /apps/biocontainers/images/maker_3.01.03.sif maker -c 8
Manta¶
Introduction¶
Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads.
Versions¶
1.6.0
Commands¶
configManta.py
python
Module¶
You can load the modules by:
module load biocontainers
module load manta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run manta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=manta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers manta
configManta.py --normalBam=HCC1954.NORMAL.30x.compare.COST16011_region.bam \
--tumorBam=G15512.HCC1954.1.COST16011_region.bam \
--referenceFasta=Homo_sapiens_assembly19.COST16011_region.fa \
--region=8:107652000-107655000 \
--region=11:94974000-94989000 \
--exome --runDir="MantaDemoAnalysis"
python MantaDemoAnalysis/runWorkflow.py
Mapcaller¶
Introduction¶
Mapcaller
is an efficient and versatile approach for short-read mapping and variant identification using high-throughput sequenced data.
Versions¶
0.9.9.41
Commands¶
MapCaller
Module¶
You can load the modules by:
module load biocontainers
module load mapcaller
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mapcaller on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mapcaller
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mapcaller
MapCaller index ref.fasta ref
MapCaller -t 12 -i ref -f input_1.fastq -f2 input_2.fastq -vcf out.vcf
Mapdamage2¶
Introduction¶
mapDamage2 is a computational framework written in Python and R, which tracks and quantifies DNA damage patterns among ancient DNA sequencing reads generated by Next-Generation Sequencing platforms.
Versions¶
2.2.1
Commands¶
mapDamage
Module¶
You can load the modules by:
module load biocontainers
module load mapdamage2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mapdamage2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mapdamage2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mapdamage2
Marginpolish¶
Introduction¶
MarginPolish is a graph-based assembly polisher. It iteratively finds multiple probable alignment paths for run-length-encoded reads and uses these to generate a refined sequence. It takes as input a FASTA assembly and an indexed BAM (ONT reads aligned to the assembly), and it produces a polished FASTA assembly.
Versions¶
0.1.3
Commands¶
marginpolish
Module¶
You can load the modules by:
module load biocontainers
module load marginpolish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run marginpolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=marginpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers marginpolish
marginpolish \
Reads_to_assembly_StaphAur.bam \
Draft_assembly_StaphAur.fasta \
helen_modles/MP_r941_guppy344_microbial.json \
-t 32 \
-o mp_output/mp_images \
-f
Mash¶
Introduction¶
Mash
is a fast sequence distance estimator that uses MinHash.
Versions¶
2.3
Commands¶
mash
Module¶
You can load the modules by:
module load biocontainers
module load mash
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mash on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mash
mash dist genome1.fasta genome2.fasta
Mashmap¶
Introduction¶
Mashmap
is a fast approximate aligner for long DNA sequences.
Versions¶
2.0-pl5321
Commands¶
mashmap
Module¶
You can load the modules by:
module load biocontainers
module load mashmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mashmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=mashmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mashmap
mashmap -r ref.fasta -t 12 -q input.fasta
Mashtree¶
Introduction¶
Mashtree
is a tool to create a tree using Mash distances.
Versions¶
1.2.0
Commands¶
mashtree
mashtree_bootstrap.pl
mashtree_cluster.pl
mashtree_init.pl
mashtree_jackknife.pl
mashtree_wrapper_deprecated.pl
Module¶
You can load the modules by:
module load biocontainers
module load mashtree
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mashtree on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mashtree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mashtree
Masurca¶
Introduction¶
The MaSuRCA (Maryland Super Read Cabog Assembler) genome assembly and analysis toolkit contains of MaSuRCA genome assembler, QuORUM error corrector for Illumina data, POLCA genome polishing software, Chromosome scaffolder, jellyfish mer counter, and MUMmer aligner.
Versions¶
4.0.9
4.1.0
Commands¶
masurca
build_human_reference.sh
chromosome_scaffolder.sh
close_gaps.sh
close_scaffold_gaps.sh
correct_with_k_unitigs.sh
deduplicate_contigs.sh
deduplicate_unitigs.sh
eugene.sh
extract_chrM.sh
filter_library.sh
final_polish.sh
fix_unitigs.sh
fragScaff.sh
mega_reads_assemble_cluster.sh
mega_reads_assemble_cluster2.sh
mega_reads_assemble_polish.sh
mega_reads_assemble_ref.sh
parallel_delta-filter.sh
polca.sh
polish_with_illumina_assembly.sh
recompute_astat_superreads.sh
recompute_astat_superreads_CA8.sh
reconcile_alignments.sh
refine.sh
resolve_trio.sh
run_ECR.sh
samba.sh
splitScaffoldsAtNs.sh
Module¶
You can load the modules by:
module load biocontainers
module load masurca
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run masurca on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=masurca
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers masurca
Mauve¶
Introduction¶
Mauve
is a system for constructing multiple genome alignments in the presence of large-scale evolutionary events such as rearrangement and inversion.
Versions¶
2.4.0
Commands¶
mauveAligner
progressiveMauve
Module¶
You can load the modules by:
module load biocontainers
module load mauve
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mauve on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mauve
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mauve
mauveAligner seqs.fasta --output=mauveAligner_output
progressiveMauve --output=threeway.xmfa \
--output-guide-tree=threeway.tree \
--backbone-output=threeway.backbone genome1.gbk genome2.gbk genome3.gbk
Maxbin2¶
Introduction¶
Maxbin2 is a software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.
Versions¶
2.2.7
Commands¶
run_MaxBin.pl
run_FragGeneScan.pl
Module¶
You can load the modules by:
module load biocontainers
module load maxbin2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run maxbin2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxbin2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maxbin2
run_MaxBin.pl -contig subset_assembly.fa \
-abund_list abundance.list -max_iteration 5 -out mbin
Maxquant¶
Introduction¶
Maxquant
is a quantitative proteomics software package designed for analyzing large mass-spectrometric data sets. It is specifically aimed at high-resolution MS data.
Versions¶
2.1.0.0
2.1.3.0
2.1.4.0
2.3.1.0
Commands¶
MaxQuantGui.exe
MaxQuantCmd.exe
Module¶
You can load the modules by:
module load biocontainers
module load maxquant
GUI¶
To run Maxquant with GUI, it is recommended to run within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers maxquant
(base) UserID@bell-a008:~ $ MaxQuantGui.exe

CMD job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Maxquant without GUI on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=maxquant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers maxquant
MaxQuantCmd.exe mqpar.xml
Mcl¶
Introduction¶
Mcl
is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs.
Versions¶
14.137-pl5262
Commands¶
clm
clmformat
clxdo
mcl
mclblastline
mclcm
mclpipeline
mcx
mcxarray
mcxassemble
mcxdeblast
mcxdump
mcxi
mcxload
mcxmap
mcxrand
mcxsubs
Module¶
You can load the modules by:
module load biocontainers
module load mcl
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mcl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mcl
Mcscanx¶

Introduction¶
The MCScanX package has two major components: a modified version of MCscan algorithm allowing users to handle MCScan more conveniently and to view multiple alignment of syntenic blocks more clearly, and a variety of downstream analysis tools to conduct different biological analyses based on the synteny data generated by the modified MCScan algorithm.
Versions¶
default
Commands¶
MCScanX
MCScanX_h
duplicate_gene_classifier
add_ka_and_ks_to_collinearity
add_kaks_to_synteny
detect_collinearity_within_gene_families
detect_synteny_within_gene_families
group_collinear_genes
group_syntenic_genes
origin_enrichment_analysis
Module¶
You can load the modules by:
module load biocontainers
module load mcscanx
Helper command¶
Note
To conduct downstream analyses, users need to copy the folder downstream_analyses
from container into the host system.
A helper command copy_downstream_analyses
is provided to simplify the task. Follow the procedure below to copy downstream_analyses into target directory:
$ copy_downstream_analyses $PWD # this will copy the downstream_analyses into the current directory.
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mcscanx on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mcscanx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mcscanx
## Run MCScanX
MCScanX Result/merge
## Copy downstream_analyses
copy_downstream_analyses $PWD
## Downstream analyses
java circle_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_circ.ctl -o ../Result/merge_circle.png
java dot_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dot.png
java dual_synteny_plotter -g ../Result/merge.gff -s ../Result/merge.collinearity -c ../Result/merge_dot.ctl -o ../Result/merge_dual_synteny.png
Medaka¶
Introduction¶
Medaka
is a tool to create consensus sequences and variant calls from nanopore sequencing data.
Versions¶
1.6.0
Commands¶
medaka
medaka_consensus
medaka_counts
medaka_data_path
medaka_haploid_variant
medaka_version_report
Module¶
You can load the modules by:
module load biocontainers
module load medaka
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Medaka on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=medaka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers medaka
Megadepth¶
Introduction¶
Megadepth
is an efficient tool for extracting coverage related information from RNA and DNA-seq BAM and BigWig files.
Versions¶
1.2.0
Commands¶
megadepth
Module¶
You can load the modules by:
module load biocontainers
module load megadepth
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megadepth on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megadepth
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megadepth
megadepth sorted.bam
Megahit¶
Introduction¶
Megahit
is a ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.
Versions¶
1.2.9
Commands¶
megahit
Module¶
You can load the modules by:
module load biocontainers
module load megahit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megahit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=megahit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megahit
megahit --12 SRR1976948.abundtrim.subset.pe.fq.gz,SRR1977249.abundtrim.subset.pe.fq.gz -o combined
Megan¶
Introduction¶
Megan
is a computer program that allows optimized analysis of large metagenomic datasets. Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample.
Versions¶
6.21.7
Commands¶
MEGAN
blast2lca
blast2rma
daa2info
daa2rma
daa-meganizer
gc-assembler
rma2info
sam2rma
references-annotator
Module¶
You can load the modules by:
module load biocontainers
module load megan
GUI¶
To run MEGAN with GUI, it is recommended to run within ThinLinc:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers megan
(base) UserID@bell-a008:~ $ MEGAN

Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Megan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=megan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers megan
Meme¶
Introduction¶
Meme
is a collection of tools for the discovery and analysis of sequence motifs.
Versions¶
5.3.3
5.4.1
5.5.0
Commands¶
ame
centrimo
dreme
dust
fimo
glam2
glam2scan
gomo
mast
mcast
meme
meme-chip
momo
purge
spamo
tomtom
Module¶
You can load the modules by:
module load biocontainers
module load meme
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Meme on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meme
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meme
meme seq.fasta -dna -mod oops -pal
meme-chip Klf1.fna -o memechip_klf1_out
Memes¶
Introduction¶
memes is an R interface to the MEME Suite family of tools, which provides several utilities for performing motif analysis on DNA, RNA, and protein sequences. memes works by detecting a local install of the MEME suite, running the commands, then importing the results directly into R.
Versions¶
1.1.2
Commands¶
R
Module¶
You can load the modules by:
module load biocontainers
module load memes
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run memes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=memes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers memes
Meraculous¶
Introduction¶
Meraculous is a whole genome assembler for Next Generation Sequencing data, geared for large genomes. It is hybrid k-mer/read-based approach capitalizes on the high accuracy of Illumina sequence by eschewing an explicit error correction step which we argue to be redundant with the assembly process. Meraculous achieves high performance with large datasets by utilizing lightweight data structures and multi-threaded parallelization, allowing to assemble human-sized genomes on a high-cpu cluster in under a day. The process pipeline implements a highly transparent and portable model of job control and monitoring where different assembly stages can be executed and re-executed separately or in unison on a wide variety of architectures.
Versions¶
2.2.6
Commands¶
run_meraculous.sh
blastMapAnalyzer2.pl
bmaToLinks.pl
_bubbleFinder2.pl
bubblePopper.pl
bubbleScout.pl
contigBias.pl
divide_it.pl
fasta_splitter.pl
findDMin2.pl
gapDivider.pl
gapPlacer.pl
haplotyper.Naive.pl
haplotyper.pl
histogram2.pl
kmerHistAnalyzer.pl
loadBalanceMers.pl
meraculous4h.pl
meraculous.pl
N50.pl
_oNo4.pl
oNo7.pl
optimize2.pl
randomList2.pl
scaffold2contig.pl
scaffReportToFasta.pl
screen_list2.pl
spanner.pl
splinter.pl
splinter_scaffolds.pl
split_and_validate_reads.pl
test_dependencies.pl
unique.pl
Module¶
You can load the modules by:
module load biocontainers
module load meraculous
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run meraculous on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meraculous
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meraculous
Merqury¶
Introduction¶
Merqury is a tool to evaluate genome assemblies with k-mers and more.
Versions¶
1.3
Commands¶
merqury.sh
Module¶
You can load the modules by:
module load biocontainers
module load merqury
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run merqury on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=merqury
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers merqury
merqury.sh F1.k18.meryl col0.hapmer.meryl cvi0.hapmer.meryl \
athal_COL.fasta athal_CVI.fasta test
Meryl¶
Introduction¶
Meryl
is a genomic k-mer counter (and sequence utility) with nice features.
Versions¶
1.3
Commands¶
meryl
meryl-analyze
meryl-import
meryl-lookup
meryl-simple
Module¶
You can load the modules by:
module load biocontainers
module load meryl
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Meryl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=meryl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers meryl
meryl count k=42 data/ec.fna.gz output ec.meryl
Metabat¶
Introduction¶
Metabat
is a robust statistical framework for reconstructing genomes from metagenomic data.
Versions¶
2.15-5
Commands¶
aggregateBinDepths.pl
aggregateContigOverlapsByBin.pl
contigOverlaps
jgi_summarize_bam_contig_depths
merge_depths.pl
metabat
metabat1
metabat2
runMetaBat.sh
Module¶
You can load the modules by:
module load biocontainers
module load metabat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Metabat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=metabat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metabat
metabat2 -m 10000 \
-t 24 \
-i contig.fasta \
-o metabat2_output \
-a depth.txt
Metachip¶
Introduction¶
Metachip is a pipeline for Horizontal gene transfer (HGT) identification.
Versions¶
1.10.12
Commands¶
MetaCHIP
Module¶
You can load the modules by:
module load biocontainers
module load metachip
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metachip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metachip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metachip
MetaPhlAn 3¶
Introduction¶
MetaPhlAn
(Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
species-level resolution for bacteria, archaea, eukaryotes and viruses;
extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.
Versions¶
3.0.14
3.0.9
4.0.2
Commands¶
metaphlan
Database¶
The lastest version of database(mpa_v30) has been downloaded and built in /depot/itap/datasets/metaphlan/
.
Module¶
You can load the modules by:
module load biocontainers
module load metaphlan/3.0.14
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MetaPhlAn on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=MetaPhlAn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metaphlan/3.0.14
DATABASE=/depot/itap/datasets/metaphlan/
metaphlan SRR11234553_1.fastq,SRR11234553_2.fastq --input_type fastq --nproc 24 -o profiled_metagenome.txt --bowtie2db $DATABASE --bowtie2out metagenome.bowtie2.bz2
Metaseq¶
Introduction¶
Metaseq is a Python package for integrative genome-wide analysis reveals relationships between chromatin insulators and associated nuclear mRNA.
Versions¶
0.5.6
Commands¶
python
python2
Module¶
You can load the modules by:
module load biocontainers
module load metaseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metaseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metaseq
Methyldackel¶
Introduction¶
MethylDackel (formerly named PileOMeth, which was a temporary name derived due to it using a PILEup to extract METHylation metrics) will process a coordinate-sorted and indexed BAM or CRAM file containing some form of BS-seq alignments and extract per-base methylation metrics from them. MethylDackel requires an indexed fasta file containing the reference genome as well.
Versions¶
0.6.1
Commands¶
MethylDackel
Module¶
You can load the modules by:
module load biocontainers
module load methyldackel
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run methyldackel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=methyldackel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers methyldackel
MethylDackel extract chgchh.fa chgchh_aln.bam
Metilene¶
Introduction¶
Metilene is a versatile tool to study the effect of epigenetic modifications in differentiation/development, tumorigenesis, and systems biology on a global, genome-wide level.
Versions¶
0.2.8
Commands¶
metilene
metilene_input.pl
metilene_output.pl
metilene_output.R
Module¶
You can load the modules by:
module load biocontainers
module load metilene
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run metilene on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=metilene
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers metilene
metilene -a g1 -b g2 methylation-file
Mhm2¶
Introduction¶
MetaHipMer is a de novo metagenome short-read assembler. Version 2 (MHM2) is written entirely in UPC++ and runs efficiently on both single servers and on multinode supercomputers, where it can scale up to coassemble terabase-sized metagenomes.
Versions¶
2.0.0
Commands¶
mhm2.py
Module¶
You can load the modules by:
module load biocontainers
module load mhm2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mhm2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mhm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mhm2
mhm2.py -r input_1.fastq,input_2.fastq
MicrobeDMM¶
Introduction¶
MicrobeDMM
is a suite of programs used for empirical Bayes fitting of DMM models.
Versions¶
1.0
Commands¶
DirichletMixtureGHPFit
Module¶
You can load the modules by:
module load biocontainers
module load microbedmm
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MicrobeDMM on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=microbedmm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers microbedmm
Minialign¶
Introduction¶
Minialign
is a little bit fast and moderately accurate nucleotide sequence alignment tool designed for PacBio and Nanopore long reads.
Versions¶
0.5.3
Commands¶
minialign
Module¶
You can load the modules by:
module load biocontainers
module load minialign
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Minialign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minialign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minialign
minialign -d index.mai genome.fasta
minialign -l index.mai input.fastq > out.sam
Miniasm¶
Introduction¶
Miniasm
is a very fast OLC-based de novo assembler for noisy long reads.
Versions¶
0.3_r179
Commands¶
miniasm
minidot
Module¶
You can load the modules by:
module load biocontainers
module load miniasm
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Miniasm on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=miniasm
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers miniasm
miniasm -f Elysia_ont_test.fq Elysia_reads.paf.gz \
> Elysia_reads.gfa
Minimap2¶
Introduction¶
Minimap2
is a versatile pairwise aligner for genomic and spliced nucleotide sequences.
Versions¶
2.22
2.24
2.26
Commands¶
minimap2
paftools.js
k8
Module¶
You can load the modules by:
module load biocontainers
module load minimap2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Minimap2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=minimap2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minimap2
minimap2 -ax sr Wuhan-Hu-1.fasta \
seq_1.fastq seq_2.fastq \
> aln.sam
Minipolish¶
Introduction¶
Minipolish is a tool for Racon polishing of miniasm assemblies.
Versions¶
0.1.3
Commands¶
minipolish
Module¶
You can load the modules by:
module load biocontainers
module load minipolish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run minipolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=minipolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers minipolish
minipolish -t 8 long_reads.fastq.gz assembly.gfa > polished.gfa
Miniprot¶
Introduction¶
Miniprot aligns a protein sequence against a genome with affine gap penalty, splicing and frameshift. It is primarily intended for annotating protein-coding genes in a new species using known genes from other species. Miniprot is similar to GeneWise and Exonerate in functionality but it can map proteins to whole genomes and is much faster at the residue alignment step.
Versions¶
0.3
0.7
Commands¶
miniprot
Module¶
You can load the modules by:
module load biocontainers
module load miniprot
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run miniprot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=miniprot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers miniprot
miRDeep2¶
Introduction¶
miRDeep2
discovers active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, …).
Versions¶
2.0.1.3
Commands¶
bwa_sam_converter.pl
clip_adapters.pl
collapse_reads_md.pl
convert_bowtie_output.pl
excise_precursors_iterative_final.pl
excise_precursors.pl
extract_miRNAs.pl
fastaparse.pl
fastaselect.pl
fastq2fasta.pl
find_read_count.pl
geo2fasta.pl
get_mirdeep2_precursors.pl
illumina_to_fasta.pl
make_html2.pl
make_html.pl
mapper.pl
mirdeep2bed.pl
miRDeep2_core_algorithm.pl
miRDeep2.pl
parse_mappings.pl
perform_controls.pl
permute_structure.pl
prepare_signature.pl
quantifier.pl
remove_white_space_in_id.pl
rna2dna.pl
samFLAGinfo.pl
sam_reads_collapse.pl
sanity_check_genome.pl
sanity_check_mapping_file.pl
sanity_check_mature_ref.pl
sanity_check_reads_ready_file.pl
select_for_randfold.pl
survey.pl
Module¶
You can load the modules by:
module load biocontainers
module load mirdeep2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run miRDeep2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirdeep2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mirdeep2
Mirtop¶
Introduction¶
Mirtop is a ommand line tool to annotate with a standard naming miRNAs e isomiRs.
Versions¶
0.4.25
Commands¶
mirtop
Module¶
You can load the modules by:
module load biocontainers
module load mirtop
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mirtop on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mirtop
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mirtop
mirtop gff --format prost --sps hsa
--hairpin examples/annotate/hairpin.fa \
--gtf examples/annotate/hsa.gff3 \
-o test_out \
examples/prost/prost.example.txt
Mitofinder¶
Introduction¶
Mitofinder
is a pipeline to assemble mitochondrial genomes and annotate mitochondrial genes from trimmed read sequencing data.
Versions¶
1.4.1
Commands¶
mitofinder
Module¶
You can load the modules by:
module load biocontainers
module load mitofinder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mitofinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mitofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mitofinder
mitofinder -j Aphaenogaster_megommata_SRR1303315 \
-1 Aphaenogaster_megommata_SRR1303315_R1_cleaned.fastq.gz \
-2 Aphaenogaster_megommata_SRR1303315_R2_cleaned.fastq.gz \
-r reference.gb -o 5 -p 5 -m 10
Mlst¶
Introduction¶
Mlst is used to scan contig files against traditional PubMLST typing schemes.
Versions¶
2.22.0
2.23.0
Commands¶
mlst
Module¶
You can load the modules by:
module load biocontainers
module load mlst
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mlst on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mlst
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mlst
mlst contigs.fa
mlst genome.gbk.gz
Mmseqs2¶
Introduction¶
Mmseqs2
is a software suite to search and cluster huge protein and nucleotide sequence sets.
Versions¶
13.45111
14.7e284
Commands¶
mmseqs
Module¶
You can load the modules by:
module load biocontainers
module load mmseqs2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mmseqs2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mmseqs2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mmseqs2
mmseqs createdb examples/DB.fasta targetDB
mmseqs createtaxdb targetDB tmp
mmseqs createindex targetDB tmp
mmseqs easy-taxonomy examples/QUERY.fasta targetDB alnRes tmp
Mob_suite¶
Introduction¶
MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies.
Versions¶
3.0.3
Commands¶
mob_cluster
mob_init
mob_recon
mob_typer
Module¶
You can load the modules by:
module load biocontainers
module load mob_suite
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mob_suite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mob_suite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mob_suite
Modbam2bed¶
Introduction¶
Modbam2bed is a program to aggregate modified base counts stored in a modified-base BAM file to a bedMethyl file.
Versions¶
0.9.1
Commands¶
modbam2bed
Module¶
You can load the modules by:
module load biocontainers
module load modbam2bed
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run modbam2bed on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modbam2bed
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers modbam2bed
Modeltest-ng¶
Introduction¶
ModelTest-NG is a tool for selecting the best-fit model of evolution for DNA and protein alignments. ModelTest-NG supersedes jModelTest and ProtTest in one single tool, with graphical and command console interfaces.
Versions¶
0.1.7
Commands¶
modeltest-ng
modeltest-ng-mpi
Module¶
You can load the modules by:
module load biocontainers
module load modeltest-ng
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run modeltest-ng on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=modeltest-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers modeltest-ng
Momi¶
Introduction¶
momi (MOran Models for Inference) is a Python package that computes the expected sample frequency spectrum (SFS), a statistic commonly used in population genetics, and uses it to fit demographic history.
Versions¶
2.1.19
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load momi
Interactive job¶
To run momi interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers momi
(base) UserID@bell-a008:~ $ python
Python 3.9.7 (default, Sep 16 2021, 13:09:58)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import momi
>>> import logging
>>> logging.basicConfig(level=logging.INFO,
filename="tutorial.log")
>>> model = momi.DemographicModel(N_e=1.2e4, gen_time=29,
muts_per_gen=1.25e-8)
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run momi on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=momi
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers momi
python python.py
Mothur¶
Introduction¶
Mothur
is an open source software package for bioinformatics data processing. The package is frequently used in the analysis of DNA from uncultured microbes.
Detailed information about Mothur can be found here: https://mothur.org
Versions¶
1.46.0
1.47.0
1.48.0
Commands¶
mothur
Module¶
You can load the modules by:
module load biocontainers
module load mothur/1.47.0
Interactive job¶
To run mothur
interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers mothur/1.47.0
(base) UserID@bell-a008:~ $ mothur
Linux version
Using ReadLine,Boost,HDF5,GSL
mothur v.1.47.0
Last updated: 1/21/22
by
Patrick D. Schloss
Department of Microbiology & Immunology
University of Michigan
http://www.mothur.org
When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41.
Distributed under the GNU General Public License
Type 'help()' for information on the commands that are available
For questions and analysis support, please visit our forum at https://forum.mothur.org
Type 'quit()' to exit program
[NOTE]: Setting random seed to 19760620.
Interactive Mode
mothur > align.seqs(help)
mothur > quit()
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=mothur
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mothur/1.47.0
mothur batch_file
Motus¶
Introduction¶
The mOTU profiler is a computational tool that estimates relative taxonomic abundance of known and currently unknown microbial community members using metagenomic shotgun sequencing data.
Versions¶
3.0.3
Commands¶
motus
Module¶
You can load the modules by:
module load biocontainers
module load motus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run motus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=motus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers motus
MrBayes¶
Introduction¶
MrBayes
is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. MrBayes uses Markov chain Monte Carlo (MCMC) methods to estimate the posterior distribution of model parameters.
MrBayes is available both in a serial version (‘mb’) and in a parallel version (‘mb-mpi’) that uses MPI instructions to distribute computations across several processors or processor cores. The serial version does not support multi-threading, which means that you will not be able to utilize more than one core on a multi-core machine for a single MrBayes analysis. If you want to utilize all cores,you need to run the MPI version of MrBayes.
Note: ‘mb-mpi’ in this version of the container does not run across multiple nodes (only within a node). This is a bug in the container (upstream).
Versions¶
3.2.7
Commands¶
mb
mb-mpi
mpirun
mpiexec
Module¶
You can load the modules by:
module load biocontainers
module load mrbayes
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run MrBayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mrbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mrbayes
Multiqc¶
Introduction¶
Multiqc
is a reporting tool that parses summary statistics from results and log files generated by other bioinformatics tools.
Versions¶
1.11
Commands¶
multiqc
Module¶
You can load the modules by:
module load biocontainers
module load multiqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Multiqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=multiqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers multiqc
multiqc fastqc_out -o multiqc_out
Mummer4¶
Introduction¶
Mummer4
is a versatile alignment tool for DNA and protein sequences.
Versions¶
4.0.0rc1-pl5262
Commands¶
annotate
combineMUMs
delta-filter
delta2vcf
dnadiff
exact-tandems
mummer
mummerplot
nucmer
promer
repeat-match
show-aligns
show-coords
show-diff
show-snps
show-tiling
Module¶
You can load the modules by:
module load biocontainers
module load mummer4
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Mummer4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mummer4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mummer4
mummer -mum -b -c H_pylori26695_Eslice.fasta H_pyloriJ99_Eslice.fasta > mummer.mums
Muscle¶
Introduction¶
Muscle
is a modified progressive alignment algorithm which has comparable accuracy to MAFFT, but faster performance.
Versions¶
3.8.1551
5.1
Versions¶
3.8.1551
5.1
Commands¶
muscle
Module¶
You can load the modules by:
module load biocontainers
module load muscle
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Muscle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=muscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers muscle
muscle -align seqs2.fasta -output seqs.afa
Mutmap¶
Introduction¶
MutMap is a powerful and efficient method to identify agronomically important loci in crop plants.
Versions¶
2.3.3
Commands¶
mutmap
mutplot
Module¶
You can load the modules by:
module load biocontainers
module load mutmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mutmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mutmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mutmap
Mykrobe¶
Introduction¶
Mykrobe analyses the whole genome of a bacterial sample, all within a couple of minutes, and predicts which drugs the infection is resistant to.
Versions¶
0.11.0
Commands¶
mykrobe
Module¶
You can load the modules by:
module load biocontainers
module load mykrobe
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run mykrobe on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=mykrobe
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers mykrobe
N50¶
Introduction¶
N50 is a command line tool to calculate assembly metrices.
Versions¶
1.5.6
Commands¶
n50
Module¶
You can load the modules by:
module load biocontainers
module load n50
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run n50 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=n50
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers n50
Nanofilt¶
Introduction¶
Nanofilt
is a tool for filtering and trimming of Oxford Nanopore Sequencing data.
Versions¶
2.8.0
Commands¶
NanoFilt
Module¶
You can load the modules by:
module load biocontainers
module load nanofilt
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanofilt on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanofilt
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanofilt
NanoFilt -q 12 --headcrop 75 reads.fastq | gzip > trimmed-reads.fastq.gz
Nanolyse¶
Introduction¶
Nanolyse
is a tool to remove reads mapping to the lambda phage genome from a fastq file.
Versions¶
1.2.0
Commands¶
NanoLyse
Module¶
You can load the modules by:
module load biocontainers
module load nanolyse
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanolyse on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanolyse
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanolyse
gunzip -c reads.fastq.gz | NanoLyse | gzip > reads_without_lambda.fastq.gz
Nanoplot¶
Introduction¶
Nanoplot
is a plotting tool for long read sequencing data and alignments.
Versions¶
1.39.0
Commands¶
NanoPlot
Module¶
You can load the modules by:
module load biocontainers
module load nanoplot
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanoplot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=nanoplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanoplot
NanoPlot --summary sequencing_summary.txt --loglength -o summary-plots-log-transformed
NanoPlot -t 2 --fastq reads1.fastq.gz reads2.fastq.gz --maxlength 40000 --plots dot --legacy hex
NanoPlot -t 12 --color yellow --bam alignment1.bam alignment2.bam alignment3.bam --downsample 10000 -o bamplots_downsampled
Nanopolish¶
Introduction¶
Nanopolish
is a software package for signal-level analysis of Oxford Nanopore sequencing data.
Versions¶
0.13.2
0.14.0
Commands¶
nanopolish
Module¶
You can load the modules by:
module load biocontainers
module load nanopolish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nanopolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nanopolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nanopolish
nanopolish index -d fast5_files/ reads.fasta
nanopolish variants --consensus \
-o polished.vcf -w "tig00000001:200000-202000" \
-r reads.fasta -b reads.sorted.bam -g draft.fa
Ncbi-amrfinderplus¶
Introduction¶
Ncbi-amrfinderplus
and the accompanying database identify acquired antimicrobial resistance genes in bacterial protein and/or assembled nucleotide sequences as well as known resistance-associated point mutations for several taxa.
Versions¶
3.10.30
3.10.42
3.11.2
Commands¶
amrfinder
Module¶
You can load the modules by:
module load biocontainers
module load ncbi-amrfinderplus
Note
AMRFinderPlus database has been setup for users. Users can check the database version by amrfinder -V
. RCAC will keep updating database for users. If you notice our database is out of date, you can contact us to update the database.
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-amrfinderplus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-amrfinderplus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-amrfinderplus
# Protein AMRFinder with no genomic coordinates
amrfinder -p test_prot.fa
# Translated nucleotide AMRFinder (will not use HMMs)
amrfinder -n test_dna.fa
# Protein AMRFinder using GFF to get genomic coordinates and 'plus' genes
amrfinder -p test_prot.fa -g test_prot.gff --plus
# Protein AMRFinder with Escherichia protein point mutations
amrfinder -p test_prot.fa -O Escherichia
# Full AMRFinderPlus search combining results
amrfinder -p test_prot.fa -g test_prot.gff -n test_dna.fa -O Escherichia --plus
Ncbi-datasets¶
Introduction¶
NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You can use it to find and download sequence, annotation, and metadata for genes and genomes using our command-line interface (CLI) tools or NCBI Datasets web interface.
Versions¶
14.3.0
Commands¶
datasets
dataformat
Module¶
You can load the modules by:
module load biocontainers
module load ncbi-datasets
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-datasets on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-datasets
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-datasets
Ncbi-genome-download¶
Introduction¶
Ncbi-genome-download
is a script to download genomes from the NCBI FTP servers.
Versions¶
0.3.1
Commands¶
ncbi-genome-download
Module¶
You can load the modules by:
module load biocontainers
module load ncbi-genome-download
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ncbi-genome-download on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=ncbi-genome-download
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-genome-download
ncbi-genome-download bacteria,viral --parallel 4
ncbi-genome-download --genera "Streptomyces coelicolor,Escherichia coli" bacteria
ncbi-genome-download --species-taxids 562 bacteria
Ncbi-table2asn¶
Introduction¶
table2asn is a command-line program that creates sequence records for submission to GenBank. It uses many of the same functions as Genome Workbench but is driven generally by data files, and the records it produces do not necessarily require additional manual editing before submission to GenBank.
Versions¶
1.26.678
Commands¶
table2asn
Module¶
You can load the modules by:
module load biocontainers
module load ncbi-table2asn
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ncbi-table2asn on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ncbi-table2asn
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ncbi-table2asn
Neusomatic¶
Introduction¶
NeuSomatic is based on deep convolutional neural networks for accurate somatic mutation detection. With properly trained models, it can robustly perform across sequencing platforms, strategies, and conditions. NeuSomatic summarizes and augments sequence alignments in a novel way and incorporates multi-dimensional features to capture variant signals effectively. It is not only a universal but also accurate somatic mutation detection method.
Versions¶
0.2.1
Commands¶
call.py
dataloader.py
extract_postprocess_targets.py
filter_candidates.py
generate_dataset.py
long_read_indelrealign.py
merge_post_vcfs.py
merge_tsvs.py
network.py
postprocess.py
preprocess.py
resolve_scores.py
resolve_variants.py
scan_alignments.py
split_bed.py
train.py
utils.py
Module¶
You can load the modules by:
module load biocontainers
module load neusomatic
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run neusomatic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=neusomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers neusomatic
Nextalign¶
Introduction¶
Nextalign
is a viral genome sequence alignment tool for command line.
Versions¶
1.10.3
Commands¶
nextalign
Module¶
You can load the modules by:
module load biocontainers
module load nextalign
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextalign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextalign
nextalign \
--sequences data/sars-cov-2/sequences.fasta \
--reference data/sars-cov-2/reference.fasta \
--genemap data/sars-cov-2/genemap.gff \
--genes E,M,N,ORF1a,ORF1b,ORF3a,ORF6,ORF7a,ORF7b,ORF8,ORF9b,S \
--output-dir output/ \
--output-basename nextalign
Nextclade¶
Introduction¶
Nextclade
is a tool that identifies differences between your sequences and a reference sequence, uses these differences to assign your sequences to clades, and reports potential sequence quality issues in your data.
Versions¶
1.10.3
Commands¶
nextclade
Module¶
You can load the modules by:
module load biocontainers
module load nextclade
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextclade on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextclade
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextclade
mkdir -p data
nextclade dataset get --name 'sars-cov-2' --output-dir 'data/sars-cov-2'
nextclade \
--in-order \
--input-fasta data/sars-cov-2/sequences.fasta \
--input-dataset data/sars-cov-2 \
--output-tsv output/nextclade.tsv \
--output-tree output/nextclade.auspice.json \
--output-dir output/ \
--output-basename nextclade
Nextdenovo¶
Introduction¶
NextDenovo is a string graph-based de novo assembler for long reads (CLR, HiFi and ONT). It uses a “correct-then-assemble” strategy similar to canu (no correction step for PacBio HiFi reads), but requires significantly less computing resources and storages. After assembly, the per-base accuracy is about 98-99.8%, to further improve single base accuracy, try NextPolish.
Versions¶
2.5.2
Commands¶
nextDenovo
Module¶
You can load the modules by:
module load biocontainers
module load nextdenovo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run nextdenovo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextdenovo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextdenovo
Nextflow¶
Introduction¶
Nextflow
is a bioinformatics workflow manager that enables the development of portable and reproducible workflows.
Versions¶
21.10.0
Commands¶
nextflow
Module¶
You can load the modules by:
module load biocontainers
module load nextflow
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Nextflow on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextflow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextflow
Nextpolish¶
Introduction¶
NextPolish is used to fix base errors (SNV/Indel) in the genome generated by noisy long reads, it can be used with short read data only or long read data only or a combination of both. It contains two core modules, and use a stepwise fashion to correct the error bases in reference genome. To correct/assemble the raw third-generation sequencing (TGS) long reads with approximately 10-15% sequencing errors, please use NextDenovo.
Versions¶
1.4.1
Commands¶
nextPolish
Module¶
You can load the modules by:
module load biocontainers
module load nextpolish
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run nextpolish on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=nextpolish
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers nextpolish
Ngs-bits¶
Introduction¶
Ngs-bits
- Short-read sequencing tools.
Versions¶
2022_04
Commands¶
SampleAncestry
SampleDiff
SampleGender
SampleOverview
SampleSimilarity
SeqPurge
CnvHunter
RohHunter
UpdHunter
CfDnaQC
MappingQC
NGSDImportQC
ReadQC
SomaticQC
VariantQC
TrioMaternalContamination
BamCleanHaloplex
BamClipOverlap
BamDownsample
BamFilter
BamToFastq
BedAdd
BedAnnotateFreq
BedAnnotateFromBed
BedAnnotateGC
BedAnnotateGenes
BedChunk
BedCoverage
BedExtend
BedGeneOverlap
BedHighCoverage
BedInfo
BedIntersect
BedLiftOver
BedLowCoverage
BedMerge
BedReadCount
BedShrink
BedSort
BedSubtract
BedToFasta
BedpeAnnotateBreakpointDensity
BedpeAnnotateCnvOverlap
BedpeAnnotateCounts
BedpeAnnotateFromBed
BedpeFilter
BedpeGeneAnnotation
BedpeSort
BedpeToBed
FastqAddBarcode
FastqConcat
FastqConvert
FastqDownsample
FastqExtract
FastqExtractBarcode
FastqExtractUMI
FastqFormat
FastqList
FastqMidParser
FastqToFasta
FastqTrim
VcfAnnotateFromBed
VcfAnnotateFromBigWig
VcfAnnotateFromVcf
VcfBreakMulti
VcfCalculatePRS
VcfCheck
VcfExtractSamples
VcfFilter
VcfLeftNormalize
VcfSort
VcfStreamSort
VcfToBedpe
VcfToTsv
SvFilterAnnotations
NGSDExportGenes
GenePrioritization
GenesToApproved
GenesToBed
GraphStringDb
PhenotypeSubtree
PhenotypesToGenes
PERsim
FastaInfo
Module¶
You can load the modules by:
module load biocontainers
module load ngs-bits
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ngs-bits on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngs-bits
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngs-bits
SeqPurge -in1 input1_1.fastq input2_1.fastq \
-in2 input2_2.fastq input2_2.fastq \
-out1 R1.fastq.gz -out2 R2.fastq.gz
Ngsld¶
Introduction¶
ngsLD is a program to estimate pairwise linkage disequilibrium (LD) taking the uncertainty of genotype’s assignation into account. It does so by avoiding genotype calling and using genotype likelihoods or posterior probabilities.
Versions¶
1.1.1
Commands¶
ngsLD
Module¶
You can load the modules by:
module load biocontainers
module load ngsld
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ngsld on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsld
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngsld
Ngsutils¶
Introduction¶
Ngsutils
is a suite of software tools for working with next-generation sequencing datasets.
Versions¶
0.5.9
Commands¶
ngsutils
bamutils
bedutils
fastqutils
gtfutils
Module¶
You can load the modules by:
module load biocontainers
module load ngsutils
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ngsutils on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ngsutils
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ngsutils
bamutils filter \
input.bam \
MQ10filtered.bam \
-mapped \
-noqcfail \
-gte MAPQ 10
bamutils stats \
-gtf genome.gtf MQ10filtered.bam \
> MQ10filtered_bamstats
OrthoFinder¶
Introduction¶
OrthoFinder
: phylogenetic orthology inference for comparative genomics
Detailed usage can be found here: https://github.com/davidemms/OrthoFinder
Versions¶
2.5.2
2.5.4
2.5.5
Commands¶
orthofinder
Module¶
You can load the modules by:
module load biocontainers
module load orthofinder/2.5.4
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run orthofinder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=orthofinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers orthofinder/2.5.4
orthofinder -t 24 -f InputData -o output
Paml¶
Introduction¶
Paml
is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.
Versions¶
4.9
Commands¶
baseml
basemlg
chi2
codeml
evolver
infinitesites
mcmctree
pamp
yn00
Module¶
You can load the modules by:
module load biocontainers
module load paml
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Paml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=paml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers paml
Panacota¶
Introduction¶
Panacota
is a software providing tools for large scale bacterial comparative genomics.
Versions¶
1.3.1
Commands¶
PanACoTA
Module¶
You can load the modules by:
module load biocontainers
module load panacota
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Panacota on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panacota
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panacota
PanACoTA annotate \
-d Examples/genomes_init \
-l Examples/input_files/list_genomes.lst \
-r Examples/2-res-QC -Q
Panaroo¶
Introduction¶
Panaroo is an updated pipeline for pangenome investigation.
Versions¶
1.2.10
Commands¶
panaroo
panaroo-extract-gene
panaroo-filter-pa
panaroo-fmg
panaroo-gene-neighbourhood
panaroo-img
panaroo-integrate
panaroo-merge
panaroo-msa
panaroo-plot-abundance
panaroo-qc
panaroo-spydrpick
Module¶
You can load the modules by:
module load biocontainers
module load panaroo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run panaroo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panaroo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panaroo
panaroo -i gff/*.gff -o results --clean-mode strict
Pandaseq¶
Introduction¶
Pandaseq
is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.
Versions¶
2.11
Commands¶
pandaseq
Module¶
You can load the modules by:
module load biocontainers
module load pandaseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pandaseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pandaseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pandaseq
pandaseq -f SRR069027_1.fastq -r SRR069027_2.fastq
Pandora¶
Introduction¶
Pandora is a tool for bacterial genome analysis using a pangenome reference graph (PanRG). It allows gene presence/absence detection and genotyping of SNPs, indels and longer variants in one or a number of samples.
Versions¶
0.9.1
Commands¶
pandora
Module¶
You can load the modules by:
module load biocontainers
module load pandora
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pandora on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pandora
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pandora
pandora index -t 4 GC00006032.fa
Pangolin¶
Introduction¶
Pangolin
is a software package for assigning SARS-CoV-2 genome sequences to global lineages.
Versions¶
3.1.20
4.0.6
4.1.2
4.1.3
4.2
Commands¶
pangolin
Module¶
You can load the modules by:
module load biocontainers
module load pangolin
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pangolin on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pangolin
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pangolin
PanPhlAn¶
Introduction¶
PanPhlAn
(Pangenome-based Phylogenomic Analysis) is a strain-level metagenomic profiling tool for identifying the gene composition and in-vivo transcriptional activity of individual strains in metagenomic samples.
Versions¶
3.1
Commands¶
panphlan_download_pangenome.py
panphlan_map.py
panphlan_profiling.py
Module¶
You can load the modules by:
module load biocontainers
module load panphlan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run PanPhlAn on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=panphlan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers panphlan
Clara Parabricks¶
Introduction¶
NVIDIA’s Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Google’s DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.
Versions¶
4.0.0-1
Commands¶
pbrun
Module¶
You can load the modules by:
module load biocontainers
module load parabricks
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
As Clara Parabricks depends on Nvidia GPU, it is only deployed in Scholar, Gilbreth, and ACCESS Anvil.
To run Clara Parabricks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --gpus=1
#SBATCH --job-name=parabricks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parabricks
pbrun haplotypecaller \
--ref FVZG01.1.fsa_nt \
--in-bam output.bam \
--out-variants variants.vcf
Parallel-fastq-dump¶
Introduction¶
Parallel-fastq-dump
is the parallel fastq-dump wrapper.
Versions¶
0.6.7
Commands¶
parallel-fastq-dump
Module¶
You can load the modules by:
module load biocontainers
module load parallel-fastq-dump
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Parallel-fastq-dump on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=parallel-fastq-dump
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parallel-fastq-dump
parallel-fastq-dump -s SRR11941281/SRR11941281.sra \
--split-files --threads 4 --gzip
Parliament2¶
Introduction¶
Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.
Versions¶
0.1.11
Commands¶
parliament2.py
Module¶
You can load the modules by:
module load biocontainers
module load parliament2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run parliament2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=parliament2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parliament2
Parsnp¶
Introduction¶
Parsnp
is used to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours.
Versions¶
1.6.2
Commands¶
parsnp
Module¶
You can load the modules by:
module load biocontainers
module load parsnp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Parsnp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=parsnp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers parsnp
parsnp -g examples/mers_virus/ref/England1.gbk \
-d examples/mers_virus/genomes/*.fna -c -p 8
Pasapipeline¶
Introduction¶
PASA, acronym for Program to Assemble Spliced Alignments (and pronounced ‘pass-uh’), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. PASA also identifies and classifies all splicing variations supported by the transcript alignments.
Versions¶
2.5.2-devb
Commands¶
pasa
Launch_PASA_pipeline.pl
GMAP_multifasta_processor.pl
blat_to_btab.pl
blat_to_cdna_clusters.pl
blat_top_hit_extractor.pl
ensure_single_valid_alignment_per_cdna_per_cluster.pl
errors_to_newalign_btabs.pl
extract_FL_transdecoder_entries.pl
get_failed_transcripts.pl
gmap_to_btab.pl
import_GMAP_gff3.pl
pasa_alignment_assembler_textprocessor.pl
pasa_asmbls_to_training_set.extract_reference_orfs.pl
polyCistronAnalyzer.pl
process_BLAT_alignments.pl
process_GMAP_alignments_gff3_chimeras_ok.pl
process_PBLAT_alignments.pl
process_minimap2_alignments.pl
pslx_to_gff3.pl
run_spliced_aligners.pl
sim4_to_btab.pl
Annotation_store_preloader.dbi
Load_Current_Gene_Annotations.dbi
PASA_transcripts_and_assemblies_to_GFF3.dbi
UTR_category_analysis.dbi
__drop_many_mysql_dbs.dbi
alignment_assembly_to_gene_models.dbi
alt_splice_AAT_alignment_generator.dbi
assemble_clusters.dbi
assembly_db_loader.dbi
assign_clusters_by_gene_intergene_overlap.dbi
assign_clusters_by_stringent_alignment_overlap.dbi
build_comprehensive_transcriptome.dbi
build_comprehensive_transcriptome.tabix.dbi
cDNA_annotation_comparer.dbi
cDNA_annotation_updater.dbi
classify_alt_splice_as_UTR_or_protein.dbi
classify_alt_splice_isoforms.dbi
classify_alt_splice_isoforms_per_subcluster.dbi
comprehensive_alt_splice_report.dbi
compute_gene_coverage_by_incorporated_PASA_assemblies.dbi
create_mysql_cdnaassembly_db.dbi
create_sqlite_cdnaassembly_db.dbi
describe_alignment_assemblies.dbi
describe_alignment_assemblies_cgi_convert.dbi
drop_mysql_db_if_exists.dbi
dump_annot_store.dbi
dump_valid_annot_updates.dbi
extract_regions_for_probe_design.dbi
extract_skipped_exons.dbi
extract_transcript_alignment_clusters.dbi
find_FL_equivalent_support.dbi
find_alternate_internal_exons.dbi
get_antisense_transcripts.dbi
import_custom_alignments.dbi
import_spliced_alignments.dbi
invalidate_RNA-Seq_assembly_artifacts.dbi
invalidate_single_exon_ESTs.dbi
mapPolyAsites_to_genes.dbi
pasa_asmbl_genes_to_GFF3.dbi
pasa_asmbls_to_training_set.dbi
polyA_site_summarizer.dbi
polyA_site_transcript_mapper.dbi
populate_alignments_via_btab.dbi
populate_ath1_cdnas.dbi
populate_cdna_clusters.dbi
populate_mysql_assembly_alignment_field.dbi
populate_mysql_assembly_sequence_field.dbi
purge_PASA_database.dbi
purge_annot_comparisons.dbi
reassign_clusters_via_valid_align_coords.dbi
reconstruct_FL_isoforms_from_parts.dbi
report_alt_splicing_findings.dbi
reset_to_prior_to_assembly_build.dbi
retrieve_assembly_sequences.dbi
set_spliced_orient_transcribed_orient.dbi
splicing_events_in_subcluster_context.dbi
splicing_variation_to_splicing_event.dbi
subcluster_builder.dbi
subcluster_loader.dbi
test_assemble_clusters.dbi
test_mysql_connection.dbi
update_alignment_status.dbi
update_clusters_coordinates.dbi
update_fli_status.dbi
update_spliced_orient.dbi
upload_cdna_headers.dbi
upload_transcript_data.dbi
validate_alignments_in_db.dbi
Module¶
You can load the modules by:
module load biocontainers
module load pasapipeline
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pasapipeline on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasapipeline
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pasapipeline
Pasta¶
Introduction¶
PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.
Versions¶
1.8.7
Commands¶
run_pasta.py
run_seqtools.py
sumlabels.py
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load pasta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pasta
Pblat¶
Introduction¶
pblat is parallelized blat with multi-threads support.
Versions¶
2.5.1
Commands¶
pblat
Module¶
You can load the modules by:
module load biocontainers
module load pblat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pblat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pblat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pblat
Pbmm2¶
Introduction¶
Pbmm2
is a minimap2 frontend for PacBio native data formats.
Versions¶
1.7.0
Commands¶
pbmm2
Module¶
You can load the modules by:
module load biocontainers
module load pbmm2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pbmm2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pbmm2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pbmm2
pbmm2 --version
pbmm2 align hg38.fa \
alz.polished.hq.bam alz.aligned.bam \
-j 12 --preset ISOSEQ --sort \
--log-level INFO
Pbptyper¶
Introduction¶
pbptyper is a tool to identify the Penicillin Binding Protein (PBP) of Streptococcus pneumoniae assemblies.
Versions¶
1.0.4
Commands¶
pbptyper
Module¶
You can load the modules by:
module load biocontainers
module load pbptyper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pbptyper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pbptyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pbptyper
pbptyper --assembly test/SRR2912551.fna.gz --outdir output
PCAngsd¶
Introduction¶
PCAngsd
is a program that estimates the covariance matrix and individual allele frequencies for low-depth next-generation sequencing (NGS) data in structured/heterogeneous populations using principal component analysis (PCA) to perform multiple population genetic analyses using genotype likelihoods.
Versions¶
1.10
Commands¶
pcangsd
Module¶
You can load the modules by:
module load biocontainers
module load pcangsd
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run PCAngsd on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pcangsd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pcangsd
pcangsd -b pupfish.beagle.gz --inbreedSites \
--selection -o pup_pca2 --threads 12
Peakranger¶
Introduction¶
Peakranger
is a multi-purporse software suite for analyzing next-generation sequencing (NGS) data.
Versions¶
1.18
Commands¶
peakranger
Module¶
You can load the modules by:
module load biocontainers
module load peakranger
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Peakranger on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=peakranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers peakranger
peakranger ccat --format bam 27-1_sorted_MDRD_MQ30filtered.bam 27-4_sorted_MDRD_MQ30filtered.bam \
ccat_result_with_HTML_report_5kb_region --report \
--gene_annot_file refGene.txt --plot_region 10000
Pepper_deepvariant¶
Introduction¶
PEPPER is a genome inference module based on recurrent neural networks that enables long-read variant calling and nanopore assembly polishing in the PEPPER-Margin-DeepVariant pipeline. This pipeline enables nanopore-based variant calling with DeepVariant.
Versions¶
r0.4.1
Commands¶
run_pepper_margin_deepvariant
Module¶
You can load the modules by:
module load biocontainers
module load pepper_deepvariant
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pepper_deepvariant on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 32
#SBATCH --job-name=pepper_deepvariant
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pepper_deepvariant
BASE=$PWD
# Set up input data
INPUT_DIR="${BASE}/input/data"
REF="GRCh38_no_alt.chr20.fa"
BAM="HG002_ONT_2_GRCh38.chr20.quickstart.bam"
# Set the number of CPUs to use
THREADS=32
# Set up output directory
OUTPUT_DIR="${BASE}/output"
OUTPUT_PREFIX="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20"
OUTPUT_VCF="HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20.vcf.gz"
TRUTH_VCF="HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz"
TRUTH_BED="HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed"
# Create local directory structure
mkdir -p "${OUTPUT_DIR}"
mkdir -p "${INPUT_DIR}"
# Download the data to input directory
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_ONT_2_GRCh38.chr20.quickstart.bam.bai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/GRCh38_no_alt.chr20.fa.fai
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark.quickstart.vcf.gz
wget -P ${INPUT_DIR} https://storage.googleapis.com/pepper-deepvariant-public/quickstart_data/HG002_GRCh38_1_22_v4.2.1_benchmark_noinconsistent.quickstart.bed
run_pepper_margin_deepvariant call_variant \
-b input/data/HG002_ONT_2_GRCh38.chr20.quickstart.bam \
-f input/data/GRCh38_no_alt.chr20.fa -o output \
-p HG002_ONT_2_GRCh38_PEPPER_Margin_DeepVariant.chr20 \
-t 32 -r chr20:1000000-1020000 \
--ont_r9_guppy5_sup --ont
BioPerl¶
Introduction¶
BioPerl
is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It provides software modules for many of the typical tasks of bioinformatics programming.
Versions¶
1.7.2-pl526
Commands¶
SOAPsh.pl
ace.pl
bam2bedgraph
bamToGBrowse.pl
bdf2gdfont.pl
bdftogd
binhex.pl
bp_aacomp.pl
bp_biofetch_genbank_proxy.pl
bp_bioflat_index.pl
bp_biogetseq.pl
bp_blast2tree.pl
bp_bulk_load_gff.pl
bp_chaos_plot.pl
bp_classify_hits_kingdom.pl
bp_composite_LD.pl
bp_das_server.pl
bp_dbsplit.pl
bp_download_query_genbank.pl
bp_extract_feature_seq.pl
bp_fast_load_gff.pl
bp_fastam9_to_table.pl
bp_fetch.pl
bp_filter_search.pl
bp_find-blast-matches.pl
bp_flanks.pl
bp_gccalc.pl
bp_genbank2gff.pl
bp_genbank2gff3.pl
bp_generate_histogram.pl
bp_heterogeneity_test.pl
bp_hivq.pl
bp_hmmer_to_table.pl
bp_index.pl
bp_load_gff.pl
bp_local_taxonomydb_query.pl
bp_make_mrna_protein.pl
bp_mask_by_search.pl
bp_meta_gff.pl
bp_mrtrans.pl
bp_mutate.pl
bp_netinstall.pl
bp_nexus2nh.pl
bp_nrdb.pl
bp_oligo_count.pl
bp_pairwise_kaks
bp_parse_hmmsearch.pl
bp_process_gadfly.pl
bp_process_sgd.pl
bp_process_wormbase.pl
bp_query_entrez_taxa.pl
bp_remote_blast.pl
bp_revtrans-motif.pl
bp_search2alnblocks.pl
bp_search2gff.pl
bp_search2table.pl
bp_search2tribe.pl
bp_seq_length.pl
bp_seqconvert.pl
bp_seqcut.pl
bp_seqfeature_delete.pl
bp_seqfeature_gff3.pl
bp_seqfeature_load.pl
bp_seqpart.pl
bp_seqret.pl
bp_seqretsplit.pl
bp_split_seq.pl
bp_sreformat.pl
bp_taxid4species.pl
bp_taxonomy2tree.pl
bp_translate_seq.pl
bp_tree2pag.pl
bp_unflatten_seq.pl
ccconfig
chartex
chi2
chrom_sizes.pl
circo
clustalw
clustalw2
corelist
cpan
cpanm
dbilogstrip
dbiprof
dbiproxy
debinhex.pl
enc2xs
encguess
genomeCoverageBed.pl
h2ph
h2xs
htmltree
instmodsh
json_pp
json_xs
lwp-download
lwp-dump
lwp-mirror
lwp-request
perl
perl5.26.2
perlbug
perldoc
perlivp
perlthanks
piconv
pl2pm
pod2html
pod2man
pod2text
pod2usage
podchecker
podselect
prove
ptar
ptardiff
ptargrep
shasum
splain
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
stubmaker.pl
t_coffee
tpage
ttree
unflatten
webtidy
xml_grep
xml_merge
xml_pp
xml_spellcheck
xml_split
xpath
xsubpp
zipdetails
Module¶
You can load the modules by:
module load biocontainers
module load perl-bioperl
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run BioPerl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=perl-bioperl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers perl-bioperl
Phast¶
Introduction¶
PHAST is a freely available software package for comparative and evolutionary genomics. For more information, please check: BioContainers: https://biocontainers.pro/tools/phast Home page: http://compgen.cshl.edu/phast/
Versions¶
1.5
Commands¶
all_dists
base_evolve
chooseLines
clean_genes
consEntropy
convert_coords
display_rate_matrix
dless
dlessP
draw_tree
eval_predictions
exoniphy
hmm_train
hmm_tweak
hmm_view
indelFit
indelHistory
maf_parse
makeHKY
modFreqs
msa_diff
msa_split
msa_view
pbsDecode
pbsEncode
pbsScoreMatrix
pbsTrain
phast
phastBias
phastCons
phastMotif
phastOdds
phyloBoot
phyloFit
phyloP
prequel
refeature
stringiphy
treeGen
tree_doctor
Module¶
You can load the modules by:
module load biocontainers
module load phast
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phast
Phd2fasta¶
Introduction¶
Phd2fasta
is a tool to convert Phred ‘phd’ format files to ‘fasta’ format.
Versions¶
0.990622
Commands¶
phd2fasta
Module¶
You can load the modules by:
module load biocontainers
module load phd2fasta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Phd2fasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phd2fasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phd2fasta
Phg¶
Introduction¶
Practical Haplotype Graph (PHG) is a general, graph-based, computational framework that can be used with a variety of skim sequencing methods to infer high-density genotypes directly from low-coverage sequence.
Versions¶
1.0
Commands¶
CreateConsensi.sh
CreateHaplotypes.sh
CreateReferenceIntervals.sh
CreateSmallDataSet.sh
CreateValidIntervalsFile.sh
IndexPangenome.sh
LoadAssemblyAnchors.sh
LoadGenomeIntervals.sh
ParallelAssemblyAnchorsLoad.sh
RunLiquibaseUpdates.sh
CreateHaplotypesFromBAM.groovy
CreateHaplotypesFromFastq.groovy
CreateHaplotypesFromGVCF.groovy
Module¶
You can load the modules by:
module load biocontainers
module load phg
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phg
Phipack¶
Introduction¶
PhiPack: PHI test and other tests of recombination
Versions¶
1.1
Commands¶
Phi
Profile
Module¶
You can load the modules by:
module load biocontainers
module load phipack
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phipack on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phipack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phipack
phrap¶
Introduction¶
phrap
is a program for assembling shotgun DNA sequence data.
Versions¶
1.090518
Commands¶
phrap
Module¶
You can load the modules by:
module load biocontainers
module load phrap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phrap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phrap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phrap
phred¶
Introduction¶
phred
software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.
Versions¶
0.071220.c
Commands¶
phred
Module¶
You can load the modules by:
module load biocontainers
module load phred
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phred on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phred
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phred
Phylofisher¶
Introduction¶
PhyloFisher is a software package written in Python3 that can be used for the creation, analysis, and visualization of phylogenomic datasets that consist of eukaryotic protein sequences.
Versions¶
1.2.7
1.2.9
Commands¶
aa_comp_calculator.py
aa_recoder.py
apply_to_db.py
astral_runner.py
backup_restoration.py
bipartition_examiner.py
build_database.py
config.py
edirect.py
explore_database.py
fast_site_remover.py
fast_taxa_remover.py
fisher.py
forest.py
genetic_code_examiner.py
gfmix_runner.py
heterotachy.py
informant.py
install_deps.py
jp.py
mammal_modeler.py
matrix_constructor.py
prep_final_dataset.py
purge.py
random_resampler.py
rst2html.py
rst2html4.py
rst2html5.py
rst2latex.py
rst2man.py
rst2odt.py
rst2odt_prepstyles.py
rst2pseudoxml.py
rst2s5.py
rst2xetex.py
rst2xml.py
rstpep2html.py
rtc_binner.py
runxlrd.py
select_orthologs.py
select_taxa.py
sgt_constructor.py
taxon_collapser.py
vba_extract.py
windowmasker_2.2.22_adapter.py
working_dataset_constructor.py
Module¶
You can load the modules by:
module load biocontainers
module load phylofisher
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phylofisher on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylofisher
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phylofisher
Phylosuite¶
Introduction¶
PhyloSuite is an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies.
Versions¶
1.2.3
Commands¶
PhyloSuite.sh
Module¶
You can load the modules by:
module load biocontainers
module load phylosuite
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run phylosuite on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=phylosuite
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers phylosuite
Picard Tools¶
Introduction¶
Picard
is a set of command line tools for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.
Detailed usage can be found here: https://broadinstitute.github.io/picard/
Versions¶
2.25.1
2.26.10
Commands¶
picard
Module¶
You can load the modules by:
module load biocontainers
module load picard/2.26.10
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run picard our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=picard
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers picard/2.26.10
picard MarkDuplicates -Xmx64g I=19P0126636WES_sorted.bam O=19P0126636WES_sorted_md.bam M=19P0126636WES.sorted.markdup.txt REMOVE_DUPLICATES=true
picard BuildBamIndex -Xmx64g I=19P0126636WES_sorted_md.bam
picard CreateSequenceDictionary -R hg38.fa -O hg38.dict
Picrust2¶
Introduction¶
Picrust2
is a software for predicting functional abundances based only on marker gene sequences.
Versions¶
2.4.2
2.5.0
Commands¶
add_descriptions.py
convert_table.py
hsp.py
metagenome_pipeline.py
pathway_pipeline.py
picrust2_pipeline.py
place_seqs.py
print_picrust2_config.py
run_abundance.py
run_sepp.py
run_tipp.py
run_tipp_tool.py
run_upp.py
shuffle_predictions.py
split_sequences.py
sumlabels.py
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load picrust2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Picrust2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 10
#SBATCH --job-name=picrust2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers picrust2
place_seqs.py -s ../seqs.fna -o out.tre -p 10 \
--intermediate intermediate/place_seqs
hsp.py -i 16S -t out.tre -o marker_predicted_and_nsti.tsv.gz -p 10 -n
hsp.py -i EC -t out.tre -o EC_predicted.tsv.gz -p 10
metagenome_pipeline.py -i ../table.biom -m marker_predicted_and_nsti.tsv.gz -f EC_predicted.tsv.gz -o EC_metagenome_out --strat_out
convert_table.py EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
-c contrib_to_legacy \
-o EC_metagenome_out/pred_metagenome_contrib.legacy.tsv.gz
pathway_pipeline.py -i EC_metagenome_out/pred_metagenome_contrib.tsv.gz \
-o pathways_out -p 10
add_descriptions.py -i EC_metagenome_out/pred_metagenome_unstrat.tsv.gz -m EC \
-o EC_metagenome_out/pred_metagenome_unstrat_descrip.tsv.gz
add_descriptions.py -i pathways_out/path_abun_unstrat.tsv.gz -m METACYC \
-o pathways_out/path_abun_unstrat_descrip.tsv.gz
picrust2_pipeline.py -s chemerin_16S/seqs.fna -i chemerin_16S/table.biom \
-o picrust2_out_pipeline -p 10
Pilon¶
Introduction¶
Pilon
is an automated genome assembly improvement and variant detection tool.
Versions¶
1.24
Commands¶
pilon.jar
Module¶
You can load the modules by:
module load biocontainers
module load pilon
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pilon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=pilon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pilon
pilon.jar --nostrays \
--genome scaffolds.fasta \
--frags out_sorted.bam \
--vcf --verbose --threads 12 \
--output pilon_corrected \
--outdir pilon_outdir
Pindel¶
Introduction¶
Pindel
is used to detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data.
Versions¶
0.2.5b9
Commands¶
pindel
pindel2cvf
Module¶
You can load the modules by:
module load biocontainers
module load pindel
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pindel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pindel
pindel -i simulated_config.txt -f simulated_reference.fa -o bamtest -c ALL
pindel -p COLO-829_20-p_ok.txt -f hs_ref_chr20.fa -o colontumor -c 20
pindel2vcf -r hs_ref_chr20.fa -R HUMAN_G1K_V2 -d 20100101 -p colontumor_D -e 5
Pirate¶
Introduction¶
Pirate
is a pangenome analysis and threshold evaluation toolbox.
Versions¶
1.0.4
Commands¶
PIRATE
FET.pl
PIRATE_to_Rtab.pl
PIRATE_to_roary.pl
SOAPsh.pl
ace.pl
analyse_blast_outputs.pl
analyse_loci_list.pl
annotate_treeWAS_output.pl
bamToGBrowse.pl
bdf2gdfont.pl
binhex.pl
bp_aacomp.pl
bp_biofetch_genbank_proxy.pl
bp_bioflat_index.pl
bp_biogetseq.pl
bp_blast2tree.pl
bp_bulk_load_gff.pl
bp_chaos_plot.pl
bp_classify_hits_kingdom.pl
bp_composite_LD.pl
bp_das_server.pl
bp_dbsplit.pl
bp_download_query_genbank.pl
bp_extract_feature_seq.pl
bp_fast_load_gff.pl
bp_fastam9_to_table.pl
bp_fetch.pl
bp_filter_search.pl
bp_find-blast-matches.pl
bp_flanks.pl
bp_gccalc.pl
bp_genbank2gff.pl
bp_genbank2gff3.pl
bp_generate_histogram.pl
bp_heterogeneity_test.pl
bp_hivq.pl
bp_hmmer_to_table.pl
bp_index.pl
bp_load_gff.pl
bp_local_taxonomydb_query.pl
bp_make_mrna_protein.pl
bp_mask_by_search.pl
bp_meta_gff.pl
bp_mrtrans.pl
bp_mutate.pl
bp_netinstall.pl
bp_nexus2nh.pl
bp_nrdb.pl
bp_oligo_count.pl
bp_parse_hmmsearch.pl
bp_process_gadfly.pl
bp_process_sgd.pl
bp_process_wormbase.pl
bp_query_entrez_taxa.pl
bp_remote_blast.pl
bp_revtrans-motif.pl
bp_search2alnblocks.pl
bp_search2gff.pl
bp_search2table.pl
bp_search2tribe.pl
bp_seq_length.pl
bp_seqconvert.pl
bp_seqcut.pl
bp_seqfeature_delete.pl
bp_seqfeature_gff3.pl
bp_seqfeature_load.pl
bp_seqpart.pl
bp_seqret.pl
bp_seqretsplit.pl
bp_split_seq.pl
bp_sreformat.pl
bp_taxid4species.pl
bp_taxonomy2tree.pl
bp_translate_seq.pl
bp_tree2pag.pl
bp_unflatten_seq.pl
cd-hit-2d-para.pl
cd-hit-clstr_2_blm8.pl
cd-hit-div.pl
cd-hit-para.pl
chrom_sizes.pl
clstr2tree.pl
clstr2txt.pl
clstr2xml.pl
clstr_cut.pl
clstr_list.pl
clstr_list_sort.pl
clstr_merge.pl
clstr_merge_noorder.pl
clstr_quality_eval.pl
clstr_quality_eval_by_link.pl
clstr_reduce.pl
clstr_renumber.pl
clstr_rep.pl
clstr_reps_faa_rev.pl
clstr_rev.pl
clstr_select.pl
clstr_select_rep.pl
clstr_size_histogram.pl
clstr_size_stat.pl
clstr_sort_by.pl
clstr_sort_prot_by.pl
clstr_sql_tbl.pl
clstr_sql_tbl_sort.pl
convert_to_distmat.pl
convert_to_treeWAS.pl
debinhex.pl
genomeCoverageBed.pl
legacy_blast.pl
make_multi_seq.pl
pangenome_variants_to_treeWAS.pl
paralogs_to_Rtab.pl
plot_2d.pl
plot_len1.pl
stag-autoschema.pl
stag-db.pl
stag-diff.pl
stag-drawtree.pl
stag-filter.pl
stag-findsubtree.pl
stag-flatten.pl
stag-grep.pl
stag-handle.pl
stag-itext2simple.pl
stag-itext2sxpr.pl
stag-itext2xml.pl
stag-join.pl
stag-merge.pl
stag-mogrify.pl
stag-parse.pl
stag-query.pl
stag-splitter.pl
stag-view.pl
stag-xml2itext.pl
stubmaker.pl
subsample_outputs.pl
subset_alignments.pl
unique_sequences.pl
update_blastdb.pl
Module¶
You can load the modules by:
module load biocontainers
module load pirate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pirate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pirate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pirate
Piscem¶
Introduction¶
piscem is a rust wrapper for a next-generation index + mapper tool (still currently written in C++17).
Versions¶
0.4.3
Commands¶
piscem
Module¶
You can load the modules by:
module load biocontainers
module load piscem
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run piscem on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=piscem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers piscem
Pixy¶
Introduction¶
pixy is a command-line tool for painlessly estimating average nucleotide diversity within (π) and between (dxy) populations from a VCF.
Versions¶
1.2.7
Commands¶
pixy
Module¶
You can load the modules by:
module load biocontainers
module load pixy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pixy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pixy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pixy
Plasmidfinder¶
Introduction¶
PlasmidFinder identifies plasmids in total or partial sequenced isolates of bacteria.
Versions¶
2.1.6
Commands¶
plasmidfinder.py
Module¶
You can load the modules by:
module load biocontainers
module load plasmidfinder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run plasmidfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plasmidfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plasmidfinder
plasmidfinder.py -p test/database \
-i test/test.fsa -o output -mp blastn -x -q
Platon¶
Introduction¶
Platon: identification and characterization of bacterial plasmid contigs from short-read draft assemblies.
Versions¶
1.6
Commands¶
platon
Module¶
You can load the modules by:
module load biocontainers
module load platon
Note
The environment variable PLATON_DB
is set as /depot/itap/datasets/platon/db
. This directory contains the required database.
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run platon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=platon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers platon
platon --verbose --threads 4 contigs.fasta
Platypus¶
Introduction¶
Platypus
is a tool designed for efficient and accurate variant-detection in high-throughput sequencing data.
Versions¶
0.8.1
Commands¶
platypus
Module¶
You can load the modules by:
module load biocontainers
module load platypus
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Platypus on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=platypus
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers platypus
Plink¶
Introduction¶
Plink
is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
Versions¶
1.90b6.21
Commands¶
plink
prettify
Module¶
You can load the modules by:
module load biocontainers
module load plink
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Plink on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plink
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plink
plink --file toy --freq --out toy_analysis
Plink2¶
Introduction¶
Plink2
is a whole genome association analysis toolset.
Versions¶
2.00a2.3
Commands¶
plink2
Module¶
You can load the modules by:
module load biocontainers
module load plink2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Plink2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plink2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plink2
plink2 --bfile HapMap_3_r3_1 --freq --out HapMap_3_r3_1_out
Plotsr¶
Introduction¶
Plotsr generates high-quality visualisation of synteny and structural rearrangements between multiple genomes. For this, it uses the genomic structural annotations between multiple chromosome-level assemblies.
Versions¶
0.5.4
Commands¶
plotsr
Module¶
You can load the modules by:
module load biocontainers
module load plotsr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run plotsr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=plotsr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers plotsr
plotsr syri.out refgenome qrygenome -H 8 -W 5
Pomoxis¶
Introduction¶
Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing. Notably tools are included for generating and analysing draft assemblies. Many of these tools are used by the research data analysis group at Oxford Nanopore Technologies.
Versions¶
0.3.9
Commands¶
assess_assembly
catalogue_errors
common_errors_from_bam
coverage_from_bam
coverage_from_fastx
fast_convert
find_indels
intersect_assembly_errors
long_fastx
mini_align
mini_assemble
pomoxis_path
qscores_from_summary
ref_seqs_from_bam
reverse_bed
split_fastx
stats_from_bam
subsample_bam
summary_from_stats
tag_bam
trim_alignments
Module¶
You can load the modules by:
module load biocontainers
module load pomoxis
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pomoxis on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=pomoxis
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pomoxis
assess_assembly \
-i helen_output/Staph_Aur_draft_helen.fa \
-r truth_assembly_staph_aur.fasta \
-p polished_assembly_quality \
-l 50 \
-t 4 \
-e \
-T
Poppunk¶
Introduction¶
PopPUNK is a tool for clustering genomes. We refer to the clusters as variable-length-k-mer clusters, or VLKCs. Biologically, these clusters typically represent distinct strains. We refer to subclusters of strains as lineages.
Versions¶
2.5.0
2.6.0
Commands¶
poppunk
poppunk_add_weights.py
poppunk_assign
poppunk_batch_mst.py
poppunk_calculate_rand_indices.py
poppunk_calculate_silhouette.py
poppunk_easy_run.py
poppunk_extract_components.py
poppunk_extract_distances.py
poppunk_info
poppunk_iterate.py
poppunk_mandrake
poppunk_mst
poppunk_references
poppunk_visualise
Module¶
You can load the modules by:
module load biocontainers
module load poppunk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run poppunk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=poppunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers poppunk
Popscle¶
Introduction¶
Popscle
is a suite of population scale analysis tools for single-cell genomics data.
Versions¶
0.1b
Commands¶
popscle
Module¶
You can load the modules by:
module load biocontainers
module load popscle
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Popscle on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=popscle
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers popscle
popscle dsc-pileup --sam data/$bam --vcf data/$ref_vcf --out data/$pileup
Pplacer¶
Introduction¶
Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment, guppy does all of the downstream analysis of placements, and rppr does useful things having to do with reference packages. For more information, please check: BioContainers: https://biocontainers.pro/tools/pplacer Home page: https://matsen.fhcrc.org/pplacer/
Versions¶
1.1.alpha19
Commands¶
pplacer
guppy
rppr
Module¶
You can load the modules by:
module load biocontainers
module load pplacer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pplacer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pplacer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pplacer
Prinseq¶
Introduction¶
Prinseq
is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data.
Versions¶
0.20.4
Commands¶
prinseq-graphs-noPCA.pl
prinseq-graphs.pl
prinseq-lite.pl
Module¶
You can load the modules by:
module load biocontainers
module load prinseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Prinseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prinseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prinseq
prinseq-lite.pl -verbose -fastq SRR5043021_1.fastq -fastq2 SRR5043021_2.fastq -graph_data test.gd -out_good null -out_bad null
prinseq-graphs.pl -i test.gd -png_all -o test
prinseq-graphs-noPCA.pl -i test.gd -png_all -o test_noPCA
Prodigal¶
Introduction¶
Prodigal
is a tool for fast, reliable protein-coding gene prediction for prokaryotic genome.
Versions¶
2.6.3
Commands¶
prodigal
Module¶
You can load the modules by:
module load biocontainers
module load prodigal
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Prodigal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=prodigal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prodigal
prodigal -i genome.fasta -o output.genes -a proteins.faa
Prokka¶
Introduction¶
Prokka
is a pipeline for rapidly annotating prokaryotic genomes. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.
Detailed usage can be found here: https://github.com/tseemann/prokka
Versions¶
1.14.6
Commands¶
prokka
prokka-abricate_to_fasta_db
prokka-biocyc_to_fasta_db
prokka-build_kingdom_dbs
prokka-cdd_to_hmm
prokka-clusters_to_hmm
prokka-genbank_to_fasta_db
prokka-genpept_to_fasta_db
prokka-hamap_to_hmm
prokka-tigrfams_to_hmm
prokka-uniprot_to_fasta_db
Module¶
You can load the modules by:
module load biocontainers
module load prokka
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run prokka on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=prokka
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prokka
prokka --compliant --centre UoN --outdir PRJEB12345 --locustag EHEC --prefix EHEC-Chr1 contigs.fa --cpus 24
prokka-genbank_to_fasta_db Coccus1.gbk Coccus2.gbk Coccus3.gbk Coccus4.gbk > Coccus.faa
Proteinortho¶
Introduction¶
Proteinortho
is a tool to detect orthologous genes within different species.
Versions¶
6.0.33
Commands¶
proteinortho
proteinortho2html.pl
proteinortho2tree.pl
proteinortho2xml.pl
proteinortho6.pl
proteinortho_cleanupblastgraph
proteinortho_clustering
proteinortho_compareProteinorthoGraphs.pl
proteinortho_do_mcl.pl
proteinortho_extract_from_graph.pl
proteinortho_ffadj_mcs.py
proteinortho_formatUsearch.pl
proteinortho_grab_proteins.pl
proteinortho_graphMinusRemovegraph
proteinortho_history.pl
proteinortho_singletons.pl
proteinortho_summary.pl
proteinortho_treeBuilderCore
Module¶
You can load the modules by:
module load biocontainers
module load proteinortho
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Proteinortho on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=proteinortho
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers proteinortho
proteinortho6.pl test/C.faa test/E.faa test/L.faa test/M.faa
ProtHint¶
Introduction¶
ProtHint
is a pipeline for predicting and scoring hints (in the form of introns, start and stop codons) in the genome of interest by mapping and spliced aligning predicted genes to a database of reference protein sequences.
Versions¶
2.6.0
Commands¶
cds_with_upstream_support.py
combine_gff_records.pl
count_cds_overlaps.py
flag_top_proteins.py
gff_from_region_to_contig.pl
make_chains.py
nucseq_for_selected_genes.pl
print_high_confidence.py
print_longest_isoform.py
proteins_from_gtf.pl
prothint.py
prothint2augustus.py
run_spliced_alignment.pl
run_spliced_alignment_pbs.pl
select_best_proteins.py
select_for_next_iteration.py
spalnBatch.sh
spaln_to_gff.py
Academic license¶
ProtHint depends on GenMark. To use GeneMark, users need to download license files by yourself.
Go to the GeneMark web site: http://exon.gatech.edu/GeneMark/license_download.cgi. Check the boxes for GeneMark-ES/ET/EP ver 4.69_lic
and LINUX 64
next to it, fill out the form, then click “I agree”. In the next page, right click and copy the link addresses for 64 bit
licenss. Paste the link addresses in the commands below:
cd $HOME
wget "replace with license URL"
zcat gm_key_64.gz > .gm_key
Module¶
You can load the modules by:
module load biocontainers
module load prothint
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ProtHint on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=prothint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers prothint
prothint.py --threads 4 input/genome.fasta input/proteins.fasta --geneSeeds input/genemark.gtf --workdir test
Pullseq¶
Introduction¶
Pullseq is an utility program for extracting sequences from a fasta/fastq file.
Versions¶
1.0.2
Commands¶
pcre-config
pcregrep
pcretest
pullseq
seqdiff
Module¶
You can load the modules by:
module load biocontainers
module load pullseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pullseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pullseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pullseq
Purge_dups¶
Introduction¶
purge_dups is designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.
Versions¶
1.2.6
Commands¶
augustify.py
bamToWig.py
cleanup-blastdb-volumes.py
edirect.py
executeTestCGP.py
extractAnno.py
findRepetitiveProtSeqs.py
fix_in_frame_stop_codon_genes.py
generate_plot.py
getAnnoFastaFromJoingenes.py
hist_plot.py
pd_config.py
run_abundance.py
run_purge_dups.py
run_sepp.py
run_tipp.py
run_tipp_tool.py
run_upp.py
split_sequences.py
stringtie2fa.py
sumlabels.py
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load purge_dups
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run purge_dups on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=purge_dups
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers purge_dups
Pvactools¶
Introduction¶
pVACtools is a cancer immunotherapy tools suite consisting of pVACseq, pVACbind, pVACfuse, pVACvector, and pVACview.
Versions¶
3.0.1
Commands¶
pvacbind
pvacfuse
pvacseq
pvactools
pvacvector
pvacview
Module¶
You can load the modules by:
module load biocontainers
module load pvactools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pvactools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pvactools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pvactools
pvacseq download_example_data .
pvacseq run \
pvacseq_example_data/input.vcf \
Test \
HLA-A*02:01,HLA-B*35:01,DRB1*11:01 \
MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \
pvacseq_output_data \
-e1 8,9,10 \
-e2 15 \
--iedb-install-directory /opt/iedb
Pyani¶
Introduction¶
Pyani
is an application and Python module for whole-genome classification of microbes using Average Nucleotide Identity.
Versions¶
0.2.11
0.2.12
Commands¶
average_nucleotide_identity.py
genbank_get_genomes_by_taxon.py
delta_filter_wrapper.py
Module¶
You can load the modules by:
module load biocontainers
module load pyani
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyani on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyani
average_nucleotide_identity.py -i tests/ -o tests/test_ANIm_output -m ANIm -g
average_nucleotide_identity.py -i tests/ -o tests/test_ANIb_output -m ANIb -g
average_nucleotide_identity.py -i tests/ -o tests/test_ANIblastall_output -m ANIblastall -g
average_nucleotide_identity.py -i tests/ -o tests/test_TETRA_output -m TETRA -g
Pybedtools¶
Introduction¶
Pybedtools
wraps and extends BEDTools and offers feature-level manipulations from within Python.
Versions¶
0.9.0
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pybedtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pybedtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybedtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pybedtools
Pybigwig¶
Introduction¶
Pybigwig
is a python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.
Versions¶
0.3.18
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pybigwig
Interactive job¶
To run pybigwig interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers pybigwig
(base) UserID@bell-a008:~ $ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 18:49:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyBigWig
>>> bw = pyBigWig.open("test/test.bw")
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run batch jobs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pybigwig
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pybigwig
python script.py
Pychopper¶
Introduction¶
Pychopper is a tool to identify, orient and trim full-length Nanopore cDNA reads. The tool is also able to rescue fused reads.
Versions¶
2.5.0
Commands¶
cdna_classifier.py
Module¶
You can load the modules by:
module load biocontainers
module load pychopper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pychopper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pychopper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pychopper
Pycoqc¶
Introduction¶
Pycoqc
is a tool that computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data.
Versions¶
2.5.2
Commands¶
pycoQC
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pycoqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pycoqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pycoqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pycoqc
pycoQC \
-f Albacore-1.2.1_basecall-1D-DNA_sequencing_summary.txt\
-o Albacore-1.2.1_basecall-1D-DNA.html \
--quiet
Pyensembl¶
Introduction¶
Pyensembl
is a Python interface to Ensembl reference genome metadata such as exons and transcripts.
Versions¶
1.9.4
Commands¶
pyensembl
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pyensembl
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyensembl on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyensembl
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyensembl
Pyfaidx¶
Introduction¶
Pyfaidx
is a Python package for random access and indexing of fasta files.
Versions¶
0.6.4
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pyfaidx
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyfaidx on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyfaidx
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyfaidx
Pygenometracks¶
Introduction¶
pyGenomeTracks aims to produce high-quality genome browser tracks that are highly customizable.
Versions¶
3.7
Commands¶
make_tracks_file
pyGenomeTracks
Module¶
You can load the modules by:
module load biocontainers
module load pygenometracks
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pygenometracks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenometracks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pygenometracks
make_tracks_file --trackFiles domains.bed bigwig.bw -o tracks.ini
pyGenomeTracks --tracks tracks.ini \
--region chr2:10,000,000-11,000,000 --outFileName nice_image.pdf
Pygenomeviz¶
Introduction¶
pyGenomeViz is a genome visualization python package for comparative genomics implemented based on matplotlib.
Versions¶
0.2.2
0.3.2
Commands¶
pgv-download-dataset
pgv-mmseqs
pgv-mummer
pgv-pmauve
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pygenomeviz
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pygenomeviz on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pygenomeviz
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pygenomeviz
Pyranges¶
Introduction¶
Pyranges
are collections of intervals that support comparison operations (like overlap and intersect) and other methods that are useful for genomic analyses.
Versions¶
0.0.115
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pyranges
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pyranges on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyranges
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyranges
Pysam¶
Introduction¶
Pysam
is a python module that makes it easy to read and manipulate mapped short read sequence data stored in SAM/BAM files.
Versions¶
0.18.0
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pysam
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Pysam on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pysam
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pysam
Pyvcf3¶
Introduction¶
PyVCF3 has been created because the Official PyVCF repository is no longer maintained and do not accept any pull requests.
Versions¶
1.0.3
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load pyvcf3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pyvcf3 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=pyvcf3
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers pyvcf3
QIIME 2¶
Introduction¶
QIIME 2
is a is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results.
Versions¶
2021.2
2022.11
2022.2
2022.8
2023.2
2023.5
Commands¶
qiime
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load qiime2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run QIIME 2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qiime2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qiime2
qiime metadata tabulate \
--m-input-file rep-seqs.qza \
--m-input-file taxonomy.qza \
--o-visualization tabulated-feature-metadata.qzv
Qtlseq¶
Introduction¶
Bulked segregant analysis, as implemented in QTL-seq (Takagi et al., 2013), is a powerful and efficient method to identify agronomically important loci in crop plants. QTL-seq was adapted from MutMap to identify quantitative trait loci. It utilizes sequences pooled from two segregating progeny populations with extreme opposite traits (e.g. resistant vs susceptible) and a single whole-genome resequencing of either of the parental cultivars.
Versions¶
2.2.3
Commands¶
qtlseq
Module¶
You can load the modules by:
module load biocontainers
module load qtlseq
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run qtlseq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qtlseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qtlseq
Qualimap¶
Introduction¶
Qualimap
is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.
Versions¶
2.2.1
Commands¶
qualimap
Module¶
You can load the modules by:
module load biocontainers
module load qualimap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Qualimap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=qualimap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers qualimap
Quast¶
Introduction¶
Quast
is Quality Assessment Tool for Genome Assemblies.
Note: Running QUAST, please use the command: quast.py| metaquast.py fastafile [OTHER OPTIONS] DO NOT call it ‘python quast.py| metaquast.py’
Versions¶
5.0.2
5.2.0
Commands¶
quast.py
metaquast.py
Module¶
You can load the modules by:
module load biocontainers
module load quast
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Quast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=quast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers quast
metaquast.py --gene-finding --threads 8 \
meta_contigs_1.fasta meta_contigs_2.fasta \
-r meta_ref_1.fasta,meta_ref_2.fasta,meta_ref_3.fasta \
-o quast_out_genefinding
QuickMIRSeq¶
Introduction¶
QuickMIRSeq
is an integrated pipeline for quick and accurate quantification of known miRNAs and isomiRs by jointly processing multiple samples.
Versions¶
1.0
Commands¶
perl
QuickMIRSeq-report.sh
Module¶
You can load the modules by:
module load biocontainers
module load quickmirseq
Note
This module defines program installation directory (note: inside the container!) as environment variable $QuickMIRSeq
. Once again, this is not a host path, this path is only available from inside the container.
With the way this module is organized, you should be able to use the variable freely for both the perl $QuickMIRSeq/QuickMIRSeq.pl allIDs.txt run.config
and the $QuickMIRSeq/QuickMIRSeq-report.sh
steps as directed by the user guide.
A simple QuickMIRSeq.pl
and QuickMIRSeq-report.sh
will also work (and can be a backup if the variable expansion somehow does not work for you).
You will also need a run configuration file. You can copy from an existing one, or take from the user guide, or as a last resort, use Singularity to copy the template (in $QuickMIRSeq/run.config.template
) from inside the container image. singularity shell
may be an easiest way for the latter.
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run QuickMIRSeq on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=quickmirseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers quickmirseq
quickmerge -d out.rq.delta -q q.fasta -r scab8722.fasta -hco 5.0 -c 1.5 -l n -ml m -p prefix
R¶
Introduction¶
R
is a system for statistical computation and graphics.
This is a plain R-base installation (see https://github.com/rocker-org/rocker/) repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files.
Versions¶
4.1.1
Commands¶
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load r
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run R on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r
Racon¶
Introduction¶
Racon
is a consensus module for raw de novo DNA assembly of long uncorrected reads.
Versions¶
1.4.20
1.5.0
Commands¶
racon
Module¶
You can load the modules by:
module load biocontainers
module load racon
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Racon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=racon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers racon
Ragout¶
Introduction¶
Ragout
is a tool for chromosome-level scaffolding using multiple references.
Versions¶
2.3
Commands¶
ragout
Module¶
You can load the modules by:
module load biocontainers
module load ragout
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ragout on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ragout
Ragtag¶
Introduction¶
Ragtag
is a tool for fast reference-guided genome assembly scaffolding.
Versions¶
2.1.0
Commands¶
ragtag.py
Module¶
You can load the modules by:
module load biocontainers
module load ragtag
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Ragtag on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ragtag
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ragtag
ragtag.py correct ref.fasta query.fasta
ragtag.py patch target.fa query.fa
Rapmap¶
Introduction¶
RapMap is a testing ground for ideas in quasi-mapping and selective alignment.
Versions¶
0.6.0
Commands¶
rapmap
Module¶
You can load the modules by:
module load biocontainers
module load rapmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rapmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rapmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rapmap
Rasusa¶
Introduction¶
Rasusa: Randomly subsample sequencing reads to a specified coverage.
Versions¶
0.6.0
0.7.0
Commands¶
rasusa
Module¶
You can load the modules by:
module load biocontainers
module load rasusa
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rasusa on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rasusa
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rasusa
rasusa -i seq_1.fq -i seq_2.fq \
--coverage 100 --genome-size 35mb \
-o out.r1.fq -o out.r2.fq
Raven-assembler¶
Introduction¶
Raven-assembler
is a de novo genome assembler for long uncorrected reads.
Versions¶
1.8.1
Commands¶
raven
Module¶
You can load the modules by:
module load biocontainers
module load raven-assembler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raven-assembler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=raven-assembler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raven-assembler
raven -t 12 input.fastq
Raxml¶
Introduction¶
Raxml
(Randomized Axelerated Maximum Likelihood) is a program for the Maximum Likelihood-based inference of large phylogenetic trees.
Versions¶
8.2.12
Commands¶
raxmlHPC
raxmlHPC-AVX2
raxmlHPC-PTHREADS
raxmlHPC-PTHREADS-AVX2
raxmlHPC-PTHREADS-SSE3
raxmlHPC-SSE3
Module¶
You can load the modules by:
module load biocontainers
module load raxml
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raxml on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 36
#SBATCH --job-name=raxml
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raxml
raxmlHPC-SSE3 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-SSE3_out -# 20 -T 36
raxmlHPC -m GTRGAMMA -p 12345 -s input.fasta -n HPC_out -# 20 -T 36
raxmlHPC-AVX2 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS_out -# 20 -T 36
raxmlHPC-PTHREADS-AVX2 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS-AVX2_out -# 20 -T 36
raxmlHPC-PTHREADS-SSE3 -m GTRGAMMA -p 12345 -s input.fasta -n HPC-PTHREADS-SSE3_out -# 20 -T 36
Raxml-ng¶
Introduction¶
Raxml-ng
is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion.
Versions¶
1.1.0
Commands¶
raxml-ng
raxml-ng-mpi
mpirun
mpiexec
Module¶
You can load the modules by:
module load biocontainers
module load raxml-ng
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Raxml-ng on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=raxml-ng
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers raxml-ng
raxml-ng --bootstrap --msa alignment.phy \
--model GTR+G --threads 12 --bs-trees 1000
R-cellchat¶
Introduction¶
CellChat: Inference and analysis of cell-cell communication.
Versions¶
1.5.0
Commands¶
R
Rscript
rstudio
Module¶
You can load the modules by:
module load biocontainers
module load r-cellchat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run r-cellchat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-cellchat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-cellchat
Reapr¶
Introduction¶
Reapr is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads.
Notes provided by Neelam Jha¶
Reapr is a tool trying to find explicit errors in the assembly based on incongruently mapped reads. It is heavily based on too low span coverage, or reads mapping too far or too close to each other. The program will also break up contigs/scaffolds at spurious sites to form smaller (but hopefully correct) contigs. Reapr runs pretty slowly, sadly,
Reapr is a bit fuzzy with contig names, but luckily it’s given us a tool to check if things are ok before we proceed! The command reapr facheck <assembly.fasta>
will tell you if everything’s ok! in this case, no output is good output, since the only output from the command is the potential problems with the contig names. If you run into any problems, run reapr facheck <assembly.fasta> <renamed_assembly.fasta>
, and you will get an assembly file with renamed contigs.
Once the names are ok, we continue:
The first thing we reapr needs, is a list of all “perfect” reads. This is reads that have a perfect map to the reference. Reapr is finicky though, and can’t use libraries with different read lengths, so you’ll have to use assemblies based on the raw data for this. Run the command reapr perfectmap
to get information on how to create a perfect mapping file, and create a perfect mapping called <assembler>_perfect
.
The next tool we need is reapr smaltmap
which creates a bam file of read-pair mappings. Do the same thing you did with perfectmap and create an output file called <assembler>_smalt.bam
.
Finally we can use the smalt mapping, and the perfect mapping to run the reapr pipeline
. Run reapr pipeline
to get help on how to run, and then run the pipeline. Store the results in reapr_<assembler>
.
Versions¶
1.0.18
Commands¶
reapr
Module¶
You can load the modules by:
module load biocontainers
module load reapr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run reapr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reapr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers reapr
reapr facheck Assembly.fasta renamedAssembly.fasta
reapr perfectmap renamedAssembly.fasta reads_1.fastq reads_2.fastq 100 outputPrefix
reapr smaltmap renamedAssembly.fasta reads_1.fastq reads_2.fastq mapped.bam
reapr pipeline renamedAssembly.fasta mapped.bam pipeoutdir outputPrefix
Rebaler¶
Introduction¶
Rebaler
is a program for conducting reference-based assemblies using long reads.
Versions¶
0.2.0
Commands¶
rebaler
Module¶
You can load the modules by:
module load biocontainers
module load rebaler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rebaler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rebaler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rebaler
Reciprocal Smallest Distance¶
Introduction¶
The reciprocal smallest distance
(RSD) algorithm accurately infers orthologs between pairs of genomes by considering global sequence alignment and maximum likelihood evolutionary distance between sequences.
Versions¶
1.1.7
Commands¶
rsd_search
rsd_blast
rsd_format
Module¶
You can load the modules by:
module load biocontainers
module load reciprocal_smallest_distance
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Reciprocal Smallest Distance on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=reciprocal_smallest_distance
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers reciprocal_smallest_distance
rsd_search
-q Mycoplasma_genitalium.aa \
--subject-genome=Mycobacterium_leprae.aa \
-o Mycoplasma_genitalium.aa_Mycobacterium_leprae.aa_0.8_1e-5.orthologs.txt
rsd_format -g Mycoplasma_genitalium.aa
rsd_blast -v -q Mycoplasma_genitalium.aa \
--subject-genome=Mycobacterium_leprae.aa \
--forward-hits q_s.hits --reverse-hits s_q.hits \
--no-format --evalue 0.1
Recycler¶
Introduction¶
Recycler
is a tool designed for extracting circular sequences from de novo assembly graphs.
Versions¶
0.7
Commands¶
make_fasta_from_fastg.py
get_simple_cycs.py
recycle.py
Module¶
You can load the modules by:
module load biocontainers
module load recycler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Recycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=recycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers recycler
recycle.py -g test/assembly_graph.fastg \
-k 55 -b test/test.sort.bam -i True
Regtools¶
Introduction¶
Regtools are tools that integrate DNA-seq and RNA-seq data to help interpret mutations in a regulatory and splicing context.
Versions¶
1.0.0
Commands¶
regtools
Module¶
You can load the modules by:
module load biocontainers
module load regtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run regtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=regtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers regtools
RepeatMasker¶
Introduction¶
RepeatMakser
is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
Detailed usage can be found here: http://www.repeatmasker.org.
Versions¶
4.1.2
Commands¶
RepeatMasker
Database¶
Note
As of May 20, 2019 GIRI has rescinded the working agreement allowing the www.repeatmasker.org website to offer a repeatmasking service utilizing the RepBase RepeatMasker Edition library. As a result, repeatmasker can only offer masking using the open database Dfam, which starting in 3.0 includes consensus sequences in addition to profile hidden Markov models for many transposable element families. Users requiring RepBase will need to purchase a commercial or academic license from GIRI and run RepeatMasker localy.
In our cluster, we set up the Dfam relaese 3.5 (October 2021) that include 285,580 repetitive DNA families.
Species name¶
Note
Since v4.1.1, RepeatMakser has switched to the FamDB format for the Dfam database. Due to this change, RepeatMasker becomes more strict with regards to what is acceptable for the -species
flag. The commonly used names such as “mammal” and “mouse” will not be accepted. To check for valid names, you can query the database using the python script famdb.py
(https://github.com/Dfam-consortium/FamDB).
See famdb.py --help
for usage information and below for an example the check the valid name for “mammal” using our copy of the Dfam database:
/depot/itap/datasets/Maker/RepeatMasker/Libraries/famdb.py -i /depot/itap/datasets/Maker/RepeatMasker/Libraries/Dfam.h5 names mammal
Module¶
You can load the modules by:
module load biocontainers
module load repeatmasker/4.1.2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatMasker on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=repeatmsker
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatmasker/4.1.2
RepeatMasker -pa 24 -species mammals genome.fasta
RepeatModeler¶
Introduction¶
RepeatModeler
is a de novo transposable element (TE) family identification and modeling package.
Versions¶
2.0.2
2.0.3
Commands¶
RepeatModeler
BuildDatabase
RepeatClassifier
Module¶
You can load the modules by:
module load biocontainers
module load repeatmodeler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatModeler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatmodeler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatmodeler
RepeatScout¶
Introduction¶
RepeatScout
is a tool to discover repetitive substrings in DNA.
Versions¶
1.0.6
Commands¶
RepeatScout
build_lmer_table
compare-out-to-gff.prl
filter-stage-1.prl
filter-stage-2.prl
merge-lmer-tables.prl
Module¶
You can load the modules by:
module load biocontainers
module load repeatscout
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RepeatScout on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=repeatscout
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers repeatscout
build_lmer_table -l 14 -sequence genome.fasta -freq Final_assembly.freq
RepeatScout -sequence genome.fasta -output Final_assembly_repeats.fasta -freq Final_assembly.freq -l 14
Resfinder¶
Introduction¶
ResFinder identifies acquired antimicrobial resistance genes in total or partial sequenced isolates of bacteria.
Versions¶
4.1.5
Commands¶
run_resfinder.py
run_batch_resfinder.py
Module¶
You can load the modules by:
module load biocontainers
module load resfinder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run resfinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=resfinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers resfinder
run_resfinder.py -o output -db_res db_resfinder/ \
-db_res_kma db_resfinder/kma_indexing -db_point db_pointfinder/ \
-s "Escherichia coli" --acquired --point -ifq data/test_isolate_01_*
Revbayes¶
Introduction¶
RevBayes – Bayesian phylogenetic inference using probabilistic graphical models and an interactive language.
Versions¶
1.1.1
Commands¶
rb
rb-mpi
Module¶
You can load the modules by:
module load biocontainers
module load revbayes
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run revbayes on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=revbayes
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers revbayes
rMATS¶
Introduction¶
MATS
is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.
Detailed usage can be found here: http://rnaseq-mats.sourceforge.net
Versions¶
4.1.1
Commands¶
rmats.py
Module¶
You can load the modules by:
module load biocontainers
module load rmats
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rmats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rmats
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rmats
rmats.py --b1 SR_b1.txt --b2 SR_b2.txt --gtf Homo_sapiens.GRCh38.105.gtf --od rmats_out_homo --tmp rmats_tmp -t paired --nthread 10 --readLength 150
rmats2sashimiplot¶
Introduction¶
rmats2sashimiplot
produces a sashimiplot visualization of rMATS output. rmats2sashimiplot can also produce plots using an annotation file and genomic coordinates. The plotting backend is MISO.
Detailed usage can be found here: https://github.com/Xinglab/rmats2sashimiplot
Versions¶
2.0.4
Commands¶
rmats2sashimiplot
Module¶
You can load the modules by:
module load biocontainers
module load rmats2sashimiplot
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rmats on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=rmats2sashimiplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rmats2sashimiplot
rmats2sashimiplot --s1 sample_1_replicate_1.sam,sample_1_replicate_2.sam,sample_1_replicate_3.sam \
--s2 sample_2_replicate_1.sam,sample_2_replicate_2.sam,sample_2_replicate_3.sam \
-t SE -e SE.MATS.JC.txt --l1 SampleOne --l2 SampleTwo --exon_s 1 --intron_s 5 \
-o test_events_output
RNAIndel¶
Introduction¶
RNAIndel
calls coding indels from tumor RNA-Seq data and classifies them as somatic, germline, and artifactual. RNAIndel supports GRCh38 and 37.
Versions¶
3.0.9
Commands¶
rnaindel
Module¶
You can load the modules by:
module load biocontainers
module load rnaindel
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RNAIndel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnaindel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnaindel
RNApeg¶
Introduction¶
RNApeg
is an RNA junction calling, correction, and quality-control package. RNAIndel supports GRCh38 and 37.
Versions¶
2.7.1
Commands¶
RNApeg.sh
Module¶
You can load the modules by:
module load biocontainers
module load rnapeg
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RNApeg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rnapeg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnapeg
Rnaquast¶
Introduction¶
Rnaquast
is a quality assessment tool for de novo transcriptome assemblies.
Versions¶
2.2.1
Commands¶
rnaQUAST.py
Dependencies de novo quality assessment and read alignment¶
Note
When reference genome and gene database are unavailable, users can also use BUSCO
and GeneMarkS-T
in rnaQUAST pipeline. Since GeneMarkS-T
requires the license key, users may need to download your own key, and put it in your $HOME.
rnaQUAST is also capable of calculating various statistics using raw reads (e.g. database coverage by reads). To use this, you will need use STAR
in the pipeline.
BUSCO
, GeneMarkS-T
, and STAR
have been installed, and the directories of their exectuables have been added to $PATH. Users do not need to load these modules. The only module required is rnaquast
itself.
Module¶
You can load the modules by:
module load biocontainers
module load rnaquast
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rnaquast on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=rnaquast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rnaquast
rnaQUAST.py -t 12 -o output \
--transcripts Trinity.fasta idba.fasta \
--reference Saccharomyces_cerevisiae.R64-1-1.75.dna.toplevel.fa \
--gtf Saccharomyces_cerevisiae.R64-1-1.75.gtf
rnaQUAST.py -t 12 -o output2 \
--reference reference.fasta \
--transcripts transcripts.fasta \
--left_reads lef.fastq \
--right_reads right.fastq \
--busco fungi_odb10
Roary¶
Introduction¶
Roary is a high speed stand alone pan genome pipeline, which takes annotated assemblies in GFF3 format (produced by Prokka) and calculates the pan genome.
Versions¶
3.13.0
Commands¶
roary
Module¶
You can load the modules by:
module load biocontainers
module load roary
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run roary on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=roary
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers roary
roary -f demo -e -n -v gff/*.gff
r-rnaseq¶
Introduction¶
r-rnaseq
is a customerized R module based on R/4.1.1
used for RNAseq analysis.
In the module, we have some packages installed:
BiocManager 1.30.16
ComplexHeatmap 2.9.4
DESeq2 1.34.0
edgeR 3.36.0
pheatmap 1.0.12
limma 3.48.3
tibble 3.1.5
tidyr 1.1.4
readr 2.0.2
readxl 1.3.1
purrr 0.3.4
dplyr 1.0.7
stringr 1.4.0
forcats 0.5.1
ggplot2 3.3.5
openxlsx 4.2.5
Versions¶
4.1.1-1
4.1.1-1-rstudio
Commands¶
R
Rscript
rstudio (only for the rstudio version)
Module¶
You can load the modules by:
module load biocontainers
module load r-rnaseq/4.1.1-1
# If you want to use Rstudio, load the rstudio version
module load r-rnaseq/4.1.1-1-rstudio
Install packages¶
Note
Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile
.
Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.
Interactive job¶
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-rnaseq/4.1.1-1 # or r-rnaseq/4.1.1-1-rstudio
(base) UserID@bell-a008:~ $ R
R version 4.1.1 (2021-08-10) -- "Kick Things"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(edgeR)
> library(pheatmap)
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_RNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-rnaseq
Rscript RNAseq.R
RStudio¶
Introduction¶
RStudio
is an integrated development environment (IDE) for the R statistical computation and graphics system.
This is an RStudio IDE together with a plain R-base installation (see https://github.com/rocker-org/rocker/), repackaged by RCAC with an addition of a handful prerequisite libraries (libcurl, libopenssl, libxml2, libcairo2 and libXt) and their header files. It is intentionally separate from the biocontainers’ ‘r’ module for reasons of image size (700MB vs 360MB).
Versions¶
4.1.1
Commands¶
R
Rscript
rstudio
Module¶
You can load the modules by:
module load biocontainers
module load r-studio
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RStudio on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=r-studio
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-studio
r-scrnaseq¶
Introduction¶
r-scrnaseq
is a customerized R module based on R/4.1.1
or R/4.2.0
used for scRNAseq analysis.
In the module, we have some packages installed:
BiocManager 1.30.16
CellChat 1.6.1
ProjecTILs 3.0
Seurat 4.1.0
SeuratObject 4.0.4
SeuratWrappers 0.3.0
monocle3 1.0.0
SnapATAC 1.0.0
SingleCellExperiment 1.14.1, 1.16.0
scDblFinder 1.8.0
SingleR 1.8.1
scCATCH 3.0
scMappR 1.0.7
rliger 1.0.0
schex 1.8.0
CoGAPS 3.14.0
celldex 1.4.0
dittoSeq 1.6.0
DropletUtils 1.14.2
miQC 1.2.0
Nebulosa 1.4.0
tricycle 1.2.0
pheatmap 1.0.12
limma 3.48.3, 3.50.0
tibble 3.1.5
tidyr 1.1.4
readr 2.0.2
readxl 1.3.1
purrr 0.3.4
dplyr 1.0.7
stringr 1.4.0
forcats 0.5.1
ggplot2 3.3.5
openxlsx 4.2.5
Versions¶
4.1.1-1
4.1.1-1-rstudio
4.2.0
4.2.0-rstudio
4.2.3-rstudio
Commands¶
R
Rscript
rstudio (only for the rstudio version)
Module¶
You can load the modules by:
module load biocontainers
module load r-scrnaseq
# or module load r-scrnaseq/4.2.0
# If you want to use Rstudio, load the rstudio version
module load r-scrnaseq/4.1.1-1-rstudio
# or module load r-scrnaseq/4.2.0-rstudio
Install packages¶
Note
Users can also install packages they need. The installed location depends on the setting in your ~/.Rprofile
.
Detailed guide about installing R packages can be found here: https://www.rcac.purdue.edu/knowledge/bell/run/examples/apps/r/package.
Interactive job¶
To run interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers r-scrnaseq/4.2.0 # or r-scrnaseq/4.2.0-rstudio
(base) UserID@bell-a008:~ $ R
R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(Seurat)
> library(monocle3)
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=r_scRNAseq
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers r-scrnaseq
Rscript scRNAseq.R
RSEM¶
Introduction¶
RSEM
is a software package for estimating gene and isoform expression levels from RNA-Seq data. Further information can be found here: https://deweylab.github.io/RSEM/.
Versions¶
1.3.3
Commands¶
rsem-bam2readdepth
rsem-bam2wig
rsem-build-read-index
rsem-calculate-credibility-intervals
rsem-calculate-expression
rsem-control-fdr
rsem-extract-reference-transcripts
rsem-generate-data-matrix
rsem-generate-ngvector
rsem-gen-transcript-plots
rsem-get-unique
rsem-gff3-to-gtf
rsem-parse-alignments
rsem-plot-model
rsem-plot-transcript-wiggles
rsem-prepare-reference
rsem-preref
rsem-refseq-extract-primary-assembly
rsem-run-ebseq
rsem-run-em
rsem-run-gibbs
rsem-run-prsem-testing-procedure
rsem-sam-validator
rsem-scan-for-paired-end-reads
rsem-simulate-reads
rsem-synthesis-reference-transcripts
rsem-tbam2gbam
Dependencies¶
STAR v2.7.9a
, Bowtie v1.2.3
, Bowtie2 v2.3.5.1
, HISAT2 v2.2.1
were included in the container image. So users do not need to provide the dependency path in the RSEM parameter.
Module¶
You can load the modules by:
module load biocontainers
module load rsem/1.3.3
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run RSEM on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=rsem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rsem/1.3.3
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --bowtie2 Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_bowtie2 -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --hisat2-hca Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_hisat2 -p 24
rsem-prepare-reference --gtf Homo_sapiens.GRCh38.105.gtf --star Homo_sapiens.GRCh38.dna.primary_assembly.fa Gh38_star -p 24
rsem-calculate-expression --paired-end --star -p 24 SRR12095148_1.fastq SRR12095148_2.fastq Gh38_star SRR12095148_rsem_expression
Rseqc¶
Introduction¶
Rseqc
is a package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data.
Versions¶
4.0.0
Commands¶
FPKM-UQ.py
FPKM_count.py
RNA_fragment_size.py
RPKM_saturation.py
aggregate_scores_in_intervals.py
align_print_template.py
axt_extract_ranges.py
axt_to_fasta.py
axt_to_lav.py
axt_to_maf.py
bam2fq.py
bam2wig.py
bam_stat.py
bed_bigwig_profile.py
bed_build_windows.py
bed_complement.py
bed_count_by_interval.py
bed_count_overlapping.py
bed_coverage.py
bed_coverage_by_interval.py
bed_diff_basewise_summary.py
bed_extend_to.py
bed_intersect.py
bed_intersect_basewise.py
bed_merge_overlapping.py
bed_rand_intersect.py
bed_subtract_basewise.py
bnMapper.py
clipping_profile.py
deletion_profile.py
div_snp_table_chr.py
divide_bam.py
find_in_sorted_file.py
geneBody_coverage.py
geneBody_coverage2.py
gene_fourfold_sites.py
get_scores_in_intervals.py
infer_experiment.py
inner_distance.py
insertion_profile.py
int_seqs_to_char_strings.py
interval_count_intersections.py
interval_join.py
junction_annotation.py
junction_saturation.py
lav_to_axt.py
lav_to_maf.py
line_select.py
lzop_build_offset_table.py
mMK_bitset.py
maf_build_index.py
maf_chop.py
maf_chunk.py
maf_col_counts.py
maf_col_counts_all.py
maf_count.py
maf_covered_ranges.py
maf_covered_regions.py
maf_div_sites.py
maf_drop_overlapping.py
maf_extract_chrom_ranges.py
maf_extract_ranges.py
maf_extract_ranges_indexed.py
maf_filter.py
maf_filter_max_wc.py
maf_gap_frequency.py
maf_gc_content.py
maf_interval_alignibility.py
maf_limit_to_species.py
maf_mapping_word_frequency.py
maf_mask_cpg.py
maf_mean_length_ungapped_piece.py
maf_percent_columns_matching.py
maf_percent_identity.py
maf_print_chroms.py
maf_print_scores.py
maf_randomize.py
maf_region_coverage_by_src.py
maf_select.py
maf_shuffle_columns.py
maf_species_in_all_files.py
maf_split_by_src.py
maf_thread_for_species.py
maf_tile.py
maf_tile_2.py
maf_tile_2bit.py
maf_to_axt.py
maf_to_concat_fasta.py
maf_to_fasta.py
maf_to_int_seqs.py
maf_translate_chars.py
maf_truncate.py
maf_word_frequency.py
mask_quality.py
mismatch_profile.py
nib_chrom_intervals_to_fasta.py
nib_intervals_to_fasta.py
nib_length.py
normalize_bigwig.py
one_field_per_line.py
out_to_chain.py
overlay_bigwig.py
prefix_lines.py
pretty_table.py
qv_to_bqv.py
random_lines.py
read_GC.py
read_NVC.py
read_distribution.py
read_duplication.py
read_hexamer.py
read_quality.py
split_bam.py
split_paired_bam.py
table_add_column.py
table_filter.py
tfloc_summary.py
tin.py
ucsc_gene_table_to_intervals.py
wiggle_to_array_tree.py
wiggle_to_binned_array.py
wiggle_to_chr_binned_array.py
wiggle_to_simple.py
Module¶
You can load the modules by:
module load biocontainers
module load rseqc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Rseqc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rseqc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rseqc
bam_stat.py -i *.bam -q 30
run-dbCAN¶
Introduction¶
run_dbCAN
using genomes/metagenomes/proteomes of any assembled organisms (prokaryotes, fungi, plants, animals, viruses) to search for CAZymes. This is a standalone tool of http://bcb.unl.edu/dbCAN2/. Details aobut its uage can be found in its Github repository.
Versions¶
3.0.2
3.0.6
Commands¶
run_dbcan
Database¶
Latest version of database has been downloaded and setup, including CAZyDB.09242021.fa, dbCAN-HMMdb-V10.txt, tcdb.fa, tf-1.hmm, tf-2.hmm, and stp.hmm.
Module¶
You can load the modules by:
module load biocontainers
module load run_dbcan/3.0.2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run run_dbcan on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=run_dbcan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers run_dbcan/3.0.2
run_dbcan protein.faa protein --out_dir test1_dbcan
run_dbcan genome.fasta prok --out_dir test2_dbcan
rush¶
Introduction¶
rush
is a tool similar to GNU parallel and gargs. rush borrows some idea from them and has some unique features, e.g., supporting custom defined variables, resuming multi-line commands, more advanced embeded replacement strings.
Versions¶
0.4.2
Commands¶
rush
Module¶
You can load the modules by:
module load biocontainers
module load rush
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rush on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=rush
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers rush
Sage¶
Introduction¶
Sage is a proteomics search engine - a tool that transforms raw mass spectra from proteomics experiments into peptide identificatons via database searching & spectral matching. But, it’s also more than just a search engine - Sage includes a variety of advanced features that make it a one-stop shop: retention time prediction, quantification (both isobaric & LFQ), peptide-spectrum match rescoring, and FDR control.
Versions¶
0.8.1
Commands¶
sage
Module¶
You can load the modules by:
module load biocontainers
module load sage
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run sage on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sage
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sage
Salmon¶
Introduction¶
Salmon
is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data.
Detailed usage can be found here: https://github.com/COMBINE-lab/salmon
Versions¶
1.10.1
1.5.2
1.6.0
1.7.0
1.8.0
1.9.0
Commands¶
salmon index
salmon quant
salmon alevin
salmon swim
salmon quantmerge
Module¶
You can load the modules by:
module load biocontainers
module load salmon
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Salmon on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=salmon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers salmon
salmon index -t Homo_sapiens.GRCh38.cds.all.fa -i salmon_index
salmon quant -i salmon_index -l A -p 24 -1 SRR16956239_1.fastq -2 SRR16956239_2.fastq --validateMappings -o transcripts_quan
Sambamba¶
Introduction¶
Sambamba
is a high performance highly parallel robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files.
Versions¶
0.8.2
Commands¶
sambamba
Module¶
You can load the modules by:
module load biocontainers
module load sambamba
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sambamba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sambamba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sambamba
sambamba view --reference-info input.bam
sambamba view -c -F "mapping_quality >= 40" input.bam
Samblaster¶
Introduction¶
Samblaster
is a tool to mark duplicates and extract discordant and split reads from sam files.
Versions¶
0.1.26
Commands¶
samblaster
Module¶
You can load the modules by:
module load biocontainers
module load samblaster
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samblaster on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samblaster
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samblaster
Samclip¶
Introduction¶
Samclip is a tool to filter SAM file for soft and hard clipped alignments.
Versions¶
0.4.0
Commands¶
samclip
Module¶
You can load the modules by:
module load biocontainers
module load samclip
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run samclip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samclip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samclip
samclip --ref test.fna < test.sam > out.sam
Samplot¶
Introduction¶
Samplot
is a command line tool for rapid, multi-sample structural variant visualization.
Versions¶
1.3.0
Commands¶
samplot
Module¶
You can load the modules by:
module load biocontainers
module load samplot
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samplot on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samplot
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samplot
samplot plot \
-n NA12878 NA12889 NA12890 \
-b samplot/test/data/NA12878_restricted.bam \
samplot/test/data/NA12889_restricted.bam \
samplot/test/data/NA12890_restricted.bam \
-o 4_115928726_115931880.png \
-c chr4 \
-s 115928726 \
-e 115931880 \
-t DEL
Samtools¶
Introduction¶
Samtools
is a set of utilities for the Sequence Alignment/Map (SAM) format.
Versions¶
1.15
1.16
1.17
1.9
Commands¶
samtools
ace2sam
htsfile
maq2sam-long
maq2sam-short
tabix
wgsim
Module¶
You can load the modules by:
module load biocontainers
module load samtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Samtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=samtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers samtools
Scanpy¶
Introduction¶
Scanpy
is scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells. Details about its usage can be found here (https://scanpy.readthedocs.io/en/stable/)
Versions¶
1.8.2
1.9.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load scanpy/1.8.2
Interactive job¶
To run scanpy interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scanpy/1.8.2
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun 4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scanpy as sc
>>> sc.tl.umap(adata, **tool_params)
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scanpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scanpy/1.8.2
python script.py
Scarches¶
Introduction¶
scArches is a package to integrate newly produced single-cell datasets into integrated reference atlases.
Versions¶
0.5.3
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load scarches
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scarches on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scarches
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scarches
Scgen¶
Introduction¶
scGen is a generative model to predict single-cell perturbation response across cell types, studies and species.
Versions¶
2.1.0
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load scgen
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scgen on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scgen
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scgen
Scirpy¶
Introduction¶
Scirpy is a scalable python-toolkit to analyse T cell receptor (TCR) or B cell receptor (BCR) repertoires from single-cell RNA sequencing (scRNA-seq) data. It seamlessly integrates with the popular scanpy library and provides various modules for data import, analysis and visualization.
Versions¶
0.10.1
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load scirpy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scirpy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scirpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scirpy
scVelo¶
Introduction¶
scVelo
is a scalable toolkit for RNA velocity analysis in single cells, based on https://doi.org/10.1038/s41587-020-0591-3. Its detailed usage can be found here: https://scvelo.readthedocs.io.
Versions¶
0.2.4
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load scvelo/0.2.4
Interactive job¶
To run scVelo interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers scvelo/0.2.4
(base) UserID@bell-a008:~ $ python
Python 3.9.5 (default, Jun 4 2021, 12:28:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import scvelo as scv
>>> scv.set_figure_params()
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=scvelo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scvelo/0.2.4
python script.py
Scvi-tools¶
Introduction¶
scvi-tools (single-cell variational inference tools) is a package for end-to-end analysis of single-cell omics data primarily developed and maintained by the Yosef Lab at UC Berkeley.
Versions¶
0.16.2
Commands¶
python
python3
R
Rscript
Module¶
You can load the modules by:
module load biocontainers
module load scvi-tools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run scvi-tools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=scvi-tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers scvi-tools
Segalign¶
Introduction¶
Segalign is a scalable GPU system for pairwise whole genome alignments based on LASTZ’s seed-filter-extend paradigm.
Versions¶
0.1.2
Commands¶
faToTwoBit
run_segalign
run_segalign_repeat_masker
segalign
segalign_repeat_masker
twoBitToFa
Module¶
You can load the modules by:
module load biocontainers
module load segalign
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run segalign on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=segalign
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers segalign
Seidr¶
Introduction¶
Seidr
is a community gene network inference and exploration toolkit.
Versions¶
0.14.2
Commands¶
correlation
seidr
mi
pcor
narromi
plsnet
llr-ensemble
svm-ensemble
genie3
tigress
el-ensemble
makeconv
genrb
gencfu
gencnval
gendict
tomsimilarity
Module¶
You can load the modules by:
module load biocontainers
module load seidr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Seidr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seidr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seidr
Sepp¶
Introduction¶
Sepp
stands for SATé-Enabled Phylogenetic Placement and addresses the problem of phylogenetic placement for meta-genomic short reads.
Versions¶
4.5.1
Commands¶
run_sepp.py
run_upp.py
split_sequences.py
sumlabels.py
sumtrees.py
Module¶
You can load the modules by:
module load biocontainers
module load sepp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sepp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sepp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sepp
run_sepp.py -t mock/rpsS/sate.tre \
-r mock/rpsS/sate.tre.RAxML_info \
-a mock/rpsS/sate.fasta \
-f mock/rpsS/rpsS.even.fas \
-o rpsS.out.default
Seqcode¶
Introduction¶
SeqCode is a family of applications designed to develop high-quality images and perform genome-wide calculations from high-throughput sequencing experiments. This software is presented into two distinct modes: web tools and command line. The website of SeqCode offers most functions to users with no previous expertise in bioinformatics, including operations on a selection of published ChIP-seq samples and applications to generate multiple classes of graphics from data files of the user. On the contrary, the standalone version of SeqCode allows bioinformaticians to run each command on any type of sequencing data locally in their computer. The architecture of the source code is modular and the input/output interface of the commands is suitable to be integrated into existing pipelines of genome analysis. SeqCode has been written in ANSI C, which favors the compatibility in every UNIX platform and grants a high performance and speed when analyzing sequencing data. Meta-plots, heatmaps, boxplots and the rest of images produced by SeqCode are internally generated using R. SeqCode relies on the RefSeq reference annotations and is able to deal with the genome and assembly release of every organism that is available from this consortium.
Versions¶
1.0
Commands¶
buildChIPprofile
combineChIPprofiles
combineTSSmaps
combineTSSplots
computemaxsignal
findPeaks
genomeDistribution
matchpeaks
matchpeaksgenes
processmacs
produceGENEmaps
produceGENEplots
producePEAKmaps
producePEAKplots
produceTESmaps
produceTESplots
produceTSSmaps
produceTSSplots
recoverChIPlevels
scorePhastCons
Module¶
You can load the modules by:
module load biocontainers
module load seqcode
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run seqcode on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqcode
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqcode
buildChIPprofile -vd ChromInfo.txt \
H3K4me3_sample.bam test_buildChIPprofile
Seqkit¶
Introduction¶
Seqkit
is a rapid tool for manipulating fasta and fastq files.
Versions¶
2.0.0
2.1.0
2.3.1
Commands¶
seqkit
Module¶
You can load the modules by:
module load biocontainers
module load seqkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Seqkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqkit
seqkit stats configs.fasta > contigs_statistics.txt
Seqyclean¶
Introduction¶
Seqyclean is used to pre-process NGS data in order to prepare for downstream analysis. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/seqyclean Home page: https://github.com/ibest/seqyclean
Versions¶
1.10.09
Commands¶
seqyclean
Module¶
You can load the modules by:
module load biocontainers
module load seqyclean
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run seqyclean on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=seqyclean
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers seqyclean
Shapeit4¶
Introduction¶
SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and high coverage sequencing data.
Versions¶
4.2.2
Commands¶
shapeit4
Module¶
You can load the modules by:
module load biocontainers
module load shapeit4
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shapeit4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shapeit4
Shapeit5¶
Introduction¶
SHAPEIT5 is a software package to estimate haplotypes in large genotype datasets (WGS and SNP array).
Versions¶
5.1.1
Commands¶
phase_common
ligate
phase_rare
simulate
switch
xcftools
Module¶
You can load the modules by:
module load biocontainers
module load shapeit5
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shapeit5 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shapeit5
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shapeit5
Shasta¶
Introduction¶
Shasta is a software for de novo assembly from Oxford Nanopore reads.
Versions¶
0.10.0
Commands¶
shasta
Module¶
You can load the modules by:
module load biocontainers
module load shasta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shasta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shasta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shasta
shasta --input r94_ec_rad2.181119.60x-10kb.fasta \
--config Nanopore-May2022
Shigeifinder¶
Introduction¶
Shigeifinder is a tool that is used to identify differentiate Shigella/EIEC using cluster-specific genes and identify the serotype using O-antigen/H-antigen genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/shigeifinder Home page: https://github.com/LanLab/ShigEiFinder
Versions¶
1.3.2
Commands¶
shigeifinder
Module¶
You can load the modules by:
module load biocontainers
module load shigeifinder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shigeifinder on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shigeifinder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shigeifinder
Shorah¶
Introduction¶
Shorah
is an open source project for the analysis of next generation sequencing data.
Versions¶
1.99.2
Commands¶
shorah
b2w
diri_sampler
fil
Module¶
You can load the modules by:
module load biocontainers
module load shorah
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Shorah on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shorah
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shorah
shorah amplicon -b ampli_sorted.bam -f reference.fasta
shorah shotgun -b test_aln.cram -f test_ref.fasta
shorah shotgun -a 0.1 -w 42 -x 100000 -p 0.9 -c 0 -r REF:42-272 -R 42 -b test_aln.cram -f ref.fasta
Shortstack¶
Introduction¶
Shortstack
is a tool for comprehensive annotation and quantification of small RNA genes.
Versions¶
3.8.5
Commands¶
ShortStack
Module¶
You can load the modules by:
module load biocontainers
module load shortstack
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Shortstack on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shortstack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shortstack
Shovill¶
Introduction¶
Shovill is a tool to assemble bacterial isolate genomes from Illumina paired-end reads.
Versions¶
1.1.0
Commands¶
shovill
Module¶
You can load the modules by:
module load biocontainers
module load shovill
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run shovill on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=shovill
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers shovill
shovill --outdir out \
--R1 test/R1.fq.gz \
--R2 test/R2.fq.gz
Sicer¶
Introduction¶
Sicer
is a clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
Versions¶
1.1
Commands¶
SICER-df-rb.sh
SICER-df.sh
SICER-rb.sh
SICER.sh
Module¶
You can load the modules by:
module load biocontainers
module load sicer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sicer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sicer
SICER.sh ./ test.bed control.bed . hg18 1 200 150 0.74 600 .01
SICER-rb.sh ./ test.bed . hg18 1 200 150 0.74 400 100
Sicer2¶
Introduction¶
Sicer2
is the redesigned and improved ChIP-seq broad peak calling tool SICER.
Versions¶
1.0.3
1.2.0
Commands¶
sicer
sicer_df
recognicer
recognicer_df
Module¶
You can load the modules by:
module load biocontainers
module load sicer2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sicer2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sicer2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sicer2
sicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
-c ./test/control_1.bed ./test/control_2.bed \
-s hg38 --significant_reads
recognicer_df -t ./test/treatment_1.bed ./test/treatment_2.bed \
-c ./test/control_1.bed ./test/control_2.bed \
-s hg38 --significant_reads
SignalP¶
Introduction¶
SignalP
predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.
Versions¶
4.1
Commands¶
signalp
Module¶
You can load the modules by:
module load biocontainers
module load signalp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SignalP on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=signalp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp
signalp -t gram+ -f all proka.fasta > proka_out
signalp -t euk -f all euk.fasta > euk.out
Signalp6¶
Introduction¶
SignalP predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes.
Versions¶
6.0-fast
6.0-slow
Commands¶
signalp6
Module¶
You can load the modules by:
module load biocontainers
module load signalp6
Example job for fast mode¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run signalp6 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 2:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-fast
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp6/6.0-fast
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_fast \
--format txt --mode fast
Example job for slow mode¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run signalp6 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 12:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=signalp6-slow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers signalp6/6.0-slow
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_slow \
--format txt --mode slow
signalp6 --write_procs 24 --fastafile proteins_clean.fasta \
--organism euk --output_dir output_slow-sequential \
--format txt --mode slow-sequential
Simug¶
Introduction¶
Simug
is a general-purpose genome simulator.
Versions¶
1.0.0
Commands¶
simuG
vcf2model
Module¶
You can load the modules by:
module load biocontainers
module load simug
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Simug on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=simug
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers simug
Singlem¶
Introduction¶
SingleM is a tool for profiling shotgun metagenomes. It has a particular strength in detecting microbial lineages which are not in reference databases. The method it uses also makes it suitable for some related tasks, such as assessing eukaryotic contamination, finding bias in genome recovery, computing ecological diversity metrics, and lineage-targeted MAG recovery.
Versions¶
0.13.2
Commands¶
singlem
Module¶
You can load the modules by:
module load biocontainers
module load singlem
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run singlem on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=singlem
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers singlem
Ska¶
Introduction¶
SKA (Split Kmer Analysis) is a toolkit for prokaryotic (and any other small, haploid) DNA sequence analysis using split kmers. A split kmer is a pair of kmers in a DNA sequence that are separated by a single base. Split kmers allow rapid comparison and alignment of small genomes, and is particulalry suited for surveillance or outbreak investigation. SKA can produce split kmer files from fasta format assemblies or directly from fastq format read sequences, cluster them, align them with or without a reference sequence and provide various comparison and summary statistics. Currently all testing has been carried out on high-quality Illumina read data, so results for other platforms may vary.
Versions¶
1.0
Commands¶
ska
Module¶
You can load the modules by:
module load biocontainers
module load ska
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run ska on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=ska
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ska
Skewer¶
Introduction¶
Skewer
is a fast and accurate adapter trimmer for paired-end reads.
Versions¶
0.2.2
Commands¶
skewer
Module¶
You can load the modules by:
module load biocontainers
module load skewer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Skewer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=skewer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers skewer
skewer -l 50 -m pe -o skewerQ30 --mean-quality 30 \
--end-quality 30 -t 10 -x TruSeq3-PE.fa \
input_1.fastq input_2.fastq
Slamdunk¶
Introduction¶
Slamdunk is a novel, fully automated software tool for automated, robust, scalable and reproducible SLAMseq data analysis.
Versions¶
0.4.3
Commands¶
slamdunk
alleyoop
Module¶
You can load the modules by:
module load biocontainers
module load slamdunk
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run slamdunk on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=slamdunk
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers slamdunk
Smoove¶
Introduction¶
Smoove
simplifies and speeds calling and genotyping SVs for short reads.
Versions¶
0.2.7
Commands¶
smoove
Module¶
You can load the modules by:
module load biocontainers
module load smoove
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Smoove on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=smoove
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers smoove
smoove call \
-x --name my-cohort \
--exclude hg38_blacklist.bed \
--fasta Homo_sapiens.GRCh38.dna.primary_assembly.fa \
-p 24 \
--genotype input_bams/*.bam
Snakemake¶
Introduction¶
Snakemake
is a workflow engine that provides a readable Python-based workflow definition language and a powerful execution environment that scales from single-core workstations to compute clusters without modifying the workflow.
Versions¶
6.8.0
Commands¶
snakemake
Module¶
You can load the modules by:
module load biocontainers
module load snakemake
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snakemake on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snakemake
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snakemake
Snap¶
Introduction¶
Snap
is a semi-HMM-based Nucleic Acid Parser – gene prediction tool.
Versions¶
2013_11_29
Commands¶
snap
Module¶
You can load the modules by:
module load biocontainers
module load snap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snap
Snap-aligner¶
Introduction¶
Snap-aligner
(Scalable Nucleotide Alignment Program) is a fast and accurate read aligner for high-throughput sequencing data.
Versions¶
2.0.0
Commands¶
snap-aligner
Module¶
You can load the modules by:
module load biocontainers
module load snap-aligner
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snap-aligner on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snap-aligner
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snap-aligner
Snaptools¶
Introduction¶
Snaptools
is a python module for pre-processing and working with snap file.
Versions¶
1.4.8
Commands¶
snaptools
Module¶
You can load the modules by:
module load biocontainers
module load snaptools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snaptools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snaptools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snaptools
Snippy¶
Introduction¶
Snippy
is a tool for rapid haploid variant calling and core genome alignment.
Versions¶
4.6.0
Commands¶
snippy
snippy-clean_full_aln
snippy-core
snippy-multi
snippy-vcf_extract_subs
snippy-vcf_report
snippy-vcf_to_tab
Module¶
You can load the modules by:
module load biocontainers
module load snippy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snippy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snippy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snippy
Snp-dists¶
Introduction¶
Snp-dists is a tool to convert a FASTA alignment to SNP distance matrix.
Versions¶
0.8.2
Commands¶
snp-dists
Module¶
You can load the modules by:
module load biocontainers
module load snp-dists
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snp-dists on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-dists
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snp-dists
snp-dists test/good.aln > distances.tab
Snpeff¶
Introduction¶
Snpeff
is an open source tool that annotates variants and predicts their effects on genes by using an interval forest approach.
Versions¶
5.1d
5.1
Commands¶
snpEff
Module¶
You can load the modules by:
module load biocontainers
module load snpeff
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
By default, snpEff only uses 1gb
of memory. To allocate larger memory, add -Xmx
flag in your command.:
snpeff -Xmx10g ## To allocate 10gb of memory.
To run Snpeff on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpeff
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpeff
snpEff GRCh37.75 examples/test.chr22.vcf > test.chr22.ann.vcf
Snpgenie¶
Introduction¶
Snpgenie
is a collection of Perl scripts for estimating πN/πS, dN/dS, and gene diversity from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.
Versions¶
1.0
Commands¶
fasta2revcom.pl
gtf2revcom.pl
snpgenie.pl
snpgenie_between_group.pl
snpgenie_between_group_processor.pl
snpgenie_within_group.pl
snpgenie_within_group_processor.pl
vcf2revcom.pl
Module¶
You can load the modules by:
module load biocontainers
module load snpgenie
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snpgenie on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpgenie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpgenie
snpgenie.pl --minfreq=0.01 --snpreport=CLC_SNP_EXAMPLE.txt \
--fastafile=REFERENCE_EXAMPLE.fasta --gtffile=CDS_EXAMPLE.gtf
Snphylo¶
Introduction¶
Snphylo is a pipeline to generate a phylogenetic tree from huge SNP data.
Versions¶
20180901
Commands¶
Rscript
snphylo.sh
convert_fasta_to_phylip.py
convert_simple_to_hapmap.py
determine_bs_tree.R
draw_unrooted_tree.R
generate_snp_sequence.R
remove_low_depth_genotype_data.py
remove_no_genotype_data.py
Module¶
You can load the modules by:
module load biocontainers
module load snphylo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snphylo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snphylo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snphylo
Snpsift¶
Introduction¶
Snpsift
is a tool used to annotate genomic variants using databases, filters, and manipulates genomic annotated variants.
Versions¶
4.3.1t
Commands¶
SnpSift
Module¶
You can load the modules by:
module load biocontainers
module load snpsift
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Snpsift on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snpsift
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snpsift
SnpSift annotate -id dbSnp132.vcf \
variants.vcf > variants_annotated.vcf
Snp-sites¶
Introduction¶
SNP-sites is a tool that apidly extracts SNPs from a multi-FASTA alignment.
Versions¶
2.5.1
Commands¶
snp-sites
Module¶
You can load the modules by:
module load biocontainers
module load snp-sites
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run snp-sites on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=snp-sites
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers snp-sites
snp-sites salmonella_serovars_core_genes.aln
Soapdenovo2¶
Introduction¶
Soapdenovo2
is a short-read assembly method to build de novo draft assembly.
Versions¶
2.40
Commands¶
SOAPdenovo-127mer
SOAPdenovo-63mer
Module¶
You can load the modules by:
module load biocontainers
module load soapdenovo2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Soapdenovo2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=soapdenovo2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers soapdenovo2
SOAPdenovo-127mer all -s config_file -K 63 -R -o graph_prefix 1>ass.log 2>ass.err
SortMeRNA¶
Introduction¶
SortMeRNA
is a local sequence alignment tool for filtering, mapping and clustering.
Versions¶
2.1b
4.3.4
Commands¶
sortmerna
Module¶
You can load the modules by:
module load biocontainers
module load sortmerna
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SortMeRNA on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sortmerna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sortmerna
sortmerna --ref silva-bac-16s-id90.fasta,silva-bac-16s-db \
--reads set2_environmental_study_550_amplicon.fasta \
--fastx --aligned Test
Souporcell¶
Introduction¶
souporcell is a method for clustering mixed-genotype scRNAseq experiments by individual.
Versions¶
2.0
Commands¶
check_modules.py
compile_stan_model.py
consensus.py
renamer.py
retag.py
shared_samples.py
souporcell.py
souporcell_pipeline.py
Module¶
You can load the modules by:
module load biocontainers
module load souporcell
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run souporcell on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=souporcell
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers souporcell
souporcell_pipeline.py -i A.merged.bam \
-b GSM2560245_barcodes.tsv \
-f refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa \
-t 8 -o demux_data_test -k 4
Sourmash¶
Introduction¶
Sourmash
is a tool for quickly search, compare, and analyze genomic and metagenomic data sets.
Versions¶
4.3.0
4.5.0
Commands¶
sourmash
Module¶
You can load the modules by:
module load biocontainers
module load sourmash
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Sourmash on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sourmash
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sourmash
sourmash sketch dna -p k=31 *.fna.gz
sourmash compare *.sig -o cmp.dist
sourmash plot cmp.dist --labels
Spaceranger¶
Introduction¶
Spaceranger
is a set of analysis pipelines that process Visium Spatial Gene Expression data with brightfield and fluorescence microscope images.
Versions¶
1.3.0
1.3.1
2.0.0
Commands¶
spaceranger
Module¶
You can load the modules by:
module load biocontainers
module load spaceranger
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Spaceranger on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=spaceranger
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers spaceranger
spaceranger count --id=sample345 \ #Output directory
--transcriptome=/opt/refdata/GRCh38-2020-A \ #Path to Reference
--fastqs=/home/jdoe/runs/HAWT7ADXX/outs/fastq_path \ #Path to FASTQs
--sample=mysample \ #Sample name from FASTQ filename
--image=/home/jdoe/runs/images/sample345.tiff \ #Path to brightfield image
--slide=V19J01-123 \ #Slide ID
--area=A1 \ #Capture area
--localcores=8 \ #Allowed cores in localmode
--localmem=64 #Allowed memory (GB) in localmode
SPAdes¶
Introduction¶
SPAdes
- St. Petersburg genome assembler - is an assembly toolkit containing various assembly pipelines.
Detailed usage can be found here: https://github.com/ablab/spades
Versions¶
3.15.3
3.15.4
3.15.5
Commands¶
coronaspades.py
metaplasmidspades.py
metaspades.py
metaviralspades.py
plasmidspades.py
rnaspades.py
rnaviralspades.py
spades.py
spades_init.py
truspades.py
spades-bwa
spades-convert-bin-to-fasta
spades-core
spades-corrector-core
spades-gbuilder
spades-gmapper
spades-gsimplifier
spades-hammer
spades-ionhammer
spades-kmer-estimating
spades-kmercount
spades-read-filter
spades-truseq-scfcorrection
Module¶
You can load the modules by:
module load biocontainers
module load spades
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run spades on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=spades
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers spades
spades.py --pe1-1 SRR11234553_1.fastq --pe1-2 SRR11234553_2.fastq -o spades_out -t 24
Sprod¶
Introduction¶
Sprod: De-noising Spatially Resolved Transcriptomics Data Based on Position and Image Information.
Versions¶
1.0
Commands¶
python
python3
sprod.py
Module¶
You can load the modules by:
module load biocontainers
module load sprod
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run sprod on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=sprod
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sprod
python3 test_examples.py
Squeezemeta¶
Introduction¶
SqueezeMeta is a fully automated metagenomics pipeline, from reads to bins.
Versions¶
1.5.1
Commands¶
01.merge_assemblies.pl
01.merge_sequential.pl
01.remap.pl
01.run_assembly.pl
01.run_assembly_merged.pl
02.rnas.pl
03.run_prodigal.pl
04.rundiamond.pl
05.run_hmmer.pl
06.lca.pl
07.fun3assign.pl
08.blastx.pl
09.summarycontigs3.pl
10.mapsamples.pl
11.mcount.pl
12.funcover.pl
13.mergeannot2.pl
14.runbinning.pl
15.dastool.pl
16.addtax2.pl
17.checkM_batch.pl
18.getbins.pl
19.getcontigs.pl
20.minpath.pl
21.stats.pl
SqueezeMeta.pl
SqueezeMeta_conf.pl
SqueezeMeta_conf_original.pl
parameters.pl
restart.pl
add_database.pl
cover.pl
sqm2ipath.pl
sqm2itol.pl
sqm2keggplots.pl
sqm2pavian.pl
sqm_annot.pl
sqm_hmm_reads.pl
sqm_longreads.pl
sqm_mapper.pl
sqm_reads.pl
versionchange.pl
find_missing_markers.pl
remove_duplicate_markers.pl
anvi-filter-sqm.py
anvi-load-sqm.py
sqm2anvio.pl
configure_nodb.pl
configure_nodb_alt.pl
download_databases.pl
make_databases.pl
make_databases_alt.pl
test_install.pl
Module¶
You can load the modules by:
module load biocontainers
module load squeezemeta
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run squeezemeta on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squeezemeta
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers squeezemeta
SqueezeMeta.pl -m coassembly -p Hadza -s test.samples -f raw
Squid¶
Introduction¶
SQUID is designed to detect both fusion-gene and non-fusion-gene transcriptomic structural variations from RNA-seq alignment.
Versions¶
1.5
Commands¶
squid
AnnotateSQUIDOutput.py
Module¶
You can load the modules by:
module load biocontainers
module load squid
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run squid on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=squid
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers squid
SRA-Toolkit¶
Introduction¶
SRA-Toolkit
is a collection of tools and libraries for using data in the INSDC Sequence Read Archives. Its detailed documentation can be found in https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc.
Versions¶
2.11.0-pl5262
Commands¶
abi-dump
align-cache
align-info
bam-load
cache-mgr
cg-load
fasterq-dump
fasterq-dump-orig
fastq-dump
fastq-dump-orig
illumina-dump
kar
kdbmeta
kget
latf-load
md5cp
prefetch
prefetch-orig
rcexplain
read-filter-redact
sam-dump
sam-dump-orig
sff-dump
sra-pileup
sra-pileup-orig
sra-sort
sra-sort-cg
sra-stat
srapath
srapath-orig
sratools
test-sra
vdb-config
vdb-copy
vdb-diff
vdb-dump
vdb-encrypt
vdb-lock
vdb-passwd
vdb-unlock
vdb-validate
Module¶
You can load the modules by:
module load biocontainers
module load sra-tools/2.11.0-pl5262
Configuring SRA-Toolkit¶
Users can config SRA-Toolkit by the command vdb-config
. For example, the below command set up the current working directory for downloading:
vdb-config --prefetch-to-cwd
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run SRA-Toolkit on our cluster:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=SRA-Toolkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers sra-tools/2.11.0-pl5262
vdb-config --prefetch-to-cwd # The data will be downloaded to the current working directory.
prefetch SRR11941281
fastq-dump --split-3 SRR11941281/SRR11941281.sra
Srst2¶
Introduction¶
Srst2 is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes. For more information, please check: Docker hub: https://hub.docker.com/r/staphb/srst2 Home page: https://github.com/katholt/srst2
Versions¶
0.2.0
Commands¶
getmlst.py
srst2
slurm_srst2.py
Module¶
You can load the modules by:
module load biocontainers
module load srst2
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run srst2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=srst2
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers srst2
Stacks¶
Introduction¶
Stacks
is a software pipeline for building loci from RAD-seq.
Versions¶
2.60
Commands¶
clone_filter
count_fixed_catalog_snps.py
cstacks
denovo_map.pl
gstacks
integrate_alignments.py
kmer_filter
phasedstacks
populations
process_radtags
process_shortreads
ref_map.pl
sstacks
stacks-dist-extract
stacks-gdb
stacks-integrate-alignments
tsv2bam
ustacks
Module¶
You can load the modules by:
module load biocontainers
module load stacks
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Stacks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=stacks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stacks
denovo_map.pl -T 8 -M 4 -o ./stacks/ \
--samples ./samples --popmap ./popmaps/popmap
STAR¶
Introduction¶
STAR
: ultrafast universal RNA-seq aligner.
Detailed usage can be found here: https://github.com/alexdobin/STAR
Versions¶
2.7.10a
2.7.10b
2.7.9a
Commands¶
STAR
STARlong
Module¶
You can load the modules by:
module load biocontainers
module load star/2.7.10a
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STAR on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=star
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers star/2.7.10a
STAR --runThreadN 24 --runMode genomeGenerate --genomeDir ref_genome --genomeFastaFiles ref_genome.fasta
STAR --runThreadN 24 --genomeDir ref_genome --readFilesIn seq_1.fastq seq_2.fastq --outSAMtype BAM SortedByCoordinate --outWigType wiggle read2
Staramr¶
Introduction¶
staramr scans bacterial genome contigs against the ResFinder, PointFinder, and PlasmidFinder databases (used by the ResFinder webservice and other webservices offered by the Center for Genomic Epidemiology) and compiles a summary report of detected antimicrobial resistance genes.
Versions¶
0.7.1
Commands¶
staramr
Module¶
You can load the modules by:
module load biocontainers
module load staramr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run staramr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=staramr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers staramr
staramr db info
staramr search \
--pointfinder-organism salmonella \
-o out *.fasta
STAR-Fusion¶
Introduction¶
STAR-Fusion
is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT).
Versions¶
1.11b
Commands¶
STAR-Fusion
Module¶
You can load the modules by:
module load biocontainers
module load starfusion
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STAR-Fusion on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=starfusion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers starfusion
STAR-Fusion --CPU 24 --left_fq ../star/SRR12095148_1.fastq --right_fq ../star/SRR12095148_2.fastq\
--genome_lib_dir GRCh38_gencode_v33_CTAT_lib_Apr062020.plug-n-play/ctat_genome_lib_build_dir \
--FusionInspector validate \
--denovo_reconstruct \
--examine_coding_effect \
--output_dir STAR-Fusion-output
STREAM¶
Introduction¶
STREAM
(Single-cell Trajectories Reconstruction, Exploration And Mapping) is an interactive pipeline capable of disentangling and visualizing complex branching trajectories from both single-cell transcriptomic and epigenomic data.
Versions¶
1.0
Commands¶
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load stream
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run STREAM on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stream
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stream
Stringdecomposer¶
Introduction¶
Stringdecomposer is a tool for decomposition centromeric assemblies and long reads into monomers.
Versions¶
1.1.2
Commands¶
stringdecomposer
Module¶
You can load the modules by:
module load biocontainers
module load stringdecomposer
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run stringdecomposer on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=stringdecomposer
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stringdecomposer
StringTie¶
Introduction¶
StringTie
: efficient transcript assembly and quantitation of RNA-Seq data.
Stringtie employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome. It takes as input spliced alignments in coordinate-sorted SAM/BAM/CRAM format and produces a GTF output which consists of assembled transcript structures and their estimated expression levels (FPKM/TPM and base coverage values).
Detailed usage can be found here: https://github.com/gpertea/stringtie
Versions¶
2.1.7
2.2.1
Commands¶
stringtie
Module¶
You can load the modules by:
module load biocontainers
module load stringtie
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run stringtie on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=stringtie
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers stringtie
stringtie -o SRR11614710.gtf -G Homo_sapiens.GRCh38.105.gtf SRR11614710Aligned.sortedByCoord.out.bam
Strique¶
Introduction¶
STRique is a python package to analyze repeat expansion and methylation states of short tandem repeats (STR) in Oxford Nanopore Technology (ONT) long read sequencing data.
Versions¶
0.4.2
Commands¶
STRique.py
STRique_test.py
fast5Masker.py
Module¶
You can load the modules by:
module load biocontainers
module load strique
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run strique on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=strique
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers strique
STRique_test.py
STRique.py index data/ > data/reads.fofn
cat data/c9orf72.sam | STRique.py count ./data/reads.fofn ./models/r9_4_450bps.model ./configs/repeat_config.tsv --config ./configs/STRique.json
Structure¶
Introduction¶
Structure is a software package for using multi-locus genotype data to investigate population structure.
Versions¶
2.3.4
Commands¶
structure
Module¶
You can load the modules by:
module load biocontainers
module load structure
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run structure on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=structure
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers structure
Subread¶
Introduction¶
Subread
carries out high-performance read alignment, quantification and mutation discovery. It is a general-purpose read aligner which can be used to map both genomic DNA-seq reads and DNA-seq reads. It uses a new mapping paradigm called seed-and-vote to achieve fast, accurate and scalable read mapping. Subread automatically determines if a read should be globally or locally aligned, therefore particularly powerful in mapping RNA-seq reads. It supports INDEL detection and can map reads with both fixed and variable lengths.
Versions¶
1.6.4
2.0.1
Commands¶
detectionCall
exactSNP
featureCounts
flattenGTF
genRandomReads
propmapped
qualityScores
removeDup
repair
subindel
subjunc
sublong
subread-align
subread-buildindex
subread-fullscan
txUnique
Module¶
You can load the modules by:
module load biocontainers
module load subread
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Subread on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=subread
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers subread
featureCounts -s 2 -p -Q 10 -T 4 -a genome.gtf -o featurecounts.txt mapped.bam
Survivor¶
Introduction¶
SURVIVOR is a tool set for simulating/evaluating SVs, merging and comparing SVs within and among samples, and includes various methods to reformat or summarize SVs.
Versions¶
1.0.7
Commands¶
SURVIVOR
Module¶
You can load the modules by:
module load biocontainers
module load survivor
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run survivor on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=survivor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers survivor
SURVIVOR simSV parameter_file
SURVIVOR simSV ref.fa parameter_file 0.1 0 simulated
SURVIVOR eval caller.vcf simulated.bed 10 eval_res
~
Svaba¶
Introduction¶
SvABA is a method for detecting structural variants in sequencing data using genome-wide local assembly.
Versions¶
1.1.0
Commands¶
svaba
Module¶
You can load the modules by:
module load biocontainers
module load svaba
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svaba on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=svaba
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svaba
DBSNP=dbsnp_indel.vcf
TUM_BAM=G15512.HCC1954.1.COST16011_region.bam
NORM_BAM=HCC1954.NORMAL.30x.compare.COST16011_region.bam
CORES=8 ## set any number of cores
REF=Homo_sapiens_assembly19.COST16011_region.fa
svaba run -t $TUM_BAM -n $NORM_BAM \
-p $CORES -D $DBSNP \
-a somatic_run -G $REF
Svtools¶
Introduction¶
Svtools is a suite of utilities designed to help bioinformaticians construct and explore cohort-level structural variation calls.
Versions¶
0.5.1
Commands¶
svtools
Module¶
You can load the modules by:
module load biocontainers
module load svtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svtools
Svtyper¶
Introduction¶
SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. svtyper is the original implementation of the genotyping algorithm, and works with multiple samples. svtyper-sso is an alternative implementation of svtyper that is optimized for genotyping a single sample. svtyper-sso is a parallelized implementation of svtyper that takes advantage of multiple CPU cores via the multiprocessing module. svtyper-sso can offer a 2x or more speedup (depending on how many CPU cores used) in genotyping a single sample. NOTE: svtyper-sso is not yet stable. There are minor logging differences between the two and svtyper-sso may exit with an error prematurely when processing CRAM files.
Versions¶
0.7.1
Commands¶
svtyper
svtyper-sso
python
python2
Module¶
You can load the modules by:
module load biocontainers
module load svtyper
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run svtyper on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=svtyper
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers svtyper
svtyper \
-i data/example.vcf \
-B data/NA12878.target_loci.sorted.bam \
-l data/NA12878.bam.json \
> out.vcf
swat¶
Introduction¶
swat
is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties.
Versions¶
1.090518
Commands¶
swat
Module¶
You can load the modules by:
module load biocontainers
module load swat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run swat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=swat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers swat
Syri¶
Introduction¶
Syri compares alignments between two chromosome-level assemblies and identifies synteny and structural rearrangements.
Versions¶
1.6
Commands¶
syri
Module¶
You can load the modules by:
module load biocontainers
module load syri
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run syri on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=syri
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers syri
syri -c out.sam -r refgenome -q qrygenome -k -F S
Talon¶
Introduction¶
Talon
is a Python package for identifying and quantifying known and novel genes/isoforms in long-read transcriptome data sets.
Versions¶
5.0
Commands¶
talon
talon_abundance
talon_create_GTF
talon_fetch_reads
talon_filter_transcripts
talon_generate_report
talon_get_sjs
talon_initialize_database
talon_label_reads
talon_reformat_gtf
talon_summarize
Module¶
You can load the modules by:
module load biocontainers
module load talon
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Talon on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=talon
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers talon
Targetp¶
Introduction¶
TargetP-2.0 tool predicts the presence of N-terminal presequences: signal peptide (SP), mitochondrial transit peptide (mTP), chloroplast transit peptide (cTP) or thylakoid luminal transit peptide (luTP). For the sequences predicted to contain an N-terminal presequence a potential cleavage site is also predicted.
Versions¶
2.0
Commands¶
targetp
Module¶
You can load the modules by:
module load biocontainers
module load targetp
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run targetp on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=targetp
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers targetp
Tassel¶
Introduction¶
TASSEL is a software package used to evaluate traits associations, evolutionary patterns, and linkage disequilibrium.
Versions¶
5.0
Commands¶
run_pipeline.pl
start_tassel.pl
Tassel5
Module¶
You can load the modules by:
module load biocontainers
module load tassel
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tassel on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tassel
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tassel
Taxonkit¶
Introduction¶
Taxonkit
is a practical and efficient NCBI taxonomy toolkit.
Versions¶
0.9.0
Commands¶
taxonkit
Module¶
You can load the modules by:
module load biocontainers
module load taxonkit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Taxonkit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=taxonkit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers taxonkit
taxonkit list --show-rank --show-name --indent " " --ids 9605,239934
T-coffee¶
Introduction¶
T-coffee
is a multiple sequence alignment software using a progressive approach.
Versions¶
13.45.0.4846264
Commands¶
t_coffee
Module¶
You can load the modules by:
module load biocontainers
module load t-coffee
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run T-coffee on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=t-coffee
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers t-coffee
t_coffee OG0002077.fa -mode expresso
Tetranscripts¶
Introduction¶
Tetranscripts
is a package for including transposable elements in differential enrichment analysis of sequencing datasets.
Versions¶
2.2.1
Commands¶
TEtranscripts
TEcount
Module¶
You can load the modules by:
module load biocontainers
module load tetranscripts
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tetranscripts on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tetranscripts
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tetranscripts
TEtranscripts --format BAM --mode multi \
-t treatment_sample1.bam treatment_sample2.bam treatment_sample3.bam \
-c control_sample1.bam control_sample2.bam control_sample3.bam \
--GTF genic-GTF-file \
--GTF genic-GTF-file \
--project sample_nosort_test
Tiara¶
Introduction¶
Tiara
is a deep-learning-based approach for identification of eukaryotic sequences in the metagenomic data powered by PyTorch.
Versions¶
1.0.2
Commands¶
tiara
Module¶
You can load the modules by:
module load biocontainers
module load tiara
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tiara on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=tiara
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tiara
tiara -t 24 -i archaea_fr.fasta -o archaea_out.txt
tiara -t 24 -i bacteria_fr.fasta -o bacteria_out.txt
tiara -t 24 -i eukarya_fr.fasta -o eukarya_out.txt
tiara -t 24 -i mitochondria_fr.fasta -o mitochondria_out.txt
tiara -t 24 -i plast_fr.fasta -o plast_out.txt
tiara -t 24 -i total.fasta -o mix_out.txt --tf all -p 0.65 0.60 --probabilities
Tigmint¶
Introduction¶
Tigmint identifies and corrects misassemblies using linked (e.g. MGI’s stLFR, 10x Genomics Chromium) or long (e.g. Oxford Nanopore Technologies long reads) DNA sequencing reads. The reads are first aligned to the assembly, and the extents of the large DNA molecules are inferred from the alignments of the reads. The physical coverage of the large molecules is more consistent and less prone to coverage dropouts than that of the short read sequencing data. The sequences are cut at positions that have insufficient spanning molecules. Tigmint outputs a BED file of these cut points, and a FASTA file of the cut sequences. For more information, please check: Home page: https://github.com/bcgsc/tigmint
Versions¶
1.2.6
Commands¶
tigmint
tigmint-arcs-tsv
tigmint-cut
tigmint-make
tigmint_estimate_dist.py
tigmint_molecule.py
tigmint_molecule_paf.py
Module¶
You can load the modules by:
module load biocontainers
module load tigmint
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tigmint on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tigmint
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tigmint
Tobias¶
Introduction¶
Tobias
is a collection of command-line bioinformatics tools for performing footprinting analysis on ATAC-seq data.
Versions¶
0.13.3
Commands¶
TOBIAS
Module¶
You can load the modules by:
module load biocontainers
module load tobias
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tobias on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=tobias
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tobias
TOBIAS DownloadData --bucket data-tobias-2020
mv data-tobias-2020/ test_data/
TOBIAS PlotAggregate --TFBS test_data/BATF_all.bed \
--signals test_data/Bcell_corrected.bw test_data/Tcell_corrected.bw \
--output BATFJUN_footprint_comparison_all.pdf \
--share_y both --plot_boundaries --signal-on-x
TOBIAS BINDetect --motifs test_data/motifs.jaspar \
--signals test_data/Bcell_footprints.bw test_data/Tcell_footprints.bw \
--genome test_data/genome.fa.gz \
--peaks test_data/merged_peaks_annotated.bed \
--peak_header test_data/merged_peaks_annotated_header.txt \
--outdir BINDetect_output --cond_names Bcell Tcell --cores 8
TOBIAS ATACorrect --bam test_data/Bcell.bam \
--genome test_data/genome.fa.gz \
--peaks test_data/merged_peaks.bed \
--blacklist test_data/blacklist.bed \
--outdir ATACorrect_test --cores 8
TOBIAS FootprintScores --signal test_data/Bcell_corrected.bw \
--regions test_data/merged_peaks.bed \
--output Bcell_footprints.bw --cores 8
Tombo¶
Introduction¶
Tombo
is a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data. Tombo also provides tools for the analysis and visualization of raw nanopore signal.
Versions¶
1.5.1
Commands¶
tombo
Module¶
You can load the modules by:
module load biocontainers
module load tombo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Tombo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=tombo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tombo
tombo resquiggle path/to/fast5s/ genome.fasta --processes 4 --num-most-common-errors 5
tombo detect_modifications alternative_model --fast5-basedirs path/to/fast5s/ \
--statistics-file-basename native.e_coli_sample \
--alternate-bases dam dcm --processes 4
# plot raw signal at most significant dcm locations
tombo plot most_significant --fast5-basedirs path/to/fast5s/ \
--statistics-filename native.e_coli_sample.dcm.tombo.stats \
--plot-standard-model --plot-alternate-model dcm \
--pdf-filename sample.most_significant_dcm_sites.pdf
# produces wig file with estimated fraction of modified reads at each valid reference site
tombo text_output browser_files --statistics-filename native.e_coli_sample.dam.tombo.stats \
--file-types dampened_fraction --browser-file-basename native.e_coli_sample.dam
# also produce successfully processed reads coverage file for reference
tombo text_output browser_files --fast5-basedirs path/to/fast5s/ \
--file-types coverage --browser-file-basename native.e_coli_sample
TopHat¶
Introduction¶
TopHat
is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
Versions¶
2.1.1-py27
Commands¶
tophat
tophat2
Module¶
You can load the modules by:
module load biocontainers
module load tophat
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run TopHat on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=tophat
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers tophat
tophat -r 20 test_ref reads_1.fq reads_2.fq
TPMCalculator¶
Introduction¶
TPMCalculator
quantifies mRNA abundance directly from the alignments by parsing BAM files.
Detailed usage can be found here: https://github.com/ncbi/TPMCalculator
Versions¶
0.0.3
0.0.4
Commands¶
TPMCalculator
Module¶
You can load the modules by:
module load biocontainers
module load tpmcalculator
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tpmcalculator on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=tpmcalculator
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transdecoder
TPMCalculator -g Homo_sapiens.GRCh38.105.chr.gtf -b SRR12095148Aligned.sortedByCoord.out.bam
Transabyss¶
Introduction¶
Transabyss
is a tool for De novo assembly of RNAseq data using ABySS.
Versions¶
2.0.1
Commands¶
transabyss
transabyss-merge
Module¶
You can load the modules by:
module load biocontainers
module load transabyss
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Transabyss on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transabyss
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transabyss
transabyss --name SRR12095148 \
--pe SRR12095148_1.fastq SRR12095148_2.fastq \
--outdir SRR12095148_assembly --threads 12
TransDecoder¶
Introduction¶
TransDecoder
identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.
TransDecoder identifies likely coding sequences based on the following criteria:
a minimum length open reading frame (ORF) is found in a transcript sequence
a log-likelihood score similar to what is computed by the GeneID software is > 0.
the above coding score is greatest when the ORF is scored in the 1st reading frame as compared to scores in the other 2 forward reading frames.
if a candidate ORF is found fully encapsulated by the coordinates of another candidate ORF, the longer one is reported. However, a single transcript can report multiple ORFs (allowing for operons, chimeras, etc).
a PSSM is built/trained/used to refine the start codon prediction.
optional the putative peptide has a match to a Pfam domain above the noise cutoff score.
Detailed usage can be found here: https://github.com/TransDecoder/TransDecoder/wiki#running-transdecoder
Versions¶
5.5.0
Commands¶
TransDecoder.LongOrfs
TransDecoder.Predict
cdna_alignment_orf_to_genome_orf.pl
compute_base_probs.pl
exclude_similar_proteins.pl
fasta_prot_checker.pl
ffindex_resume.pl
gene_list_to_gff.pl
get_FL_accs.pl
get_longest_ORF_per_transcript.pl
get_top_longest_fasta_entries.pl
gff3_file_to_bed.pl
gff3_file_to_proteins.pl
gff3_gene_to_gtf_format.pl
gtf_genome_to_cdna_fasta.pl
gtf_to_alignment_gff3.pl
gtf_to_bed.pl
nr_ORFs_gff3.pl
pfam_runner.pl
refine_gff3_group_iso_strip_utrs.pl
refine_hexamer_scores.pl
remove_eclipsed_ORFs.pl
score_CDS_likelihood_all_6_frames.pl
select_best_ORFs_per_transcript.pl
seq_n_baseprobs_to_loglikelihood_vals.pl
start_codon_refinement.pl
train_start_PWM.pl
uri_unescape.pl
Module¶
You can load the modules by:
module load biocontainers
module load transdecoder
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run transdecoder on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 20:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=transdecoder
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transdecoder
gtf_genome_to_cdna_fasta.pl transcripts.gtf test.genome.fasta > transcripts.fasta
gtf_to_alignment_gff3.pl transcripts.gtf > transcripts.gff3
TransDecoder.LongOrfs -t transcripts.fasta
TransDecoder.Predict -t transcripts.fasta
Transrate¶
Introduction¶
Transrate is software for de-novo transcriptome assembly quality analysis.
Versions¶
1.0.3
Commands¶
transrate
Module¶
You can load the modules by:
module load biocontainers
module load transrate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run transrate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=transrate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transrate
transrate --assembly mm10/Mus_musculus.GRCm38.cds.all.fa \
--left seq_1.fq.gz \
--right seq_2.fq.gz \
--threads 12
Transvar¶
Introduction¶
Transvar
is a multi-way annotator for genetic elements and genetic variations.
Versions¶
2.5.9
Commands¶
transvar
Module¶
You can load the modules by:
module load biocontainers
module load transvar
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Transvar on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=transvar
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers transvar
# set up databases
transvar config --download_anno --refversion hg19
# in case you don't have a reference
transvar config --download_ref --refversion hg19
transvar panno -i 'PIK3CA:p.E545K' --ucsc --ccds
tRAX¶
Introduction¶
tRAX
(tRNA Analysis of eXpression) is a software package built for in-depth analyses of tRNA-derived small RNAs (tDRs), mature tRNAs, and inference of RNA modifications from high-throughput small RNA sequencing data.
Versions¶
1.0.0
Commands¶
TestRun.bash
quickdb.bash
maketrnadb.py
trimadapters.py
processamples.py
Module¶
You can load the modules by:
module load biocontainers
module load trax
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tRAX on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trax
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trax
Treetime¶
Introduction¶
Treetime
is a tool for maximum likelihood dating and ancestral sequence inference.
Versions¶
0.8.6
0.9.4
Commands¶
treetime
Module¶
You can load the modules by:
module load biocontainers
module load treetime
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Treetime on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=treetime
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers treetime
treetime ancestral --aln input.fasta --tree input.nwk
Trimal¶
Introduction¶
Trimal
is a tool for the automated removal of spurious sequences or poorly aligned regions from a multiple sequence alignment.
Versions¶
1.4.1
Commands¶
trimal
readal
statal
Module¶
You can load the modules by:
module load biocontainers
module load trimal
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trimal on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trimal
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trimal
trimal -in input.fasta -out output1 -htmlout output1.html -gt 1
Trim-galore¶
Introduction¶
Trim-galore
is a wrapper tool that automates quality and adapter trimming to FastQ files.
Versions¶
0.6.7
Commands¶
trim_galore
Module¶
You can load the modules by:
module load biocontainers
module load trim-galore
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trim-galore on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 4
#SBATCH --job-name=trim-galore
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trim-galore
trim_galore --paired --fastqc --length 20 -o sample1_trimmed Sample1_1.fq Sample1_2.fq
Trimmomatic¶
Introduction¶
Trimmomatic
is a flexible read trimming tool for Illumina NGS data.
Versions¶
0.39
Commands¶
trimmomatic
Module¶
You can load the modules by:
module load biocontainers
module load trimmomatic
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trimmomatic on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 8
#SBATCH --job-name=trimmomatic
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trimmomatic
trimmomatic PE -threads 8 \
input_forward.fq.gz input_reverse.fq.gz \
output_forward_paired.fq.gz output_forward_unpaired.fq.gz \
output_reverse_paired.fq.gz output_reverse_unpaired.fq.gz \
ILLUMINACLIP:TruSeq3-PE.fa:2:30:10:2:True LEADING:3 TRAILING:3 MINLEN:36
Trinity¶
Introduction¶
Trinity
assembles transcript sequences from Illumina RNA-Seq data.
Versions¶
2.12.0
2.13.2
2.14.0
2.15.0
Commands¶
Trinity
TrinityStats.pl
Trinity_gene_splice_modeler.py
ace2sam
align_and_estimate_abundance.pl
analyze_blastPlus_topHit_coverage.pl
analyze_diff_expr.pl
blast2sam.pl
bowtie
bowtie2
bowtie2-build
bowtie2-inspect
bowtie2sam.pl
contig_ExN50_statistic.pl
define_clusters_by_cutting_tree.pl
export2sam.pl
extract_supertranscript_from_reference.py
filter_low_expr_transcripts.pl
get_Trinity_gene_to_trans_map.pl
insilico_read_normalization.pl
interpolate_sam.pl
jellyfish
novo2sam.pl
retrieve_sequences_from_fasta.pl
run_DE_analysis.pl
sam2vcf.pl
samtools
samtools.pl
seq_cache_populate.pl
seqtk-trinity
sift_bam_max_cov.pl
soap2sam.pl
tabix
trimmomatic
wgsim
wgsim_eval.pl
zoom2sam.pl
Module¶
You can load the modules by:
module load biocontainers
module load trinity
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trinity on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 6
#SBATCH --job-name=trinity
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trinity
Trinity --seqType fq --left reads_1.fq --right reads_2.fq \
--CPU 6 --max_memory 20G
Trinotate¶
Introduction¶
Trinotate
is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms.
Versions¶
3.2.2
Commands¶
Trinotate
Build_Trinotate_Boilerplate_SQLite_db.pl
EMBL_dat_to_Trinotate_sqlite_resourceDB.pl
EMBL_swissprot_parser.pl
PFAM_dat_parser.pl
PFAMtoGoParser.pl
RnammerTranscriptome.pl
TrinotateSeqLoader.pl
Trinotate_BLAST_loader.pl
Trinotate_GO_to_SLIM.pl
Trinotate_GTF_loader.pl
Trinotate_GTF_or_GFF3_annot_prep.pl
Trinotate_PFAM_loader.pl
Trinotate_RNAMMER_loader.pl
Trinotate_SIGNALP_loader.pl
Trinotate_TMHMM_loader.pl
Trinotate_get_feature_name_encoding_attributes.pl
Trinotate_report_writer.pl
assign_eggnog_funccats.pl
autoTrinotate.pl
build_DE_cache_tables.pl
cleanMe.pl
cleanme.pl
count_table_fields.pl
create_clusters_tables.pl
extract_GO_assignments_from_Trinotate_xls.pl
extract_GO_for_BiNGO.pl
extract_specific_genes_from_all_matrices.pl
import_DE_results.pl
import_Trinotate_xls_as_annot.pl
import_expression_and_DE_results.pl
import_expression_matrix.pl
import_samples_n_expression_matrix.pl
import_samples_only.pl
import_transcript_annotations.pl
import_transcript_clusters.pl
import_transcript_names.pl
init_Trinotate_sqlite_db.pl
legacy_blast.pl
make_cXp_html.pl
obo_tab_to_sqlite_db.pl
obo_to_tab.pl
prep_nuc_prot_set_for_trinotate_loading.pl
print.pl
rnammer_supperscaffold_gff_to_indiv_transcripts.pl
runMe.pl
run_TrinotateWebserver.pl
run_cluster_functional_enrichment_analysis.pl
shrink_db.pl
sqlite.pl
superScaffoldGenerator.pl
test_Barplot.pl
test_GO_DAG.pl
test_GenomeBrowser.pl
test_Heatmap.pl
test_Lineplot.pl
test_Piechart.pl
test_Scatter2D.pl
test_Sunburst.pl
trinotate_report_summary.pl
update_blastdb.pl
update_seq_n_annotation_fields.pl
Module¶
You can load the modules by:
module load biocontainers
module load trinotate
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trinotate on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trinotate
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trinotate
sqlite_db="myTrinotate.sqlite"
Trinotate ${sqlite_db} init \
--gene_trans_map data/Trinity.fasta.gene_to_trans_map \
--transcript_fasta data/Trinity.fasta \
--transdecoder_pep \
data/Trinity.fasta.transdecoder.pep
Trinotate ${sqlite_db} LOAD_swissprot_blastp data/swissprot.blastp.outfmt6
Trinotate ${sqlite_db} LOAD_pfam data/TrinotatePFAM.out
Trnascan-se¶
Introduction¶
Trnascan-se
is a convenient, ready-for-use means to identify tRNA genes in one or more query sequences.
Versions¶
2.0.9
Commands¶
tRNAscan-SE
Module¶
You can load the modules by:
module load biocontainers
module load trnascan-se
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Trnascan-se on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=trnascan-se
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trnascan-se
tRNAscan-SE --thread 12 -o tRNA.out \
-f rRNA.ss -m tRNA.stats genome.fasta
Trtools¶
Introduction¶
TRTools includes a variety of utilities for filtering, quality control and analysis of tandem repeats downstream of genotyping them from next-generation sequencing.
Versions¶
5.0.1
Commands¶
associaTR
compareSTR
dumpSTR
mergeSTR
qcSTR
statSTR
Module¶
You can load the modules by:
module load biocontainers
module load trtools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Warning
We noticed that xalt
module can cause the failure of certain commands including statSTR
. Please unload all loaded modules by module --force purge
before loading required modules.
To run trtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trtools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trtools htslib bcftools
mergeSTR --vcfs ceu_ex.vcf.gz,yri_ex.vcf.gz --out merged
bgzip merged.vcf
tabix -p vcf merged.vcf.gz
# Get the CEU and YRI sample lists
bcftools query -l yri_ex.vcf.gz > yri_samples.txt
bcftools query -l ceu_ex.vcf.gz > ceu_samples.txt
# Run statSTR on region chr21:35348646-35348646 (hg38)
statSTR \
--vcf merged.vcf.gz \
--samples yri_samples.txt,ceu_samples.txt \
--sample-prefixes YRI,CEU \
--out stdout \
--mean --het --acount \
--use-length \
--region chr21:34351482-34363028
Trust4¶
Introduction¶
Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR and BCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors.
Versions¶
1.0.7
Commands¶
run-trust4
BuildDatabaseFa.pl
BuildImgtAnnot.pl
trust-airr.pl
trust-barcoderep.pl
trust-simplerep.pl
trust-smartseq.pl
Module¶
You can load the modules by:
module load biocontainers
module load trust4
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run trust4 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trust4
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trust4
run-trust4 -b mapped.bam -f hg38_bcrtcr.fa --ref human_IMGT+C.fa
Trycycler¶
Introduction¶
Trycycler is a tool for generating consensus long-read assemblies for bacterial genomes. I.e. if you have multiple long-read assemblies for the same isolate, Trycycler can combine them into a single assembly that is better than any of your inputs.
Versions¶
0.5.0
0.5.3
0.5.4
Commands¶
trycycler
Module¶
You can load the modules by:
module load biocontainers
module load trycycler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run trycycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=trycycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers trycycler
trycycler cluster --assemblies \
test/test_cluster/assembly_*.fasta \
--read test/test_cluster/reads.fastq \
--out_dir trycycler_out
UCSC Executables¶
Introduction¶
UCSC Executables
is a variety of executables that perform functions ranging from sequence analysis and format conversion, to basic number crunching and statistics, to complex database generation and manipulation.
These executables have been downloaded from http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64.v369/ and made available on RCAC clusters.
Versions¶
369
Commands¶
addCols
ameme
autoDtd
autoSql
autoXml
ave
aveCols
axtChain
axtSort
axtSwap
axtToMaf
axtToPsl
bamToPsl
barChartMaxLimit
bedClip
bedCommonRegions
bedCoverage
bedExtendRanges
bedGeneParts
bedGraphPack
bedGraphToBigWig
bedIntersect
bedItemOverlapCount
bedJoinTabOffset
bedJoinTabOffset.py
bedMergeAdjacent
bedPartition
bedPileUps
bedRemoveOverlap
bedRestrictToPositions
bedSingleCover.pl
bedSort
bedToBigBed
bedToExons
bedToGenePred
bedToPsl
bedWeedOverlapping
bigBedInfo
bigBedNamedItems
bigBedSummary
bigBedToBed
bigGenePredToGenePred
bigHeat
bigMafToMaf
bigPslToPsl
bigWigAverageOverBed
bigWigCat
bigWigCluster
bigWigCorrelate
bigWigInfo
bigWigMerge
bigWigSummary
bigWigToBedGraph
bigWigToWig
binFromRange
blastToPsl
blastXmlToPsl
blat
calc
catDir
catUncomment
chainAntiRepeat
chainBridge
chainCleaner
chainFilter
chainMergeSort
chainNet
chainPreNet
chainScore
chainSort
chainSplit
chainStitchId
chainSwap
chainToAxt
chainToPsl
chainToPslBasic
checkAgpAndFa
checkCoverageGaps
checkHgFindSpec
checkTableCoords
chopFaLines
chromGraphFromBin
chromGraphToBin
chromToUcsc
clusterGenes
clusterMatrixToBarChartBed
colTransform
countChars
cpg_lh
crTreeIndexBed
crTreeSearchBed
dbSnoop
dbTrash
endsInLf
estOrient
expMatrixToBarchartBed
faAlign
faCmp
faCount
faFilter
faFilterN
faFrag
faNoise
faOneRecord
faPolyASizes
faRandomize
faRc
faSize
faSomeRecords
faSplit
faToFastq
faToTab
faToTwoBit
faToVcf
faTrans
fastqStatsAndSubsample
fastqToFa
featureBits
fetchChromSizes
findMotif
fixStepToBedGraph.pl
gapToLift
genePredCheck
genePredFilter
genePredHisto
genePredSingleCover
genePredToBed
genePredToBigGenePred
genePredToFakePsl
genePredToGtf
genePredToMafFrames
genePredToProt
gensub2
getRna
getRnaPred
gff3ToGenePred
gff3ToPsl
gmtime
gtfToGenePred
headRest
hgBbiDbLink
hgFakeAgp
hgFindSpec
hgGcPercent
hgGoldGapGl
hgLoadBed
hgLoadChain
hgLoadGap
hgLoadMaf
hgLoadMafSummary
hgLoadNet
hgLoadOut
hgLoadOutJoined
hgLoadSqlTab
hgLoadWiggle
hgSpeciesRna
hgTrackDb
hgWiggle
hgsql
hgsqldump
hgvsToVcf
hicInfo
htmlCheck
hubCheck
hubClone
hubPublicCheck
ixIxx
lastz-1.04.00
lastz_D-1.04.00
lavToAxt
lavToPsl
ldHgGene
liftOver
liftOverMerge
liftUp
linesToRa
localtime
mafAddIRows
mafAddQRows
mafCoverage
mafFetch
mafFilter
mafFrag
mafFrags
mafGene
mafMeFirst
mafNoAlign
mafOrder
mafRanges
mafSpeciesList
mafSpeciesSubset
mafSplit
mafSplitPos
mafToAxt
mafToBigMaf
mafToPsl
mafToSnpBed
mafsInRegion
makeTableList
maskOutFa
matrixClusterColumns
matrixMarketToTsv
matrixNormalize
mktime
mrnaToGene
netChainSubset
netClass
netFilter
netSplit
netSyntenic
netToAxt
netToBed
newProg
newPythonProg
nibFrag
nibSize
oligoMatch
overlapSelect
para
paraFetch
paraHub
paraHubStop
paraNode
paraNodeStart
paraNodeStatus
paraNodeStop
paraSync
paraTestJob
parasol
positionalTblCheck
pslCDnaFilter
pslCat
pslCheck
pslDropOverlap
pslFilter
pslHisto
pslLiftSubrangeBlat
pslMap
pslMapPostChain
pslMrnaCover
pslPairs
pslPartition
pslPosTarget
pslPretty
pslRc
pslRecalcMatch
pslRemoveFrameShifts
pslReps
pslScore
pslSelect
pslSomeRecords
pslSort
pslSortAcc
pslStats
pslSwap
pslToBed
pslToBigPsl
pslToChain
pslToPslx
pslxToFa
qaToQac
qacAgpLift
qacToQa
qacToWig
raSqlQuery
raToLines
raToTab
randomLines
rmFaDups
rowsToCols
sizeof
spacedToTab
splitFile
splitFileByColumn
sqlToXml
strexCalc
stringify
subChar
subColumn
tabQuery
tailLines
tdbQuery
tdbRename
tdbSort
textHistogram
tickToDate
toLower
toUpper
trackDbIndexBb
transMapPslToGenePred
trfBig
twoBitDup
twoBitInfo
twoBitMask
twoBitToFa
ucscApiClient
udr
vai.pl
validateFiles
validateManifest
varStepToBedGraph.pl
webSync
wigCorrelate
wigEncode
wigToBigWig
wordLine
xmlCat
xmlToSql
Module¶
You can load the modules by:
module load biocontainers
module load ucsc_genome_toolkit/369
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run UCSC executables on our our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=UCSC
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers ucsc_genome_toolkit/369
blat genome.fasta input.fasta blat.out
fastqToFa input.fastq output.fasta
Umi_tools¶
Introduction¶
Umi_tools is a collection of tools for handling Unique Molecular Identifiers in NGS data sets.
Versions¶
1.1.4
Commands¶
umi_tools
Module¶
You can load the modules by:
module load biocontainers
module load umi_tools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run umi_tools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=umi_tools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers umi_tools
Unicycler¶
Introduction¶
Unicycler
is an assembly pipeline for bacterial genomes.
Versions¶
0.5.0
Commands¶
unicycler
Module¶
You can load the modules by:
module load biocontainers
module load unicycler
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Unicycler on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 12
#SBATCH --job-name=unicycler
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers unicycler
unicycler -t 12 -1 SRR11234553_1.fastq -2 SRR11234553_2.fastq -o shortout
unicycler -t 12 -l SRR3982487.fastq -o longout
Usefulaf¶
Introduction¶
Usefulaf is an all-in-one Docker/Singularity image for single-cell processing with Alevin-fry(paper). It includes the all tools you need to turn your FASTQ files into a count matrix and then load it into your favorite analysis environment.
Versions¶
0.9.2
Commands¶
simpleaf
R
Rscript
python
python3
Module¶
You can load the modules by:
module load biocontainers
module load usefulaf
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run usefulaf on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=usefulaf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers usefulaf
Vadr¶
Introduction¶
VADR is a suite of tools for classifying and analyzing sequences homologous to a set of reference models of viral genomes or gene families. It has been mainly tested for analysis of Norovirus, Dengue, and SARS-CoV-2 virus sequences in preparation for submission to the GenBank database.
Versions¶
1.4.1
1.4.2
1.5
Commands¶
parse_blast.pl
v-annotate.pl
v-build.pl
v-test.pl
Module¶
You can load the modules by:
module load biocontainers
module load vadr
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vadr on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vadr
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vadr
v-annotate.pl noro.9.fa va-noro.9
Vardict-java¶
Introduction¶
VarDictJava is a variant discovery program written in Java and Perl. It is a Java port of VarDict variant caller.
Versions¶
1.8.3
Commands¶
vardict-java
var2vcf_paired.pl
var2vcf_valid.pl
testsomatic.R
teststrandbias.R
Module¶
You can load the modules by:
module load biocontainers
module load vardict-java
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vardict-java on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vardict-java
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vardict-java
AF_THR="0.01" # minimum allele frequency
vardict-java -G genome.fasta \
-f $AF_THR -N genome \
-b input.bam \
-c 1 -S 2 -E 3 -g 4 output.bed \
| teststrandbias.R \
| var2vcf_valid.pl \
-N genome -E -f $AF_THR \
> vars.vcf
Varlociraptor¶
Introduction¶
Varlociraptor
implements a novel, unified fully uncertainty-aware approach to genomic variant calling in arbitrary scenarios.
Versions¶
4.11.4
Commands¶
varlociraptor
Module¶
You can load the modules by:
module load biocontainers
module load varlociraptor
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Varlociraptor on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varlociraptor
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers varlociraptor
varlociraptor call variants tumor-normal --purity 0.75 --tumor
Varscan¶
Introduction¶
Varscan
is a tool used for variant detection in massively parallel sequencing data.
Versions¶
2.4.2
2.4.4
Commands¶
VarScan.v2.4.4.jar
Module¶
You can load the modules by:
module load biocontainers
module load varscan
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Varscan on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=varscan
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers varscan
Vartrix¶
Introduction¶
Vartrix
is a software tool for extracting single cell variant information from 10x Genomics single cell data.
Versions¶
1.1.22
Commands¶
vartrix
Module¶
You can load the modules by:
module load biocontainers
module load vartrix
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Vartrix on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vartrix
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vartrix
vartrix -v test/test.vcf -b test/test.bam \
-f test/test.fa -c test/barcodes.tsv \
-o output.matrix
Vatools¶
Introduction¶
VAtools is a python package that includes several tools to annotate VCF files with data from other tools.
Versions¶
5.0.1
Commands¶
ref-transcript-mismatch-reporter
transform-split-values
vcf-expression-annotator
vcf-genotype-annotator
vcf-info-annotator
vcf-readcount-annotator
vep-annotation-reporter
Module¶
You can load the modules by:
module load biocontainers
module load vatools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vatools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vatools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vatools
vcf-readcount-annotator <input_vcf> <snv_bam_readcount_file> <DNA| RNA> \
-s <sample_name> -t snv -o <snv_annotated_vcf>
Vcf2maf¶
Introduction¶
To convert a VCF into a MAF, each variant must be mapped to only one of all possible gene transcripts/isoforms that it might affect. This selection of a single effect per variant, is often subjective. So this project is an attempt to make the selection criteria smarter, reproducible, and more configurable. And the default criteria must lean towards best practices.
Versions¶
1.6.21
Commands¶
maf2maf.pl
maf2vcf.pl
vcf2maf.pl
vcf2vcf.pl
Module¶
You can load the modules by:
module load biocontainers
module load vcf2maf
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
Note
If users need to use vep
, please add --vep-path /opt/conda/bin
.
To run vcf2maf on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2maf
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2maf
vcf2maf.pl --vep-path /opt/conda/bin \
--ref-fasta Homo_sapiens.GRCh37.dna.toplevel.fa.gz \
--input-vcf tests/test.vcf --output-maf test.vep.maf
Vcf2phylip¶
Introduction¶
vcf2phylip is a tool to convert SNPs in VCF format to PHYLIP, NEXUS, binary NEXUS, or FASTA alignments for phylogenetic analysis.
Versions¶
2.8
Commands¶
vcf2phylip.py
Module¶
You can load the modules by:
module load biocontainers
module load vcf2phylip
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf2phylip on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2phylip
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2phylip
vcf2phylip --input myfile.vcf
Vcf2tsvpy¶
Introduction¶
Vcf2tsvpy is a small Python program that converts genomic variant data encoded in VCF format into a tab-separated values (TSV) file.
Versions¶
0.6.0
Commands¶
vcf2tsvpy
Module¶
You can load the modules by:
module load biocontainers
module load vcf2tsvpy
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf2tsvpy on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf2tsvpy
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf2tsvpy
Vcf-kit¶
Introduction¶
VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files.
Versions¶
0.2.6
0.2.9
Commands¶
vk
Module¶
You can load the modules by:
module load biocontainers
module load vcf-kit
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vcf-kit on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcf-kit
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcf-kit
VCFtools¶
Introduction¶
VCFtools
is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
Versions¶
0.1.16
Commands¶
vcftools
Module¶
You can load the modules by:
module load biocontainers
module load vartrix
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run VCFtools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vcftools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vcftools
vcftools --vcf input_data.vcf --chr 1 \
--from-bp 1000000 --to-bp 2000000
Velocyto.py¶
Introduction¶
Velocyto.py
a library for the analysis of RNA velocity.
Detailed information about velocyto.py can be found here: https://github.com/velocyto-team/velocyto.py.
Versions¶
0.17.17
Commands¶
python
python3
velocyto
Module¶
You can load the modules by:
module load biocontainers
module load velocyto.py/0.17.17-py39
Interactive job¶
To run Velocyto.py
interactively on our clusters:
(base) UserID@bell-fe00:~ $ sinteractive -N1 -n12 -t4:00:00 -A myallocation
salloc: Granted job allocation 12345869
salloc: Waiting for resource configuration
salloc: Nodes bell-a008 are ready for job
(base) UserID@bell-a008:~ $ module load biocontainers cellrank/1.5.1
(base) UserID@bell-a008:~ $ python
Python 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:24:11)
[GCC 9.4.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import velocyto as vcy
>>> vlm = vcy.VelocytoLoom("YourData.loom")
>>> vlm.normalize("S", size=True, log=True)
>>> vlm.S_norm # contains log normalized
Batch job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To submit a sbatch job on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 10:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=Velocyto
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers velocyto.py/0.17.17-py39
velocyto run10x cellranger_count_1kpbmcs_out refdata-gex-GRCh38-2020-A/genes/genes.gtf
Velvet¶
Introduction¶
Velvet
is a sequence assembler for very short reads.
Versions¶
1.2.10
Commands¶
velveth
velvetg
Module¶
You can load the modules by:
module load biocontainers
module load trimmomatic
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Velvet on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=velvet
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers velvet
velveth output_directory 21 -fasta -short solexa1.fa solexa2.fa solexa3.fa -long capillary.fa
velvetg output_directory -cov_cutoff 4
Veryfasttree¶
Introduction¶
VeryFastTree is a highly-tuned implementation of the FastTree-2 tool that takes advantage of parallelization and vectorization strategies to speed up the inference of phylogenies for huge alignments. It is important to highlight that VeryFastTree keeps unchanged the phases, methods and heuristics used by FastTree-2 to estimate the phylogenetic tree. In this way, it produces trees with the same topological accuracy than FastTree-2. In addition, unlike the parallel version of FastTree-2, VeryFastTree is deterministic.
Versions¶
3.2.1
Commands¶
VeryFastTree
Module¶
You can load the modules by:
module load biocontainers
module load veryfasttree
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run veryfasttree on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=veryfasttree
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers veryfasttree
Vg¶
Introduction¶
Variation graphs (vg) provides tools for working with genome variation graphs.
Quay.io: https://quay.io/repository/vgteam/vg?tabinfo | Home page: https://github.com/vgteam/vg
Versions¶
1.40.0
Commands¶
vg
Module¶
You can load the modules by:
module load biocontainers
module load vg
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run vg on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vg
vg construct -r test/small/x.fa -v test/small/x.vcf.gz >x.vg
# GFA output
vg view x.vg >x.gfa
# dot output suitable for graphviz
vg view -d x.vg >x.dot
# And if you have a GAM file
cp small/x-s1337-n1.gam x.gam
# json version of binary alignments
vg view -a x.gam >x.json
vg align -s CTACTGACAGCAGAAGTTTGCTGTGAAGATTAAATTAGGTGATGCTTG x.vg
Viennarna¶
Introduction¶
Viennarna
is a set of standalone programs and libraries used for prediction and analysis of RNA secondary structures.
Versions¶
2.5.0
Commands¶
RNA2Dfold
RNALalifold
RNALfold
RNAPKplex
RNAaliduplex
RNAalifold
RNAcofold
RNAdistance
RNAdos
RNAduplex
RNAeval
RNAfold
RNAforester
RNAheat
RNAinverse
RNAlocmin
RNAmultifold
RNApaln
RNAparconv
RNApdist
RNAplex
RNAplfold
RNAplot
RNApvmin
RNAsnoop
RNAsubopt
RNAup
Kinfold
b2ct
popt
Module¶
You can load the modules by:
module load biocontainers
module load viennarna
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Viennarna on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=viennarna
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers viennarna
RNAfold < test.seq
RNAfold -p --MEA < test.seq
Vsearch¶
Introduction¶
Vsearch
is a versatile open source tool for metagenomics.
Versions¶
2.19.0
2.21.1
2.22.1
Commands¶
vsearch
Module¶
You can load the modules by:
module load biocontainers
module load vsearch
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Vsearch on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=vsearch
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers vsearch
vsearch -sintax SRR8723605_merged.fasta -db rdp_16s_v16_sp.fa \
-tabbedout SRR8723605_out.txt -strand both -sintax_cutoff 0.5
Weblogo¶
Introduction¶
Weblogo
is a web based application designed to make the generation of sequence logos as easy and painless as possible.
Versions¶
3.7.8
Commands¶
weblogo
Module¶
You can load the modules by:
module load biocontainers
module load weblogo
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Weblogo on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=weblogo
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers weblogo
weblogo --resolution 600 --format PNG \
<seq.fasta >logo.png
Whatshap¶
Introduction¶
Whatshap is a software for phasing genomic variants using DNA sequencing reads, also called read-based phasing or haplotype assembly. It is especially suitable for long reads, but works also well with short reads.
Versions¶
1.4
Commands¶
whatshap
Module¶
You can load the modules by:
module load biocontainers
module load whatshap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run whatshap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=whatshap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers whatshap
whatshap phase --indels \
--reference=reference.fasta \
variants.vcf pacbio.bam
Wiggletools¶
Introduction¶
The WiggleTools package allows genomewide data files to be manipulated as numerical functions, equipped with all the standard functional analysis operators (sum, product, product by a scalar, comparators), and derived statistics (mean, median, variance, stddev, t-test, Wilcoxon’s rank sum test, etc).
Versions¶
1.2.11
Commands¶
wiggletools
Module¶
You can load the modules by:
module load biocontainers
module load wiggletools
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run wiggletools on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=wiggletools
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers wiggletools
wiggletools test/fixedStep.wig
wiggletools test/fixedStep.bw
wiggletools test/bedfile.bg
wiggletools test/overlapping.bed
wiggletools test/bam.bam
wiggletools test/cram.cram
wiggletools test/vcf.vcf
wiggletools test/bcf.bcf
Winnowmap¶
Introduction¶
Winnowmap is a long-read mapping algorithm optimized for mapping ONT and PacBio reads to repetitive reference sequences.
Versions¶
2.03
Commands¶
winnowmap
Module¶
You can load the modules by:
module load biocontainers
module load winnowmap
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run winnowmap on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --job-name=winnowmap
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers winnowmap
winnowmap -W repetitive_k15.txt \
-ax map-pb Cm.contigs.fasta \
SRR3982487.fastq > output.sam
Wtdbg2¶
Introduction¶
Wtdbg2
is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT).
Versions¶
2.5
Commands¶
wtdbg-cns
wtdbg2
wtpoa-cns
Module¶
You can load the modules by:
module load biocontainers
module load wtdbg
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run Wtdbg2 on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 24
#SBATCH --job-name=wtdbg
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml biocontainers wtdbg
wtpoa-cns -t 24 -i dbg.ctg.lay.gz -fo dbg.ctg.fa
NVIDIA NGC containers¶
autodock¶
Description¶
The AutoDock Suite is a growing collection of methods for computational docking and virtual screening, for use in structure-based drug discovery and exploration of the basic mechanisms of biomolecular structure and function.
Versions¶
2020.06
Module¶
You can load the modules by:
module load ngc
module load autodock
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run autodock on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=autodock
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc autodock
gamess¶
Description¶
The General Atomic and Molecular Electronic Structure Systems GAMESS program simulates molecular quantum chemistry, allowing users to calculate various molecular properties and dynamics.
Versions¶
17.09-r2-libcchem
Module¶
You can load the modules by:
module load ngc
module load gamess
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gamess on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=gamess
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc gamess
gromacs¶
Description¶
GROMACS GROningen MAchine for Chemical Simulations is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids. It was originally developed in the Biophysical Chemistry department of University of Groningen, and is now maintained by contributors in universities and research centers across the world.
Versions¶
2018.2
2020.2
2021
2021.3
Module¶
You can load the modules by:
module load ngc
module load gromacs
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run gromacs on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=gromacs
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc gromacs
julia¶
Description¶
The Julia programming language is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.
Versions¶
v1.5.0
v2.4.2
Module¶
You can load the modules by:
module load ngc
module load julia
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run julia on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=julia
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc julia
lammps¶
Description¶
Large-scale Atomic/Molecular Massively Parallel Simulator LAMMPS is a software application designed for molecular dynamics simulations. It has potentials for solid-state materials metals, semiconductor, soft matter biomolecules, polymers and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
Versions¶
10Feb2021
15Jun2020
24Oct2018
29Oct2020
Module¶
You can load the modules by:
module load ngc
module load lammps
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run lammps on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=lammps
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc lammps
namd¶
Description¶
NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR.
Versions¶
2.13-multinode
2.13-singlenode
3.0-alpha3-singlenode
Module¶
You can load the modules by:
module load ngc
module load namd
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run namd on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=namd
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc namd
nvhpc¶
Description¶
The NVIDIA HPC SDK C, C++, and Fortran compilers support GPU acceleration of HPC modeling and simulation applications with standard C++ and Fortran, OpenACC® directives, and CUDA®. GPU-accelerated math libraries maximize performance on common HPC algorithms, and optimized communications libraries enable standards-based multi-GPU and scalable systems programming.
Versions¶
20.7
20.9
20.11
21.5
21.9
Module¶
You can load the modules by:
module load ngc
module load nvhpc
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run nvhpc on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=nvhpc
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc nvhpc
parabricks¶
Description¶
NVIDIAs Clara Parabricks brings next generation sequencing to GPUs, accelerating an array of gold-standard tooling such as BWA-MEM, GATK4, Googles DeepVariant, and many more. Users can achieve a 30-60x acceleration and 99.99% accuracy for variant calling when comparing against CPU-only BWA-GATK4 pipelines, meaning a single server can process up to 60 whole genomes per day. These tools can be easily integrated into current pipelines with drop-in replacement commands to quickly bring speed and data-center scale to a range of applications including germline, somatic and RNA workflows.
Versions¶
4.0.0-1
Module¶
You can load the modules by:
module load ngc
module load parabricks
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run parabricks on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=parabricks
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc parabricks
paraview¶
Description¶
no ParaView client GUI in this container, but ParaView Web application is included.
Versions¶
5.9.0
Module¶
You can load the modules by:
module load ngc
module load paraview
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run paraview on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=paraview
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc paraview
pytorch¶
Description¶
PyTorch is a GPU accelerated tensor computational framework with a Python front end. Functionality can be easily extended with common Python libraries such as NumPy, SciPy, and Cython. Automatic differentiation is done with a tape-based system at both a functional and neural network layer level. This functionality brings a high level of flexibility and speed as a deep learning framework and provides accelerated NumPy-like functionality.
Versions¶
20.02-py3
20.03-py3
20.06-py3
20.11-py3
20.12-py3
21.06-py3
21.09-py3
Module¶
You can load the modules by:
module load ngc
module load pytorch
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run pytorch on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=pytorch
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc pytorch
qmcpack¶
Description¶
QMCPACK is an open-source, high-performance electronic structure code that implements numerous Quantum Monte Carlo algorithms. Its main applications are electronic structure calculations of molecular, periodic 2D and periodic 3D solid-state systems. Variational Monte Carlo VMC, diffusion Monte Carlo DMC and a number of other advanced QMC algorithms are implemented. By directly solving the Schrodinger equation, QMC methods offer greater accuracy than methods such as density functional theory, but at a trade-off of much greater computational expense. Distinct from many other correlated many-body methods, QMC methods are readily applicable to both bulk periodic and isolated molecular systems.
Versions¶
v3.5.0
Module¶
You can load the modules by:
module load ngc
module load qmcpack
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run qmcpack on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=qmcpack
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc qmcpack
quantum_espresso¶
Description¶
Quantum ESPRESSO is an integrated suite of Open-Source computer codes for electronic-structure calculations and materials modeling at the nanoscale based on density-functional theory, plane waves, and pseudopotentials.
Versions¶
v6.6a1
v6.7
Module¶
You can load the modules by:
module load ngc
module load quantum_espresso
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run quantum_espresso on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=quantum_espresso
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc quantum_espresso
rapidsai¶
Description¶
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Versions¶
0.12
0.13
0.14
0.15
0.16
0.17
21.06
21.10
Module¶
You can load the modules by:
module load ngc
module load rapidsai
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run rapidsai on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=rapidsai
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc rapidsai
relion¶
Description¶
RELION for REgularized LIkelihood OptimizatioN implements an empirical Bayesian approach for analysis of electron cryo-microscopy Cryo-EM. Specifically it provides methods of refinement of singular or multiple 3D reconstructions as well as 2D class averages. RELION is an important tool in the study of living cells.
Versions¶
2.1.b1
3.1.0
3.1.2
3.1.3
Module¶
You can load the modules by:
module load ngc
module load relion
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run relion on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=relion
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc relion
tensorflow¶
Description¶
TensorFlow is an open-source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays tensors that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.
Versions¶
20.02-tf1-py3
20.02-tf2-py3
20.03-tf1-py3
20.03-tf2-py3
20.06-tf1-py3
20.06-tf2-py3
20.11-tf1-py3
20.11-tf2-py3
20.12-tf1-py3
20.12-tf2-py3
21.06-tf1-py3
21.06-tf2-py3
21.09-tf1-py3
21.09-tf2-py3
Module¶
You can load the modules by:
module load ngc
module load tensorflow
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run tensorflow on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=tensorflow
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc tensorflow
torchani¶
Description¶
TorchANI is a PyTorch-based program for training/inference of ANI (ANAKIN-ME) deep learning models to obtain potential energy surfaces and other physical properties of molecular systems.
Versions¶
2021.04
Module¶
You can load the modules by:
module load ngc
module load torchani
Example job¶
Warning
Using #!/bin/sh -l
as shebang in the slurm job script will cause the failure of some biocontainer modules. Please use #!/bin/bash
instead.
To run torchani on our clusters:
#!/bin/bash
#SBATCH -A myallocation # Allocation name
#SBATCH -t 1:00:00
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 8
#SBATCH --gpus-per-node=1
#SBATCH --job-name=torchani
#SBATCH --mail-type=FAIL,BEGIN,END
#SBATCH --error=%x-%J-%u.err
#SBATCH --output=%x-%J-%u.out
module --force purge
ml ngc torchani