Point-and-click tools for assembly, mapping and amplicon variant analysis
The GS Data Analysis Software package includes the tools to investigate complex genomic variation in samples including de novo assembly, reference guided alignment and variant calling, and low abundance variant identification and quantification. The suite of software is provided with the GS Junior and GS FLX System at no additional cost and allows researchers to begin interpreting sequence data immediately, without the need to invest in complex and expensive third party solution. Each of the software tools incorporates flow and signal information into the sequence analysis algorithms leading to higher confidence variant calling. Additionally, researchers can interrogate sequence data down to the flow-by-flow signal intensities used in base calling.
Point-and-click genome and transcriptome assembly
- A powerful tool for de novo assembly of genomes up to 3 Gb in size
- Microbial genome assembly on commodity workstation hardware
- Perform whole genome assembly with shotgun reads alone or in combination with 3, 8, or 20 kb span paired end reads to order and join contigs into scaffolds to accurately reconstruct the structure of a genome
- Produce high-quality assemblies for microbial genomes in as little as 15 minutes, and in less than 24 hours for larger genomes
- Perform de novo assembly of EST reads from cDNA library sequencing runs to accurately reconstruct the transcriptome and identify novel genes, isoform variants and transcript fusions
- Graphical software environment for quick project set-up and assembly viewing down to the flowgram level
- Command line operation available for power users and scripting
- Perform hybrid assemblies using GS FLX and GS Junior shotgun and paired end reads with additional capillary-sequencing or short read sequencing reads (FASTA or FASTQ)
- Data output: Contig sequence and quality files (sequence of contigs and corresponding Phred equivalent quality scores, FASTA format), ace.file (alignment of the reads to contig sequence), optional Consed output and more
Screenshot of the GS De Novo Assembler illustrating an E. coli assembly project. Individual reads making up consensus contigs are listed in the multiple alignment view. The user interface allows viewing of flowgram data backing sequence reads.
Reference-guided alignment and variation detection for resequencing projects
- Rapidly and accurately align reads to any reference genome
- Identify differences compared to the reference
- Annotate reference features and variations
- Explore the full spectrum of genomic variation:
- Local variation detection: SNPs, insertions and deletions (blocks up to 50 bases)
- Structural variation detection: large inserts and deletions, inversions, duplications, translocations and fusions
- Data outputs: fna.file (sequence of contigs, FASTA format), qual.file (corresponding Phred equivalent quality score), ace.file (consensus alignment of the reads against a given reference sequence), and SAM/BAM (industry standard alignment format)
Screenshot of the GS Reference Mapper showing two linked SNPs separated by 36 bp in the multiple alignment view. Each row is a unique sequencing read aligned to the reference genome and the highlighted regions show a single base substitution.
Screenshot of the GS Refence Mapper showing the flowgram for a single read compared with the idealized reference flowgram. The bottom panel is the subtraction of the two flowgrams highlighting the two linked SNPs occurring in a single read.
High-sensitivity variation detection from amplicon-based samples
- Aligns PCR amplicon reads against a reference sequence
- Accurately detects and quantifies known variants in complex pools
- Defines and discovers novel variants
- Performs haplotyping– identify multiple linked variants over the full amplicon length
- Detects low-frequency (<1%) variants in complex mixtures, such as somatic mutations and viral quasispecies
- Collapse high-depth sequences into consensus sequences to explore the unique members of a mixture
- Flexible project set-up: separate samples / results based on MID tags (“barcodes”), associate amplicons, references and samples for simple and complex experimental designs
- Data outputs: ace.file (alignment of the reads against a reference sequence), png.file (graphical file format, tab delimited text file), and SAM/BAM (industry standard alignment format)
Screenshot of the GS Amplicon Variant Analyzer illustrating a single base change compared to the reference for an amplicon sample. Top pane shows variant frequency in the sample and the bottom pane shows multiple alignment.
Screenshot of the GS Amplicon Variant Analyzer illustrating a multi-nucleotide deletion in an amplicon sample. The flowgram of a single read is compared against the idealized flowgram of the reference.