Welcome to CNVision

This program was designed and written by Stephan Sanders and Chris Mason from the State Lab at Yale University. It is designed to simplify all stages of predicting and analyzing CNVs, from running prediction algorithms and combining their results to visualizing the raw data and designing qPCR primers for confirmation.

If this is the first time you have used CNVision then check the CNVision basics for a quick overview.

Downloading CNVision

CNvision can be downloaded from SourceForge.net at this address:

https://sourceforge.net/projects/cnvision/files/

Installing CNVision

For directions on installing CNVision see the CNVision install page

CNV Data

BeadStudio Imports raw data and creates a new BeadStudio project
FinalReports Converts a BeadStudio project into data that CNVision can analyze

CNV Pipeline

  • See Pipeline Shows an image of the entire pipeline: CNVision Pipeline Map
  • Start Initiates a Logfile and Sequence tracker file, also asks how many processors to use
  • Gender Determines the gender of FinalReports by looking at chrX homozygosity
  • Quality Checks FinalReports quality by counting the number of probes with extreme and wide logR values
  • Convert Converts FinalReports into suitable input files for PennCNV and QuantiSNP
  • Input Makes a list of all files made during the Convert function and prepares a list of files for PennCNV and a batch file for QuantiSNP
  • GNOSIS Runs the GNOSIS CNV detection tool on all files with ‘FinalReport’ in their name
  • GNOSIS old Runs a version of GNOSIS CNV detection tool that modifies LogR values on chrX and chrY for both sexes, this is not necessary for newer Illumina chips (1M Duo, 370 Quad, Omni)
  • RF GNOSIS Reformats GNOSIS output to allow merging
  • PennCNV Runs the PennCNV CNV detection tool on all files in the list created by Input
  • RF PennCNV Reformats PennCNV output to allow merging
  • QuantiSNP Runs the QuantiSNP CNV detection tool on all files in the batch file created by Input
  • RF QuantiSNP Reformats QuantiSNP output to allow merging
  • Merge Merges files in the Merge input format
  • Rare Compares merged output to a list of common loci in the Annotation Folder
  • Bad Combines lists of samples that failed quality control into a single list
  • Pedfile Creates a modified pedfile without samples in the list generated by Bad
  • Denovo Compares children to parents using a pedfile (with ‘pedfile’ in the filename)
  • Summary Converts the Denovo output into a readable format and makes a UCSC genome browser file
  • Annotate Annotates the Summary output for genes, common variants and regions of interest
  • Homodel Finds homozygous deletions in FinalReports
  • RF Homodel Reformats Homodel output to the Merge input format
  • Annotate Annotates the Homodel output for genes, common variants and regions of interest
  • Clean Removes any unnecessary files from the folder
  • Quality Control
  • Quality Checks FinalReports quality by counting the number of probes with extreme and wide logR values
  • Family Checks the family relationships of all FinalReports in a pedfile using PLINK
  • Order Checks that all FinalReports are in the same order
  • Strand Checks that all FinalReports are from the same strand
  • Overlap Looks for overlap between regions or CNVs in a file

CNV Analysis

  • De Novo 2 A different method of de novo CNV prediction that uses the FinalReports and a pedfile to work out whether a list CNVs are de novo or inherited
  • Large An algorithm for joining together large CNVs that have been made into smaller predictions with gaps
  • Recurrent Works out the relative frequency of CNVs from two files of CNVs
  • Recurrent thres Same as Recurrent, but filters the input by CNV type, number of algorithms and number of probes
  • Frequency Determines the frequency of CNVs/regions across all samples
  • eTaqman Examines all FinalReports for evidence of CNVs at specified co-ordinates
  • Loci Reduces a list of regions/CNVs to non-overlapping loci
  • Compare Compares the output of two algorithms/chips run on the same samples
  • Proportion Calculates the number of CNVs and individuals meeting specified criteria to calculate CNV burden
  • Template Creates a template for running Multiproportion
  • Multiproportion Similar to Proportion, but can do numerous analyses. Use the Template function above to create a template specifiying the analyses to perform, add the analyses then run this function.

Annotation

  • Annotate Annotates a list of regions (chr, start, stop) against a series of annotation files (eg genes); this function allows numerous analyses and can be customised using the Template function below
  • Template Makes a custom annotation template of all the files in the Annotation Folder
  • Pedfile Annotates a list of regions (chr, start, stop) against a pedfile to show family, gender and phenotype
  • Samples Annotates a list of regions against a list of samples
  • Genes Finds the chr, position, number of exons and gene name of each RefSeq Gene Symbol (eg NRXN1)
  • Genes2 Similar to Genes, but performs the analysis of the internet and is slower
  • SNPcount Works out the number of probes in a FinalReports file within each region/CNV
  • dbSNP Works out the number of common SNPs present in dbSNP within each region/CNV
  • SNP Uses the internet to look up SNP ‘rs’ IDs in dbSNP and return chr and position
  • Visualization
  • CNVs Makes a PDF of the LogR and BAF values for CNVs per family (SOR plot)
  • BEDgraph Makes a UCSC genome browser custom track of Frequency files
  • Wiggle Makes a UCSC genome browser custom track of large Frequency files
  • BEDfile Makes a UCSC genome browser custom track of CNV files
  • FinalReport Makes a UCSC genome browser custom track of a FinalReport

FinalReport Analysis

  • SampleID Quickly finds the sample ID and chip type from FinalReports
  • Gender Looks for ‘FinalReport’ and determines sample ID, chip type and sex
  • Ancestry Uses 400 ancestry informative markers (AIMs) to determine connitent of origin
  • Mean LogR Calculates the Mean, median and stdev in LogR of each chromosome
  • Consensus Makes a list of the probes common to different chip types in FinalReports
  • Prune Removes probes that are not present in the consensus list from Consensus out of FinalReports
  • PLINK Converts FinalReports into PLINK input format (.ped and .map)
  • Homodel Finds homozygous deletions in FinalReports
  • UPD Calculates the percentage homozygosity per chromosome to detect uniparental disomy

File Manipulation

  • List Makes a list of every file and folder
  • Copy Copies files in a text file to a new folder
  • Move Moves files in a text file to a new folder
  • vLookup Looks up terms in a reference file to add a new column
  • Split Looks for differences in a specified column and splits the file accordingly
  • Combine Combines multiple files that have columns in the same order
  • Sort Sorts the file by up to three columns
  • Columns Changes the order of columns in a file
  • Trim top Removes lines from the top of a file
  • Trim bottom Removes lines from the bottom of a file (useful for making a testfile)
  • Duplicates Removes duplicate lines in a file
  • Differences Find differences between two files that should be identical

Other Tools

  • GC content Calculates the GC content for a list of regions
  • Get DNA Looks up the DNA sequence for a list of regions
  • Primers Designs qPCR or PCR primers for a list of regions
  • Advanced Users

The graphical user interface (CNVision.jar) is not necessary for running the functions. The perl script Combined_CNVv1.73.pl can be used alone to perform all these functions using command prompt (Windows) or terminal (Mac and Linux). On the pages showing how a specific function is performed in CNVision, the command line description is also given.