Something about Valis
Genomic insights. at scale.
VALIS empowers organizations to analyze NGS data more efficiently,
by utilizing the latest technologies in big data and machine learning.

GENOMIC DATA HUB

Purpose built genomic storage engine enables blazing fast read-write access to FASTA, BAM, VCF, and Bigwig format data.
Sparse, columnar, compression lowers costs by up to 10x, while accelerating performance intensive use cases.
Genetic Data
Clinical Data
Public Data
Analytics
Machine Learning
Genome Browser
NGS Data
Manage all source and processed NGS data in a single repository. Track data lineage to ensure reproducability. Re-process source data with the click of a button.
Clinical Data
Integrate sample and variant level annotations with existing NGS data.
Generate standardized reports such as GWAS, Kaplan-Meier Curves, and many more.
Public Data
Curated, re-processed data from TCGA, ExAC, GTEx, ENCODE, dbSNP, gnomad, and 100+ other datasets can be integrated with private data at the click of a button.
Analytics
A powerful library of customizable pipelines & algorithms. Fast data aggregation powered by Apache Spark. Elastically scale existing pipelines and analyses with Kubernetes.
Machine Learning
Tensorflow API access to BAM, VCF, and Bigwig data makes training custom models quick and easy. Deploy models on large datasets with Apache Spark.
Genome Browser
Browse data using a high performance genome browser that enables collaboration, multi-locus viewing, embedding, and customization.

Elastically Scalable Analysis

Process thousands of samples without breaking a sweat – VALIS is built on top of technologies proven to scale to petabytes of data.
Empower researchers with best in class tools and infrastructure. Run analysis using R or Python interactively with Jupyter.
Trait Associated Loci for Lung Cancer
3 causal variants
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
X
Y
significance
chromosome

Machine Learning

Load functional, variant, and alignment data directly into Tensorflow.
Train models using cloud GPUs and run them in parallel using Apache Spark.
Deep Neural Network
T
to
C
mutation
increasesdecreases
METHYLATION at LOCUS

EXPLORE THE PRODUCT

Genomic data analysis and visualization, designed for petabyte scale