Genetics Bioinformatics Service Center

Introduction

Genetics Bioinformatics Service Center, GBSC, is a School of Medicine service center. It was started in Fiscal Year 2014 (09/01/2013) by Department of Genetics. Services are available to Stanford University researchers.  In this era of Big Data Genomics, our goal at the Bioinformatics Service Center is to provide best-in-class high performance computational infrastructure and cutting edge bioinformatics services.

 

Services offered:

  • On-premises HPC cluster
  • Managed Google Cloud Platform access
  • Bioinformatics-as-a-Service consulting
  • Wearable data collection platform

 

On-premises HPC cluster

  • State-of-the-art computational cluster
    • 3000+ modern cores
    • 7.0+ Petabytes of high performance storage
  • Architected specifically to suit genomics data analysis
    • We have several generations of Intel x86 microarchitecture including Westmere, SandyBridge and IvyBridge.
    • Most of the servers have at least 384 GB RAM. RAM/core ratio in a server is 8 GB/core or better.
    • We have two "fat nodes": one with 1 TB RAM and two others with 1.5 TB RAM. These servers are specifically useful for memory-intensive tasks such as assembly.
    • Storage is the Oak storage service from SRCC.
    • 10Gb or better Ethernet connectivity to Internet.
  • Complies with dbGAP (NIH) compliance rules for data security, allowing PIs to bring data requiring this compliance to our cluster
  • Housed in Stanford’s state-of-the-art Tier 2 data centers at Forsythe Hall and at SLAC that deliver >99% uptime
  • Managed by Stanford Research Computing Center HPC IT sys admins
  • UNFORTUNATELY, NO PHI data can be stored on our computing cluster.

Managed Google Cloud Platform (GCP) access

  • Single Sign On: GCP servers are authenticating against Stanford's password servers.  
  • Stanford-Google Master Level Agreement covers Stanford's data use and privacy clauses, so you do not need an NDA to store data on GCP.
  • GCP has a dedicated link with the Stanford network, allowing us faster transfer speeds. Data can be uploaded directly to GCP from the cluster. Observed upload bandwidth of ~100MB/s.
  • Allows for data backups and archival. 
  • Google storage is on disk (unlike AWS Glacier) so if users wish, they can compute on GCP against this data.
  • Fine-grained control on data sharing.
    • Box only supports 10GB files so typical FASTQ or BAM files need to be chunked.
    • On GCP, you can keep the files intact and share with fine-grained control like on Box. 
  • Developed to dbGaP compliance. Please talk to the administrative team before putting dbGaP data on GCP. There is some additional training necessary to understand the dos and donts.

Bioinformatics-as-a-Service consulting

  • Payable by the hour
    • First 1 hour is free for Stanford affiliates
  • Answers questions about techniques and approaches and troubleshoot existing processes
  • Evaluates your data needs
  • Runs best-practices pipelines on your data
  • Develops new tools and custom workflows
  • Provides help with grant writing, experiment design, and publications
  • Trains lab members
  • Interprets your results

Wearable data collection platform access (MyPHD)

  • An open-source software framework built by Stanford researchers to support big biomedical data acquisition, storage, and real-time analysis with minimum configuration effort.
  • Highly secure
  • Recruit through our mobile application
  • Get wearable data delivered to the destination of your choice

Other bioinformatics services provided include:

  • >500 public genomics software applications that are centrally installed and managed. Users can request installation of new tools. 
  • Wiki with training and other relevant information 
  • Commercial solutions: Sentieon software, KEGG Database access
  • Regular bioinformatics workshops and user group meetings (See GBSC website for past events)

 

GBSC has negotiated deep discounts with multiple solution providers. Since this web page is public, information that is deemed business-confidential is not provided here. Please look at "Request Core Access" tab for service costs and other Stanford Confidential information.

 

Getting Started

  • Stanford Users:  LOGIN or REGISTER for an iLab account using valid Stanford credentials (SUNet ID).

 

Faculty Advisory Committee

  • Thomas Quertermous (Chair), MD, Professor of Cardiovascular Medicine
  • Euan Ashley, MRCP DPhil, Associate Professor of Cardiovascular Medicine
  • Anshul Kundaje, PhD, Assistant Professor of Genetics and of Computer Science
  • Stephen Montgomery, PhD, Associate Professor of Pathology
  • Dmitri Petrov, PhD, Professor of Biology
  • Rajat Rohatgi, PhD, MD, Biochemistry and Medicine
  • Michael Snyder, PhD, Professor and Chairman of Genetics, Director of the Stanford Center for Genomics and Personalized Medicine 

Locations   

GBSC Team

Computational Cluster

SoM Technology and Innovation Park           
3165 Porter Drive
Palo Alto, CA 94304 

Forsythe Hall Data Center     
275 Panama Street
Stanford, CA 94305

 

Links and Resources

  • GBSC website
  • Cluster wiki (note that this is only available to Stanford community)