NCBI参考基因组下载资源

参考文件来源于GDC,提供 MD5 校验和用于在下载后验证文件完整性。

GRCh38.d1.vd1 Reference Sequence

GRCh38.d1.vd1.fa.tar.gz

  • md5: 3ffbcfe2d05d43206f57f81ebb251dc9
  • file size: 875.3 MB

This reference genome is used by the GDC for all sequencing and array based analyses. This file is composed of the following sequences:


Index Files

Index files are built from the GDC reference genome and are used with the software listed below.

GDC.h38.d1.vd1 BWA Index Files

GDC.h38.d1.vd1 GATK Index Files

GDC.h38.d1.vd1 STAR2 Index Files (v36)

GDC.h38.d1.vd1 STAR2 Index Files (v22)


Annotation Files

Annotation files contain information about the position and identity of regions in the reference genome. They allow software to calculate expression values.

GDC.h38 miRNA database files

GDC.h38 GENCODE v36 GTF

GDC.h38 GENCODE v22 GTF

GDC.h38 GENCODE TSV (v22)


Miscellaneous Files

Methylation Array Gene Annotation File (v36)

Antibody Description Files for TCGA RPPA Data (v36)

Antibody Description Files for TCGA RPPA Data (v22)

Genome Annotation Files for Legacy TCGA Data

SNP6 GRCh38 Remapped Probeset File for Copy Number Variation Analysis

If you are using Masked Copy Number Segment for GISTIC analysis, please only keep probesets with freqcnv = FALSE

SNP6 GRCh38 Liftover Probeset File for Copy Number Variation Analysis

GDC VEP Cache File

GDC Panel of Normal (PON) Files used for Variant Calling

THESE FILES ARE CONTROLLED AND REQUIRE DBGAP ACCESS TO DOWNLOAD. YOU WILL NEED TO USE THE GDC-CLIENT TO DOWNLOAD THESE.

For Tumor-Only Variant Calling Pipeline

gatk4_mutect2_4136_pon.vcf.tar

  • uuid: 6c4c4a48-3589-4fc0-b1fd-ce56e88c06e4
  • md5: 725d891e02ca93edaabac8b09322439e
  • file size: 92 MB
For Tumor / Normal Variant Calling Pipeline

MuTect2.PON.4136.vcf.tar

  • uuid: 6b45b9f7-893e-4947-83b6-db0402471e23
  • md5: d13a138dcf4e9f1ec8a69ac3a4f64ca9
  • file size: 121 MB

MuTect2.PON.5210.vcf.tar

  • uuid: 726e24c0-d2f2-41a8-9435-f85f22e1c832
  • md5: 5b5c1c3e208aa9a403cc4a8ff39e7f1f
  • file size: 146 MB

如若转载,请注明出处:https://www.ouq.net/2588.html

(0)
打赏 微信打赏,为服务器增加50M流量 微信打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量 支付宝打赏,为服务器增加50M流量
上一篇 11/20/2023
下一篇 11/25/2023

相关推荐

  • 药物开发的通用公共数据库

    数据库名称 描述 URL dbSNP SNPs for a wide range of organisms, including >150M human reference SNPs. http://www.ncbi.nlm.nih.…

    生物在线资源 08/09/2023
    207
  • AffyProbeMiner

    网址:http://discover.nci.nih.gov/affyprobeminer/ 功能:研究并重新定义的Affymetrix芯片数据。 AffyProbeMiner是一种计算效率低的平台独立工具,使用了所有RefSeq成熟RNA…

    03/19/2020
    110
  • AlphaFold蛋白结构预测常见使用问题

    如何搜索数据库?  页面顶部的搜索栏接受基于蛋白质名称(例如游离脂肪酸受体 2)、基因名称(例如At1g58602)、UniProt 加入(例如Q5VSL9)或生物名称(例如大肠杆菌)的查询。目前不支持 BLAST / 基于序列的搜索,也不…

    蛋白预测 04/05/2022
    479
  • ShRNA设计网站 shRNA design Online

    http://www.changbioscience.com/stat/sirna.html http://www.dharmacon.com/designcenter/designcenterpage.aspx http://www.dk…

    生物在线资源 04/03/2024
    360
  • DNA和RNA数据库

    核酸序列数据库主要包括了基因组DNA序列,mRNA序列,tRNA序列,rRNA序列等核酸序列。国家上有三个主要核苷酸序列公共数据库: 位于英国剑桥的欧洲分子生物学实验室的欧洲核苷酸档案库(ENA) 位于美国的生物技术信息中心的GeneBan…

    09/12/2021
    145