单细胞表观基因组学揭示了人类皮质发育的机制

微信号：inooooow
不接反杀，想去别人群里开挂，开不了不用加。
复制微信号

　　在严格遵守法律和机构伦理法规的情况下，在先前的患者同意下收集了去识别的组织样本。协议获得了加利福尼亚大学旧金山分校的人配子，胚胎和干细胞研究委员会（机构审查委员会）的批准。　　除了三个非区域特异性中期人皮质的标本外，还从三个中期人皮层的三个标本中微分区域。在含有DNase I（DNase）的木瓜蛋白酶在37°C中分离30分钟，然后将样品进行三杆，形成单细胞悬浮液。将细胞（106）在100μl冷冻裂解缓冲液（10 mM Tris-HCl pH 7.4、10 mM NaCl，3 mM MGCL2、0.1％tween-20、0.1％igepal ca-630，0.1％igepal ca-630，0.01％digitonin，1％Bovine digiton，1％Bovine Digitonin（Bsa））中）。然后将裂解细胞用1 mL冷水洗涤缓冲液（10 mM Tris-HCl pH 7.4、10 mM NaCl，3 mM MGCL2、0.1％Tween-20、1％BSA）洗涤，并在4°C下在500G下以500G颗粒固定。　　将组织切片冷冻并存储在-80°C下。如所述57，从冷冻组织中分离了核。简而言之，将冷冻的组织样品在2 mL冷冻均质化缓冲液中解冻（10 mM Tris pH 7.8，5 mM CaCl2，320 mm乙酸镁，320 mM蔗糖，0.1 mm EDTA，0.1％NP40，0.1％NP40，167μmβ-碳纤维硫乙醇，16.7μmmmsf）和imys and。然后将细胞裂解物在4°C下在4°C下在摇摆的铲斗离心机中以4°C的速度在4°C下在3,000g的碘糖梯度中离心20分钟，并关闭刹车。然后仔细移走了核条带，并在洗涤缓冲液中稀释核。　　使用前脑定向的分化方案培养皮质器官47,58。简而言之，两种遗传正常的人IPS细胞系（H28126（吉拉德实验室，芝加哥大学）和1323-4（康克林实验室，Gladstone Institutes）先前是认证的47，以及胚胎干细胞系H1（Wicell H1（Wicell，Wicell，Wicell，perioned in source in source），使用单个细胞扩展并使用单个细胞扩展，并使用单个细胞进行了分解。细胞测试了支原体阴性的细胞。解离后，在96孔V底低粘附板中以每孔10,000细胞的密度在神经诱导培养基中重构细胞。格拉斯哥改良的鹰培养基（GMEM）基于基于GMEM的神经诱导培养基包括20％基因敲除血清替代品（KSR），1×非必需氨基酸，0.11 mg/ml丙酮酸钠，1×青霉素 - 1×青霉素 - 链霉素，0.1 mmβ-莫氏素，5个and ananol，5个和3154215431543 cb43 and43 c41111113 ich43 c43 and43 y.11 and43 y.11 and43 y.111 and43 r.11 and3 icmb43 ch43 ch43 y.111 and 3 cym sb43 c43 and43 ch43 ch43 ch43 y11 and。iwr1-endo。在最初的6天中，培养基补充了20μM岩石抑制剂Y-27632。After 18 days, organoids were transferred from 96- to 6-well low-adhesion plates, moved to an orbital shaker rotating at 90 rpm, and changed to Dulbecco’s modified Eagle’s medium (DMEM)/F12-based medium containing 1× glutamax, 1× N2, 1× B27 without vitamin A and 1× antibiotic–antimycotic (anti-anti).在35天后，类器官被移至基于DMEM/F12的培养基中，该培养基含有1×N2，1×B27的维生素A和1×抗抗ANTI。在整个培养期间，每隔一天喂养器官。　　将大脑器官分离在含有DNase I（DNase）37°C的木瓜蛋白酶中30分钟，并进行样品形成单细胞悬浮液。将细胞（106）在100μl冷冻裂解缓冲液（10 mM Tris-HCl pH 7.4、10 mM NaCl，3 mM MGCL2、0.1％Tween-20，0.1％Tween-20，0.1％igepal CA-630，0.1％igepal Ca-630，0.01％，0.01％Digitonin，1％BSA）中，将细胞（106）裂解3分钟。然后将裂解细胞用1 mL冷水洗涤缓冲液（10 mM Tris-HCl pH 7.4、10 mM NaCl，3 mM MGCL2、0.1％Tween-20、1％BSA）洗涤，并在4°C下在500G下以500G颗粒固定。　　在上述方案直到第35天之前，将两种遗传正常的IPS细胞系（1323-4和H28126）分化为皮质器官。在第35天，每条线的类器官分为三种不同的条件：1）如上所述的第35天及以后的正常培养基条件（使用维生素A A）；2）第35天及以后的正常培养基条件（维生素A）加上100μMDEAB，这是RA合成的抑制剂；或3）第35天及以后的正常培养基条件除外，除了使用无维生素A的B27以外，DEAB治疗在一周后结束，培养条件一直保持相同，直到第70天，此时为SCRNA-SEQ处理了类骨子，并固定了免疫组织化学。我们将1323-4的类器官用于SCRNA-SEQ（三个条件中的每个条件中的每个条件），并使用1323-4和H28126类器官进行免疫染色。使用多seq寡核苷酸条形码59对SCRNA-SEQ加工的类器官进行多路复用，并合并用于库制备和测序，以减少潜在的批次效应。　　使用10倍基因组铬3'基因表达试剂盒生成单细胞RNA-seq库。简而言之，将单个细胞加载到铬芯片上，每个样品的捕获目标为10,000个细胞。根据提供的协议制备库，并在Illumina Novaseq上进行测序，其靶向测序深度为每个单元格50,000个读取。然后将来自测序的BCL文件用作10倍基因组细胞游侠管道的输入。　　如所述，生成了批量ATAC-SEQ库。简而言之，将50,000个核被透化并标记。生成标记的染色质库并在Illumina Novaseq上进行测序，其目标测序深度为每个库的5000万个读取。测序数据用作编码ATAC-SEQ分析管道（https://github.com/encode-dcc/atac-seq-pipeline）的输入。　　如先前所述的60，并进行了修改，准备了H3K27AC切割和标签库。简而言之，如上所述，细胞与人类发育的皮质组织分离。将五万个细胞等分试样在摇摆的铲斗转子离心机中以600克固定，并在200μl切割和标记洗涤缓冲液中洗涤两次（20 mm HEPES pH 7.5，150 mm NaCl，0.5 mm的精子，1×蛋白酶抑制剂鸡尾酒（ROCHE）。通过在200μL挖掘缓冲液中重悬于细胞颗粒（切割和标记洗涤缓冲液中，补充了0.05％digitonin和0.05％igepal Ca-630），将核分离。在200μl挖掘缓冲液中洗涤核沉淀核两次，然后再在补充了2 mM EDTA和1:50 H3K27AC初级抗体（细胞信号传导8173）的100μl挖掘缓冲液中，并在4°C上孵育过夜。通过在600g的核中固定核并在200μl挖掘缓冲液中洗涤两次，从而去除过量的原代抗体。在旋转时在1:50稀释的1:50中添加二抗（NOVEX A16031），并在室温下孵育30分钟。通过在600g的核中颗粒进行颗粒，并在200μl挖掘缓冲液中清除过多的二抗。在100μlDIG-MED缓冲液中以1：100的稀释液（0.05％digitonin，20 mM HEPE，pH 7.5，300 mm NaCl，0.5 mm的精子，1×蛋白酶抑制剂鸡尾酒）和核在房间温度下孵育1小时，以1：100的稀释液（0.05％digitonin，20 mm HEPES，pH 7.5，300 mm NaCl，0.5 mm NaCl，0.5 mm NaCl，0.5 mm NaCl，0.5 mm NaCl，0.5 mm NaCl，0.5，300 mm NaCl），在房间温度下孵育1小时，加入PA-TN5。通过在300g的核中固定核并在200μlDIG-MED缓冲液中洗涤两次，从而消除了未结合的PA-TN5。将细胞核重悬于100μL标记缓冲液中（在DIG-MED缓冲液中为10 mM MGCL2），并在37°C下孵育1小时。标记后，将核与添加100μLDNA结合缓冲液（Zymo Research）裂解，并按照制造商的说明将标记的DNA用Ampure XP珠（Beckman）的1.5：1比例纯化。将纯化的DNA在21μL缓冲液EB（10 mM Tris-Cl，pH 8.5）中洗脱并与2μl混合10μm索引i5和i7引物以及25μlnebnext Hifi 2×PCR主混合物。图书馆用循环条件放大：72°C持续5分钟；98°C 30 s;12个循环为98°C 10 s和63°C 30 s；最终延伸在72°C下持续1分钟，并保持在4°C。用Ampure XP珠的1：1比例纯化库，并在15μlEB中洗脱。切割和标签库通过安捷伦生物分析仪进行量化，并在Illumina novaseq 6000系统上测序的末端读取为1500万个读取，并具有50×8×8×50的读取长度。　　按照10倍基因组铬单细胞ATAC-SEQ溶液方案（使用V1.0 KIT）中概述的核制备。每个样品的核心核心的捕获靶标为10,000个核。准备遵循10倍基因组学单细胞ATAC – SEQ解决方案方案的SCATAC-SEQ库进行测序。使用PE150测序在Illumina Novaseq上测序SCATAC-SEQ库，其目标深度为每个核的25,000个读取（补充表1）。　　从测序产生的BCL文件用作10倍基因组细胞游侠ATAC管道的输入。简而言之，使用BWA生成了FASTQ文件并与GRCH38对齐。生成片段文件，其中包含所有独特的合适配对和对齐片段，并具有映射质量（MAPQ）> 30。每个唯一的片段都与单个单元格编码相关联。　　从单元格游室ATAC管道生成的片段文件被加载到Snapatac61管道（https://github.com/r3fang/snapatac）中，并生成了SNAP文件。然后，通过将基因组分割为5-kb窗口并为每个窗口中的读取，为每个样品生成一个单元格矩阵。基于对数（通过读取过滤器）在3到5之间的介质中过滤细胞，并在10％至60％的启动子中读取读数的一部分，以获得具有高质量文库的细胞。然后过滤垃圾箱，卸下重叠的垃圾箱编码黑名单区域（http://mitra.stanford.edu/kundaje/kundaje/akundaje/release/release/blacklists/）。然后将此矩阵进行二进制，并通过log10（计数 + 1）计算每个垃圾箱的覆盖范围。从归一化的垃圾箱覆盖率计算Z分数，并通过进一步分析过滤z得分以上的Z得分。覆盖范围 <500 bins were removed from the downstream analysis. A cell-by-cell similarity matrix was generated by calculating the latent semantic index (LSI) of the binarized bin matrix. Singular value decomposition (SVD) was performed on the log (term frequency – inverse document frequency) (TF − IDF) matrix. The top 50 reduced dimensions were used for batch correction through scAlign. 　　Multiple batches were integrated using the scAlign package16 (https://github.com/quon-titative-biology/scAlign). The ATAC batches were first merged together to calculate the latent semantic index (LSI) with the transcription factor matrix log-scaled for input into SVD. The 50 reduced dimensions of LSI were used as inputs to the encoder. The latent dimension was set at 32 and ran with all-pairs alignment of all batches. The input dimension to the encoder was set to 50 to match the input dimensions and trained to 15,000 iterations using the small architecture setting with batch normalization. The 32 dimensions were used for downstream analysis for finding neighbours. The scRNA-seq were processed using Seurat and computed the top 15 components from CCA for input into scAlign, and the latent dimension was set to 20 using the small architecture with batch normalization and 15,000 iterations. All alignments were unsupervised. 　　To visualize the high-dimensionality dataset in 2D space, the latent dimensions for the ATAC and RNA data from scAlign were used to construct UMAP62 graphs from Seurat. A k-nearest neighbour graph was constructed from the latent dimensions from scAlign using k = 15. The Leiden algorithm was then used to identify ‘communities’, or clusters, in the sample, representing groups of cells likely to be of the same cell type using resolution 0.8. 　　To create a proxy for gene expression, ATACseq fragments in the gene body plus promoter (2 kb upstream from transcription start sites) of all protein-coding genes were summed for each cell to generate ‘gene activity scores’. A matrix was constructed for all gene activity scores by all cells. Owing to the sparsity of scATAC-seq data, the MAGIC63 imputation method was used, as implemented in the SnapATAC package, to impute gene activity scores based on the k-nearest neighbour graph. 　　Broad cell-type classes were assigned to cells on the basis of the gene activity scores of previously described cell-type marker genes2 (Extended Data Fig. 2a). To identify cell types at a higher resolution, we assigned cell-type labels using the CellWalker17 method, as implemented in CellWalkR (v0.1.7). In brief, we used CellWalker to integrate scRNA-seq derived labels2 with scATAC-seq data by building a network of label-to-cell and cell-to-cell edges and diffusing label information over this combined network to compensate for data sparsity in single-cell data. We calculated cell-to-cell edge weight using the Jaccard similarity between cells. Label-to-cell weight was calculated as the sum of the products of the gene activity scores for that cell and the log fold-change in expression of each marker for that cell label. We tuned label edge weight using cell homogeneity as described17. Diffusion resulted in a vector of influence scores of each label for each cell. We then smoothed these vectors for each cell by taking a weighted average of its scores with those of each of its ten closest neighbours (weighted such that each neighbour contributed one-fifth as much as the cell in question) in UMAP space. Finally, we assigned cell-type labels to each cell using the label with the highest influence. 　　Fragments from cells were grouped together by broad cell class (RG, IPC, ulEN, dlEN, endo/mural, astro/oligo, nEN, IN-MGE, IN-CGE, MGE progenitor, insular, microglia) and peaks were called on all cluster fragments using MACS2 (https://github.com/taoliu/MACS) with the parameters ‘--nomodel --shift -37 --ext 73 --qval 5e-2 -B --SPMR --call-summits’. Peaks from each cell type were then combined, merging overlapping peaks, to form a master peak set, and a cell-by-peak matrix was constructed. This matrix was binarized for all downstream applications. 　　Differentially accessible peaks for each cell type were determined by performing a two-sided Fisher’s exact test and selecting peaks that had log-transformed fold-change >0和FDR校正的P < 0.05, using the built in function in snapATAC ‘findDAR’. 　　The deeptools suite64 (https://deeptools.readthedocs.io/en/develop/) was used to visualize pileups of cluster-specific ATAC-seq signal (output from MACS2) in DA peak sets. 　　To comprehensively categorize our peaks in genomic features genome-wide, we intersected our peak set with the 25-state model from the Roadmap Epigenomics Project18, specifically using the data generated from sample E081, which was a sample of developing human brain. Enrichment of peaks within annotated regions of the genome was calculated using the ratio between the (number of bases in state AND overlap feature)/(number of bases in genome) and the [(number of bases overlap feature)/(number of bases in genome) × (number of bases in state)/(number of bases in genome)] as previously described18. 　　We intersected our peak sets with several epigenomic datasets including ATAC-seq peaks from de la Torre-Ubieta et al.9 (GEO: GSE95023), ATAC-seq peaks from Markenscoff-Papadimitriou et al.23 (GEO: GSE149268), H3K4me3 PLAC–seq promoter-interacting regions, generated from ENs, INs, IPCs, and RGs sorted from samples of developing human cortex21 that were graciously provided by the author, H3K27ac peaks from Amiri et al.11 (taken from supplementary tables of publication), ATAC-seq peaks from Trevino et al.49 (GEO: GSE132403), H3K27ac peaks from Li et al.22 (obtained from http://development.psychencode.org), and high-confidence enhancer predictions from Wang et al.24 (obtained from http://resource.psychencode.org/). Any peaks not already mapped to hg38 were lifted over using the UCSC LiftOver tool. Overlaps between peak sets were determined using the ‘findOverlaps’ function in R. 　　The findMotifsGenome.pl tool from the HOMER suite65 (http://homer.ucsd.edu/homer/) was used to identify transcription factor motif enrichments in peak sets. The ChromVAR26 R package was used to identify transcription factor motif enrichments at the single-cell level in scATAC-seq data. In brief, the peak-by-cell matrix from the snap object was used as an input, filtering for peaks open in at least 10 cells. Biased-corrected transcription factor motif deviations were calculated for the set of 1,764 human transcription factor motifs for each cell. 　　The activity-by-contact (ABC) model20 (https://github.com/broadinstitute/ABC-Enhancer-Gene-Prediction) was used for prediction of enhancer–gene interactions from scATAC-seq data. Cell-type-specific ATAC-seq signal and peak outputs from MACS2 were used as inputs. Bulk H3K27ac CUT&Tag libraries generated from similar samples (see ‘Bulk H3K27ac CUT&Tag library preparation and sequencing’ above) were used as a mark for active enhancers. Publicly available Hi-C data generated from similar samples19 were used to demarcate regulatory neighbourhoods, using the highest resolution available for each chromosome. Cell-type-specific gene expression profiles were generated from publicly available scRNA-seq data generated from similar samples2 by averaging expression across each cell type. The default threshold of 0.02 was used for calling enhancer–gene interactions. 　　VISTA enhancers were taken from the VISTA Enhancer Browser25 (https://enhancer.lbl.gov/) and filtered for human sequences found to be active in the forebrain. Enhancers were lifted over to Hg38 using the UCSC LiftOver tool (https://genome.ucsc.edu/cgi-bin/hgLiftOver) and overlapping regions were merged, resulting in 319 unique regions. These regions were intersected with the peak set from all primary scATAC-seq cells and 304 peaks that overlapped with VISTA forebrain enhancer regions were identified. 　　The ChIPSeeker R package66 (https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html) was used to annotate all peak sets in genomic features. 　　Identification of enriched biological processes in genes near to sets of cell-type-specific enhancer predictions was performed using the GREAT alogrithm67. For each cell type, peaks that were both predicted enhancers and cell-type-specific were identified, and enrichment of biological processes in the flanking genes of the regions relative to a background set of the full primary peak set was identified. 　　Correlation between samples was determined using the ‘multiBamSummary’ function from the deeptools python suite64 on sample bam files. Bam file comparison was limited to the genomic space of the merged primary peak set (n = 459,953 peaks), ignoring duplicates and unmapped reads. Heat maps were then generated using the ‘plotCorrelation’ function. 　　To anchor mRNA expression and chromatin state profiles in the same map of cell diversity, we applied scAlign on datasets where we profiled scRNA-seq and scATAC-seq in parallel in the same sample. This was achieved by linking gene expression data to gene activity scores derived from chromatin accessibility data. The gene activity scores were logRPM values derived from gene activity scores generated by the SnapATAC pipeline. Then the gene expression and gene activity scores were processed using Seurat, and then split into batches for input into scAlign. The encoder space was computed using multi CCA of the 10 dimensions with latent dimensions at 20 using the ‘small’ architecture. 　　The Monocle 3 R package68 (https://cole-trapnell-lab.github.io/monocle3/) was used for pseudotime calculation of the co-embedded RNA and ATAC dataset. The RG cells were set as the root cells. The minimum branch length was 9 in the graph building. Monocle 3 was also used for the pseudotime calculation of the scRNA-seq PFC/V1 dataset. The Cicero package69 (https://cole-trapnell-lab.github.io/cicero-release/) was used for the pseudotime calculation of the scATAC-seq PFC/V1 dataset. 　　scATAC-seq cells from V1 samples used in the co-embedding analysis were divided into ten equal bins by pseudotime. Average accessibility for each peak for each bin was determined. Peaks were considered temporally dynamic if they met the following criteria: accessible in a minimum of 10% of cells in the bin with the highest accessibility; accessible in a maximum of 20% of cells in the bin with the lowest accessibility; at least a difference of 10% in proportion of cells where the peak was accessible between the lowest and highest accessibility bins; and had an increase in proportion of accessibility in cells of at least 3× between the lowest and highest accessibility bins. In total 25,415 out of 459,953 peaks met these criteria and were deemed to be temporally dynamic in the cortical excitatory neuronal lineage. 　　As pseudotime was calculated on the co-embedded space of ATAC and RNA cells, we can directly compare temporal changes in gene expression, gene activity scores calculated from open chromatin, and transcription factor motif enrichment. For each of the genes, we calculated gene activity scores using Cicero69 and calculated a 1,000-cell moving average across pseudotime from the ATAC cells. This value was normalized to represent a proportion of the maximum value. For gene expression, we calculated a 1,000-cell moving average across pseudotime from the RNA cells. This value was normalized to represent a proportion of the maximum value. For transcription factor motif enrichment, using Z-scores from ChromVAR, we calculated a 1,000-cell moving average of the motif enrichment across pseudotime from the ATAC cells. LOESS regression lines were fit to the moving average data. For the generation of heat maps, a similar approach was used, except values were averaged within 20 equally sized bins of pseudotime and normalized the maximum value. 　　URD70 (https://github.com/farrellja/URD/) was used to compare the branchpoints of ATAC and RNA independently. Deep-layer neurons were not considered during this analysis owing to obfuscating identities, and the batch-corrected values were used as input to the diffusion map calculations to combat batch effects. Diffusion parameters were set to 150 nearest neighbours, and sigma was autocalculated from the data. The tree was constructed using 200 cells per pseudotime bin, 6 bins per pseudotime window, and branch point P value threshold of 0.001. 　　To identify homologous cell types between primary and organoid scATAC-seq datasets, reads from organoid cells were counted in peaks defined in the primary dataset, providing matching peak-by-cell matrices for primary and organoid datasets. DA peaks were then identified in each dataset for each cluster as described above, and the intersection of this DA peak set was used to calculate correlations between primary and organoid clusters after averaging peak accessibility across all cells in each cluster. Homologous cell types were then determined on the basis of the highest correlation values for each cluster. 　　For primary samples used in Figs. 2, 3, scRNA-seq data were preprocessed using a minimum of 500 genes and 5% mitochondrial cutoff and Scrublet71 for doublet removal. The SCTransform72 workflow in Seurat73 was run separately on each batch. Canonical component analysis (CCA) on the Pearson residuals from SCTransform was used as input into scAlign for batch correction. Dimensionality reduction and clustering were performed using PCA and Leiden, respectively, using the default parameters of the Seurat pipeline. For organoid samples used in the arealization experiment in Fig. 3, libraries from different conditions were demultiplexed using the Multi-seq pipeline (https://github.com/chris-mcginnis-ucsf/MULTI-seq). The normal SCTransform workflow was then applied, as described above. Genes that were differentially expressed between conditions were identified using the ‘FindMarkers’ function with ‘MAST’ selected as the method. For organoid samples used for validation (Extended Data Fig. 10), scRNA-seq data were integrated following the Seurat SCTransform integration workflow using default parameters. 　　To systematically determine whether organoid cells had a transcriptomic identity more closely aligned with human PFC or V1 cells, we implemented a previously described classifier method42. In brief, area gene modules defined on the basis of area-associated gene expression patterns2,42 were generated and module eigengene values were determined for each organoid excitatory neuron using the ‘moduleEigengenes’ function from the WGCNA R package74. Organoid cells were then assigned an identity of ‘PFC’ or ‘V1’ using the higher module eigengene value for each module. The significance of differences in proportions of identity labels between treatments was determined using a two-sided χ2 test (P < 0.05). 　　Peak sets were intersected with DNMs from 2,708 probands and 1,876 siblings using bedtools v2.24.0. DNMs were identified using an in-house pipeline. In brief, variants from whole-genome sequencing data were called using four independent callers: GATK v3.8, FreeBayes, Strelka, and Platypus. Variant calls from each caller were intersected, and filtered for read depth (>9), allele balance (>0.25），缺乏支持父母突变的读数，并由四个呼叫者中的至少三个识别。　　与包含所有主要峰的背景峰集相比，测试了与Plac-seq启动子相互作用区域重叠的细胞类型特异性峰和峰值的DNM富集。我们使用Fisher的精确测试将峰数与细胞类型特异性峰集和背景峰集之间的一个或多个DNM进行比较。我们还执行了Wilcoxon秩和测试，比较了细胞类型的集合设置中的每个峰的DNM数量与背景峰集。我们将Bonferroni多重测试校正应用于所有P值。　　我们使用Bedtools v2.24.0创建了基因加上上游调节区域，在此我们将上游调节区域定义为基因转录起始位点上游的100-kb区域。使用Gencode V27定义了基因区域。每个基因的每个基因中的每个基因的峰值总数量化了每种细胞类型的每个基因，并使用Fisher的精确测试为每个基因组合的合并峰集中的峰数进行了比较。其余基因和启动子区域中的峰用作背景。COE等人30（COE253），Kaplanis等人56（DDD299）和Sfari Gene（https://gene.sfari.org/database/human-gene/）的基因集用于富集测试。p值在多个测试（峰值集数）中进行了Bonferroni校正。　　使用Bedtools 2.24.0，将COE等人30（n = 70）的NDD病例中富含的CNV与峰集相交；峰必须与CNV区域重叠50％。将重叠的CNV的峰总数与每种单元类型的CNV重叠的峰数进行了比较。完整的初级峰值组用作背景，并通过Fisher的精确测试进行了比较。p值在多个测试（峰值集数）中进行了Bonferroni校正。　　我们从精神病学基因组数据portal（https://wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww.unc.unc.edu/pgc）中检索了精神分裂症（Ripke et al.27），躁郁症（Stahl等人35）和ASD（Grove等人33）的GWAS摘要统计。我们还从http://walters.psycm.cf.ac.uk/获得了精神分裂症（Pardiñas等人32）的GWAS摘要统计数据。GWAS的重大抑郁症汇总统计数据（Howard等人34）是根据23AndMe和马里兰州巴尔的摩大学之间的数据使用协议的主持人获得的。我们将分层LD得分回归（LDSC版本1.0.175,76）应用于这些摘要统计信息，以评估十个预测的增强子集中的每个中的每个人中的特征遗传力的富集。调整了这些关联的峰值集合以及LDSC基线模型1.2版的52个注释（包括基因区域，增强器区域和保守区域76）。符合FDR <0.05的截止的关联被认为是显着的。　　如果赔率还包含ATAC峰的可能性，则计算赔率，如果它也包含来自子图标题表示的集合的基因，则使用Fisher的精确测试确定了显着性。洋红色虚线表示p <0.05的显着性阈值。基因集从http://resource.psychencode.org/24,31获得。TAD集来自人脑，生发区（GZ）和皮质板（CP）19。　　用于免疫染色的样品在4％PFA中固定45分钟，用PBS洗净，并在4°C的30％蔗糖溶液中孵育过夜。然后将样品嵌入OCT和30％蔗糖的1：1溶液中，并在-80°C下冷冻直至准备切片。冷冻切片以16μm的厚度制备。在10 mM柠檬酸钠（pH 6.0）中进行热诱导的抗原检测15分钟。在补充2％Triton X-100的PBS（pH 7.4）中进行透化。将原抗体和二抗稀释并在补充有10％驴血清，2％Triton X-100和0.2％明胶的PBS（pH 7.4）中孵育。本研究中使用的主要抗体包括：鼠标抗AUTS2（1：200，ABCAM AB243036），兔抗NR2F1（1：1：1：1：1：100，Novus Biologicals NBP1-31259），鼠标抗SATB2（1：250）（1：250）AB18465），兔子抗FOXG1（1：500，ABCAM AB196868）和兔抗PAX6（1：200，Biolegend 901301）。使用的二抗是Alexafluor二抗。使用Leica SP5共聚焦系统收集图像，并使用ImageJ/Fiji处理。　　有关研究设计的更多信息可在与本文有关的自然研究报告摘要中获得。

本文来自作者[yjmlxc]投稿，不代表颐居号立场，如若转载，请注明出处：https://yjmlxc.cn/zsfx/202506-6481.html