映射到连续组织轴的空间人胸细胞图集

微信号：inooooow
不接反杀，想去别人群里开挂，开不了不用加。
复制微信号

　　Metadata about scRNA-seq and CITE-seq samples, including information on source study, cell enrichment and donor age, are provided in Supplementary Table 1. Information about spatial data, including Visium, IBEX, RareCyte and RNAscope, is provided in Supplementary Table 2. In brief, all CITE-seq data were generated at Ghent University and mapped to the human genome (GRCh38) at the Wellcome Sanger Institute（WSI）。生成了所有其他原始的SCRNA-SEQ数据和10倍森林数据，并映射到WSI的人类基因组（GRCH38）。IBEX成像是在NIH的国家过敏和传染病研究所（NIAID）进行的。WSI生成了所有非IBEX成像数据集（Rarecyte，rnascope，visium H＆E）。NIH和根特大学没有进行胎儿工作。　　包括原始FASTQ文件中的人类胎儿和小儿数据包括3,7,15,16的3,7,15,16。有关样本处理，道德和资金的详细信息都可以在各自的出版物中获得；图1C和补充表1和2中提供了有关每个样品原点的详细信息。　　对于在WSI处理的样品，根据REC批准的研究18/EM/0314收集的纽卡斯尔大学从心脏矫正手术中获得小儿样本，并根据REC批准的研究07/Q0508/43提供了大奥尔蒙特街医院。人类胚胎和胎儿材料由MRC＆Wellcome Trust（Grant MR/006237/1）提供人类发育生物学资源（http://www.hdbr.org），并获得了纽卡斯尔和North Tyneside NHS NHS NHS NHS Health NHS Health Pental Interation委员会（08/H0906/2106/21+5）的书面同意和批准。根据根特大学（Ghent University）的样品，并根据比利时根特大学医院（EC/2019-0826）的医学伦理委员会（通过eec-bio/1-2018）的批准使用。根据先天性心脏病儿童的心胸部手术，在华盛顿州儿童医疗中心病理学系的病理学系NIAID MTA 2016-250根据NIAID MTA 2016-250获得了样品。根据人类研究保护办公室发布的指南，将这些胸腺样品的使用在本研究中被确定为不受NIH机构审查委员会的审查。获得了所有捐助者或其法定监护人的知情同意。　　手术去除的小儿百里香直接转移到低温（Sigma-Aldrich，H4416-100ML），由Courier用冰袋运送，并在手术后24小时内进行处理。对于SCRNA-SEQ实验，我们进行了单细胞分离，如在线可用协议中所述（https://doi.org/10.17504/protocols.io.bx8sprwe）。简而言之，胸腺组织被细分了，并使用Liberase TH（Roche，05401135001）和DNase I（Roche，4716728001）的混合物进行细胞解离，在两轮中进行约30-60分钟。通过70μm滤网过滤消化组织，并在RPMI培养基中使用2％FBS停止消化。然后，使用RBC裂解缓冲液（Ebioscience，00-4333-57）在细胞颗粒上进行红细胞裂解，然后将细胞洗涤并计数。进行磁和/或FACS分类以富集基质种群。使用来自Miltenyi Biotec的磁排序套件（包括LS列）（包括LS列（130-042-401））和以下质量标记的抗体CD45 MIRSOBEABEADS（13045 MIRSOBOBEABEADS（130-MIRSOBOBEADIS），对U09，U48和Z11样品（补充表1）进行磁性排序以富含EPCAM+或耗尽CD45+或CD3+细胞耗尽CD45+或CD3+细胞（补充表1）。人CD326（EPCAM）Microbeads（130-061-101），人CD3 Microbeads（130-050-101）。将Z11和U40样品分组为富含CD45-基质细胞（Z11）或获得CD45-细胞和总TEC（U40）（补充表1）。要执行FACS分类，将细胞重悬于FACS缓冲液中（0.5％FBS和2 mM EDTA中的PBS），被TrustainFCX（Biolegend，422302）封闭10分钟，并用抗体染色EPCAM（抗CD326 PE，9C4，BIOLOLGEND，BIOLEGEND，324206，CD45（hi305），CD45（抗）Biolegend，304048），CD205（抗CD205（DEC-205）APC，Biolegend，342207）和CD3（抗CD3 FITC，Okt3，Biolegend，317306）和DAPI 30分钟。将所有抗体稀释1:50进行染色。染色后，使用带有130μm喷嘴的Sony SH800或Sony MA900划线器对细胞洗涤并分类。首先将样品门控以去除碎屑和死细胞，但没有应用单线门以确保不会排除大型TEC。对于用抗CD45染色的样品，将总CD45-细胞分类以富集基质。对于EPCAM/CD205染色的细胞，对EPCAM+ CD205-和EPCAM-CD205+细胞进行了分类，以获得包括皮质和髓质上皮细胞在内的总TEC分数，但排除了自动荧光细胞（补充图19A）。FCS Express V.7.18.0025用于数据分析。分类/富集后，将细胞重悬于收集缓冲液中，至推荐的浓度（每毫升106个细胞），并在10倍基因组铬控制器上捕获约8,000-10,000个细胞，以使用铬铬在液滴中产生乳细胞的乳液，该细胞使用铬的下一个gem gem单细胞5'吉特5'吉特V2（1000263）。根据制造商的说明，使用图书馆构造套件（1000190）（1000190）和铬单细胞人类TCR扩增套件（1000252）进一步准备了GEX和TCR-Seq库。测序是在Novaseq 6000 Sequencer（Illumina）上进行的。补充表1提供了其他详细信息。　　本文中包括了先前一些研究的数据3,7,15,16。对于存放在ArrayExpress上的公共数据集，从ENA下载了配对的FASTQ文件，并使用.sdhf文件来确定实验类型（3'/5'和10x Genomics Kit的版本）。对于存放在GEO上的公共数据集，如果将数据存入作为单元格游室.BAM文件，则使用Srapath v.2.11.0获得BAM文件的URL。使用10X BAMTOFASTQ v.1.3.2下载了.bam文件并将其转换为.fastq文件。如果将GEO数据存储为配对端FASTQ文件，则使用NCBI Entrez-Direct v.15.6的搜索实用程序找到SRA文件，并使用FastQ-Dump v.2.11.0下载并转换为FastQ文件。从沉积在地理上的摘要中策划了样品元数据。最后，从IRODS v.4.2.7下载了已发布的数据集，以CRAM文件的形式下载，并使用SamTools v.1.12转换为FastQ文件。使用命令‘samtools callate -o -u-@@16 $ cram $ tag.tmp |samtools fastq -n -f 0×900-@@16 −1 $ tag.r1.fastq.gz-2 $ tag.r2.fastq.gz--’。使用IROD的IMETA命令获得了样品元数据。　　FASTQ文件产生后，使用GitHub上详述的Starsolo管道（https://github.com/cellgeni/starsolo）上详细介绍了10倍基因组学SCRNA-SEQ实验。按照10倍基因组学的指示制备了恒星人基因组参考匹配的细胞Ranger GRCH38-2020-A。使用Star v.2.7.9a和先前收集的有关样品类型（3'/5'，10倍基因组学KIT版本）的数据，我们应用了Starsolo命令来指定UMI coldapsing，条形码崩溃，并读取剪辑算法以生成与“ Cellranger Counter” cellranger Counter in cyl in cyl in cyl in cyl in cyl in cyl in cyl rangrum v.6 cyl rangrum v.6 cyl rangrum v.6 cyl rangmmm complun in cyl rangrum v.6 cyl rangrum v.6 inmer v.6-solocbmatchwltype 1mm_multi_nbase_pseudocounts -soloumifiltering MultigeNeumi_cr-- clipadaptertype cellranger4 -outfilterscoremin 30”。对于单元格滤波，使用“ - solocellfilter emptyDrops_cr”调用Cell Ranger V.4及以上的空滴定算法。选项“ - 分子基因基因基因速率”用于产生仅外显子和全长（前mRNA）基因计数以及RNA速度输出矩阵。使用Cell Ranger V.6.1.1处理TCR-Seq样品，并使用VDJ参考VDJ-Grch38-Alts-5.0.0处理。使用了基于参考的“ CellRanger VDJ”命令的默认设置。FastQ文件已转换为 _s1_l001_r1_001.fastq.gz格式与Cell Ranger兼容。　　用于数据质量控制，预处理，集成和注释的Jupyter笔记本可在GitHub存储库中用于此手稿（代码可用性）。Scanpy v.1.9.1带有Anndata v.0.10.7的统计和绘制库Pandas v.2.2.2，Numpy V.1.26.4，Scipy V.1.13.0，Seaborn V.0.13.2和Matplotlib V.3.8.4用于数据分析和可视化。使用Cellbender42 v.0.1.0对映射的库进行计算去除环境RNA。接下来，所有数据集都经过了细胞质量控制的过滤，并使用 <400 or > 6500个基因，> 6％的线粒体读数或 <5% ribosomal counts were removed. Doublets were annotated using Scrublet43 v.0.2.3. Next, datasets were integrated with scVI from scvi-tools44 v.0.19.0, for which mitochondrial, TCR and cell cycle genes were removed, and cells were annotated into major lineages (cell_type_level_0: T_DN, T_DP, T_SP, Epithelial, Stroma, Myeloid, RBC, B and Schwann) by Leiden clustering. Individual cell lineages were then separated and integrated with scVI to perform fine-grained annotation and remove remaining doublets picked up by manual annotation. Cell annotations were assigned based on four sequential steps: (1) high-resolution Leiden clustering was performed to find all potential cell clusters. (2) Annotations of new cells generated in this study were predicted (i) based on KNN graph majority voting of neighbouring cells with annotations from previous studies by adaptation of the weighted KNN transfer solution from scArches45, or (ii) automatic label transfer using CellTypist46 v.1.6.2 with the Developing_Human_Thymus or Pan_Fetal_Human models. (3) Calling the cell type annotation for a given cluster was then informed by a combination of the predicted labels and by marker genes reported in the literature. (4) Additional QC was performed and newly detected doublet clusters were removed where applicable. Steps 1–4 were repeated until final, fine-grained annotations were reached for the highest resolution (cell_type_level_4). Importantly, cell clusters that passed quality control but that we could not confidently assign to a defined cell type either in the literature or by cell markers by the strategy described above, were kept in the integrated object as ‘cell_type_level_4_explore’, which we recommend for future exploration and validation of these cell states. Annotations in ‘cell_type_level_4’ were grouped into five hierarchical levels from the finest (cell_type_level_4) to the broadest (cell_type_level_0) (Supplementary Table 7). 　　Single-cell TCR-sequencing data were processed using Dandelion47 v.0.3.1, and a detailed notebook can be found in the GitHub repository for this Article (see Code availability). In brief, a Dandelion class object with n_obs was constructed from all combined TCR libraries. Cells were then further subset to only include DP and SP subtypes that contained V and J rearrangements for both TRB and TRA loci. Next, milo neighbourhoods were constructed based on scVI neighbourhood graphs (n_neighbors = 100) and VDJ genes and frequency feature space was calculated. 　　The scFates package48 v.1.0.7 was used for trajectory analysis on the combined mTECII and mTECIII cells. mTECII and mTECIII cells from the paediatric scRNA-seq dataset were reintegrated and batch-corrected using scVI. Next, the object was preprocessed according to recommendations from Palantir49 and the scVI latent embedding was used as an input to ‘palantir.utils.run_diffusion_maps’ (Palantir v.1.3.3). Tree learning was performed using scf.tl.tree by using multiscale diffusion space from Palantir as recommended by scFates. Next, the node that was characterized by expression of some mTECI markers (ASCL1, CCL21) together with mTECII markers (AIRE, FEZF2) was used as a root to compute pseudotime using scf.tl.pseudotime. Finally, milestones obtained after running the pseudotime were used to adjust the annotations. To derive differentially expressed genes across pseudotime branches, we applied the scf.tl.test_fork function, followed by scf.tl.branch-specific as described in the scFates article48. In brief, it fits a generalized additive model for each gene using pseudotime, branch and interaction between pseudotime and branch as covariates; two-sided P values were extracted for the interaction term (pseudotime:branch) and corrected using FDR to obtain significant differentially expressed genes. Next, these genes were tested for upregulation in each branch and assigned to different branches based on a cut-off of 1.3-fold upregulation. 　　The STEMNET package50 (v.0.1) was used to determine the differentiation potential of mcTEC progenitor cells. For this purpose, only mTEC, cTEC and mcTEC(-Prolif) cells were retained in the dataset. mTECI/II/III and cTECI/II/III were set as maturation endpoints and the probability of each cell to adopt any of these six possible fates was calculated. To identify priming of mcTEC(-Prolif) towards mTECI versus cTECI fate, we derived a priming score by calculating the difference between the posterior probabilities for these two fates for each cell. Cells with priming score <−0.5 or >0.5分别标记为MTECI进行了标记和CTECI进行了标记，而所有其他具有可比MTECI和CTECI电位的细胞被认为是未经验证的。　　组织是一个数学模型，旨在得出相对于两个形态标记的空间中一个点的相对，签名的位置。相对于研究问题，空间分辨率和采样频率，组织碱功能H高度灵活且可调。在补充注释1和2中，我们提供了有关Tissuetag V.0.1.1的组织注释的详细指南以及Orgrosaxis Perivation的细节。CMA是组成型的推断，由两个H功能（CMA = 0.2×h（边缘到皮质） + 0.8×H（Cortex to-Medulla））的加权线性组合定义。用Tissuetag v.0.1.1注释所有IBEX和覆盖图像，分辨率为每个像素2 UM（PPM = 0.5）。然后，将注释转移到了一个准甲状腺网格中，该网格是通过将r-microns间距放置在x和y方向上的点而产生的，并以r/2的形式使每一行分错。在整个研究中，我们使用r =15。L2距离对广泛的注释（Annotation_level_0），并用六角网格中所有斑点k = 10计算相应的CMA值。用k = 1计算了fin-notation_level_1）的L2距离。然后通过最近的neighbour映射将CMA和L2距离转移到IBEX的IBEX的斑点或核分割的斑点，因此在这两种空间技术的空间上是均匀的。　　为了提供共同的参考，我们还将轴固定在整个百里香的顺序离散空间中，降低了十个水平（胶囊，囊皮层，皮质1级，皮质2级，皮质等级3，皮质级别3，皮质CMJ，皮层CMJ，Medullary CMJ，Medullary Level 1，Medullary Level 2，Medullary 3，Medullary 3；详细信息在补充桌子上提供补充2和补充2的详细信息。　　直接将切除的胎儿和小儿百里香直接移至体温过低（Sigma-Aldrich，H4416-100ML）（小儿样本）或冷PBS（胎儿样品），由快递员用冰袋运送，并在手术后24小时内进行处理。为了嵌入，首先将胎儿或小儿胸腺组织转移到PBS上，然后将其放在冰上几分钟，以清除任何多余的培养基和保存液体（例如体温过低）。接下来，从样品中除去尽可能多的液体，并在必要时修剪组织以适合冷冻质量。将样品放入装有OCT（Leica Biosystems，14020108926）的冷冻质量（Tissue-Tek，Agg4581）中，并根据所需的方向定位。然后将冷冻质量放入已平衡至-60°C的异戊烷中，并将其完全冷冻2分钟。然后将样品放在干冰上，以使等异戊烷排出。最后，将冷冻质量包裹在箔中，并在-80°C下储存。在切片当天，将样品在切片前1小时去除，并在-18°C下将样品放入低温恒温器（Leica Biosystems）中以平衡。将组织切开（在补充表2中提供了截面厚度），并根据制造商的规程将切片放在覆盖幻灯片上。将切片用H＆E染色，并以×20放大倍率成像（Hamamatsu纳米份2.0 HT）。根据制造商的方案进一步处理库（visium基因表达幻灯片和试剂盒，10倍基因组学，PN-1000184）（透化时间在补充表2中显示）。在Novaseq 6000 Sequencer（Illumina）上对样品进行了测序，并用Space Ranger（补充表2中提供了Space Ranger（10倍基因组学；版本编号），将获得的FASTQ文件进行了测序。　　为了处理比间距默认值更高的分辨率的覆盖组织学图像数据，我们构建了一条自定义管道，以提取高达5,000个像素（HIRES5K）的额外图像分辨率层，我们发现这更适合形态学分析。我们还开发了自己的基准图像登记管道，以提高准确性，而Scikit-image-image V.0.22.0的细胞v.2.1.1和RANSAC检测到了准确性，用于仿照参考框架（由10X基因组提供的信息）。最后，对于柔性组织检测，我们使用了具有可调阈值的大otsu阈值。　　随后，我们使用Tissuetag v.0.1.1进行半小量的图像注释（补充注释1）。用像素随机森林分类器预测皮质和髓质像素，通过基于aire基因表达最高的斑点（用于髓质）和arpp21（用于皮质）的斑点来产生训练注释。然后在必要时手动调整自动皮质/髓质注释。此外，我们手动注释单个胸腺叶和特定结构，例如胶囊/边缘，冷冻/切片人工制品，HCS，PVS和胎儿胸腺相关的淋巴骨料（如先前定义的16）。形态学注释和评估是与专家人类胸腺病理学家协商进行的。本文的GitHub存储库中提供了visium处理管道和注释的完整示例（请参阅代码可用性）。　　为了确保森林和单细胞轮廓之间的最佳匹配，我们使用Cell2Location51 V.0.1.3分别用于胎儿和儿科数据集进行空间映射。因此，我们根据胎儿或小儿阶段对单细胞参考数据集进行了分组，并去除了在其中一个阶段中主要发现的稀有细胞类型。然后，我们进一步删除了显示应力特征（我们认为源自技术因素）和细胞类型的细胞类型 < 40. A list of cell types that were excluded and the exclusion criteria are provided in Supplementary Table 9. Before cell2location deconvolution, we removed cell cycle genes (Supplementary Table 10) and mitochondrial genes, as well as TCR genes using the regex expression ‘^TR[AB][VDJ]|^IG[HKL][VDJC]’. We then further calculated highly variable genes and used relevant metadata cofactors (sample, chemistry, study, donor, age) to correct for batch effects in the cell2location model. 　　After deconvolution, all Visium data were subjected to filtering based on read coverage and predicted cell abundance. Spots with fewer than 1,000 genes per spot or fewer than 25 predicted cells were omitted. Furthermore, annotated tissue artefacts and areas not assigned to a specific structure were removed. Next, to generate a common embedding we performed scVI integration after removing cell cycle, mitochondrial and TCR genes from the highly variable gene selection. scVI training was performed with ‘SampleID’ as the batch key, ‘SlideID, Spaceranger, section_thickness(um)’ as categorical covariates, and ‘Age(numeric), n_genes_by_counts’ as continuous covariates. 　　Before performing any association analysis with the CMA, we further removed lobules (based on ‘annotations_lobules_0’) that had no or small medullar or cortical regions, as we expected our CMA model to be less accurate in these cases. 　　To estimate the dependency of the axis on spot gene variance across samples, we first normalized to a target sum of 2,500 counts and performed log transformation followed by combat regression (https://scanpy.readthedocs.io/en/stable/api/generated/scanpy.pp.combat.html) by sample to adjust for the batch effect of individual samples. We then computed the PCA for batch-corrected gene expression and calculated the Spearman correlation between the first ten PCs and CMA or the number of genes detected per spot (n_gene_by_counts). Note that number of genes per spot, in our hands, was mostly influenced by inconsistent permeabilization during Visium library preparation and constituted the largest technical source of within-sample variance that we found in both fetal and paediatric Visium samples. To estimate the cumulative contribution of either CMA or the number of genes per spot, we multiplied the Spearman’s R with the percentage cumulative explained variance of the first 10 PCs. 　　To analyse cytokine gradients based on the spatial distribution across CMA bins, we first selected a group of 65 cytokines that were broadly expressed from the CellphoneDB database (v.4.1.0, genes annotated as ‘Cytokine’, ‘growthfactor | cytokine’, ‘cytokine’, ‘cytokine | hormone’). We excluded cytokines that were expressed in less than 5% of the spots in all CMA bins. We then performed hierarchical clustering on gene expression batch-corrected (combat) fetal Visium samples of the standardized mean expression of genes across bins using the Ward linkage method with the linkage function from the scipy.cluster.hierarchy module. A heat map was generated using the matrixplot function from the scanpy package52. 　　To compare the distribution of cytokines across developmental groups (fetal versus paediatric) and identify differentially distributed genes, we implemented a two-way ANOVA approach. We initially log-normalized Visium gene expression, then removed lobules for which not a single cortex or medulla Visium spot was detected to increase CMA confidence in both datasets. Data were grouped by mean expression per sample and CMA bin, such that each sample had a single datapoint per CMA bin; n = 16 (paediatric) and n = 12 (fetal) samples. Cosine similarity was calculated based on the median values of the pooled sample bins between fetal and paediatric gene profiles with sklearn.metrics.pairwise.cosine_similarity from scikit-learn v.0.22.0. Two-way ANOVA for age group (fetal versus paediatric) and CMA bin was calculated with statsmodels.api.stats.anova_lm(model, type=2). P values for main and interaction effects were Bonferroni corrected with statsmodels.stats.multitest.multipletests (pvals, alpha=0.05, method=‘bonferroni’). For the full report of the results refer to Supplementary Table 5. 　　To estimate the average position of a cell or gene distribution along CMA and HC axes (L2 distance to the nearest HC), spots with low gene expression were filtered out by using appropriate thresholds (0.2 for scVI corrected gene expression and 0.5 for predicted cell abundances). The position of a gene or cell was then calculated according to the following formula: for every gene/cell and axis positions, the weighted mean was calculated as a dot product of spot cell abundance values and CMA position divided by the sum of the cell abundance values across spots. 　　To identify genes exclusively expressed in a specific cell type or subset thereof (‘specialization genes’, SGs), we developed custom Python functions. Starting from raw read count, gene expression was scaled with scipy.stats.zscore(). Cells that showed expression below a cut-off of 0.05 and genes that had expression below 1.5 mean counts were excluded from further steps. Next, a quantile threshold (>95％）用于选择特定基因表达水平最高的细胞。每个基因进行χ2检验（Scipy CHI2_Contingency），以识别所选细胞是否在特定细胞类型（Cell_Type_level_4_explore）中占用过多，表明该基因为标记基因。预测仅在单个细胞类型中表达的基因（χ2α= 1×10-50）被认为是SGS，并用于进一步分析。　　在心脏胸部心脏病儿童的心脏胸部手术后，从先天性心脏病的儿童手术后，从华盛顿特区的儿童国家医疗中心的病理学系获得了人类胸腺样本，因为胸腺组织通常被定期去除并丢弃以获取足够的曝光，以使腹侧术后手术领域充分暴露。根据人类研究保护办公室发布的指南，将这些胸腺样品的使用在本研究中被确定为不受NIH机构审查委员会的审查。该队列中的患者没有遗传问题。有关该队列的详细信息可以在补充表2中找到。人百里香在收到时将人类胸体放置在PBS中，并在手术后24小时内进行处理。修剪多余的脂肪和结缔组织，并将 <5 mm cubes. For IBEX imaging, human thymi were fixed with BD CytoFix/CytoPerm (BD Biosciences) diluted in PBS (1:4) for 2 days. After fixation, all tissues were washed briefly (5 min per wash) in PBS and incubated in 30% sucrose for 2 days before embedding in OCT compound (Tissue-Tek) as described previously17,53. 　　IBEX imaging was performed on fixed frozen sections as described previously17,53. In brief, 20 μm sections were cut on a CM1950 cryostat (Leica) and adhered to two-well chambered cover glasses (Lab-tek) coated with 15 μl of chrome alum gelatin (Newcomer Supply) per well. Frozen sections were permeabilized, blocked and stained in PBS containing 0.3% Triton X-100 (Sigma-Aldrich), 1% bovine serum albumin (Sigma-Aldrich) and 1% human Fc block (BD Biosciences). Immunolabelling was performed with the PELCO BioWave Pro 36500-230 microwave equipped with a PELCO SteadyTemp Pro 50062 Thermoelectric Recirculating Chiller (Ted Pella) using a 2-1-2-1-2-1-2-1-2 program. The IBEX thymus antibody panel can be found in Supplementary Table 3 and has been formatted as an Organ Mapping Antibody Panel54 (OMAP-17) accessible online (https://humanatlas.io/omap). Custom antibodies were purchased from BioLegend or conjugated in house using labelling kits for Lumican (AAT Bioquest, 1230) and LYVE-1 (Thermo Fisher Scientific, A20182). A biotin avidin kit (Abcam, ab64212) was used to block endogenous avidin, biotin and biotin-binding proteins before streptavidin application. Cell nuclei were visualized with Hoechst 33342 (Biotium) and the sections were mounted using Fluoromount G (Southern Biotech). Mounting medium was thoroughly removed by washing with PBS after image acquisition and before chemical bleaching of fluorophores. After each staining and imaging cycle, the samples were treated with two 15 min treatments of 1 mg ml−1 of LiBH4 (STREM Chemicals) prepared in deionized H2O to bleach all fluorophores except Hoechst. 　　Representative sections from different tissues were acquired using the inverted Leica TCS SP8 X confocal microscope with a ×40 objective (NA 1.3), 4 HyD and 1 PMT detectors, a white-light laser that produces a continuous spectral output between 470 and 670 nm as well as 405, 685 and 730 nm lasers. Panels consisted of antibodies conjugated to the following fluorophores and dyes: Hoechst, Alexa Fluor (AF)488, FITC, AF532, phycoerythrin (PE), eF570, AF555, iFluor 594, AF647, eF660, AF680 and AF700. All images were captured at an 8-bit depth, with a line average of 3 and 1024 × 1024 format with the following pixel dimensions: x (0.284 μm), y (0.284 μm) and z (1 μm). Images were tiled and merged using the LAS X Navigator software (LAS X v.3.5.7.23225). For IBEX tissue imaging, multiple tissue sections were examined before selecting a representative tissue section that contained several distinct lobules with multiple functional units, often resulting in an unusually shaped region of interest. Fluorophore emission was collected on separate detectors with sequential laser excitation of compatible fluorophores (3–4 per sequential) used to minimize spectral spillover. The Channel Dye Separation module within LAS X v.3.5.7.23225 (Leica Microsystems) was then used to correct for any residual spillover. For publication-quality images, Gaussian filters, brightness/contrast adjustments and channel masks were applied uniformly to all images. Image alignment of all IBEX panels was performed as described previously using SimpleITK55,56. Additional details on antibodies, protocols and software can be found on the IBEX Knowledge-Base (https://doi.org/10.5281/zenodo.7693279). 　　IBEX images were converted from .ims format (Imaris, Oxford Instruments, v.9.5.0 and v.10.0.0) to 3D stacks (TIFF) by individual channels with FIJI v.1.54j. We then applied a custom pipeline for 3D single nuclei segmentation with cellpose v.2.1.1: 　　We used a KNN algorithm to compare the annotated cell types in our scRNA-seq reference atlas with the IBEX single-cell segmentations. For this purpose, the protein expression in the IBEX query cells was matched with the RNA expression of the corresponding genes in the scRNA-seq reference. Protein and gene names were matched according to Supplementary Table 11. Batch effects were removed from IBEX with scanpy.pp.combat(ibex_gene, key=‘sample’, inplace=True). We next subsetted each IBEX sample and ran the KNN prediction algorithm per sample with k = 30, including the following steps: 　　In some of our IBEX samples AIRE staining in particular produced a relatively high level of non-specific signal and low signal-to-noise ratio. As a consequence, we identify some predicted AIRE+ mTECII cells that are not in the medulla and have capsular/cortical localization (Fig. 4g). We had cases of individual samples for which a specific antibody did not perform well or was missing (for example, for IBEX_01 the KRT10 antibody was out of stock). However, as ISS patcher was run on each sample separately, this did not affect proper scaling of that marker and its use of it for mapping in other samples. Moreover, by selecting cells with a higher proportion of matched KNN cells from the single-cell data (KKNf) these effects are reconciled through the removal of cells with low-confidence mapping. We would like to flag this point for future researchers who want to reuse our datasets. 　　The general code for the ISS patcher KNN mapping algorithm can be found in the dedicated GitHub repository (https://github.com/Teichlab/iss_patcher/tree/main) and the full example for KNN mapping using IBEX is reported in the GitHub repository for this Article (see Code availability). 　　To annotate T cell types in the IBEX data with high accuracy, we applied the ISS patcher with the CITE-seq T lineage data as a reference as follows: we first applied the KNN algorithm to IBEX data using the scRNA-seq reference atlas to identify and subset IBEX T lineage cells. We then used the CITE-seq data as a reference to repeat the KNN-based annotation on these selected cells. This KNN-based reannotation was performed on a hybrid RNA/protein reference, which included protein measurements for the 19 markers assayed in both CITE-seq and IBEX in addition to RNA measurements for the remaining 23 genes as described in Supplementary Table 11. We used the same KNN implementation as described above but with k = 7, while also imputing CD4 and CD8 pseudotime. 　　Human paediatric samples were obtained from cardiac corrective surgeries. Removed thymi were directly moved to HypoThermosol (Sigma-Aldrich, H4416-100ML), shipped with a courier with ice packs and processed in under 24 h after surgery. After arrival, for FFPE processing, the samples were cut to approximately 1 cm2 pieces with sharp scissors in 1× DPBS. Tissue pieces were then rinsed in clean DPBS to remove any excess HypoThermosol, patted dry with wipes (Kimtech) and placed into 10% formalin (cellpath, BAF-6000-08A) for 16–24 h at room temperature. The next day, tissues were dehydrated and embedded in wax, then kept at 4 °C. 　　Multiplex immunofluorescence and single round imaging was performed as described previously57. All steps were performed at room temperature unless stated otherwise. In brief, FFPE blocks were sectioned using a microtome (Leica, RM2235) at 3.5–5 µm thickness and placed onto a superfrost slide (Thermo Fisher Scientific, 12312148). Slides were dried at 60 °C for 60 min to ensure that tissue sections had adhered to the slides. Tissue sections were deparaffinized and subjected to antigen retrieval using the BioGenex EZ-Retriever system (95 °C for 5 min followed by 107 °C 5 min). For OCT sections, 7 µm sections were taken using a cryostat (Leica CM3050S), placed onto SuperFrost Plus slides (VWR) and immediately submerged in 10% buffered saline formalin (Cellpath, BAF-6000-08A) for 1 h at room temperature. The samples were then subjected to the following steps similarly to the FFPE samples. To remove autofluorescence, slides were bleached with AF quench buffer (4.5% H2O2, 24 mM NaOH in PBS). The slides were quenched for 60 min using the ‘high’ setting with a strong white-light exposure followed by further quenching for 30 min using the 365 nm ‘high’ setting using a UV transilluminator. The slides were rinsed with 1× PBS and incubated in 300 µl of Image-iT FX Signal Enhancer (Thermo Fisher Scientific, I36933) for 15 min. The slides were rinsed again and 300 µl of labelled primary antibody staining cocktail was added to the tissue, which was subsequently incubated for 120 min in the dark within a humidity tray. All antibodies were prediluted according to company recommendations and were not adjusted further (Supplementary Table 4). The slides were washed with a surfactant wash buffer and 300 µl of nuclear staining in goat diluent was added to the slide. The slides were then incubated in the dark for 30 min in a humidity tray. The slides were then washed and placed in 1× PBS. Finally the slides were coverslipped using ArgoFluor mount medium and left in the dark at room temperature overnight to dry. The slides were imaged the next day using the RareCyte Orion microscope with a ×20 objective and relevant acquisition settings were applied using the software Artemis v.4. 　　For RNAscope analysis, thymus tissue was processed as described above for Visium sectioning. Sections were cut from the fresh frozen OCT-embedded (OCT, Leica) samples at a thickness of 10 μm using a cryostat (Leica, CM3050S) and placed onto SuperFrost Plus slides (VWR). Sections were stored at −80 °C until staining. The sections were removed from the −80 °C storage and submerged in chilled (4 °C) 4% PFA for 15 min, then acclimatized to room temperature 4% PFA over 120 min. The sections were then briefly washed in 1× PBS to remove any remaining OCT. Then, the sections were dehydrated in a series of 50%, 70%, 100% and 100% ethanol (5 min each) and air-dried before performing automated 4-plex RNAscope. 　　Using the automated Leica BOND RX, RNAscope staining was performed on the fresh frozen sections using the RNAscope LS multiplex fluorescent Reagent Kit v2 Assay and RNAscope LS 4-Plex Ancillary Kit for LS Multiplex Fluorescent (Advanced Cell Diagnostics (ACD), Bio-Techne) according to the manufacturer’s instructions. All of the sections were subjected to 15 min of protease III treatment before staining protocols were performed. Before running RNAscope probe panels, the RNA quality of fresh frozen samples was assessed using multiplex positive (RNAscope LS 2.5 4-plex Positive Control Probe, ACD Bio-Techne, 321808) and negative (RNAscope 4-plex LS Multiplex Negative Control Probe, ACD Bio-Techne, 321838) controls. 　　The probes were labelled using Opal 520, 570 and 650 fluorophores (Akoya Biosciences, 1:1,000) and one probe channel was labelled using Atto 425-streptavidin fluorophore (Sigma-Aldrich, 1:500), which was first incubated with TSA–biotin (Akoya Biosciences, 1:400). The following RNAscope 2.5 LS probes were used for this study: Hs-AIRE (ACD Bio-Techne, 551248), Hs-LY75-C2 (ACD Bio-Techne, 481438-C2), Hs-CAMP-C3 (ACD Bio-Techne, 446248-C3), Hs-EPCAM-C4 (ACD Bio-Techne, 310288-C4), Hs-IGFBP6-C1 (ACD Bio-Techne, 496068) and Hs-DLK2-C3 (ACD Bio-Techne, 425088-C3). All nuclei were DAPI stained (Life Technologies, D1306). Details are provided in Supplementary Table 12. 　　Confocal imaging was performed on the Perkin Elmer Operetta CLS High Content Analysis System using a ×20 (NA 0.16, 0.299 μm px−1) water-immersion objective with 9-11 z-stacks with 2 µm step. Channels: DAPI (excitation, 355–385 nm; emission, 430–500 nm), Atto 425 (excitation, 435–460 nm; emission, 470–515 nm), Opal 520 (excitation, 460–490 nm; emission, 500–550 nm), Opal 570 (excitation, 530–560 nm; emission, 570–620 nm), Opal 650 (excitation, 615–645 nm; emission, 655–760 nm). Confocal image stacks were stitched as individual z stacks using proprietary Acapella scripts provided by Perkin Elmer, and visualized using OMERO Plus (Glencoe Software). 　　The contrast used for Extended Data Fig. 6e was as follows: DLK2 (magenta 150–500), IGFBP6 (yellow 200–1500), LY75 (green 200–4000), EPCAM (red 300–2500). 　　Paediatric thymus samples from children undergoing cardiac surgery were obtained according to and used with the approval of the Medical Ethical Commission of Ghent University Hospital, Belgium (EC/2019-0826) through the haematopoietic cell biobank (EC-Bio/1-2018). Thymus tissue was cut into small pieces using scalpels and digested with 1.6 mg ml−1 collagenase (Gibco, 17104-019) in IMDM medium for 30 min at 37 °C with regular agitation to generate a single-cell suspension. The reaction was quenched with 10% FBS and the thymocyte suspension was passed through a 70 μm strainer to remove undigested tissue. Cells were frozen in FBS containing 10% DMSO and stored in liquid nitrogen until needed. 　　The TotalSeq-C Human Universal Cocktail 1.0 (BioLegend, 399905) was resuspended according to the manufacturer’s instructions. In brief, the lyophilized cocktail was equilibrated to room temperature for 5 min and then centrifuged at 10,000g for 30 s. Then, 27.5 µl cell staining buffer (BioLegend, 420201) was added and the tube was vortexed for 10 s, incubated for 5 min at room temperature and vortexed again. The resuspended cocktail was centrifuged for 30 s at room temperature at 10,000g and the entire volume was transferred to a low-bind tube. Finally, the tube was centrifuged again at 14,000g for 10 min at 4 °C and 25 µl of the supernatant was used per sample (2 × 106 cells in 200 µl). 　　In total, 13 TotalSeq-C antibodies (BioLegend) were titrated individually by flow cytometry using PE-conjugated versions of the same antibody clone as recommended by BioLegend. After choosing a suitable concentration for each antibody, a master mix was prepared for cell staining. For this, all antibodies were initially diluted in the cell staining buffer to obtain a concentration 100-fold higher than the desired final staining concentration. Then, 2 µl of each diluted antibody substock was combined in a master mix, which was added to the cells for labelling in a total volume of 200 µl. Details on the TotalSeq-C antibodies are provided in Supplementary Table 6. 　　Cells were thawed slowly by gradually adding 15 volumes of pre-warmed IMDM medium and pelleted at 1,700 rpm for 6 min at 4 °C. After resuspending in PBS, cells were passed through a 70 μm strainer to remove clumps. Enrichment for viable cells was achieved using a magnetic bead-based dead cell removal kit (Miltenyi, 130-090-101). For this, cells were pelleted as before, washed with 1× binding buffer (part of the kit, prepared with sterile distilled water) and resuspended in dead cell removal microbeads (part of the kit) at a concentration of 107 cells per 100 µl beads. After incubation at room temperature for 15 min, cells were applied to an LS column (Miltenyi, 130-122-729), which was prerinsed with 3 ml 1× binding buffer. The column was washed four times with 3 ml binding buffer and the flow-through containing viable cells was collected. Cells in the flow-through were pelleted and viability was confirmed using trypan blue. A total of 2 × 106 viable cells was used for TotalSeq-C and anti-CD3-PE antibody staining. For this purpose, cells were washed with cell staining buffer (BioLegend, 420201), pelleted at 600g for 10 min at 4 °C and resuspended in 90 µl cell staining buffer. Then, 10 µl Human TruStain FcX blocking solution (BioLegend, 422301) was added and cells were incubated for 10 min at 4 °C. The TotalSeq-C Human Universal Cocktail 1.0 (BioLegend, 399905) was resuspended as described above, centrifuged at 14,000g for 10 min at 4 °C and 25 µl of the supernatant was added to the blocked cells. Individual TotalSeq-C antibodies were prepared as described above and 26 µl of the master mix was added to each sample. To facilitate enrichment of immature and mature thymocytes by FACS, 10 µl anti-CD3-PE (SK7, BioLegend, 344805) was added and the samples were topped up with 40 µl cell staining buffer resulting in a total staining volume of 200 µl. The samples were incubated for 30 min at 4 °C in the dark. To wash off unbound antibodies, cell staining buffer was added to the samples, and cells were pelleted for 10 min at 600g at 4 °C. All supernatant was removed, cells were resuspended in cell staining buffer, transferred to a new tube and pelleted as before. Cells were again resuspended in cell staining buffer and pelleted and this wash step was repeated once more before cells were resuspended in 200 µl MACS buffer (2% FCS, 2 mM EDTA in PBS) in preparation for sorting. Then, 1 µl propidium iodide (Invitrogen, 230111) was added for detection of dead cells and samples were sorted on the BD FACSAria III or BD FACSAria Fusion cell sorter using a 100 μm nozzle and a maximum flow rate of 4 (FACSDiva v.8.0.2, reanalysis with FlowJo v.10.8.2). Cells were gated using forward/side scatter to remove doublets and debris, then dead cells were excluded based on PI staining. CD3− and CD3+ cells were collected separately in cooled IMDM + 50% FCS (Supplementary Fig. 19b). After completion of the sort, collection tubes were topped up with DPBS and cells were pelleted at 400g for 10 min at 4 °C. The supernatant was removed and cells were resuspended at an estimated concentration of 1,500 cells per µl in PBS + 0.04% BSA (Miltenyi, 130-091-376), of which 16.5 µl was used in the GEM generation step. The Next GEM Single Cell 5′ Kit v2 (10x Genomics, 1000265) was used to prepare the reaction master mix, and load cells, gel beads and partitioning oil on a Chip K (10x Genomics, 1000286) according to the manufacturer’s protocol CG000330 Rev A. GEMs were generated using a Chromium Controller (10x Genomics), transferred to a tube strip and reverse transcription was performed in a BioRad C1000 Touch Thermal Cycler according to the protocol. The samples were stored at 4 °C overnight and the library preparation was carried out the next day. 　　Feature barcode (FB), gene expression (GEX) and TCR libraries were prepared according to protocol CG000330 Rev A (10x Genomics) using the Chromium Next GEM Single Cell 5′GEM Kit v2 (10x Genomics, 1000244), Library Construction Kit (10x Genomics, 1000190), 5′ Feature Barcode Kit (10x Genomics, 1000256), Human TCR Amplification Kit (10x Genomics, 1000252), Dual Index Kit TT set A (10x Genomics, 1000215) and Dual Index Kit TN set A (10x Genomics, 1000250). The protocol version for >遵循6,000个细胞，并将文库放大13个周期（cDNA），14个周期（GEX），8个周期（FB）或12+10个周期（TCR文库）。使用高敏性DNA测定法对生物分析仪仪器（Agilent）检查了图书馆质量和数量。将图书馆合并并在Novaseq 6000仪器（Illumina）上进行测序至每个单元格至少25,000个读数，每个单元格的读数为10,000个读取，fb的读数为10,000个读数，每个单元格5,000个用于TCR库的读取。　　使用r套件Seurat58（v.4.3.0），SeuratObject（v.4.1.4），SeuratDisk（V.0.0.0.9021），SingleCeLlexperiment（v.1.24.0），Matrix（V.1.6-4），MatrixStats（V.1.6-4）（V.1.24.0）（V.1.24.0）（V.1.24.0）tidyr（v.1.3.1），reshape2（v.1.4.4），biocneighbors（v.1.20.2），biocparallalle（v.1.36.0），stringr（v.1.5.1），网状（v.1.35.0）和可口（v.0.0.0.7）。使用GGPLOT2（V.3.5.0），GGRASTR（V.1.0.2），GGRIDGES（V.0.5.6）和RCOLORBREWER（V.1.1-3）可视化数据。　　使用Cell Ranger v.7.0.0映射FB库中的FASTQ文件。如上所述，用SCRNA-SEQ数据绘制了GEX库用Starsolo映射。如上所述，首先根据RNA性能对CITE-SEQ数据进行质量控制处理。对于保留的高质量细胞，然后使用DSB（v.1.0.3）59对ADT数据进行了降解。为此，在未过滤的映射输出中鉴定了空液滴，并用作估计噪声和抗体背景水平的参考。根据RNA计数<240读，ADT计数在120至350读的基础上选择了大约140万滴液滴，线粒体读数<5％，以确保子集中不包含受损的细胞。在用Dsbnormalizeprotein进行降解和归一化过程中，使用了来自液滴的ADT数据进行背景校正，并使用了TotalSeq-C人类通用鸡尾酒中包含的七种同种型对照抗体来确定技术变化。通过减去平均值，然后除以空液滴的标准偏差，根据背景缩放数据。denoising后的负值（对应于非常低的表达）被设置为零，以改善解释和可视化。在额外的质量控制步骤中，从数据集中除去了少于100种抗体的细胞作为技术人工制品。此外，由于不可靠的表面染色特性，从数据集中除去了受TCRγδ，CD199（CCR9），CCR9），CD370，CD357和XCR1影响的细胞子集。　　对综合RNA进行了cite-seq数据的注释，并将ADT模式降低。为此，两种数据模式都是对数均衡的，并且使用SEURAT软件包中的标准函数鉴定了顶部2,000个高度可变的基因，然后在缩放的高度可变基因上进行PCA。使用Batchelor Package60（V.1.18.1）的还原函数将MNN校正应用于PCA载荷矩阵，以减少样品之间的批处理和供体效应。为了整合模态，确定了多模式邻居和模态权重，并使用基于PCA的Findmultimodalneighbors构建了加权最近的邻居（WNN）Graph58。要使用的PC数量是根据连续PC的变化差异高于0.1％的差异确定的。基于WNN图生成了UMAP可视化，以表示两种方式的加权组合。此外，使用SEURAT软件包的RunSPCA函数在转录组数据上进行了监督PCA（SPCA），以获得代表WNN Graph58结构的RNA模态的维度降低。　　为了识别细胞类型和发育阶段，我们使用seurat函数FindNeighbors和Findclusters根据SPCA进行了低分辨率进行Leiden聚类。然后将所获得的簇从最不成熟的簇开始，并分别分别分析，该簇是根据表面标记CD34的高表达来鉴定的。对于每个子集，如上所述重复RNA和ADT数据的归一化和缩放，并构建了新的WNN UMAP和SPCA。在SPCA上的莱顿聚类和蛋白质水平的阈值的组合用于鉴定细胞类型。此外，为了识别增殖的细胞，使用Seurat包装中提供的G2M和S相标记进行了细胞周期评分。Findmarkers函数（Seurat）和包装单链菌（Singlecellhaystack61）（v.1.0.0）分别以基于群集和群集无关的方式识别差异表达的基因和表面标记。CD4和CD8谱系成熟阶段的区别基于CD45RA，CD45RO和CD1A，并且使用了相同的表达截止，以确保子集直接可比。　　配对的TCR-seq数据用蒲公英47 V.0.3.1处理，并为每个细胞提取有关生产性或非生产性TRA和TRB重排的信息，以验证聚类后的细胞类型注释。　　To carry out trajectory inference for αβT lineage cells, CITE-seq data were subsetted to contain DP_pos_sel, DP_4hi8lo, SP_CD4_immature, SP_CD4_semi-mature, SP_CD4_mature, SP_CD8_immature, SP_CD8_semi-mature and SP_CD8_mature cells.基于表面蛋白和RNA构建了新的WNN UMAP。SlingShot62（v.2.6.0）用于使用GetlineAges函数在WNN UMAP上建立最小跨越树，该函数基于基于互助的近距离距离，将DP_POS_SEL SET作为起点，SP_CD4_Mature和SP_CD4_Mature和SP_CD8_Mature指定为终点。使用GetCurves函数获得平滑的谱系，并使用衍生的伪序列来评估整个分化过程中的转录本和表面标记表达。　　对于集中的多模式分析，我们将完整的SCRNA-SEQ数据集征服了仅包括儿科数据，并在没有Cite-Seq蛋白信息的情况下进一步删除了T谱系细胞。然后，我们按照上述进行了细胞2静态，以获得基于Cite-Seq注释的反volvolved细胞类型映射。此外，作为基于离散细胞注释的单细胞引用的唯一pseudobulk表达曲线的Cell2Location deconvolves斑点，映射分辨率仅限于带注释的细胞子集的映射分辨率。为了研究分辨率增加的CD4/CD8细胞谱系的连续时空性质，我们使用scanpy（分辨率= 35）对Cite-Seq数据进行了高分辨率leiden聚类，从而导致567个细胞簇。然后，将这些细胞簇映射到我们的小儿覆盖数据中，如上所述。为了测量每个细胞簇跨CMA的位置，我们选择了具有最高簇标记丰度（高于百分位数95％）的景点，并计算了这些斑点的加权平均CMA值。然后将此值分配给包含单细胞对象中相关群集的单元格。本文的GitHub存储库中提供了更多详细信息（请参阅代码可用性）。　　有关研究设计的更多信息可在与本文有关的自然投资组合报告摘要中获得。

本文来自作者[yjmlxc]投稿，不代表颐居号立场，如若转载，请注明出处：https://yjmlxc.cn/zsfx/202506-3707.html