XhetRel: a pipeline for X heterozygosity and relatedness analysis of sequencing data


Salman B., Bebek N., Uğur İşeri S.

Bioinformatics Advances, cilt.6, sa.1, 2026 (ESCI, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 6 Sayı: 1
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1093/bioadv/vbag002
  • Dergi Adı: Bioinformatics Advances
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Scopus, BIOSIS, Directory of Open Access Journals
  • İstanbul Üniversitesi Adresli: Evet

Özet

Motivation: Verification of sample sex is an essential quality control step in next-generation sequencing studies, typically assessed from genomic data. Clustering individuals by X chromosome heterozygosity (Xhet) and incorporating relatedness estimates offers a practical first-pass screen for potential sex label errors, sample mix-ups, and pedigree inconsistencies. To better interpret Xhet based patterns, we further investigated the biological and technical origins using the 1000 Genomes Project dataset. Results: We developed XhetRel, a user-friendly workflow and notebook application that computes Xhet and performs relatedness estimation directly from VCF files. As a fully genotype-based approach, XhetRel enables both sex-based clustering and relatedness assessment as an initial quality control (QC) step in NGS. XhetRel serves groups without bioinformatics infrastructure, users requiring a browser-based QC tool, and workflow developers seeking a modular Nextflow component. Our investigation into the sources of Xhet variation highlighted important limitations in sequencing and variant-calling approaches. In particular, specific pseudogenes and gene clusters, such as SLC25A5 and the GAGE cluster, as recurrent contributors to misleading variant allele fractions.