MCKPZLD66LQL3YIVEAQB4KPKXY3JL7ZAFGGWQVHCN4N65PGMQTDQC
YA4R25RPAF23T46FTNB2YLYF2U2I7BZZ2673YNUG6SO73JRCWAIQC
YUIFBBZUGTZTASYNEQBB7F5P4PF3HGUPPTLSW5HIT4HEH6PT34EAC
YF6ORNFUYYAT63AHY66B7DMVWHULHCK5JLJM2TNT7KWGMM54WBDQC
BSTHKI4NCR6JGVYEX3V4NGAIEHD52CYN7BO7BBIH5ZE2G6VWWVXQC
FXA3ZBV64FML7W47IPHTAJFJHN3J3XHVHFVNYED47XFSBIGMBKRQC
Z6B2FRJWT6EF4MC4GTIDBMULUKEI62ZGPYZMCWRJDJNUM7YUJJMAC
_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
Idem tests/spliceai
bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
**** DONE Regarder les conséquences pour -s worst
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
Après filtre_vep sans splice
]$ bcftools +split-vep filtered.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
Dans tests/spliceai
$ bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -s worst -d | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
**** KILL Vérifier si tests sanger passent: non
CLOSED: [2023-09-28 Thu 01:33] SCHEDULED: <2023-09-27 Wed>
│ String Float64 Int64
─────┼───────────────────────────────────────
1 │ chr10:g.130884530 60.0 67
2 │ chr10:g.240362 60.0 79
3 │ chr14:g.52665581 60.0 51
4 │ chr19:g.41325390 60.0 180
*** TODO Regarder annotation VEP des variants sur NA12878 non trataié :na12878:
SCHEDULED: <2023-10-16 Mon>
/Entered on/ [2023-10-16 Mon 19:39]
** DONE [#B] Indicateurs qualité :qualité:
CLOSED: [2023-09-10 Sun 16:46]
*** Idée
Raredisease:
- FastQC : nombreuses statistiques. Non disponible Nix
- Mosdepth : calcule la profondeur (2x plus rapide que samtools depth). Nix
- MultiQC : fusionne juste les résultats des analyses. Non disponible nix
- Picard's CollectMutipleMetrics, CollectHsMetrics, and CollectWgsMetrics
- Qualimap : alternative fastqc ? Non disponible nix
- Sentieon's WgsMetricsAlgo : propriétaire
- TIDDIT's cov : TIDIT = remaninement chromosomique
Sarek:
- alignment statistics : samtools stats, mosdepth
- QC : MultiQC
MultiQC : non disponible Nix
*** DONE FastqQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Mosdepth
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
Pour exomple, il faut le fichier de capture
subworkflows/local/bam_markduplicates/
*** DONE Samtools stats
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE [#B] Compte-redu exécution avec MultiQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Résultats sur NA12878 : 98% à 20x
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-17 Thu>
**** DONE Comprendre 91% à 20x seulement: SNVs inséré
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester autre kit : Twist exome comprehensive
CLOSED: [2023-08-18 Fri 22:24]
Moins bon
***** DONE Tester génome sans alt
CLOSED: [2023-08-18 Fri 22:25]
Idem
***** DONE Tester NA12878 sans SNVs inséré: cause !!
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester hg19 sur NA12878 non inséré
CLOSED: [2023-08-18 Fri 22:25]
**** DONE Comprendre pourquoi SNVs diminuent le score: reads manquants
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-18 Fri>
Voir [[id:5c1c36f3-f68e-4e6d-a7b6-61dca89abc37][Bug: perte de nombreux reads avec NA12878]]
*** DONE Relancer résultats avec NA1287 et NA12878 + sanger
CLOSED: [2023-08-29 Tue 10:30] SCHEDULED: <2023-08-29 Tue>
*** DONE Comparer avec hg19
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec autres kit de capture
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec no-alt
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
** HOLD vérifier si normalisation
** KILL [#B] Vérification nomenclature hgvs :hgvs:
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-15
644 | Acc | 0.0000003317384 | No | Acc | 89894637 | 7 | 89894644 | 0.0000002205815 | No | 89894637 | 0.02545572 | No | 0.02545572 | No |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
**** DONE Vérifier multiples transcripts en hg38 avec coordonées génomiquues: ok
CLOSED: [2023-08-10 Thu 23:00]
Beaucoup plus de transcrits en T2T
Ex: 1 transcrit refseq curated
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108257446%2D108257496&hgsid=1672963428_J5aWAqack2FpJ7mvhFTNVw7bKzxo
vs 2 transcrits en T2T
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hub_3671779_hs1&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108264969%2D108265019&hgsid=1672963612_Eso9frdQ7z6RkKkcKsIf2Waq3pec
C'est bien ce qu'on retrouve avec spip
*** DONE [#A] Filtre vep avec spip
CLOSED: [2023-08-13 Sun 00:39] SCHEDULED: <2023-08-12 Sat 19:00>
*** DONE Annotation CADD + spliceAI GRCh38 avec nouvelle version :annotation:
CLOSED: [2023-08-28 Mon 17:21] SCHEDULED: <2023-08-20 Sun>
*** DONE OMIM: possible seulement sur nom du gènes:annotation:
CLOSED: [2023-08-13 Sun 11:57] SCHEDULED: <2023-08-13 Sun 16:00>
Base de données non disponible et compliqué de faire la mise à jour nous.
Si on essaie de prendre les gènes de GRCH38, ils ne sont pas forcément en T2T
Ex: DDX11L17 n'existe pas dans T2T à ces coordonées
zgrep DDX11L17 GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz
Note: c'est un pseudogene
https://www.genecards.org/cgi-bin/carddisp.pl?gene=DDX11L17
Si on prend les gènes de T2T, il y en a des nouveaux.
Ex: le premier est LOC101928626.
À cette position, rien en GRCh38
Si on essaye avec ENSEMBL: non car n'ont pas le même identifiant
Ex: ACHE
Idéalement, il faudrait l'identifiant NCBI (disponible dans OMIM) mais n'est pas en sortie de VEP
Et cela demande la version "merged" donc impossible en T2T
Est-ce faisable de faire une chr10129957338-T-Ccorrespondance sur le nom du gène ?
Tous les gènes de T2T:
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > t2t-genes.txt
wc -l t2t-genes.txt
#+end_src
#+RESULTS:
: 57660 t2t-genes.txt
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_000001405.40_GRCh38.p14_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > hg38-genes.txt
wc -l hg38-genes.txt
#+end_src
#+RESULTS:
: 67127 hg38-genes.txt
Gènes communs aux 2
#+begin_src sh :dir ~/Downloads
comm -12 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 54506
Gènes uniquements dans t2t
#+begin_src sh :dir ~/Downloads
comm -23 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 3154
Gènes uniquements dans GRCh38
#+begin_src sh :dir ~/Downloads
comm -13 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 12621
*** HOLD OMIM sur nom du gène :annotation:
*** DONE Mobidetails API
CLOSED: [2023-09-10 Sun 16:44]
Trop long ... 1h à 1h30 d'exécution
Disponible dans module
*** DONE Filtre vep avec spip for T2T et spliceAI pour GRCh38
CLOSED: [2023-09-16 Sat 22:47]
*** DONE Repasser tests en GRCh38 avec nouveau filtre (spip ou splice ai) :sanger:
CLOSED: [2023-09-17 Sun 09:07] SCHEDULED: <2023-09-16 Sat>
*** HOLD Franklin API
https://www.postman.com/genoox-ps/workspace/franklin-api-documentation-s-public-workspace/documentation/6621518-4335389d-12e3-445f-8182-339df95b2a09
*** KILL Regarder si clinique disponible avec vep :annotation:
CLOSED: [2023-09-10 Sun 16:44]
*** TODO Tester filtre sans splice: 6130 mais il en manque 4
SCHEDULED: <2023-09-27 Wed>
Mail Paul: Exome donc hors splice, peu intéressant
**** DONE Enlever complètement condition splice: 6130 variants restants...
CLOSED: [2023-09-27 Wed 19:37] SCHEDULED: <2023-09-26 Tue>
Cf [[id:c9b2009a-503b-4561-94c6-29ae21a3188d][Filtre vep avec spliceAI: 37365 -> 6130]]
Dans tests/splicai
#+begin_src sh
filter_vep -i output-all-gpu.vcf --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o test.vcf
grep -c -v '^#' test.vcf
6130
#+end_src
**** DONE Remplacer par impact fonctionnel: peu d'impact : majorité = MODERATE
CLOSED: [2023-09-27 Wed 19:45] SCHEDULED: <2023-09-26 Tue>
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is HIGH" --only_matched | grep -c -v '^#'
258
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is LOW" --only_matched | grep -c -v '^#'
11
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is MODERATE" --only_matched | grep -c -v '^#'
5824
**** DONE Regarder les conséquences pour tes les transcripts
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
filter_vep -i NA12878-sanger-all-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o filtered.vcf
bcftools +split-vep filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
Idem tests/spliceai
bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
**** DONE Regarder les conséquences pour -s worst
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
Après filtre_vep sans splice
]$ bcftools +split-vep filtered.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
Dans tests/spliceai
$ bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -s worst -d | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
**** KILL Vérifier si tests sanger passent: non
CLOSED: [2023-09-28 Thu 01:33] SCHEDULED: <2023-09-27 Wed>
│ String Float64 Int64
─────┼───────────────────────────────────────
1 │ chr10:g.130884530 60.0 67
2 │ chr10:g.240362 60.0 79
3 │ chr14:g.52665581 60.0 51
4 │ chr19:g.41325390 60.0 180
**** DONE Comparer aux filtres en GRCh38: ce sont bien les filtres hors splice...
CLOSED: [2023-10-17 Tue 21:12]
T2T:
filter_vep -i 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched | bcftools +counts
Number of samples: 1
Number of SNPs: 5362
Number of INDELs: 325
Number of MNPs: 323
Number of others: 0
Number of sites: 5991
GRCh38
filter_vep -i 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched | bcftools +counts
Number of samples: 1
Number of SNPs: 1182
Number of INDELs: 143
Number of MNPs: 535
Number of others: 0
Number of sites: 1840
**** DONE Proportions de conséquence : T2T vs GRCh38 avec multiqc: idem
CLOSED: [2023-10-17 Tue 21:00]
À l'oeil
**** Réexaminer les conséquences
***** DONE Impact fonctionnel: plus de LOW et de MODIFIER++
CLOSED: [2023-10-17 Tue 21:22]
T2T
bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%IMPACT\n' -d | sort | uniq -c
596 HIGH
2828 LOW
16314 MODERATE
11261 MODIFIER
GRCh38
bcftools +split-vep 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.filtervep.vcf -f '%IMPACT\n' -d | sort | uniq -c
414 HIGH
466 LOW
10054 MODERATE
550 MODIFIER
***** DONE Pire conséquence: trop de missense
CLOSED: [2023-10-17 Tue 21:23]
GRCh38
$ bcftools +split-vep 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
2 3_prime_utr_variant&nmd_transcript_variant
1 5_prime_utr_variant
2 coding_sequence_variant
47 frameshift_variant
6 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
1 frameshift_variant&splice_region_variant
1 frameshift_variant&start_lost&start_retained_variant
37 inframe_deletion
9 inframe_deletion&nmd_transcript_variant
27 inframe_insertion
5 inframe_insertion&nmd_transcript_variant
21 intron_variant
1593 missense_variant
37 missense_variant&nmd_transcript_variant
17 missense_variant&splice_region_variant
1 missense_variant&splice_region_variant&nmd_transcript_variant
1 protein_altering_variant
1 splice_acceptor_variant
1 splice_acceptor_variant&frameshift_variant
2 splice_acceptor_variant&nmd_transcript_variant
3 splice_donor_5th_base_variant&intron_variant
1 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
2 splice_donor_region_variant&intron_variant
1 splice_donor_region_variant&intron_variant&nmd_transcript_variant
1 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
10 splice_donor_variant
1 splice_donor_variant&non_coding_transcript_variant
11 splice_polypyrimidine_tract_variant&intron_variant
1 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
1 splice_region_variant&intron_variant
9 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
3 splice_region_variant&synonymous_variant
1 splice_region_variant&synonymous_variant&nmd_transcript_variant
4 start_lost
19 stop_gained
2 stop_gained&frameshift_variant
2 stop_gained&nmd_transcript_variant
1 stop_gained&splice_region_variant
1 stop_gained&splice_region_variant&nmd_transcript_variant
3 stop_lost
2 stop_lost&nmd_transcript_variant
1 stop_retained_variant
18 synonymous_variant
1 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
T2T
[apraga@mesointeractive filter]$ bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
15 3_prime_utr_variant
11 3_prime_utr_variant&nmd_transcript_variant
51 5_prime_utr_variant
3 5_prime_utr_variant&nmd_transcript_variant
48 coding_sequence_variant
5 coding_sequence_variant&nmd_transcript_variant
3 downstream_gene_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
78 inframe_deletion
2 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
84 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
16 intergenic_variant
368 intron_variant
21 intron_variant&nmd_transcript_variant
71 intron_variant&non_coding_transcript_variant
5187 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
105 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
33 non_coding_transcript_exon_variant
12 splice_acceptor_variant
1 splice_acceptor_variant&5_prime_utr_variant&intron_variant&nmd_transcript_variant
1 splice_acceptor_variant&nmd_transcript_variant
3 splice_acceptor_variant&non_coding_transcript_variant
1 splice_acceptor_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
16 splice_donor_5th_base_variant&intron_variant
2 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
33 splice_donor_region_variant&intron_variant
4 splice_donor_region_variant&intron_variant&nmd_transcript_variant
7 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
19 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
2 splice_donor_variant&non_coding_transcript_variant
3 splice_donor_variant&splice_donor_5th_base_variant&coding_sequence_variant&intron_variant
64 splice_polypyrimidine_tract_variant&intron_variant
6 splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
8 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
2 splice_region_variant&3_prime_utr_variant
2 splice_region_variant&5_prime_utr_variant
4 splice_region_variant&intron_variant
6 splice_region_variant&non_coding_transcript_exon_variant
54 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
4 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
5 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
27 splice_region_variant&synonymous_variant
13 start_lost
31 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
2 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
349 synonymous_variant
17 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
2 upstream_gene_variant
*** TODO Regarder annotation VEP des variants sur NA12878 non trataié :na12878:
SCHEDULED: <2023-10-16 Mon>
/Entered on/ [2023-10-16 Mon 19:39]
** DONE [#B] Indicateurs qualité :qualité:
CLOSED: [2023-09-10 Sun 16:46]
*** Idée
Raredisease:
- FastQC : nombreuses statistiques. Non disponible Nix
- Mosdepth : calcule la profondeur (2x plus rapide que samtools depth). Nix
- MultiQC : fusionne juste les résultats des analyses. Non disponible nix
- Picard's CollectMutipleMetrics, CollectHsMetrics, and CollectWgsMetrics
- Qualimap : alternative fastqc ? Non disponible nix
- Sentieon's WgsMetricsAlgo : propriétaire
- TIDDIT's cov : TIDIT = remaninement chromosomique
Sarek:
- alignment statistics : samtools stats, mosdepth
- QC : MultiQC
MultiQC : non disponible Nix
*** DONE FastqQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Mosdepth
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
Pour exomple, il faut le fichier de capture
subworkflows/local/bam_markduplicates/
*** DONE Samtools stats
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE [#B] Compte-redu exécution avec MultiQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Résultats sur NA12878 : 98% à 20x
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-17 Thu>
**** DONE Comprendre 91% à 20x seulement: SNVs inséré
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester autre kit : Twist exome comprehensive
CLOSED: [2023-08-18 Fri 22:24]
Moins bon
***** DONE Tester génome sans alt
CLOSED: [2023-08-18 Fri 22:25]
Idem
***** DONE Tester NA12878 sans SNVs inséré: cause !!
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester hg19 sur NA12878 non inséré
CLOSED: [2023-08-18 Fri 22:25]
**** DONE Comprendre pourquoi SNVs diminuent le score: reads manquants
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-18 Fri>
Voir [[id:5c1c36f3-f68e-4e6d-a7b6-61dca89abc37][Bug: perte de nombreux reads avec NA12878]]
*** DONE Relancer résultats avec NA1287 et NA12878 + sanger
CLOSED: [2023-08-29 Tue 10:30] SCHEDULED: <2023-08-29 Tue>
*** DONE Comparer avec hg19
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec autres kit de capture
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec no-alt
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
** HOLD vérifier si normalisation
** KILL [#B] Vérification nomenclature hgvs :hgvs:
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-15
840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63061871
- [X] 2300055440_63061880
- [X] 230006894_63064950
- [X] 2300071111_63070356
- [X] 2300083434_63071675
- [X] 2300103609_63076239
- [X] 2300104572_63076232
- [X] 2300109602_63076765
- [X] 2300109665_63076770
- [X] 2300119721_63078732
- [X] 2300137773_63078133
- [X] 2300137834_63078123
- [X] 2300167821_63086183
- [X] 2300172698_63113453
- [X] 2300188216_63090609
- [X] 2300188281_63090632
- [ ] 2300188800_63090616
- [ ] 2300193193645_63090623
- [ ] 2300193668_63090611
- [ ] 2300195426_63090608
- [ ] 2300201017_63089636
- [ ] 2300227479_63098330
- [ ] 2300232688_63130821
- [ ] 2300292749_63109239
- [ ] 230029277_63109247
- [ ] 2300294712_63109236
- [ ] 2300308032_63111581
- [ ] 2300323537_63114209
- [ ] 2300334609_63115535
- [ ] 2300346867_63118093
- [ ] 2300346867_63118093_NA12878
- [ ] 2300348940_63118099
- [ ] 2300359806_63119915
- [ ] 2300380476_63123963
- [ ] 2300382582_63123749
- [ ] 2300384269_63126867
- [ ] 2300407581_63130826
- [ ] 2300407626_63130842
- [ ] 2300409593_63130874
- [ ] 2300409612_63130980
- [ ] 2300417623_63131524
** TODO Variants manqués :missed:
SCHEDULED: <2023-10-21 Sat>
*** DONE 63012582: chr10:g.102230760 filtré par AD :63012582:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
Il est en sortie d'haplotypecaller !
Attention à la position : POS=102230753 noté CG->C
GT:AD:DP:GQ:PL 0/1:26,8:34:99:146,0,671
Filtré par la condition AD <= 10 (porté par 8 reads seulement)
Non confirméen sanger, rendu vous
**** KILL image BAM cento
CLOSED: [2023-10-08 Sun 23:13]
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 23:23] SCHEDULED: <2023-10-08 Sun>
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** DONE 63060439: chr15:g.26869324 = Problème de profondeur DP=15 :63060439:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
GABRA5
Rendu VOUS avec un variant patho MDB5 pour même patient (VOUS- même)
Non confirmé en Sanger
GT:AD:DP:GQ:PL 0/1:9,6:15:99:103,0,213
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 22:56]
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** TODO Ajouter négatifs dans la liste des variants
SCHEDULED: <2023-10-19 Thu>
* Résultats
** TODO Speed-up BWA-mem
SCHEDULED: <2023-10-22 Sun>
** TODO Speed-up Hapotypecaller
SCHEDULED: <2023-10-22 Sun>
* Communication
** DONE Mail NGS-diag
CLOSED: [2023-10-06 Fri 08:04] SCHEDULED: <2023-10-06 Fri>
/Entered on/ [2023-10-04 Wed 19:33]
644 | Acc | 0.0000003317384 | No | Acc | 89894637 | 7 | 89894644 | 0.0000002205815 | No | 89894637 | 0.02545572 | No | 0.02545572 | No |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
**** DONE Vérifier multiples transcripts en hg38 avec coordonées génomiquues: ok
CLOSED: [2023-08-10 Thu 23:00]
Beaucoup plus de transcrits en T2T
Ex: 1 transcrit refseq curated
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108257446%2D108257496&hgsid=1672963428_J5aWAqack2FpJ7mvhFTNVw7bKzxo
vs 2 transcrits en T2T
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hub_3671779_hs1&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108264969%2D108265019&hgsid=1672963612_Eso9frdQ7z6RkKkcKsIf2Waq3pec
C'est bien ce qu'on retrouve avec spip
*** DONE [#A] Filtre vep avec spip
CLOSED: [2023-08-13 Sun 00:39] SCHEDULED: <2023-08-12 Sat 19:00>
*** DONE Annotation CADD + spliceAI GRCh38 avec nouvelle version :annotation:
CLOSED: [2023-08-28 Mon 17:21] SCHEDULED: <2023-08-20 Sun>
*** DONE OMIM: possible seulement sur nom du gènes:annotation:
CLOSED: [2023-08-13 Sun 11:57] SCHEDULED: <2023-08-13 Sun 16:00>
Base de données non disponible et compliqué de faire la mise à jour nous.
Si on essaie de prendre les gènes de GRCH38, ils ne sont pas forcément en T2T
Ex: DDX11L17 n'existe pas dans T2T à ces coordonées
zgrep DDX11L17 GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz
Note: c'est un pseudogene
https://www.genecards.org/cgi-bin/carddisp.pl?gene=DDX11L17
Si on prend les gènes de T2T, il y en a des nouveaux.
Ex: le premier est LOC101928626.
À cette position, rien en GRCh38
Si on essaye avec ENSEMBL: non car n'ont pas le même identifiant
Ex: ACHE
Idéalement, il faudrait l'identifiant NCBI (disponible dans OMIM) mais n'est pas en sortie de VEP
Et cela demande la version "merged" donc impossible en T2T
Est-ce faisable de faire une chr10129957338-T-Ccorrespondance sur le nom du gène ?
Tous les gènes de T2T:
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > t2t-genes.txt
wc -l t2t-genes.txt
#+end_src
#+RESULTS:
: 57660 t2t-genes.txt
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_000001405.40_GRCh38.p14_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > hg38-genes.txt
wc -l hg38-genes.txt
#+end_src
#+RESULTS:
: 67127 hg38-genes.txt
Gènes communs aux 2
#+begin_src sh :dir ~/Downloads
comm -12 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 54506
Gènes uniquements dans t2t
#+begin_src sh :dir ~/Downloads
comm -23 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 3154
Gènes uniquements dans GRCh38
#+begin_src sh :dir ~/Downloads
comm -13 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 12621
*** HOLD OMIM sur nom du gène :annotation:
*** DONE Mobidetails API
CLOSED: [2023-09-10 Sun 16:44]
Trop long ... 1h à 1h30 d'exécution
Disponible dans module
*** DONE Filtre vep avec spip for T2T et spliceAI pour GRCh38
CLOSED: [2023-09-16 Sat 22:47]
*** DONE Repasser tests en GRCh38 avec nouveau filtre (spip ou splice ai) :sanger:
CLOSED: [2023-09-17 Sun 09:07] SCHEDULED: <2023-09-16 Sat>
*** HOLD Franklin API
https://www.postman.com/genoox-ps/workspace/franklin-api-documentation-s-public-workspace/documentation/6621518-4335389d-12e3-445f-8182-339df95b2a09
*** KILL Regarder si clinique disponible avec vep :annotation:
CLOSED: [2023-09-10 Sun 16:44]
*** TODO Tester filtre sans splice: 6130 mais il en manque 4
SCHEDULED: <2023-09-27 Wed>
Mail Paul: Exome donc hors splice, peu intéressant
**** DONE Enlever complètement condition splice: 6130 variants restants...
CLOSED: [2023-09-27 Wed 19:37] SCHEDULED: <2023-09-26 Tue>
Cf [[id:c9b2009a-503b-4561-94c6-29ae21a3188d][Filtre vep avec spliceAI: 37365 -> 6130]]
Dans tests/splicai
#+begin_src sh
filter_vep -i output-all-gpu.vcf --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o test.vcf
grep -c -v '^#' test.vcf
6130
#+end_src
**** DONE Remplacer par impact fonctionnel: peu d'impact : majorité = MODERATE
CLOSED: [2023-09-27 Wed 19:45] SCHEDULED: <2023-09-26 Tue>
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is HIGH" --only_matched | grep -c -v '^#'
258
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is LOW" --only_matched | grep -c -v '^#'
11
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is MODERATE" --only_matched | grep -c -v '^#'
5824
**** DONE Regarder les conséquences pour tes les transcripts
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
filter_vep -i NA12878-sanger-all-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o filtered.vcf
bcftools +split-vep filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
Idem tests/spliceai
bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained_variant
6 stop_retained_variant&NMD_transcript_variant
1 transcript_ablation
**** DONE Regarder les conséquences pour -s worst
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
Après filtre_vep sans splice
]$ bcftools +split-vep filtered.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
Dans tests/spliceai
$ bcftools +split-vep output-all-gpu-filtered.vcf -f '%Consequence\n' -s worst -d | sort | uniq -c
48 coding_sequence_variant
6 coding_sequence_variant&nmd_transcript_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
79 inframe_deletion
3 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
85 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
5309 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
110 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
19 splice_acceptor_variant
1 splice_acceptor_variant&nmd_transcript_variant
21 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
14 start_lost
44 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
3 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
1 transcript_ablation
**** KILL Vérifier si tests sanger passent: non
CLOSED: [2023-09-28 Thu 01:33] SCHEDULED: <2023-09-27 Wed>
│ String Float64 Int64
─────┼───────────────────────────────────────
1 │ chr10:g.130884530 60.0 67
2 │ chr10:g.240362 60.0 79
3 │ chr14:g.52665581 60.0 51
4 │ chr19:g.41325390 60.0 180
**** DONE Comparer aux filtres en GRCh38: ce sont bien les filtres hors splice...
CLOSED: [2023-10-17 Tue 21:12]
T2T:
filter_vep -i 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched | bcftools +counts
Number of samples: 1
Number of SNPs: 5362
Number of INDELs: 325
Number of MNPs: 323
Number of others: 0
Number of sites: 5991
GRCh38
filter_vep -i 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched | bcftools +counts
Number of samples: 1
Number of SNPs: 1182
Number of INDELs: 143
Number of MNPs: 535
Number of others: 0
Number of sites: 1840
**** DONE Proportions de conséquence : T2T vs GRCh38 avec multiqc: idem
CLOSED: [2023-10-17 Tue 21:00]
À l'oeil
**** Réexaminer les conséquences
***** DONE Impact fonctionnel: plus de LOW et de MODIFIER++
CLOSED: [2023-10-17 Tue 21:22]
T2T
bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%IMPACT\n' -d | sort | uniq -c
596 HIGH
2828 LOW
16314 MODERATE
11261 MODIFIER
GRCh38
bcftools +split-vep 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.filtervep.vcf -f '%IMPACT\n' -d | sort | uniq -c
414 HIGH
466 LOW
10054 MODERATE
550 MODIFIER
***** DONE Pire conséquence: trop de missense
CLOSED: [2023-10-17 Tue 21:23]
GRCh38
$ bcftools +split-vep 2300346867_NA12878-63118093_S260-GRCh38/2300346867_NA12878-63118093_S260-GRCh38.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
2 3_prime_utr_variant&nmd_transcript_variant
1 5_prime_utr_variant
2 coding_sequence_variant
47 frameshift_variant
6 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
1 frameshift_variant&splice_region_variant
1 frameshift_variant&start_lost&start_retained_variant
37 inframe_deletion
9 inframe_deletion&nmd_transcript_variant
27 inframe_insertion
5 inframe_insertion&nmd_transcript_variant
21 intron_variant
1593 missense_variant
37 missense_variant&nmd_transcript_variant
17 missense_variant&splice_region_variant
1 missense_variant&splice_region_variant&nmd_transcript_variant
1 protein_altering_variant
1 splice_acceptor_variant
1 splice_acceptor_variant&frameshift_variant
2 splice_acceptor_variant&nmd_transcript_variant
3 splice_donor_5th_base_variant&intron_variant
1 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
2 splice_donor_region_variant&intron_variant
1 splice_donor_region_variant&intron_variant&nmd_transcript_variant
1 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
10 splice_donor_variant
1 splice_donor_variant&non_coding_transcript_variant
11 splice_polypyrimidine_tract_variant&intron_variant
1 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
1 splice_region_variant&intron_variant
9 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
3 splice_region_variant&synonymous_variant
1 splice_region_variant&synonymous_variant&nmd_transcript_variant
4 start_lost
19 stop_gained
2 stop_gained&frameshift_variant
2 stop_gained&nmd_transcript_variant
1 stop_gained&splice_region_variant
1 stop_gained&splice_region_variant&nmd_transcript_variant
3 stop_lost
2 stop_lost&nmd_transcript_variant
1 stop_retained_variant
18 synonymous_variant
1 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
T2T
[apraga@mesointeractive filter]$ bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
15 3_prime_utr_variant
11 3_prime_utr_variant&nmd_transcript_variant
51 5_prime_utr_variant
3 5_prime_utr_variant&nmd_transcript_variant
48 coding_sequence_variant
5 coding_sequence_variant&nmd_transcript_variant
3 downstream_gene_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
78 inframe_deletion
2 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
84 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
16 intergenic_variant
368 intron_variant
21 intron_variant&nmd_transcript_variant
71 intron_variant&non_coding_transcript_variant
5187 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
105 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
33 non_coding_transcript_exon_variant
12 splice_acceptor_variant
1 splice_acceptor_variant&5_prime_utr_variant&intron_variant&nmd_transcript_variant
1 splice_acceptor_variant&nmd_transcript_variant
3 splice_acceptor_variant&non_coding_transcript_variant
1 splice_acceptor_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
16 splice_donor_5th_base_variant&intron_variant
2 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
33 splice_donor_region_variant&intron_variant
4 splice_donor_region_variant&intron_variant&nmd_transcript_variant
7 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
19 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
2 splice_donor_variant&non_coding_transcript_variant
3 splice_donor_variant&splice_donor_5th_base_variant&coding_sequence_variant&intron_variant
64 splice_polypyrimidine_tract_variant&intron_variant
6 splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
8 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
2 splice_region_variant&3_prime_utr_variant
2 splice_region_variant&5_prime_utr_variant
4 splice_region_variant&intron_variant
6 splice_region_variant&non_coding_transcript_exon_variant
54 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
4 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
5 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
27 splice_region_variant&synonymous_variant
13 start_lost
31 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
2 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
349 synonymous_variant
17 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
2 upstream_gene_variant
** DONE [#B] Indicateurs qualité :qualité:
CLOSED: [2023-09-10 Sun 16:46]
*** Idée
Raredisease:
- FastQC : nombreuses statistiques. Non disponible Nix
- Mosdepth : calcule la profondeur (2x plus rapide que samtools depth). Nix
- MultiQC : fusionne juste les résultats des analyses. Non disponible nix
- Picard's CollectMutipleMetrics, CollectHsMetrics, and CollectWgsMetrics
- Qualimap : alternative fastqc ? Non disponible nix
- Sentieon's WgsMetricsAlgo : propriétaire
- TIDDIT's cov : TIDIT = remaninement chromosomique
Sarek:
- alignment statistics : samtools stats, mosdepth
- QC : MultiQC
MultiQC : non disponible Nix
*** DONE FastqQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Mosdepth
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
Pour exomple, il faut le fichier de capture
subworkflows/local/bam_markduplicates/
*** DONE Samtools stats
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE [#B] Compte-redu exécution avec MultiQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Résultats sur NA12878 : 98% à 20x
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-17 Thu>
**** DONE Comprendre 91% à 20x seulement: SNVs inséré
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester autre kit : Twist exome comprehensive
CLOSED: [2023-08-18 Fri 22:24]
Moins bon
***** DONE Tester génome sans alt
CLOSED: [2023-08-18 Fri 22:25]
Idem
***** DONE Tester NA12878 sans SNVs inséré: cause !!
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester hg19 sur NA12878 non inséré
CLOSED: [2023-08-18 Fri 22:25]
**** DONE Comprendre pourquoi SNVs diminuent le score: reads manquants
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-18 Fri>
Voir [[id:5c1c36f3-f68e-4e6d-a7b6-61dca89abc37][Bug: perte de nombreux reads avec NA12878]]
*** DONE Relancer résultats avec NA1287 et NA12878 + sanger
CLOSED: [2023-08-29 Tue 10:30] SCHEDULED: <2023-08-29 Tue>
*** DONE Comparer avec hg19
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec autres kit de capture
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec no-alt
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
** HOLD vérifier si normalisation
** KILL [#B] Vérification nomenclature hgvs :hgvs:
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-15 Tue>
*** KILL mutalyzer
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13 Sun>
*** KILL API variantvalidator
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13 Sun>
** DO
840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63061871
- [X] 2300055440_63061880
- [X] 230006894_63064950
- [X] 2300071111_63070356
- [X] 2300083434_63071675
- [X] 2300103609_63076239
- [X] 2300104572_63076232
- [X] 2300109602_63076765
- [X] 2300109665_63076770
- [X] 2300119721_63078732
- [X] 2300137773_63078133
- [X] 2300137834_63078123
- [X] 2300167821_63086183
- [X] 2300172698_63113453
- [X] 2300188216_63090609
- [X] 2300188281_63090632
- [ ] 2300188800_63090616
- [ ] 2300193193645_63090623
- [ ] 2300193668_63090611
- [ ] 2300195426_63090608
- [ ] 2300201017_63089636
- [ ] 2300227479_63098330
- [ ] 2300232688_63130821
- [ ] 2300292749_63109239
- [ ] 230029277_63109247
- [ ] 2300294712_63109236
- [ ] 2300308032_63111581
- [ ] 2300323537_63114209
- [ ] 2300334609_63115535
- [ ] 2300346867_63118093
- [ ] 2300346867_63118093_NA12878
- [ ] 2300348940_63118099
- [ ] 2300359806_63119915
- [ ] 2300380476_63123963
- [ ] 2300382582_63123749
- [ ] 2300384269_63126867
- [ ] 2300407581_63130826
- [ ] 2300407626_63130842
- [ ] 2300409593_63130874
- [ ] 2300409612_63130980
- [ ] 2300417623_63131524
** TODO Variants manqués :missed:
SCHEDULED: <2023-10-21 Sat>
*** DONE 63012582: chr10:g.102230760 filtré par AD :63012582:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
Il est en sortie d'haplotypecaller !
Attention à la position : POS=102230753 noté CG->C
GT:AD:DP:GQ:PL 0/1:26,8:34:99:146,0,671
Filtré par la condition AD <= 10 (porté par 8 reads seulement)
Non confirméen sanger, rendu vous
**** KILL image BAM cento
CLOSED: [2023-10-08 Sun 23:13]
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 23:23] SCHEDULED: <2023-10-08 Sun>
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** DONE 63060439: chr15:g.26869324 = Problème de profondeur DP=15 :63060439:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
GABRA5
Rendu VOUS avec un variant patho MDB5 pour même patient (VOUS- même)
Non confirmé en Sanger
GT:AD:DP:GQ:PL 0/1:9,6:15:99:103,0,213
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 22:56]
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** TODO Ajouter négatifs dans la liste des variants
SCHEDULED: <2023-10-19 Thu>
* Résultats
** TODO Speed-up BWA-mem
SCHEDULED: <2023-10-22 Sun>
** TODO Speed-up Hapotypecaller
SCHEDULED: <2023-10-22 Sun>
* Communication
** DONE Mail NGS-diag
CLOSED: [2023-10-06 Fri 08:04] SCHEDULED: <2023-10-06 Fri>
/Entered on/ [2023-10-04 Wed 19:33]