DIQYFZN3GWR5EQWM3VPQXZ6JKVYCJZAFGJ3C2P3ONAYOKY6OUW5AC
7DD76XEW74EUAXS4MLNCHLISHV2C54SAUH4T42UMMSLVACOYD3ZQC
4MSZS47PX5OIKGBAMWYPHM5V6NJEWUWE4LNEPFXT4AIIU6MOB5AAC
FXA3ZBV64FML7W47IPHTAJFJHN3J3XHVHFVNYED47XFSBIGMBKRQC
YHHQ5OOSVEIJMOCGCS2F6WXJADKVOGDLNPMSA6YGQY4ZI2RIQRVAC
SYWVRVTSQFTJTEOZOEIIRDGPSJUEIIXT6FLFGXK7MJGL2P2C7MOQC
SNZLCATNHYELY3ZLE5CMAUSO5RQM2XTHB3WEZ4345PR5WQSTOJFAC
BB3OO5HF5GT4ZECXYRNPCBNECRSRPPVPWVM5NRVI7MOP7BAL3OEAC
18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** DONE Mail R. Lemann :T2T:
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** KILL Mise à jour T2T :T2T:
*** WAIT Corriger PR
SCHEDULED: <2023-12-18 Mon>
** TODO VEP :vep:
*** DONE [[https://github.com/NixOS/nixpkgs/pull/185691][BioPerl]]
SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-09 Tue 10:57]
PR submitted
*** DONE BioDBBBigFile
CLOSED: [2023-11-30 Thu 21:52]
:PROPERTIES:
:ORDERED: t
:END:
/Entered on/ [2022-08-10 Wed 14:28]
On utilise la dernière version de kent, donc plus de problème.
PRête à être mergé. Rebase faite<2023-07-02 Sun>
**** DONE Version de kent déjà packagée : forcer version 335
CLOSED: [2023-07-02 Sun 11:20]
***** KILL [[https://github.com/NixOS/nixpkgs/pull/206991][Restore building kent 404]]
CLOSED: [2023-05-06 Sat 17:40]
Review faite <2023-03-26 Sun> , atteinte merge]
Relancé <2023-05-06 Sat>
Kent 446 n'a pas ce problème donc PR inutile
***** DONE [[https://github.com/NixOS/nixpkgs/pull/223411][Ajouter les header to package]] (inc folder)
CLOSED: [2023-05-08 Mon 10:18] SCHEDULED: <2023-05-07 Sun>
Review à faire
https://github.com/NixOS/nixpkgs/pull/223411
Corrigé et plus besoin de la PR précédente
***** KILL [[https://github.com/NixOS/nixpkgs/pull/186462][BioDBBBigFile]] avec ces 2 changements
CLOSED: [2023-07-02 Sun 11:20]
**** KILL Version de kent déjà packagée : 404
CLOSED: [2023-03-27 Mon 16:43]
Compile mais les tests de passent pas
**** DONE Modifier selon PR https://github.com/NixOS/nixpkgs/pull/186462
CLOSED: [2023-07-30 Sun 22:01] SCHEDULED: <2023-07-30 Sun 20:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 19:13]--[2023-07-30 Sun 20:50] => 1:37
:END:
Modification nécessaire pour kent :
- plus de patch
- suppression d'une boucle dans postPatch
On supprime aussi NIX_BUILD_TOP
**** DONE Corriger PR biobigfile
CLOSED: [2023-11-30 Thu 21:52] SCHEDULED: <2023-12-05 Tue>
/Entered on/ [2023-10-15 Sun 17:21]
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186459][BioDBHTS]]
CLOSED: [2023-05-06 Sat 08:49] SCHEDULED: <2023-04-15 Sat>
/Entered on/ [2022-08-10 Wed 14:28]
Correction pour review faites <2022-10-10 Mon>
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186464][BioExtAlign]]
CLOSED: [2022-10-22 Sat 12:43] SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-10 Wed 14:28]
Review <2022-10-10 Mon>, correction dans la journée.
Correction 2e passe, attente
Impossible de faire marcher les tests Car il ne trouve pas le module Bio::Tools::Align, qui est dans un dossier ailleurs dans le dépôt. Même en compilant tout le dépôt, cela ne fonctionne pas... On skip les tests.
*** TODO VEP
SCHEDULED: <2023-12-21 Thu>
** WAIT [[https://github.com/NixOS/nixpkgs/pull/230394][rtg-tools]] :vcfeval:
Soumis
** WAIT Package Spip https://github.com/NixOS/nixpkgs/pull/247476
** TODO Happy :happy:
*** TODO PR python 3 upstream
SCHEDULED: <2023-12-27 Wed>
*** TODO nixpkgs en l'état
SCHEDULED: <2023-12-27 Wed>
** PROJ SpliceAI
** TODO Bamsurgeon
/Entered on/ [2023-05-13 Sat 19:11]
*** TODO Velvet
** TODO PR Picard avec option pour gérer la mémoire
Similaire à
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picard/picard.sh
** DONE Gatk 4.5.0
CLOSED: [2023-12-17 Sun 22:02] SCHEDULED: <2023-12-15 Fri>
/Entered on/ [2023-12-15 Fri 18:58]
* Julia :julia:
** KILL XAM.jl: PR pour modification record :julia:
CLOSED: [2023-05-29 Mon 15:40] SCHEDULED: <2023-05-28 Sun>
/Entered on/ [2023-05-27 Sat 22:39]
** TODO XAMscissors.jl :xamscissors:
Modification de la séquence dans BAM.
*Pas de mise à jour de CIGAR*
On convertit en fastq et on lance le pipeline pour "corriger"
#+begin_src sh
cd /home/alex/code/bisonex/out/63003856/preprocessing/mapped
samtools view 63003856_S135.bam NC_000022.11 -o 63003856_S135_chr22.bam
cd /home/alex/recherche/bisonex/code/BamScissors.jl
cp ~/code/bisonex/out/63003856/preprocessing/mapped/63003856_S135_chr22.bam .
samtools index 63003856_chr22.bam
#+end_src
Le script va modifier le bam, le trier et générer le fastq. !!!
Attention: ne pas oublier l'option -n !!!
#+begin_src sh
time julia --project=.. insertVariant.jl
scp 63003856_S135_chr22_{1,2}.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
*** WAIT Implémenter les SNV avec VAF :snv:
Stratégie :
1. calculer la profondeur sur les positions
2. créer un dictionnaire { nom du reads : position dataframe }
3. itérer sur tous les reads et changer ceux marqués
**** DONE VAF = 1
CLOSED: [2023-05-29 Mon 15:34]
**** DONE VAF selon loi normale
CLOSED: [2023-05-29 Mon 15:35]
Tronquée si > 1
**** WAIT Tests unitaires
***** DONE NA12878: 1 gène sur chromosome 22
CLOSED: [2023-05-30 Tue 23:55]
root = "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/"
#+begin_src sh
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878.bwa.markDuplicates.bam chr22 -o project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam chr22:19419700-19424000 -o NIST7035_H7AP8ADXX_NA12878_chr22_MRPL40_hg19.bam
#+end_src
***** WAIT Pull request formatspeciment
https://github.com/BioJulia/FormatSpecimens.jl/pull/8
***** DONE Formatspecimens
CLOSED: [2023-05-29 Mon 23:03]
****** DONE 1 read
CLOSED: [2023-05-29 Mon 23:02]
****** DONE VAF sur 1 exon
CLOSED: [2023-05-29 Mon 23:03]
**** DONE [#A] Bug: perte de nombreux reads avec NA12878
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-18 Fri>
:PROPERTIES:
:ID: 5c1c36f3-f68e-4e6d-a7b6-61dca89abc37
:END:
Ex: chrX:g.124056226 : on passe de 65 reads à 1
Test xamscissors: pas de soucis...
On teste sur cette position +/- 200bp
#+begin_src sh :dir /home/alex/roam/research/bisonex/code/sanger
samtools view /home/alex/code/bisonex/out/2300346867_NA12878-63118093_S260-GRCh38/preprocessing/mapped/2300346867_NA12878-63118093_S260-GRCh38.bam chrX:124056026-124056426 -o chrXsmall.bam
#+end_src
#+RESULTS:
***** DONE Vérifier profondeur avec dernière version :
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
****** DONE chr20: profondeur ok
SCHEDULED: <2023-08-19 Sat>
****** DONE toutes les données
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
Ok pour 7 variants (IGV) notament chromosome X
*** TODO Implémenter les indel avec VAF :indel:
*** TODO Soumission paquet
* Données
:PROPERTIES:
:CATEGORY: data
:END:
** DONE Remplacer bam par fastq sur mesocentre
CLOSED: [2023-04-16 Sun 16:33]
Commande
*** DONE Supprimer les fastq non "paired"
CLOSED: [2023-04-16 Sun 16:33]
nushell
Liste des fastq avec "paired-end" manquant
#+begin_src nu
ls **/*.fastq.gz | get name | path basename | split column "_" | get column1 | uniq -u | save single.txt
#+end_src
#+RESULTS:
: 62907927
: 62907970
: 62899606
: 62911287
: 62913201
: 62914084
: 62915905
: 62921595
: 62923065
: 62925220
: 62926503
: 62926502
: 62926500
: 62926499
: 62926498
: 62931719
: 62943423
: 62943400
: 62948290
: 62949205
: 62949206
: 62949118
: 62951284
: 62960792
: 62960785
: 62960787
: 62960617
: 62962561
: 62962692
: 62967473
: 62972194
: 62979102
On vérifie
#+begin_src nu
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0 }
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0.name } | path basename | split column "_" | get column1 | uniq -c
#+end_src
On met tous dans un dossier (pas de suppression )
#+begin_src
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0 } | each {|e| ^mv $e.name bad-fastq/}
#+end_src
On vérifie que les dossiier sont videsj
open single.txt | lines | each {|e| ls $"fastq/*_($in)" | get 0.name } | ^ls -l $in
Puis on supprime
open single.txt | lines | each {|e| ls $"fastq/*_($in)" | get 0.name } | ^rm -r $in
*** DONE Supprimer bam qui ont des fastq
CLOSED: [2023-04-16 Sun 16:33]
On liste les identifiants des fastq et bam dans un tableau avec leur type :
#+begin_src
let fastq = (ls fastq/*/*.fastq.gz | get name | parse "{dir}/{full_id}/{
18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** DONE Mail R. Lemann :T2T:
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** KILL Mise à jour T2T :T2T:
*** WAIT Corriger PR
SCHEDULED: <2024-01-02 Tue>
** TODO VEP :vep:
*** DONE [[https://github.com/NixOS/nixpkgs/pull/185691][BioPerl]]
SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-09 Tue 10:57]
PR submitted
*** DONE BioDBBBigFile
CLOSED: [2023-11-30 Thu 21:52]
:PROPERTIES:
:ORDERED: t
:END:
/Entered on/ [2022-08-10 Wed 14:28]
On utilise la dernière version de kent, donc plus de problème.
PRête à être mergé. Rebase faite<2023-07-02 Sun>
**** DONE Version de kent déjà packagée : forcer version 335
CLOSED: [2023-07-02 Sun 11:20]
***** KILL [[https://github.com/NixOS/nixpkgs/pull/206991][Restore building kent 404]]
CLOSED: [2023-05-06 Sat 17:40]
Review faite <2023-03-26 Sun> , atteinte merge]
Relancé <2023-05-06 Sat>
Kent 446 n'a pas ce problème donc PR inutile
***** DONE [[https://github.com/NixOS/nixpkgs/pull/223411][Ajouter les header to package]] (inc folder)
CLOSED: [2023-05-08 Mon 10:18] SCHEDULED: <2023-05-07 Sun>
Review à faire
https://github.com/NixOS/nixpkgs/pull/223411
Corrigé et plus besoin de la PR précédente
***** KILL [[https://github.com/NixOS/nixpkgs/pull/186462][BioDBBBigFile]] avec ces 2 changements
CLOSED: [2023-07-02 Sun 11:20]
**** KILL Version de kent déjà packagée : 404
CLOSED: [2023-03-27 Mon 16:43]
Compile mais les tests de passent pas
**** DONE Modifier selon PR https://github.com/NixOS/nixpkgs/pull/186462
CLOSED: [2023-07-30 Sun 22:01] SCHEDULED: <2023-07-30 Sun 20:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 19:13]--[2023-07-30 Sun 20:50] => 1:37
:END:
Modification nécessaire pour kent :
- plus de patch
- suppression d'une boucle dans postPatch
On supprime aussi NIX_BUILD_TOP
**** DONE Corriger PR biobigfile
CLOSED: [2023-11-30 Thu 21:52] SCHEDULED: <2023-12-05 Tue>
/Entered on/ [2023-10-15 Sun 17:21]
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186459][BioDBHTS]]
CLOSED: [2023-05-06 Sat 08:49] SCHEDULED: <2023-04-15 Sat>
/Entered on/ [2022-08-10 Wed 14:28]
Correction pour review faites <2022-10-10 Mon>
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186464][BioExtAlign]]
CLOSED: [2022-10-22 Sat 12:43] SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-10 Wed 14:28]
Review <2022-10-10 Mon>, correction dans la journée.
Correction 2e passe, attente
Impossible de faire marcher les tests Car il ne trouve pas le module Bio::Tools::Align, qui est dans un dossier ailleurs dans le dépôt. Même en compilant tout le dépôt, cela ne fonctionne pas... On skip les tests.
*** TODO VEP
SCHEDULED: <2023-12-21 Thu>
** WAIT [[https://github.com/NixOS/nixpkgs/pull/230394][rtg-tools]] :vcfeval:
Soumis
** WAIT Package Spip https://github.com/NixOS/nixpkgs/pull/247476
** TODO Happy :happy:
*** TODO PR python 3 upstream
SCHEDULED: <2023-12-27 Wed>
*** TODO nixpkgs en l'état
SCHEDULED: <2023-12-27 Wed>
** PROJ SpliceAI
** TODO Bamsurgeon
/Entered on/ [2023-05-13 Sat 19:11]
*** TODO Velvet
** TODO PR Picard avec option pour gérer la mémoire
Similaire à
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picard/picard.sh
** DONE Gatk 4.5.0
CLOSED: [2023-12-17 Sun 22:02] SCHEDULED: <2023-12-15 Fri>
/Entered on/ [2023-12-15 Fri 18:58]
* Julia :julia:
** KILL XAM.jl: PR pour modification record :julia:
CLOSED: [2023-05-29 Mon 15:40] SCHEDULED: <2023-05-28 Sun>
/Entered on/ [2023-05-27 Sat 22:39]
** TODO XAMscissors.jl :xamscissors:
Modification de la séquence dans BAM.
*Pas de mise à jour de CIGAR*
On convertit en fastq et on lance le pipeline pour "corriger"
#+begin_src sh
cd /home/alex/code/bisonex/out/63003856/preprocessing/mapped
samtools view 63003856_S135.bam NC_000022.11 -o 63003856_S135_chr22.bam
cd /home/alex/recherche/bisonex/code/BamScissors.jl
cp ~/code/bisonex/out/63003856/preprocessing/mapped/63003856_S135_chr22.bam .
samtools index 63003856_chr22.bam
#+end_src
Le script va modifier le bam, le trier et générer le fastq. !!!
Attention: ne pas oublier l'option -n !!!
#+begin_src sh
time julia --project=.. insertVariant.jl
scp 63003856_S135_chr22_{1,2}.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
*** WAIT Implémenter les SNV avec VAF :snv:
Stratégie :
1. calculer la profondeur sur les positions
2. créer un dictionnaire { nom du reads : position dataframe }
3. itérer sur tous les reads et changer ceux marqués
**** DONE VAF = 1
CLOSED: [2023-05-29 Mon 15:34]
**** DONE VAF selon loi normale
CLOSED: [2023-05-29 Mon 15:35]
Tronquée si > 1
**** WAIT Tests unitaires
***** DONE NA12878: 1 gène sur chromosome 22
CLOSED: [2023-05-30 Tue 23:55]
root = "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/"
#+begin_src sh
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878.bwa.markDuplicates.bam chr22 -o project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam chr22:19419700-19424000 -o NIST7035_H7AP8ADXX_NA12878_chr22_MRPL40_hg19.bam
#+end_src
***** WAIT Pull request formatspeciment
https://github.com/BioJulia/FormatSpecimens.jl/pull/8
***** DONE Formatspecimens
CLOSED: [2023-05-29 Mon 23:03]
****** DONE 1 read
CLOSED: [2023-05-29 Mon 23:02]
****** DONE VAF sur 1 exon
CLOSED: [2023-05-29 Mon 23:03]
**** DONE [#A] Bug: perte de nombreux reads avec NA12878
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-18 Fri>
:PROPERTIES:
:ID: 5c1c36f3-f68e-4e6d-a7b6-61dca89abc37
:END:
Ex: chrX:g.124056226 : on passe de 65 reads à 1
Test xamscissors: pas de soucis...
On teste sur cette position +/- 200bp
#+begin_src sh :dir /home/alex/roam/research/bisonex/code/sanger
samtools view /home/alex/code/bisonex/out/2300346867_NA12878-63118093_S260-GRCh38/preprocessing/mapped/2300346867_NA12878-63118093_S260-GRCh38.bam chrX:124056026-124056426 -o chrXsmall.bam
#+end_src
#+RESULTS:
***** DONE Vérifier profondeur avec dernière version :
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
****** DONE chr20: profondeur ok
SCHEDULED: <2023-08-19 Sat>
****** DONE toutes les données
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
Ok pour 7 variants (IGV) notament chromosome X
*** TODO Implémenter les indel avec VAF :indel:
*** TODO Soumission paquet
* Données
:PROPERTIES:
:CATEGORY: data
:END:
** DONE Remplacer bam par fastq sur mesocentre
CLOSED: [2023-04-16 Sun 16:33]
Commande
*** DONE Supprimer les fastq non "paired"
CLOSED: [2023-04-16 Sun 16:33]
nushell
Liste des fastq avec "paired-end" manquant
#+begin_src nu
ls **/*.fastq.gz | get name | path basename | split column "_" | get column1 | uniq -u | save single.txt
#+end_src
#+RESULTS:
: 62907927
: 62907970
: 62899606
: 62911287
: 62913201
: 62914084
: 62915905
: 62921595
: 62923065
: 62925220
: 62926503
: 62926502
: 62926500
: 62926499
: 62926498
: 62931719
: 62943423
: 62943400
: 62948290
: 62949205
: 62949206
: 62949118
: 62951284
: 62960792
: 62960785
: 62960787
: 62960617
: 62962561
: 62962692
: 62967473
: 62972194
: 62979102
On vérifie
#+begin_src nu
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0 }
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0.name } | path basename | split column "_" | get column1 | uniq -c
#+end_src
On met tous dans un dossier (pas de suppression )
#+begin_src
open single.txt | lines | each {|e| ls $"fastq/*_($in)/*" | get 0 } | each {|e| ^mv $e.name bad-fastq/}
#+end_src
On vérifie que les dossiier sont videsj
open single.txt | lines | each {|e| ls $"fastq/*_($in)" | get 0.name } | ^ls -l $in
Puis on supprime
open single.txt | lines | each {|e| ls $"fastq/*_($in)" | get 0.name } | ^rm -r $in
*** DONE Supprimer bam qui ont des fastq
CLOSED: [2023-04-16 Sun 16:33]
On liste les identifiants des fastq et bam dans un tableau avec leur type :
#+begin_src
let fastq = (ls fastq/*/*.fastq.gz | get name | parse "{dir}/{full_id}/{
.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio
QUERY.TOTAL.het_hom_ratio
INDEL ALL 413 246 167 751 289 215 2 98 0.595642 0.460821 0.286285 0.519629 NaN NaN 2.428571 2.465116
INDEL PASS 413 246 167 751 289 215 2 98 0.595642 0.460821 0.286285 0.519629 NaN NaN 2.428571 2.465116
SNP ALL 15883 15479 404 23597 5277 2841 46 44 0.974564 0.745760 0.120397 0.844947 3.017198 2.85705 5.560099 2.114633
SNP PASS 15883 15479 404 23597 5277 2841 46 44 0.974564 0.745760 0.120397 0.844947 3.017198 2.85705 5.560099 2.114633
******* DONE Vérifier qu'il ne reste plus de filtre autre que PASS
CLOSED: [2023-07-08 Sat 15:19]
#+begin_src
$ zgrep -c 'PASS' HG001_GRCh38_1_22_v4_lifted_merged.vcf.gz
3730505
$ zgrep -c '^chr' HG001_GRCh38_1_22_v4_lifted_merged.vcf.gz
3730506
#+end_src
****** TODO 1/4 SNP manquant ?
******* DONE Regarder avec Julia si ce sont vraiment des FP: 61/5277 qui ne le sont pas
CLOSED: [2023-07-09 Sun 12:09]
******* DONE Examiner les FP
CLOSED: [2023-07-30 Sun 22:05]
******* DONE Tester un FP
CLOSED: [2023-07-30 Sun 22:05]
2 │ chr1 608765 A G ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:ti:SNP:homalt:188
liftDown UCSC: rien en GIAB : vrai FP
3 │ chr1 762943 A G ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:ti:SNP:homalt:287
4 │ chr1 762945 A T ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:tv:SNP:homalt:287
Remaniements complexes ? Pas dans le gène en HG38
******* DONE La plupart des FP (4705/5566) sont homozygotes: erreur de référence ?
CLOSED: [2023-07-12 Wed 21:10] SCHEDULED: <2023-07-09 Sun>
Sur les 2 premiers variants, ils montrent en fait la différence entre T2T et GRCh38
Erreur à l'alignement ?
******** KILL relancer l'alignement
CLOSED: [2023-07-09 Sun 17:36]
******** DONE vérifier reads identiques hg38 et T2T: oui
CLOSED: [2023-07-09 Sun 16:36]
T2T CHR1608765
38 chr1:1180168-1180168 (
SRR14724513.24448214
SRR14724513.24448214
******* DONE Vérifier quelques variants sur IGV
CLOSED: [2023-07-09 Sun 17:36]
******* KILL Répartition des FP : cluster ?
CLOSED: [2023-07-09 Sun 17:36]
****** DONE Examiner les FP restant après correction selon séquence de référence
CLOSED: [2023-08-12 Sat 15:57]
****** HOLD Examiner les variants supprimé
****** TODO Enlever les FP qui correspondent à un changement dans le génome
******* Condition:
- pas de variation à la position en GRCh38
- variantion homozygote
- la varation en T2T correspond au changement de pair de base GRC38 -> T2T
pour les SNP:
alt_T2T[i] = DNA_GRC38[j]
avec i la position en T2T et j la position en GRCh38
Note: définir un ID n'est pas correct car les variants peuvent être modifié par happy !
******* Idée
- Pour chaque FP, c'est un "faux" FP si
- REF en hg38 == ALT en T2T
- et REF en hg38 != REF en T2T
- et variant homozygote
Comment obtenir les séquences de réferences ?
1. liftover
2. blat sur la séquence autour du variant
3. identifier quelques reads contenant le variant et regarder leur aligneement en hg38
Après discussion avec Alexis: solution 3
******* Algorithme
1. Extraire les coordonnées en T2T des faux positifs *homozygote*
2. Pour chaque faux positif
1. lister 10 reads contenant le variant
2. pour chacun de ces reads, récupérer la séquence en T2T et GRCh38 via le nom du read dans le bam
3. si la séquence en T2T modifiée par le variant est "identique" à celle en GRCh38, alors on ignore ce faux positif
Note: on ignore les reads qui ont changé de chromosome entre les version
******* DONE Résultat préliminaire
CLOSED: [2023-07-23 Sun 14:30]
cf [[file:~/roam/research/bisonex/code/giab/giab-corrected.csv][script julia]]
3498 faux positifs en moins, soit 0.89 sensibilité
julia> tp=15479
julia> fp=5277
julia> tp/(tp+fp)
0.7457602620928888
julia> tp/(tp+(fp-3498))
0.8969173716537258
On est toujours en dessous des 97%
******* HOLD Corriger proprement VCF ou résultats Happy
******* TODO Adapter pour gérer plusieurs variants par read
****** DONE Méthodologie du pangenome
CLOSED: [2023-10-03 Tue 21:28]
Voir biblio[cite:@liao2023] mais ont aligné sur GRCH38
******* DONE Mail alexis
CLOSED: [2023-10-03 Tue 21:28]
****** DONE Méthodologie T2T
CLOSED: [2023-10-16 Mon 19:42]
Mail alexis
SCHEDULED: <2023-10-04 Wed>
***** TODO Rendre simplement le nombre de vrais positifs
SCHEDULED: <2023-12-18 Mon>
***** KILL Mail Yannis
CLOSED: [2023-07-08 Sat 10:44]
***** DONE Mail GIAB pour version T2T
CLOSED: [2023-07-07 Fri 18:37]
**** KILL HG002 :hg002:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** KILL HG003 :hg003:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** KILL HG004 :hg004:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** DONE Plot : ashkenazim trio :hg38:
CLOSED: [2023-07-30 Sun 16:49] SCHEDULED: <2023-07-30 Sun 15:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 16:06]--[2023-07-30 Sun 16:35] => 0:29
CLOCK: [2023-07-30 Sun 15:39]--[2023-07-30 Sun 15:40] => 0:01
:END:
/Entered on/ [2023-04-16 Sun 17:29]
Refaire résultats
**** DONE Mail Paul sur les résultat ashkenazim +/- centogene
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** DONE Relancer comparaison GIAB avec GATK 4.4.0
CLOSED: [2023-08-12 Sat 15:55]
/Entered on/ [2023-08-03 Thu 12:42]
**** TODO Re-télécharger proprement dans pipeline dédiés
[[*Résumé][Résumé]]
Cf [[*Validation : Quelles données de référence ?][Validation : Quelles données de référence ?]]
https://medium.com/dnanexus/benchmarking-state-of-the-art-secondary-variant-calling-pipelines-5472ca6bace7
Source:
https://trace.ncbi.nlm.nih.gov/Traces/index.html?view=study&acc=SRP047086
https://zenodo.org/records/3597727
Selon https://github.com/genome-in-a-bottle/giab_data_indexes
***** TODO HG001 :hg001:
SCHEDULED: <2023-11-29 Wed>
****** TODO Avec données en hg38
SCHEDULED: <2023-11-29 Wed>
[[*Résumé][Résumé]]
Ok pour hiseq4000 et sureselect mais utiliser le dernier commit pour symlink
****** TODO Avec données en hg19
SCHEDULED: <2023-12-20 Wed>
Utiliser crossmap ! https://crossmap.readthedocs.io/en/latest/ (inspiré de [[https://github.com/bcbio/bcbio_validation_workflows/blob/master/giab-exome/input/get_data.sh][bcbio]]
pour vérifier
***** TODO HG002 :hg002:
SCHEDULED: <2023-12-21 Thu>
***** TODO HG003 :hg003:
SCHEDULED: <2023-12-21 Thu>
***** TODO HG004 :hg001:
SCHEDULED: <2023-12-21 Thu>
**** TODO Refaire les analyses happy+ rtgeval
SCHEDULED: <2023-12-09 Sat>
On veut les résultats de https://medium.com/dnanexus/benchmarking-state-of-the-art-secondary-variant-calling-pipelines-5472ca6bace7
hap.py avec conda ?
*** TODO Platinum genome :platinum:
https://emea.illumina.com/platinumgenomes.html
**** TODO Tester sur la zone couverte par l'exome centogène
SCHEDULED: <2023-12-18 Mon>
*** DONE Séquencer NA12878 :cento:hg001:
CLOSED: [2023-10-07 Sat 17:59]
Discussion avec Paul : sous-traitant ne nous donnera pas les données, il faut commander l'ADN
**** DONE ADN commandé
CLOSED: [2023-06-30 Fri 22:29]
**** DONE Sauvegarder les données brutes
CLOSED: [2023-07-30 Sun 14:22] SCHEDULED: <2023-07-19 Wed>
K, scality, S
**** KILL Récupérer le fichier de capture
CLOSED: [2023-07-30 Sun 14:25] SCHEDULED: <2023-07-23 Sun>
Candidats donnés dans publication https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354858/
#+begin_quote
In short, the Nextera Rapid Capture Exome Kit (Illumina, San Diego, CA), the SureSelect Human All Exon kit (Agilent, Santa Clara, CA) or the Twist Human Core Exome was used for enrichment
, and a Nextseq500, HiSeq4000, or Novoseq 6000 (Illumina) instrument was used for the actual sequencing, with the average coverage targeted to at least 100× or at least 98% of the target DNA covered 20×.
#+end_quote
Par défaut, on utilisera https://www.twistbioscience.com/products/ngs/alliance-panels#tab-3
ANnonce récente pour nouveau panel Twist : https://www.centogene.com/news-events/news/newsdetails/twist-bioscience-and-centogene-launch-three-panels-to-advance-rare-disease-and-hereditary-cancer-research-and-support-diagnostics
Masi pas de fichier BED
***** DONE Mail centogène
CLOSED: [2023-07-30 Sun 14:22] DEADLINE: <2023-07-23 Sun>
**** DONE Tester Nextera Rapid Capture Exome v1.2 (hg19) :giab:
CLOSED: [2023-08-06 Sun 19:05] SCHEDULED: <2023-08-03 Thu 19:00>
https://support.illumina.com/downloads/nextera-rapid-capture-exome-v1-2-product-files.html
***** DONE Liftover capture
CLOSED: [2023-08-06 Sun 18:30] SCHEDULED: <2023-08-06 Sun>
#+begin_src sh
nextflow run -profile standard,helios workflows/lift-nextera-capture.nf -lib lib
#+end_src
Vérification rapide : ok
***** DONE Run
CLOSED: [2023-08-06 Sun 19:05] SCHEDULED: <2023-08-06 Sun>
#+begin_src sh
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_NA12878-63118093_S260-GRCh38/callVariant/haplotypecaller/2300346867_NA12878-63118093_S260-GRCh38.vcf.gz --outdir=out/2300346867_NA12878-63118093_S260-GRCh38/happy-nextera-lifted/ --compare=happy -lib lib --capture=capture/nexterarapidcapture_exome_targetedregions_v1.2-nochrM_lifted.bed --id=HG001 --genome=GRCh38
#+end_src
**** DONE Tester Agilent SureSelect All Exon V8 (hg38) :giab:
CLOSED: [2023-07-31 Mon 23:09] SCHEDULED: <2023-07-31 Mon>
https://earray.chem.agilent.com/suredesign/index.htm
"Find design"
"Agilent catalog"
Fichiers:
- Regions.bed: Targeted exon intervals, curated and targeted by Agilent Technologies
- MergedProbes.bed: Merged probes for targeted enrichment of exons described in Regions.bed
- Covered.bed: Merged probes and sequences with 95% homology or above
- Padded.bed: Merged probes and sequences with 95% homology or above extended 50 bp at each side
- AllTracks.bed: Targeted regions and covered tracks
#+begin_src sh
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_63118093_NA12878-GRCh38/callVariant/haplotypecaller/2300346867_63118093_NA12878-GRCh38.vcf.gz --outdir=out/2300346867_63118093_NA12878-GRCh38/happy/ --compare=happy -lib lib --capture=capture/Agilent_SureSelect_All_Exons_v8_hg38_Regions.bed --id=HG001 --genome=GRCh38
#+end_src
| Type | Filter | TRUTH.TOTAL | TRUTH.TP | TRUTH.FN | QUERY.TOTAL | QUERY.FP | QUERY.UNK | FP.gt | FP.al | METRIC.Recall | METRIC.Precision | METRIC.Frac_NA | METRIC.F1_Score | TRUTH.TOTAL.TiTv_ratio | QUERY.TOTAL.TiTv_ratio | TRUTH.TOTAL.het_hom_ratio | QUERY.TOTAL.het_hom_ratio |
| INDEL | ALL | 423 | 395 | 28 | 915 | 108 | 405 | 4 | 13 | 0.933806 | 0.788235 | 0.442623 | 0.854868 | | | 1.7012987012987013 | 2.7916666666666665 |
| INDEL | PASS | 423 | 395 | 28 | 915 | 108 | 405 | 4 | 13 | 0.933806 | 0.788235 | 0.442623 | 0.854868 | | | 1.7012987012987013 | 2.7916666666666665 |
| SNP | ALL | 20984 | 20600 | 384 | 26080 | 780 | 4703 | 62 | 10 | 0.9817 | 0.963512 | 0.18033 | 0.972521 | 3.0499710592321048 | 2.7596541786743516 | 1.58256372367935 | 1.8978207694018234 |
| SNP | PASS | 20984 | 20600 | 384 | 26080 | 780 | 4703 | 62 | 10 | 0.9817 | 0.963512 | 0.18033 | 0.972521 | 3.0499710592321048 | 2.7596541786743516 | 1.58256372367935 | 1.8978207694018234 |
**** DONE Test Twist Human core Exome (hg38):giab:
CLOSED: [2023-08-01 Tue 23:16] SCHEDULED: <202 3-08-02 Wed>
https://www.twistbioscience.com/resources/data-files/ngs-human-core-exome-panel-bed-file
#+begin_src
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_63118093_NA12878-GRCh38/callVariant/haplotypecaller/2300346867_63118093_NA12878-GRCh38.vcf.gz --outdir=out/2300346867_63118093_NA12878-GRCh38/happy-twist-exome-core/ --compare=happy -lib lib --capture=capture/Twist_Exome_Core_Covered_Targets_hg38.bed --id=HG001 --genome=GRCh38 -bg
#+end_src
| Type | Filter | TRUTH.TOTAL | TRUTH.TP | TRUTH.FN | QUERY.TOTAL | QUERY.FP | QUERY.UNK | FP.gt | FP.al | METRIC.Recall | METRIC.Precision | METRIC.Frac_NA | METRIC.F1_Score | TRUTH.TOTAL.TiTv_ratio | QUERY.TOTAL.TiTv_ratio | TRUTH.TOTAL.het_hom_ratio | QUERY.TOTAL.het_hom_ratio |
| INDEL | ALL | 328 | 313 | 15 | 722 | 95 | 309 | 4 | 13 | 0.954268 | 0.769976 | 0.427978 | 0.852273 | | | 1.8584070796460177 | 2.8967391304347827 |
| INDEL | PASS | 328 | 313 | 15 | 722 | 95 | 309 | 4 | 13 | 0.954268 | 0.769976 | 0.427978 | 0.852273 | | | 1.8584070796460177 | 2.8967391304347827 |
| SNP | ALL | 19198 | 18962 | 236 | 23381 | 684 | 3738 | 48 | 10 | 0.987707 | 0.965178 | 0.159873 | 0.976313 | 3.1034188034188035 | 2.859264147830391 | 1.5669565217391304 | 1.8578767123287672 |
| SNP | PASS | 19198 | 18962 | 236 | 23381 | 684 | 3738 | 48 | 10 | 0.987707 | 0.965178 | 0.159873 | 0.976313 | 3.1034188034188035 | 2.859264147830391 | 1.5669565217391304 | 1.8578767123287672 |
**** DONE Test Twist Human core Exome (hg38):giab:
CLOSED: [2023-08-05 Sat 09:25] SCHEDULED: <2023-08-03 Thu 20:00>
#+begin_src sh
ID="2300346867_NA12878-63118093_S260-GRCh38"; nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/${ID}/callVariant/haplotypecaller/${ID}.vcf.gz --outdir=out/${ID}/happy-twist-exome-core/ --compare=happy -lib lib --capture=capture/Twist_Exome_Core_Covered_Targets_hg38.bed --id=HG001 --genome=GRCh38 -bg
#+end_src
**** DONE Tester Agilen SureSelect All Exon V8 (hg38) GATK-4.4:giab:
CLOSED: [2023-08-05 Sat 09:25] SCHEDULED: <2023-08-03 Thu 20:00>
**** DONE Vérifier l'impact gatk 4.3 - 4.4 : aucun
CLOSED: [2023-08-05 Sat 09:25]
**** DONE Figure comparant les 3 capture :hg001:
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** DONE Mail Paul sur les 3 capture :hg001:
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** KILL Tester si le panel Twist Alliance VCGS Exome suffit
CLOSED: [2023-07-31 Mon 22:31] SCHEDULED: <2023-07-30 Sun>
**** DONE Mail cento pour demande le type de capture
CLOSED: [2023-10-07 Sat 17:59]
/Entered on/ [2023-08-07 Mon 20:40]
Twist exome
*** PROJ Comparer happy et happy-vcfeval :giab:
** TODO Données syndip (CHM-eval) ! :syndip:
https://github.com/lh3/CHM-eval
*** KILL Données officielles : non car génome !!
CLOSED: [2023-11-19 Sun 23:43]
**** KILL Run ERR1341793
CLOSED: [2023-11-19 Sun 23:43] SCHEDULED: <2023-11-18 Sat>
(raw reads ERR1341793_1.fastq.gz and ERR1341793_2.fastq.gz downloaded from https://www.ebi.ac.uk/ena/browser/view/ERR1341793)
**** KILL Run ERR1341796
CLOSED: [2023-11-19 Sun 23:43] SCHEDULED: <2023-11-18 Sat>
*** TODO Données exome Broad institute (nextflow)
SCHEDULED: <2023-12-18 Mon>
https://console.cloud.google.com/storage/browser/broad-public-datasets/CHM1_CHM13_WES;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false
*** TODO Télécharger VCF
SCHEDULED: <2023-12-01 Fri>
.al METRIC.Recall METRIC.Precision METRIC.Frac_NA METRIC.F1_Score TRUTH.TOTAL.TiTv_ratio QUERY.TOTAL.TiTv_ratio TRUTH.TOTAL.het_hom_ratio QUERY.TOTAL.het_hom_ratio
INDEL ALL 413 246 167 751 289 215 2 98 0.595642 0.460821 0.286285 0.519629 NaN NaN 2.428571 2.465116
INDEL PASS 413 246 167 751 289 215 2 98 0.595642 0.460821 0.286285 0.519629 NaN NaN 2.428571 2.465116
SNP ALL 15883 15479 404 23597 5277 2841 46 44 0.974564 0.745760 0.120397 0.844947 3.017198 2.85705 5.560099 2.114633
SNP PASS 15883 15479 404 23597 5277 2841 46 44 0.974564 0.745760 0.120397 0.844947 3.017198 2.85705 5.560099 2.114633
******* DONE Vérifier qu'il ne reste plus de filtre autre que PASS
CLOSED: [2023-07-08 Sat 15:19]
#+begin_src
$ zgrep -c 'PASS' HG001_GRCh38_1_22_v4_lifted_merged.vcf.gz
3730505
$ zgrep -c '^chr' HG001_GRCh38_1_22_v4_lifted_merged.vcf.gz
3730506
#+end_src
****** TODO 1/4 SNP manquant ?
******* DONE Regarder avec Julia si ce sont vraiment des FP: 61/5277 qui ne le sont pas
CLOSED: [2023-07-09 Sun 12:09]
******* DONE Examiner les FP
CLOSED: [2023-07-30 Sun 22:05]
******* DONE Tester un FP
CLOSED: [2023-07-30 Sun 22:05]
2 │ chr1 608765 A G ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:ti:SNP:homalt:188
liftDown UCSC: rien en GIAB : vrai FP
3 │ chr1 762943 A G ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:ti:SNP:homalt:287
4 │ chr1 762945 A T ./.:.:.:.:NOCALL:nocall:. 1/1:FP:.:tv:SNP:homalt:287
Remaniements complexes ? Pas dans le gène en HG38
******* DONE La plupart des FP (4705/5566) sont homozygotes: erreur de référence ?
CLOSED: [2023-07-12 Wed 21:10] SCHEDULED: <2023-07-09 Sun>
Sur les 2 premiers variants, ils montrent en fait la différence entre T2T et GRCh38
Erreur à l'alignement ?
******** KILL relancer l'alignement
CLOSED: [2023-07-09 Sun 17:36]
******** DONE vérifier reads identiques hg38 et T2T: oui
CLOSED: [2023-07-09 Sun 16:36]
T2T CHR1608765
38 chr1:1180168-1180168 (
SRR14724513.24448214
SRR14724513.24448214
******* DONE Vérifier quelques variants sur IGV
CLOSED: [2023-07-09 Sun 17:36]
******* KILL Répartition des FP : cluster ?
CLOSED: [2023-07-09 Sun 17:36]
****** DONE Examiner les FP restant après correction selon séquence de référence
CLOSED: [2023-08-12 Sat 15:57]
****** HOLD Examiner les variants supprimé
****** TODO Enlever les FP qui correspondent à un changement dans le génome
******* Condition:
- pas de variation à la position en GRCh38
- variantion homozygote
- la varation en T2T correspond au changement de pair de base GRC38 -> T2T
pour les SNP:
alt_T2T[i] = DNA_GRC38[j]
avec i la position en T2T et j la position en GRCh38
Note: définir un ID n'est pas correct car les variants peuvent être modifié par happy !
******* Idée
- Pour chaque FP, c'est un "faux" FP si
- REF en hg38 == ALT en T2T
- et REF en hg38 != REF en T2T
- et variant homozygote
Comment obtenir les séquences de réferences ?
1. liftover
2. blat sur la séquence autour du variant
3. identifier quelques reads contenant le variant et regarder leur aligneement en hg38
Après discussion avec Alexis: solution 3
******* Algorithme
1. Extraire les coordonnées en T2T des faux positifs *homozygote*
2. Pour chaque faux positif
1. lister 10 reads contenant le variant
2. pour chacun de ces reads, récupérer la séquence en T2T et GRCh38 via le nom du read dans le bam
3. si la séquence en T2T modifiée par le variant est "identique" à celle en GRCh38, alors on ignore ce faux positif
Note: on ignore les reads qui ont changé de chromosome entre les version
******* DONE Résultat préliminaire
CLOSED: [2023-07-23 Sun 14:30]
cf [[file:~/roam/research/bisonex/code/giab/giab-corrected.csv][script julia]]
3498 faux positifs en moins, soit 0.89 sensibilité
julia> tp=15479
julia> fp=5277
julia> tp/(tp+fp)
0.7457602620928888
julia> tp/(tp+(fp-3498))
0.8969173716537258
On est toujours en dessous des 97%
******* HOLD Corriger proprement VCF ou résultats Happy
******* TODO Adapter pour gérer plusieurs variants par read
****** DONE Méthodologie du pangenome
CLOSED: [2023-10-03 Tue 21:28]
Voir biblio[cite:@liao2023] mais ont aligné sur GRCH38
******* DONE Mail alexis
CLOSED: [2023-10-03 Tue 21:28]
****** DONE Méthodologie T2T
CLOSED: [2023-10-16 Mon 19:42]
Mail alexis
SCHEDULED: <2023-10-04 Wed>
***** TODO Rendre simplement le nombre de vrais positifs
SCHEDULED: <2023-12-26 Tue>
***** KILL Mail Yannis
CLOSED: [2023-07-08 Sat 10:44]
***** DONE Mail GIAB pour version T2T
CLOSED: [2023-07-07 Fri 18:37]
**** KILL HG002 :hg002:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** KILL HG003 :hg003:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** KILL HG004 :hg004:T2T:
CLOSED: [2023-11-26 Sun 12:30]
**** DONE Plot : ashkenazim trio :hg38:
CLOSED: [2023-07-30 Sun 16:49] SCHEDULED: <2023-07-30 Sun 15:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 16:06]--[2023-07-30 Sun 16:35] => 0:29
CLOCK: [2023-07-30 Sun 15:39]--[2023-07-30 Sun 15:40] => 0:01
:END:
/Entered on/ [2023-04-16 Sun 17:29]
Refaire résultats
**** DONE Mail Paul sur les résultat ashkenazim +/- centogene
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** DONE Relancer comparaison GIAB avec GATK 4.4.0
CLOSED: [2023-08-12 Sat 15:55]
/Entered on/ [2023-08-03 Thu 12:42]
**** TODO Re-télécharger proprement dans pipeline dédiés
[[*Résumé][Résumé]]
Cf [[*Validation : Quelles données de référence ?][Validation : Quelles données de référence ?]]
https://medium.com/dnanexus/benchmarking-state-of-the-art-secondary-variant-calling-pipelines-5472ca6bace7
Source:
https://trace.ncbi.nlm.nih.gov/Traces/index.html?view=study&acc=SRP047086
https://zenodo.org/records/3597727
Selon https://github.com/genome-in-a-bottle/giab_data_indexes
***** TODO HG001 :hg001:
SCHEDULED: <2023-11-29 Wed>
****** TODO Avec données en hg38
SCHEDULED: <2023-11-29 Wed>
[[*Résumé][Résumé]]
Ok pour hiseq4000 et sureselect mais utiliser le dernier commit pour symlink
****** TODO Avec données en hg19
SCHEDULED: <2023-12-20 Wed>
Utiliser crossmap ! https://crossmap.readthedocs.io/en/latest/ (inspiré de [[https://github.com/bcbio/bcbio_validation_workflows/blob/master/giab-exome/input/get_data.sh][bcbio]]
pour vérifier
***** TODO HG002 :hg002:
SCHEDULED: <2023-12-21 Thu>
***** TODO HG003 :hg003:
SCHEDULED: <2023-12-21 Thu>
***** TODO HG004 :hg001:
SCHEDULED: <2023-12-21 Thu>
**** TODO Refaire les analyses happy+ rtgeval
SCHEDULED: <2023-12-09 Sat>
On veut les résultats de https://medium.com/dnanexus/benchmarking-state-of-the-art-secondary-variant-calling-pipelines-5472ca6bace7
hap.py avec conda ?
*** TODO Platinum genome :platinum:
https://emea.illumina.com/platinumgenomes.html
**** TODO Tester sur la zone couverte par l'exome centogène
SCHEDULED: <2023-12-26 Tue>
*** DONE Séquencer NA12878 :cento:hg001:
CLOSED: [2023-10-07 Sat 17:59]
Discussion avec Paul : sous-traitant ne nous donnera pas les données, il faut commander l'ADN
**** DONE ADN commandé
CLOSED: [2023-06-30 Fri 22:29]
**** DONE Sauvegarder les données brutes
CLOSED: [2023-07-30 Sun 14:22] SCHEDULED: <2023-07-19 Wed>
K, scality, S
**** KILL Récupérer le fichier de capture
CLOSED: [2023-07-30 Sun 14:25] SCHEDULED: <2023-07-23 Sun>
Candidats donnés dans publication https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8354858/
#+begin_quote
In short, the Nextera Rapid Capture Exome Kit (Illumina, San Diego, CA), the SureSelect Human All Exon kit (Agilent, Santa Clara, CA) or the Twist Human Core Exome was used for enrichment, and a Nextseq500, HiSeq4000, or Novoseq 6000 (Illumina) instrument was used for the actual sequencing, with the average coverage targeted to at least 100× or at least 98% of the target DNA covered 20×.
#+end_quote
Par défaut, on utilisera https://www.twistbioscience.com/products/ngs/alliance-panels#tab-3
ANnonce récente pour nouveau panel Twist : https://www.centogene.com/news-events/news/newsdetails/twist-bioscience-and-centogene-launch-three-panels-to-advance-rare-disease-and-hereditary-cancer-research-and-support-diagnostics
Masi pas de fichier BED
***** DONE Mail centogène
CLOSED: [2023-07-30 Sun 14:22] DEADLINE: <2023-07-23 Sun>
**** DONE Tester Nextera Rapid Capture Exome v1.2 (hg19) :giab:
CLOSED: [2023-08-06 Sun 19:05] SCHEDULED: <2023-08-03 Thu 19:00>
https://support.illumina.com/downloads/nextera-rapid-capture-exome-v1-2-product-files.html
***** DONE Liftover capture
CLOSED: [2023-08-06 Sun 18:30] SCHEDULED: <2023-08-06 Sun>
#+begin_src sh
nextflow run -profile standard,helios workflows/lift-nextera-capture.nf -lib lib
#+end_src
Vérification rapide : ok
***** DONE Run
CLOSED: [2023-08-06 Sun 19:05] SCHEDULED: <2023-08-06 Sun>
#+begin_src sh
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_NA12878-63118093_S260-GRCh38/callVariant/haplotypecaller/2300346867_NA12878-63118093_S260-GRCh38.vcf.gz --outdir=out/2300346867_NA12878-63118093_S260-GRCh38/happy-nextera-lifted/ --compare=happy -lib lib --capture=capture/nexterarapidcapture_exome_targetedregions_v1.2-nochrM_lifted.bed --id=HG001 --genome=GRCh38
#+end_src
**** DONE Tester Agilent SureSelect All Exon V8 (hg38) :giab:
CLOSED: [2023-07-31 Mon 23:09] SCHEDULED: <2023-07-31 Mon>
https://earray.chem.agilent.com/suredesign/index.htm
"Find design"
"Agilent catalog"
Fichiers:
- Regions.bed: Targeted exon intervals, curated and targeted by Agilent Technologies
- MergedProbes.bed: Merged probes for targeted enrichment of exons described in Regions.bed
- Covered.bed: Merged probes and sequences with 95% homology or above
- Padded.bed: Merged probes and sequences with 95% homology or above extended 50 bp at each side
- AllTracks.bed: Targeted regions and covered tracks
#+begin_src sh
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_63118093_NA12878-GRCh38/callVariant/haplotypecaller/2300346867_63118093_NA12878-GRCh38.vcf.gz --outdir=out/2300346867_63118093_NA12878-GRCh38/happy/ --compare=happy -lib lib --capture=capture/Agilent_SureSelect_All_Exons_v8_hg38_Regions.bed --id=HG001 --genome=GRCh38
#+end_src
| Type | Filter | TRUTH.TOTAL | TRUTH.TP | TRUTH.FN | QUERY.TOTAL | QUERY.FP | QUERY.UNK | FP.gt | FP.al | METRIC.Recall | METRIC.Precision | METRIC.Frac_NA | METRIC.F1_Score | TRUTH.TOTAL.TiTv_ratio | QUERY.TOTAL.TiTv_ratio | TRUTH.TOTAL.het_hom_ratio | QUERY.TOTAL.het_hom_ratio |
| INDEL | ALL | 423 | 395 | 28 | 915 | 108 | 405 | 4 | 13 | 0.933806 | 0.788235 | 0.442623 | 0.854868 | | | 1.7012987012987013 | 2.7916666666666665 |
| INDEL | PASS | 423 | 395 | 28 | 915 | 108 | 405 | 4 | 13 | 0.933806 | 0.788235 | 0.442623 | 0.854868 | | | 1.7012987012987013 | 2.7916666666666665 |
| SNP | ALL | 20984 | 20600 | 384 | 26080 | 780 | 4703 | 62 | 10 | 0.9817 | 0.963512 | 0.18033 | 0.972521 | 3.0499710592321048 | 2.7596541786743516 | 1.58256372367935 | 1.8978207694018234 |
| SNP | PASS | 20984 | 20600 | 384 | 26080 | 780 | 4703 | 62 | 10 | 0.9817 | 0.963512 | 0.18033 | 0.972521 | 3.0499710592321048 | 2.7596541786743516 | 1.58256372367935 | 1.8978207694018234 |
**** DONE Test Twist Human core Exome (hg38):giab:
CLOSED: [2023-08-01 Tue 23:16] SCHEDULED: <202 3-08-02 Wed>
https://www.twistbioscience.com/resources/data-files/ngs-human-core-exome-panel-bed-file
#+begin_src
nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/2300346867_63118093_NA12878-GRCh38/callVariant/haplotypecaller/2300346867_63118093_NA12878-GRCh38.vcf.gz --outdir=out/2300346867_63118093_NA12878-GRCh38/happy-twist-exome-core/ --compare=happy -lib lib --capture=capture/Twist_Exome_Core_Covered_Targets_hg38.bed --id=HG001 --genome=GRCh38 -bg
#+end_src
| Type | Filter | TRUTH.TOTAL | TRUTH.TP | TRUTH.FN | QUERY.TOTAL | QUERY.FP | QUERY.UNK | FP.gt | FP.al | METRIC.Recall | METRIC.Precision | METRIC.Frac_NA | METRIC.F1_Score | TRUTH.TOTAL.TiTv_ratio | QUERY.TOTAL.TiTv_ratio | TRUTH.TOTAL.het_hom_ratio | QUERY.TOTAL.het_hom_ratio |
| INDEL | ALL | 328 | 313 | 15 | 722 | 95 | 309 | 4 | 13 | 0.954268 | 0.769976 | 0.427978 | 0.852273 | | | 1.8584070796460177 | 2.8967391304347827 |
| INDEL | PASS | 328 | 313 | 15 | 722 | 95 | 309 | 4 | 13 | 0.954268 | 0.769976 | 0.427978 | 0.852273 | | | 1.8584070796460177 | 2.8967391304347827 |
| SNP | ALL | 19198 | 18962 | 236 | 23381 | 684 | 3738 | 48 | 10 | 0.987707 | 0.965178 | 0.159873 | 0.976313 | 3.1034188034188035 | 2.859264147830391 | 1.5669565217391304 | 1.8578767123287672 |
| SNP | PASS | 19198 | 18962 | 236 | 23381 | 684 | 3738 | 48 | 10 | 0.987707 | 0.965178 | 0.159873 | 0.976313 | 3.1034188034188035 | 2.859264147830391 | 1.5669565217391304 | 1.8578767123287672 |
**** DONE Test Twist Human core Exome (hg38):giab:
CLOSED: [2023-08-05 Sat 09:25] SCHEDULED: <2023-08-03 Thu 20:00>
#+begin_src sh
ID="2300346867_NA12878-63118093_S260-GRCh38"; nextflow run workflows/compareVCF.nf -profile standard,helios --query=out/${ID}/callVariant/haplotypecaller/${ID}.vcf.gz --outdir=out/${ID}/happy-twist-exome-core/ --compare=happy -lib lib --capture=capture/Twist_Exome_Core_Covered_Targets_hg38.bed --id=HG001 --genome=GRCh38 -bg
#+end_src
**** DONE Tester Agilen SureSelect All Exon V8 (hg38) GATK-4.4:giab:
CLOSED: [2023-08-05 Sat 09:25] SCHEDULED: <2023-08-03 Thu 20:00>
**** DONE Vérifier l'impact gatk 4.3 - 4.4 : aucun
CLOSED: [2023-08-05 Sat 09:25]
**** DONE Figure comparant les 3 capture :hg001:
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** DONE Mail Paul sur les 3 capture :hg001:
CLOSED: [2023-08-06 Sun 20:24] SCHEDULED: <2023-08-06 Sun>
**** KILL Tester si le panel Twist Alliance VCGS Exome suffit
CLOSED: [2023-07-31 Mon 22:31] SCHEDULED: <2023-07-30 Sun>
**** DONE Mail cento pour demande le type de capture
CLOSED: [2023-10-07 Sat 17:59]
/Entered on/ [2023-08-07 Mon 20:40]
Twist exome
*** PROJ Comparer happy et happy-vcfeval :giab:
** TODO Données syndip (CHM-eval) ! :syndip:
https://github.com/lh3/CHM-eval
*** KILL Données officielles : non car génome !!
CLOSED: [2023-11-19 Sun 23:43]
**** KILL Run ERR1341793
CLOSED: [2023-11-19 Sun 23:43] SCHEDULED: <2023-11-18 Sat>
(raw reads ERR1341793_1.fastq.gz and ERR1341793_2.fastq.gz downloaded from https://www.ebi.ac.uk/ena/browser/view/ERR1341793)
**** KILL Run ERR1341796
CLOSED: [2023-11-19 Sun 23:43] SCHEDULED: <2023-11-18 Sat>
*** TODO Données exome Broad institute (nextflow)
SCHEDULED: <2023-12-26 Tue>
https://console.cloud.google.com/storage/browser/broad-public-datasets/CHM1_CHM13_WES;tab=objects?pli=1&prefix=&forceOnObjectsSortingFiltering=false
*** TODO Télécharger VCF
SCHEDULED: <2023-12-01 Fri>
transcript status
canonical status of transcript
APPRIS isoform annotation
transcript support level
biotype of transcript ("pr
otein_coding" preferred)
CCDS status of transcript
consequence rank according to this table
translated, transcript or feature length (longer preferred)
"Wherever possible we would discourage you from summarising data in this way. "
**** DONE Mail alexis
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-20 Sun>
**** TODO Données simuscop 200x
SCHEDULED: <2023-12-21 Thu>
**** DONE En T2T avec liftover (filtre = spip) : ok mais lent et trop de variants :tests:
CLOSED: [2023-09-17 Sun 17:13] SCHEDULED: <2023-09-17 Sun>
1. Conversion en bed
#+begin_src sh :dir:~/code/sanger
open snvs-cento-sanger.csv | select chrom pos | insert pos2 {$in.pos } | to csv --separator="\t" | save snvs-cento-sanger.bed -f
#+end_src
2. Liftover avec UCSC (en ligne)
NB: vérifié sur le premier résultat en cherche le read contenant le variant (samtools view -r puis samtools view | grep en T2T) et avec l'aide d'IGV, on a un variant qui correspond en
chr1:10757746
3. En supposant que l'ordre des variants n'a pas changé, on ajoute simplement REF et ALT avec annotateLifted.jl
Annotation spip *très lente* : 1h13 !
Résultat:
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13594572 60.0 1
2 │ chr17:g.10204026 60.0 1
144 found over 146
filter depth : another 0 missed variants
filter poly : another 0 missed variants
filter vep : another 0 missed variants
Et on a trop de variants en sortie (7330 !)
**** DONE Mail Paul avec résultats filtre en T2T + nouveau schéma
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
** TODO Medically relevant genes
SCHEDULED: <2023-12-18 Mon>
/Entered on/ [2023-10-18 Wed 22:37]
** TODO HG002 en T2T
/Entered on/ [2023-11-25 Sat 17:58]
https://github.com/marbl/HG002
*** TODO Tester les benchmark préliminaires
SCHEDULED: <2023-12-26 Tue>
https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/NIST_HG002_DraftBenchmark_defrabbV0.011-20230725/
* Ré-interprétation :reanalysis:
** DONE Lancer tests sur données brutes [225/250] <(samples.csv)> <(runs.waiting)>
CLOSED: [2023-10-14 Sat 11:58] SCHEDULED: <2023-10-08 Sun>
- [X] 100222_63015289
- [X] 1600304839_63051311
- [X] 1900007827_62913191
- [X] 1900398899_62999500
- [X] 1900486799_62913197
- [X] 2100422923_62952677
- [X] 2100458888_62933047
- [X] 2100601558_62903840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63
transcript status
canonical status of transcript
APPRIS isoform annotation
transcript support level
biotype of transcript ("protein_coding" preferred)
CCDS status of transcript
consequence rank according to this table
translated, transcript or feature length (longer preferred)
"Wherever possible we would discourage you from summarising data in this way. "
**** DONE Mail alexis
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-20 Sun>
**** TODO Données simuscop 200x
SCHEDULED: <2023-12-21 Thu>
**** DONE En T2T avec liftover (filtre = spip) : ok mais lent et trop de variants :tests:
CLOSED: [2023-09-17 Sun 17:13] SCHEDULED: <2023-09-17 Sun>
1. Conversion en bed
#+begin_src sh :dir:~/code/sanger
open snvs-cento-sanger.csv | select chrom pos | insert pos2 {$in.pos } | to csv --separator="\t" | save snvs-cento-sanger.bed -f
#+end_src
2. Liftover avec UCSC (en ligne)
NB: vérifié sur le premier résultat en cherche le read contenant le variant (samtools view -r puis samtools view | grep en T2T) et avec l'aide d'IGV, on a un variant qui correspond en
chr1:10757746
3. En supposant que l'ordre des variants n'a pas changé, on ajoute simplement REF et ALT avec annotateLifted.jl
Annotation spip *très lente* : 1h13 !
Résultat:
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13594572 60.0 1
2 │ chr17:g.10204026 60.0 1
144 found over 146
filter depth : another 0 missed variants
filter poly : another 0 missed variants
filter vep : another 0 missed variants
Et on a trop de variants en sortie (7330 !)
**** DONE Mail Paul avec résultats filtre en T2T + nouveau schéma
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
** TODO Medically relevant genes
SCHEDULED: <2023-12-26 Tue>
/Entered on/ [2023-10-18 Wed 22:37]
** TODO HG002 en T2T
/Entered on/ [2023-11-25 Sat 17:58]
https://github.com/marbl/HG002
*** TODO Tester les benchmark préliminaires
SCHEDULED: <2023-12-26 Tue>
https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/analysis/NIST_HG002_DraftBenchmark_defrabbV0.011-20230725/
* Ré-interprétation :reanalysis:
** DONE Lancer tests sur données brutes [225/250] <(samples.csv)> <(runs.waiting)>
CLOSED: [2023-10-14 Sat 11:58] SCHEDULED: <2023-10-08 Sun>
- [X] 100222_63015289
- [X] 1600304839_63051311
- [X] 1900007827_62913191
- [X] 1900398899_62999500
- [X] 1900486799_62913197
- [X] 2100422923_62952677
- [X] 2100458888_62933047
- [X] 2100601558_62903840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63
* Myélome
** Principaux réactifs
*** TP
- TQ = II,V,VII,X et fibrinogène
- insensible héparine dose thérapeutique et fibrinogène > 0.6 ou < 6
- temps de coagulation d'un plasma citraté avec facteur tissulaire + phospholipides en excès + calcium
- conversion en % avec droite de Thivolle -> TP
- TP : spécifique seulement pour VII
- Réactif := thromboplastine
- phospholipides
- facteur tissulaire
- calcium
- inhibiteur héparine
- forte concentration en phospholipides donc TP peu sensible aux lupes anticoagulants
- modifié par debigatran et rivaroxaban surtout (moins apixaban)
*** TCA
- TCA : mesure temps malade / temps témoin (ref < 1.2)
- Calcul temps témoins à chaque changement de lot, validation de réactif sur 30 sujets sains
- Réactif : céphaline : phospholipides + activateur (kaolin)
- Allongement variable par HNF en curatif variable
- Choisir le réactif selon indication :
- dépistage risque hémorragique en pré-op
- dépistage lupus anticoagulant
- Peu sensible au antiXa, sensible anti IIa
Dosage fibrinogène avec thrombine calcique en excès + inhibiteur héparine : mesure du temps et conversion en g/L avec droite d'étalonnage
* Maladies
** Myélome
** Polyglobulie de Vaquez
Diagnostic :
A1 + A2 + {A3 ou B}
A1 : Hb > 16.5 (16 femmes) ou Hct > 0.49 (0.48) ou masse globulaire totale > 125%
A2 : biopsie méudllaire : hypercellularité, panmyélose, mégacaryocytes mature pléiomorphe
A3: mutation JAK2 (V617F)
B : diminution EPO
NB: on peut éviter la biopsie osseuse si Hb > 18.5 avec JAK2 et EPO mais elle permet de voir une fibrose initiale (permet de dépister myelofibrose)
Risque :
- vasculaire (court terme) : thromobse art/vein, hémorragie (!)
- hémato (long terme) : myélofibrose, myélodysplasie, leucémie aigue
Traitement :
1. aspirine/saignée
2. si haut risque : hydroxyurée ou ropeginterferon