HK72ZI3QZZKNR4BNVO2XELQHRXTOCQMTLS7ZRDR2PON6SXGFYCOQC
MCKPZLD66LQL3YIVEAQB4KPKXY3JL7ZAFGGWQVHCN4N65PGMQTDQC
5QSNY2QCI5S5TMB7344ZCQWVHBEGWKAA6ECRRGI4ZHYXNVS3OHUQC
I4OQ35BVH6AC4IEQKZP2YXLKGWBCRM4TUX7S66OFZ5D7RZP23R2QC
YUIFBBZUGTZTASYNEQBB7F5P4PF3HGUPPTLSW5HIT4HEH6PT34EAC
RHWQQAAHNHFO3FLCGVB3SIDKNOUFJGZTDNN57IQVBMXXCWX74MKAC
CDPYGRNUP6EEE77RHDGEALT4BUELT3RBSBUO35PLI3HLZ44KK26QC
F4OH5H3ONZKUVBUI5NKYJ25B66FS22QQ7LRAO53OQX3ZBL2BL6JAC
4EYK4BECM65Q7PWBSARPRHL5MIPC4JGIWYIPFDCSAV6YWSZH67BQC
YA4R25RPAF23T46FTNB2YLYF2U2I7BZZ2673YNUG6SO73JRCWAIQC
QET6OJBIY5FL3PAXD3W7Z673IKDXTFODAX6UZ32TEJQESPK4RAFQC
TCROEYW5OAW6FKNIARXCLGMDTLSP62X54BP5IBNBPDF3YPT24DQQC
FYUID2XZFKETTS6EJLOGMV5DJJMLY4NCQSATT3ZKWDLHEKJRO4TAC
G7KNVIJW7ORXWK4IX3VHPW5DIILVOT6W7U6DEDLNUZWHNDGYWKEQC
IMCF75S3NQXK7TZUGS4POPLLOCFQPYQX3QTHIR2J3HCW564UMPGAC
DQXWDHCPZVQ5GXDGORO33WKVR5HYOODQ5CHGZXVMM6ESHCFM5FSQC
3LLSOLDOJ5OSKQN2DKYHYGWTBUJAC5KPRX2SAVNARRUYCPRM44RQC
4UOH23PPEC4TH5GALRKZOOGOSLV4FJB6R4GA2N3TCEDCOIZTMJSAC
7NDZXAGUWT2JOXEDHCP7OGVNLC7XUNJ2MNEBKDJKSJ2ZFL5HF4YAC
NEU45XWLRZO7ZWW7QW2PMILLG5KHA3ZBBIEK7TPNSOOBINCUDVKQC
XBXXQ7NGCA2AM7F6DODF75VNIA52J3MNYSLNAYRRC2KKYJUVCN2QC
FXA3ZBV64FML7W47IPHTAJFJHN3J3XHVHFVNYED47XFSBIGMBKRQC
Z6B2FRJWT6EF4MC4GTIDBMULUKEI62ZGPYZMCWRJDJNUM7YUJJMAC
BJW7E6D6T7QQFPFGMHHQYHAEMRX5HGEE2H32IUU4HPUIS2AZW7KAC
HODZPQLZBLXXWP6NVKDWVSLFEN6HKIKOFG6QVX2PEJRYAM4I6ZQQC
ub.com/NixOS/nixpkgs/issues/192396][Bug report Version 22.10.6]]
**** Notes
Erreur :
ERROR: Cannot download nextflow required file -- make sure you can connect to the internet
Alternatively you can try to download this file:
https://www.nextflow.io/releases/v22.10.6/nextflow-22.10.6-all.jar
and save it as:
.//nix/store/md2b1ah4d7ivj82k8xxap30dmdci00pa-nextflow-22.10.6/bin/.nextflow-wrapped
Dans la mise à jour, il y a la création d'un environnement virtuel qui casse l'exécution de nextflow (besoin de télécharger)
Fix = désactiver
**** KILL Patch NXF_OFFLINE=true
CLOSED: [2023-07-02 Sun 11:02] SCHEDULED: <2023-06-11 Sun>
** WAIT [[https://github.com/NixOS/nixpkgs/pull/249329][Multiqc]]
HG002,sanger-chr20,data/HG002-sanger-inserted-chr20_1.fq.gz,data/HG002-sanger-inserted-chr20_2.fq.gz
** KILL Mutalyzer
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13 Sun>
Packaging faisable mais nombreux paquet python
** TODO Variant validator -> hgvs
C'est juste une interface autour d'hgvs mais il faut
- postgresql
- un accès ou télécharger des bases de données
Dépendences
s: wcwidth, pyee, pure-eval, ptyprocess, pickleshare, parsley, parse, fake-useragent, executing, backcall, appdirs, zipp, websockets, w3lib, urllib3, traitlets, tqdm, tabulate, sqlparse, soupsieve, six, pygments, psycopg2, prompt-toolkit, pexpect, parso, lxml, idna, humanfriendly, decorator, cython, cssselect, configparser, charset-normalizer, certifi, attrs, requests, pysam, pyquery, matplotlib-inline, jedi, importlib-metadata, coloredlogs, beautifulsoup4, asttokens, yoyo-migrations, stack-data, pyppeteer, bs4, bioutils, requests-html, ipython, biocommons.seqrepo, hgvs
** TODO SPIP T2T
*** DONE PR upstream
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** DONE Mail R. Lemann
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** TODO Mise à jour packages nix
** TODO VEP :vep:
*** DONE [[https://github.com/NixOS/nixpkgs/pull/185691][BioPerl]]
SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-09 Tue 10:57]
PR submitted
*** TODO BioDBBBigFile
:PROPERTIES:
:ORDERED: t
:END:
/Entered on/ [2022-08-10 Wed 14:28]
On utilise la dernière version de kent, donc plus de problème.
PRête à être mergé. Rebase faite<2023-07-02 Sun>
**** DONE Version de kent déjà packagée : forcer version 335
CLOSED: [2023-07-02 Sun 11:20]
***** KILL [[https://github.com/NixOS/nixpkgs/pull/206991][Restore building kent 404]]
CLOSED: [2023-05-06 Sat 17:40]
Review faite <2023-03-26 Sun> , atteinte merge]
Relancé <2023-05-06 Sat>
Kent 446 n'a pas ce problème donc PR inutile
***** DONE [[https://github.com/NixOS/nixpkgs/pull/223411][Ajouter les header to package]] (inc folder)
CLOSED: [2023-05-08 Mon 10:18] SCHEDULED: <2023-05-07 Sun>
Review à faire
https://github.com/NixOS/nixpkgs/pull/223411
Corrigé et plus besoin de la PR précédente
***** KILL [[https://github.com/NixOS/nixpkgs/pull/186462][BioDBBBigFile]] avec ces 2 changements
CLOSED: [2023-07-02 Sun 11:20]
**** KILL Version de kent déjà packagée : 404
CLOSED: [2023-03-27 Mon 16:43]
Compile mais les tests de passent pas
**** DONE Modifier selon PR https://github.com/NixOS/nixpkgs/pull/186462
CLOSED: [2023-07-30 Sun 22:01] SCHEDULED: <2023-07-30 Sun 20:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 19:13]--[2023-07-30 Sun 20:50] => 1:37
:END:
Modification nécessaire pour kent :
- plus de patch
- suppression d'une boucle dans postPatch
On supprime aussi NIX_BUILD_TOP
**** TODO Corriger PR biobigfile
SCHEDULED: <2023-10-20 Fri>
/Entered on/ [2023-10-15 Sun 17:21]
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186459][BioDBHTS]]
CLOSED: [2023-05-06 Sat 08:49] SCHEDULED: <2023-04-15 Sat>
/Entered on/ [2022-08-10 Wed 14:28]
Correction pour review faites <2022-10-10 Mon>
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186464][BioExtAlign]]
CLOSED: [2022-10-22 Sat 12:43] SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-10 Wed 14:28]
Review <2022-10-10 Mon>, correction dans la journée.
Correction 2e passe, attente
Impossible de faire marcher les tests Car il ne trouve pas le module Bio::Tools::Align, qui est dans un dossier ailleurs dans le dépôt. Même en compilant tout le dépôt, cela ne fonctionne pas... On skip les tests.
*** TODO VEP
** WAIT [[https://github.com/NixOS/nixpkgs/pull/230394][rtg-tools]] :vcfeval:
Soumis
** WAIT Package Spip https://github.com/NixOS/nixpkgs/pull/247476
** TODO Happy :happy:
*** TODO PR python 3 upstream
SCHEDULED: <2023-10-21 Sat>
*** TODO nixpkgs en l'état
SCHEDULED: <2023-10-21 Sat>
** PROJ SpliceAI
** TODO Bamsurgeon
/Entered on/ [2023-05-13 Sat 19:11]
*** TODO Velvet
** TODO PR Picard avec option pour gérer la mémoire
Similaire à
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picard/picard.sh
* Julia :julia:
** KILL XAM.jl: PR pour modification record :julia:
CLOSED: [2023-05-29 Mon 15:40] SCHEDULED: <2023-05-28 Sun>
/Entered on/ [2023-05-27 Sat 22:39]
** TODO XAMscissors.jl :xamscissors:
Modification de la séquence dans BAM.
*Pas de mise à jour de CIGAR*
On convertit en fastq et on lance le pipeline pour "corriger"
#+begin_src sh
cd /home/alex/code/bisonex/out/63003856/preprocessing/mapped
samtools view 63003856_S135.bam NC_000022.11 -o 63003856_S135_chr22.bam
cd /home/alex/recherche/bisonex/code/BamScissors.jl
cp ~/code/bisonex/out/63003856/preprocessing/mapped/63003856_S135_chr22.bam .
samtools index 63003856_chr22.bam
#+end_src
Le script va modifier le bam, le trier et générer le fastq. !!!
Attention: ne pas oublier l'option -n !!!
#+begin_src sh
time julia --project=.. insertVariant.jl
scp 63003856_S135_chr22_{1,2}.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
*** WAIT Implémenter les SNV avec VAF :snv:
Stratégie :
1. calculer la profondeur sur les positions
2. créer un dictionnaire { nom du reads : position dataframe }
3. itérer sur tous les reads et changer ceux marqués
**** DONE VAF = 1
CLOSED: [2023-05-29 Mon 15:34]
**** DONE VAF selon loi normale
CLOSED: [2023-05-29 Mon 15:35]
Tronquée si > 1
**** WAIT Tests unitaires
***** DONE NA12878: 1 gène sur chromosome 22
CLOSED: [2023-05-30 Tue 23:55]
root = "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/"
#+begin_src sh
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878.bwa.markDuplicates.bam chr22 -o project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam chr22:19419700-19424000 -o NIST7035_H7AP8ADXX_NA12878_chr22_MRPL40_hg19.bam
#+end_src
***** WAIT Pull request formatspeciment
https://github.com/BioJulia/FormatSpecimens.jl/pull/8
***** DONE Formatspecimens
CLOSED: [2023-05-29 Mon 23:03]
****** DONE 1 read
CLOSED: [2023-05-29 Mon 23:02]
****** DONE VAF sur 1 exon
CLOSED: [2023-05-29 Mon 23:03]
**** DONE [#A] Bug: perte de nombreux reads avec NA12878
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-18 Fri>
:PROPERTIES:
:ID: 5c1c36f3-f68e-4e6d-a7b6-61dca89abc37
:END:
Ex: chrX:g.124056226 : on passe de 65 reads à 1
Test xamscissors: pas de soucis...
On teste sur cette position +/- 200bp
#+begin_src sh :dir /home/alex/roam/research/bisonex/code/sanger
samtools view /home/alex/code/bisonex/out/2300346867_NA12878-63118093_S260-GRCh38/preprocessing/mapped/2300346867_NA12878-63118093_S260-GRCh38.bam chrX:124056026-124056426 -o chrXsmall.bam
#+end_src
#+RESULTS:
***** DONE Vérifier profondeur avec dernière version :
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
****** DONE chr20: profondeur ok
SCHEDULED: <2023-08-19 Sat>
****** DONE toutes les données
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
Ok pour 7 variants (IGV) notament chromosome X
*** TODO Implémenter les indel avec VAF :indel:
*** TODO Soumission paquet
* Données
:PROPERTIES:
:CATEGORY: data
:END:
** DONE Remplacer bam par fastq sur mesocentre
CLOSED: [2023-04-16 Sun 16:33]
Commande
*** DONE Supprimer les fastq non "paired"
CLOSED: [20
ub.com/NixOS/nixpkgs/issues/192396][Bug report Version 22.10.6]]
**** Notes
Erreur :
ERROR: Cannot download nextflow required file -- make sure you can connect to the internet
Alternatively you can try to download this file:
https://www.nextflow.io/releases/v22.10.6/nextflow-22.10.6-all.jar
and save it as:
.//nix/store/md2b1ah4d7ivj82k8xxap30dmdci00pa-nextflow-22.10.6/bin/.nextflow-wrapped
Dans la mise à jour, il y a la création d'un environnement virtuel qui casse l'exécution de nextflow (besoin de télécharger)
Fix = désactiver
**** KILL Patch NXF_OFFLINE=true
CLOSED: [2023-07-02 Sun 11:02] SCHEDULED: <2023-06-11 Sun>
** WAIT [[https://github.com/NixOS/nixpkgs/pull/249329][Multiqc]]
HG002,sanger-chr20,data/HG002-sanger-inserted-chr20_1.fq.gz,data/HG002-sanger-inserted-chr20_2.fq.gz
** KILL Mutalyzer
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13 Sun>
Packaging faisable mais nombreux paquet python
** TODO Variant validator -> hgvs
C'est juste une interface autour d'hgvs mais il faut
- postgresql
- un accès ou télécharger des bases de données
Dépendences
s: wcwidth, pyee, pure-eval, ptyprocess, pickleshare, parsley, parse, fake-useragent, executing, backcall, appdirs, zipp, websockets, w3lib, urllib3, traitlets, tqdm, tabulate, sqlparse, soupsieve, six, pygments, psycopg2, prompt-toolkit, pexpect, parso, lxml, idna, humanfriendly, decorator, cython, cssselect, configparser, charset-normalizer, certifi, attrs, requests, pysam, pyquery, matplotlib-inline, jedi, importlib-metadata, coloredlogs, beautifulsoup4, asttokens, yoyo-migrations, stack-data, pyppeteer, bs4, bioutils, requests-html, ipython, biocommons.seqrepo, hgvs
** TODO SPIP T2T
*** DONE PR upstream
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** DONE Mail R. Lemann
CLOSED: [2023-08-12 Sat 18:23] SCHEDULED: <2023-08-12 Sat 18:00>
*** TODO Mise à jour packages nix
** TODO VEP :vep:
*** DONE [[https://github.com/NixOS/nixpkgs/pull/185691][BioPerl]]
SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-09 Tue 10:57]
PR submitted
*** TODO BioDBBBigFile
:PROPERTIES:
:ORDERED: t
:END:
/Entered on/ [2022-08-10 Wed 14:28]
On utilise la dernière version de kent, donc plus de problème.
PRête à être mergé. Rebase faite<2023-07-02 Sun>
**** DONE Version de kent déjà packagée : forcer version 335
CLOSED: [2023-07-02 Sun 11:20]
***** KILL [[https://github.com/NixOS/nixpkgs/pull/206991][Restore building kent 404]]
CLOSED: [2023-05-06 Sat 17:40]
Review faite <2023-03-26 Sun> , atteinte merge]
Relancé <2023-05-06 Sat>
Kent 446 n'a pas ce problème donc PR inutile
***** DONE [[https://github.com/NixOS/nixpkgs/pull/223411][Ajouter les header to package]] (inc folder)
CLOSED: [2023-05-08 Mon 10:18] SCHEDULED: <2023-05-07 Sun>
Review à faire
https://github.com/NixOS/nixpkgs/pull/223411
Corrigé et plus besoin de la PR précédente
***** KILL [[https://github.com/NixOS/nixpkgs/pull/186462][BioDBBBigFile]] avec ces 2 changements
CLOSED: [2023-07-02 Sun 11:20]
**** KILL Version de kent déjà packagée : 404
CLOSED: [2023-03-27 Mon 16:43]
Compile mais les tests de passent pas
**** DONE Modifier selon PR https://github.com/NixOS/nixpkgs/pull/186462
CLOSED: [2023-07-30 Sun 22:01] SCHEDULED: <2023-07-30 Sun 20:00>
:LOGBOOK:
CLOCK: [2023-07-30 Sun 19:13]--[2023-07-30 Sun 20:50] => 1:37
:END:
Modification nécessaire pour kent :
- plus de patch
- suppression d'une boucle dans postPatch
On supprime aussi NIX_BUILD_TOP
**** TODO Corriger PR biobigfile
SCHEDULED: <2023-10-20 Fri>
/Entered on/ [2023-10-15 Sun 17:21]
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186459][BioDBHTS]]
CLOSED: [2023-05-06 Sat 08:49] SCHEDULED: <2023-04-15 Sat>
/Entered on/ [2022-08-10 Wed 14:28]
Correction pour review faites <2022-10-10 Mon>
*** DONE [[https://github.com/NixOS/nixpkgs/pull/186464][BioExtAlign]]
CLOSED: [2022-10-22 Sat 12:43] SCHEDULED: <2022-08-10 Wed>
/Entered on/ [2022-08-10 Wed 14:28]
Review <2022-10-10 Mon>, correction dans la journée.
Correction 2e passe, attente
Impossible de faire marcher les tests Car il ne trouve pas le module Bio::Tools::Align, qui est dans un dossier ailleurs dans le dépôt. Même en compilant tout le dépôt, cela ne fonctionne pas... On skip les tests.
*** TODO VEP
** WAIT [[https://github.com/NixOS/nixpkgs/pull/230394][rtg-tools]] :vcfeval:
Soumis
** WAIT Package Spip https://github.com/NixOS/nixpkgs/pull/247476
** TODO Happy :happy:
*** TODO PR python 3 upstream
SCHEDULED: <2023-10-21 Sat>
*** TODO nixpkgs en l'état
SCHEDULED: <2023-10-21 Sat>
** PROJ SpliceAI
** TODO Bamsurgeon
/Entered on/ [2023-05-13 Sat 19:11]
*** TODO Velvet
** TODO PR Picard avec option pour gérer la mémoire
Similaire à
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/picard/picard.sh
* Julia :julia:
** KILL XAM.jl: PR pour modification record :julia:
CLOSED: [2023-05-29 Mon 15:40] SCHEDULED: <2023-05-28 Sun>
/Entered on/ [2023-05-27 Sat 22:39]
** TODO XAMscissors.jl :xamscissors:
Modification de la séquence dans BAM.
*Pas de mise à jour de CIGAR*
On convertit en fastq et on lance le pipeline pour "corriger"
#+begin_src sh
cd /home/alex/code/bisonex/out/63003856/preprocessing/mapped
samtools view 63003856_S135.bam NC_000022.11 -o 63003856_S135_chr22.bam
cd /home/alex/recherche/bisonex/code/BamScissors.jl
cp ~/code/bisonex/out/63003856/preprocessing/mapped/63003856_S135_chr22.bam .
samtools index 63003856_chr22.bam
#+end_src
Le script va modifier le bam, le trier et générer le fastq. !!!
Attention: ne pas oublier l'option -n !!!
#+begin_src sh
time julia --project=.. insertVariant.jl
scp 63003856_S135_chr22_{1,2}.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
*** WAIT Implémenter les SNV avec VAF :snv:
Stratégie :
1. calculer la profondeur sur les positions
2. créer un dictionnaire { nom du reads : position dataframe }
3. itérer sur tous les reads et changer ceux marqués
**** DONE VAF = 1
CLOSED: [2023-05-29 Mon 15:34]
**** DONE VAF selon loi normale
CLOSED: [2023-05-29 Mon 15:35]
Tronquée si > 1
**** WAIT Tests unitaires
***** DONE NA12878: 1 gène sur chromosome 22
CLOSED: [2023-05-30 Tue 23:55]
root = "https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/"
#+begin_src sh
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878.bwa.markDuplicates.bam chr22 -o project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam
samtools view project.NIST_NIST7035_H7AP8ADXX_NA12878_chr22.bam chr22:19419700-19424000 -o NIST7035_H7AP8ADXX_NA12878_chr22_MRPL40_hg19.bam
#+end_src
***** WAIT Pull request formatspeciment
https://github.com/BioJulia/FormatSpecimens.jl/pull/8
***** DONE Formatspecimens
CLOSED: [2023-05-29 Mon 23:03]
****** DONE 1 read
CLOSED: [2023-05-29 Mon 23:02]
****** DONE VAF sur 1 exon
CLOSED: [2023-05-29 Mon 23:03]
**** DONE [#A] Bug: perte de nombreux reads avec NA12878
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-18 Fri>
:PROPERTIES:
:ID: 5c1c36f3-f68e-4e6d-a7b6-61dca89abc37
:END:
Ex: chrX:g.124056226 : on passe de 65 reads à 1
Test xamscissors: pas de soucis...
On teste sur cette position +/- 200bp
#+begin_src sh :dir /home/alex/roam/research/bisonex/code/sanger
samtools view /home/alex/code/bisonex/out/2300346867_NA12878-63118093_S260-GRCh38/preprocessing/mapped/2300346867_NA12878-63118093_S260-GRCh38.bam chrX:124056026-124056426 -o chrXsmall.bam
#+end_src
#+RESULTS:
***** DONE Vérifier profondeur avec dernière version :
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
****** DONE chr20: profondeur ok
SCHEDULED: <2023-08-19 Sat>
****** DONE toutes les données
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-19 Sat>
Ok pour 7 variants (IGV) notament chromosome X
*** TODO Implémenter les indel avec VAF :indel:
*** TODO Soumission paquet
* Données
:PROPERTIES:
:CATEGORY: data
:END:
** DONE Remplacer bam par fastq sur mesocentre
CLOSED: [2023-04-16 Sun 16:33]
Commande
*** DONE Supprimer les fastq non "paired"
CLOSED: [20
644 | Acc | 0.0000003317384 | No | Acc | 89894637 | 7 | 89894644 | 0.0000002205815 | No | 89894637 | 0.02545572 | No | 0.02545572 | No |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
**** DONE Vérifier multiples transcripts en hg38 avec coordonées génomiquues: ok
CLOSED: [2023-08-10 Thu 23:00]
Beaucoup plus de transcrits en T2T
Ex: 1 transcrit refseq curated
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108257446%2D108257496&hgsid=1672963428_J5aWAqack2FpJ7mvhFTNVw7bKzxo
vs 2 transcrits en T2T
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hub_3671779_hs1&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108264969%2D108265019&hgsid=1672963612_Eso9frdQ7z6RkKkcKsIf2Waq3pec
C'est bien ce qu'on retrouve avec spip
*** DONE [#A] Filtre vep avec spip
CLOSED: [2023-08-13 Sun 00:39] SCHEDULED: <2023-08-12 Sat 19:00>
*** DONE Annotation CADD + spliceAI GRCh38 avec nouvelle version :annotation:
CLOSED: [2023-08-28 Mon 17:21] SCHEDULED: <2023-08-20 Sun>
*** DONE OMIM: possible seulement sur nom du gènes:annotation:
CLOSED: [2023-08-13 Sun 11:57] SCHEDULED: <2023-08-13 Sun 16:00>
Base de données non disponible et compliqué de faire la mise à jour nous.
Si on essaie de prendre les gènes de GRCH38, ils ne sont pas forcément en T2T
Ex: DDX11L17 n'existe pas dans T2T à ces coordonées
zgrep DDX11L17 GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz
Note: c'est un pseudogene
https://www.genecards.org/cgi-bin/carddisp.pl?gene=DDX11L17
Si on prend les gènes de T2T, il y en a des nouveaux.
Ex: le premier est LOC101928626.
À cette position, rien en GRCh38
Si on essaye avec ENSEMBL: non car n'ont pas le même identifiant
Ex: ACHE
Idéalement, il faudrait l'identifiant NCBI (disponible dans OMIM) mais n'est pas en sortie de VEP
Et cela demande la version "merged" donc impossible en T2T
Est-ce faisable de faire une chr10129957338-T-Ccorrespondance sur le nom du gène ?
Tous les gènes de T2T:
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > t2t-genes.txt
wc -l t2t-genes.txt
#+end_src
#+RESULTS:
: 57660 t2t-genes.txt
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_000001405.40_GRCh38.p14_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > hg38-genes.txt
wc -l hg38-genes.txt
#+end_src
#+RESULTS:
: 67127 hg38-genes.txt
Gènes communs aux 2
#+begin_src sh :dir ~/Downloads
comm -12 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 54506
Gènes uniquements dans t2t
#+begin_src sh :dir ~/Downloads
comm -23 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 3154
Gènes uniquements dans GRCh38
#+begin_src sh :dir ~/Downloads
comm -13 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 12621
*** HOLD OMIM sur nom du gène :annotation:
*** DONE Mobidetails API
CLOSED: [2023-09-10 Sun 16:44]
Trop long ... 1h à 1h30 d'exécution
Disponible dans module
*** DONE Filtre vep avec spip for T2T et spliceAI pour GRCh38
CLOSED: [2023-09-16 Sat 22:47]
*** DONE Repasser tests en GRCh38 avec nouveau filtre (spip ou splice ai) :sanger:
CLOSED: [2023-09-17 Sun 09:07] SCHEDULED: <2023-09-16 Sat>
*** HOLD Franklin API
https://www.postman.com/genoox-ps/workspace/franklin-api-documentation-s-public-workspace/documentation/6621518-4335389d-12e3-445f-8182-339df95b2a09
*** KILL Regarder si clinique disponible avec vep :annotation:
CLOSED: [2023-09-10 Sun 16:44]
*** TODO Tester filtre sans splice: 6130 mais il en manque 4
SCHEDULED: <2023-09-27 Wed>
Mail Paul: Exome donc hors splice, peu intéressant
**** DONE Enlever complètement condition splice: 6130 variants restants...
CLOSED: [2023-09-27 Wed 19:37] SCHEDULED: <2023-09-26 Tue>
Cf [[id:c9b2009a-503b-4561-94c6-29ae21a3188d][Filtre vep avec spliceAI: 37365 -> 6130]]
Dans tests/splicai
#+begin_src sh
filter_vep -i output-all-gpu.vcf --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o test.vcf
grep -c -v '^#' test.vcf
6130
#+end_src
**** DONE Remplacer par impact fonctionnel: peu d'impact : majorité = MODERATE
CLOSED: [2023-09-27 Wed 19:45] SCHEDULED: <2023-09-26 Tue>
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is HIGH" --only_matched | grep -c -v '^#'
258
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is LOW" --only_matched | grep -c -v '^#'
11
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is MODERATE" --only_matched | grep -c -v '^#'
5824
**** DONE Regarder les conséquences pour tes les transcripts
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
filter_vep -i NA12878-sanger-all-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o filtered.vcf
bcftools +split-vep filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained
644 | Acc | 0.0000003317384 | No | Acc | 89894637 | 7 | 89894644 | 0.0000002205815 | No | 89894637 | 0.02545572 | No | 0.02545572 | No |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
**** DONE Vérifier multiples transcripts en hg38 avec coordonées génomiquues: ok
CLOSED: [2023-08-10 Thu 23:00]
Beaucoup plus de transcrits en T2T
Ex: 1 transcrit refseq curated
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108257446%2D108257496&hgsid=1672963428_J5aWAqack2FpJ7mvhFTNVw7bKzxo
vs 2 transcrits en T2T
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hub_3671779_hs1&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr11%3A108264969%2D108265019&hgsid=1672963612_Eso9frdQ7z6RkKkcKsIf2Waq3pec
C'est bien ce qu'on retrouve avec spip
*** DONE [#A] Filtre vep avec spip
CLOSED: [2023-08-13 Sun 00:39] SCHEDULED: <2023-08-12 Sat 19:00>
*** DONE Annotation CADD + spliceAI GRCh38 avec nouvelle version :annotation:
CLOSED: [2023-08-28 Mon 17:21] SCHEDULED: <2023-08-20 Sun>
*** DONE OMIM: possible seulement sur nom du gènes:annotation:
CLOSED: [2023-08-13 Sun 11:57] SCHEDULED: <2023-08-13 Sun 16:00>
Base de données non disponible et compliqué de faire la mise à jour nous.
Si on essaie de prendre les gènes de GRCH38, ils ne sont pas forcément en T2T
Ex: DDX11L17 n'existe pas dans T2T à ces coordonées
zgrep DDX11L17 GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz
Note: c'est un pseudogene
https://www.genecards.org/cgi-bin/carddisp.pl?gene=DDX11L17
Si on prend les gènes de T2T, il y en a des nouveaux.
Ex: le premier est LOC101928626.
À cette position, rien en GRCh38
Si on essaye avec ENSEMBL: non car n'ont pas le même identifiant
Ex: ACHE
Idéalement, il faudrait l'identifiant NCBI (disponible dans OMIM) mais n'est pas en sortie de VEP
Et cela demande la version "merged" donc impossible en T2T
Est-ce faisable de faire une chr10129957338-T-Ccorrespondance sur le nom du gène ?
Tous les gènes de T2T:
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_009914755.1_T2T-CHM13v2.0_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > t2t-genes.txt
wc -l t2t-genes.txt
#+end_src
#+RESULTS:
: 57660 t2t-genes.txt
#+begin_src sh :dir ~/Downloads
zgrep -o "ID=gene[^;]*;" GCF_000001405.40_GRCh38.p14_genomic.gff.gz | sed 's/ID=gene-//;s/;//' | sort | uniq > hg38-genes.txt
wc -l hg38-genes.txt
#+end_src
#+RESULTS:
: 67127 hg38-genes.txt
Gènes communs aux 2
#+begin_src sh :dir ~/Downloads
comm -12 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 54506
Gènes uniquements dans t2t
#+begin_src sh :dir ~/Downloads
comm -23 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 3154
Gènes uniquements dans GRCh38
#+begin_src sh :dir ~/Downloads
comm -13 t2t-genes.txt hg38-genes.txt | wc -l
#+end_src
#+RESULTS:
: 12621
*** HOLD OMIM sur nom du gène :annotation:
*** DONE Mobidetails API
CLOSED: [2023-09-10 Sun 16:44]
Trop long ... 1h à 1h30 d'exécution
Disponible dans module
*** DONE Filtre vep avec spip for T2T et spliceAI pour GRCh38
CLOSED: [2023-09-16 Sat 22:47]
*** DONE Repasser tests en GRCh38 avec nouveau filtre (spip ou splice ai) :sanger:
CLOSED: [2023-09-17 Sun 09:07] SCHEDULED: <2023-09-16 Sat>
*** HOLD Franklin API
https://www.postman.com/genoox-ps/workspace/franklin-api-documentation-s-public-workspace/documentation/6621518-4335389d-12e3-445f-8182-339df95b2a09
*** KILL Regarder si clinique disponible avec vep :annotation:
CLOSED: [2023-09-10 Sun 16:44]
*** DONE Tester filtre sans splice: 6130 mais il en manque 4
CLOSED: [2023-10-18 Wed 22:50] SCHEDULED: <2023-09-27 Wed>
Mail Paul: Exome donc hors splice, peu intéressant
**** DONE Enlever complètement condition splice: 6130 variants restants...
CLOSED: [2023-09-27 Wed 19:37] SCHEDULED: <2023-09-26 Tue>
Cf [[id:c9b2009a-503b-4561-94c6-29ae21a3188d][Filtre vep avec spliceAI: 37365 -> 6130]]
Dans tests/splicai
#+begin_src sh
filter_vep -i output-all-gpu.vcf --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o test.vcf
grep -c -v '^#' test.vcf
6130
#+end_src
**** DONE Remplacer par impact fonctionnel: peu d'impact : majorité = MODERATE
CLOSED: [2023-09-27 Wed 19:45] SCHEDULED: <2023-09-26 Tue>
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is HIGH" --only_matched | grep -c -v '^#'
258
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is LOW" --only_matched | grep -c -v '^#'
11
filter_vep -i output-all-gpu-filtered.vcf --format vcf --filter "IMPACT is MODERATE" --only_matched | grep -c -v '^#'
5824
**** DONE Regarder les conséquences pour tes les transcripts
CLOSED: [2023-09-27 Wed 21:04]
/Work/Users/apraga/bisonex/out/annotate/vep/NA12878-sanger-all-T2T
filter_vep -i NA12878-sanger-all-T2T.vep.vcf.gz --format vcf --filter " not(Consequence matches non_coding_transcript or Consequence matches stream or Consequence matches intergenic_variant or Consequence matches UTR or Consequence matches intron_variant or Consequence matches synonymous or BIOTYPE matches pseudogene or BIOTYPE matches misc_RNA)" --only_matched -o filtered.vcf
bcftools +split-vep filtered.vcf -f '%Consequence\n' -d | sort | uniq -c
94 coding_sequence_variant
13 coding_sequence_variant&NMD_transcript_variant
257 frameshift_variant
21 frameshift_variant&NMD_transcript_variant
2 frameshift_variant&splice_donor_region_variant
20 frameshift_variant&splice_region_variant
1 frameshift_variant&splice_region_variant&NMD_transcript_variant
1 incomplete_terminal_codon_variant&coding_sequence_variant
211 inframe_deletion
18 inframe_deletion&NMD_transcript_variant
6 inframe_deletion&splice_region_variant
242 inframe_insertion
22 inframe_insertion&NMD_transcript_variant
4 inframe_insertion&splice_region_variant
14689 missense_variant
1416 missense_variant&NMD_transcript_variant
6 missense_variant&splice_donor_5th_base_variant
374 missense_variant&splice_region_variant
34 missense_variant&splice_region_variant&NMD_transcript_variant
53 splice_acceptor_variant
11 splice_acceptor_variant&NMD_transcript_variant
79 splice_donor_variant
6 splice_donor_variant&NMD_transcript_variant
30 start_lost
5 start_lost&NMD_transcript_variant
135 stop_gained
13 stop_gained&frameshift_variant
3 stop_gained&frameshift_variant&NMD_transcript_variant
2 stop_gained&frameshift_variant&splice_region_variant
14 stop_gained&NMD_transcript_variant
5 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&NMD_transcript_variant
4 stop_lost
1 stop_lost&NMD_transcript_variant
9 stop_retained
ameshift_variant&start_lost&start_retained_variant
37 inframe_deletion
9 inframe_deletion&nmd_transcript_variant
27 inframe_insertion
5 inframe_insertion&nmd_transcript_variant
21 intron_variant
1593 missense_variant
37 missense_variant&nmd_transcript_variant
17 missense_variant&splice_region_variant
1 missense_variant&splice_region_variant&nmd_transcript_variant
1 protein_altering_variant
1 splice_acceptor_variant
1 splice_acceptor_variant&frameshift_variant
2 splice_acceptor_variant&nmd_transcript_variant
3 splice_donor_5th_base_variant&intron_variant
1 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
2 splice_donor_region_variant&intron_variant
1 splice_donor_region_variant&intron_variant&nmd_transcript_variant
1 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
10 splice_donor_variant
1 splice_donor_variant&non_coding_transcript_variant
11 splice_polypyrimidine_tract_variant&intron_variant
1 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
1 splice_region_variant&intron_variant
9 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
3 splice_region_variant&synonymous_variant
1 splice_region_variant&synonymous_variant&nmd_transcript_variant
4 start_lost
19 stop_gained
2 stop_gained&frameshift_variant
2 stop_gained&nmd_transcript_variant
1 stop_gained&splice_region_variant
1 stop_gained&splice_region_variant&nmd_transcript_variant
3 stop_lost
2 stop_lost&nmd_transcript_variant
1 stop_retained_variant
18 synonymous_variant
1 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
T2T
[apraga@mesointeractive filter]$ bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
15 3_prime_utr_variant
11 3_prime_utr_variant&nmd_transcript_variant
51 5_prime_utr_variant
3 5_prime_utr_variant&nmd_transcript_variant
48 coding_sequence_variant
5 coding_sequence_variant&nmd_transcript_variant
3 downstream_gene_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
78 inframe_deletion
2 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
84 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
16 intergenic_variant
368 intron_variant
21 intron_variant&nmd_transcript_variant
71 intron_variant&non_coding_transcript_variant
5187 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
105 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
33 non_coding_transcript_exon_variant
12 splice_acceptor_variant
1 splice_acceptor_variant&5_prime_utr_variant&intron_variant&nmd_transcript_variant
1 splice_acceptor_variant&nmd_transcript_variant
3 splice_acceptor_variant&non_coding_transcript_variant
1 splice_acceptor_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
16 splice_donor_5th_base_variant&intron_variant
2 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
33 splice_donor_region_variant&intron_variant
4 splice_donor_region_variant&intron_variant&nmd_transcript_variant
7 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
19 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
2 splice_donor_variant&non_coding_transcript_variant
3 splice_donor_variant&splice_donor_5th_base_variant&coding_sequence_variant&intron_variant
64 splice_polypyrimidine_tract_variant&intron_variant
6 splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
8 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
2 splice_region_variant&3_prime_utr_variant
2 splice_region_variant&5_prime_utr_variant
4 splice_region_variant&intron_variant
6 splice_region_variant&non_coding_transcript_exon_variant
54 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
4 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
5 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
27 splice_region_variant&synonymous_variant
13 start_lost
31 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
2 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
349 synonymous_variant
17 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
2 upstream_gene_variant
*** TODO Regarder annotation VEP des variants sur NA12878 non trataié :na12878:
SCHEDULED: <2023-10-16 Mon>
/Entered on/ [2023-10-16 Mon 19:39]
** DONE [#B] Indicateurs qualité :qualité:
CLOSED: [2023-09-10 Sun 16:46]
*** Idée
Raredisease:
- FastQC : nombreuses statistiques. Non disponible Nix
- Mosdepth : calcule la profondeur (2x plus rapide que samtools depth). Nix
- MultiQC : fusionne juste les résultats des analyses. Non disponible nix
- Picard's CollectMutipleMetrics, CollectHsMetrics, and CollectWgsMetrics
- Qualimap : alternative fastqc ? Non disponible nix
- Sentieon's WgsMetricsAlgo : propriétaire
- TIDDIT's cov : TIDIT = remaninement chromosomique
Sarek:
- alignment statistics : samtools stats, mosdepth
- QC : MultiQC
MultiQC : non disponible Nix
*** DONE FastqQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Mosdepth
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
Pour exomple, il faut le fichier de capture
subworkflows/local/bam_markduplicates/
*** DONE Samtools stats
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE [#B] Compte-redu exécution avec MultiQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Résultats sur NA12878 : 98% à 20x
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-17 Thu>
**** DONE Comprendre 91% à 20x seulement: SNVs inséré
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester autre kit : Twist exome comprehensive
CLOSED: [2023-08-18 Fri 22:24]
Moins bon
***** DONE Tester génome sans alt
CLOSED: [2023-08-18 Fri 22:25]
Idem
***** DONE Tester NA12878 sans SNVs inséré: cause !!
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester hg19 sur NA12878 non inséré
CLOSED: [2023-08-18 Fri 22:25]
**** DONE Comprendre pourquoi SNVs diminuent le score: reads manquants
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-18 Fri>
Voir [[id:5c1c36f3-f68e-4e6d-a7b6-61dca89abc37][Bug: perte de nombreux reads avec NA12878]]
*** DONE Relancer résultats avec NA1287 et NA12878 + sanger
CLOSED: [2023-08-29 Tue 10:30] SCHEDULED: <2023-08-29 Tue>
*** DONE Comparer avec hg19
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec autres kit de capture
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec no-alt
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
** HOLD vérifier si normalisation
** KILL [#B] Vérification nomenclature hgvs :hgvs:
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-15
Tue>
*** KILL mutalyzer
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13
ameshift_variant&start_lost&start_retained_variant
37 inframe_deletion
9 inframe_deletion&nmd_transcript_variant
27 inframe_insertion
5 inframe_insertion&nmd_transcript_variant
21 intron_variant
1593 missense_variant
37 missense_variant&nmd_transcript_variant
17 missense_variant&splice_region_variant
1 missense_variant&splice_region_variant&nmd_transcript_variant
1 protein_altering_variant
1 splice_acceptor_variant
1 splice_acceptor_variant&frameshift_variant
2 splice_acceptor_variant&nmd_transcript_variant
3 splice_donor_5th_base_variant&intron_variant
1 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
2 splice_donor_region_variant&intron_variant
1 splice_donor_region_variant&intron_variant&nmd_transcript_variant
1 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
10 splice_donor_variant
1 splice_donor_variant&non_coding_transcript_variant
11 splice_polypyrimidine_tract_variant&intron_variant
1 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
1 splice_region_variant&intron_variant
9 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
3 splice_region_variant&synonymous_variant
1 splice_region_variant&synonymous_variant&nmd_transcript_variant
4 start_lost
19 stop_gained
2 stop_gained&frameshift_variant
2 stop_gained&nmd_transcript_variant
1 stop_gained&splice_region_variant
1 stop_gained&splice_region_variant&nmd_transcript_variant
3 stop_lost
2 stop_lost&nmd_transcript_variant
1 stop_retained_variant
18 synonymous_variant
1 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
T2T
[apraga@mesointeractive filter]$ bcftools +split-vep 2300346867_NA12878-63118093_S260-T2T/2300346867_NA12878-63118093_S260-T2T.filtervep.vcf -f '%Consequence\n' -d -s worst | sort | uniq -c
15 3_prime_utr_variant
11 3_prime_utr_variant&nmd_transcript_variant
51 5_prime_utr_variant
3 5_prime_utr_variant&nmd_transcript_variant
48 coding_sequence_variant
5 coding_sequence_variant&nmd_transcript_variant
3 downstream_gene_variant
121 frameshift_variant
9 frameshift_variant&nmd_transcript_variant
1 frameshift_variant&splice_donor_region_variant
9 frameshift_variant&splice_region_variant
78 inframe_deletion
2 inframe_deletion&nmd_transcript_variant
2 inframe_deletion&splice_region_variant
84 inframe_insertion
2 inframe_insertion&nmd_transcript_variant
1 inframe_insertion&splice_region_variant
16 intergenic_variant
368 intron_variant
21 intron_variant&nmd_transcript_variant
71 intron_variant&non_coding_transcript_variant
5187 missense_variant
207 missense_variant&nmd_transcript_variant
3 missense_variant&splice_donor_5th_base_variant
105 missense_variant&splice_region_variant
9 missense_variant&splice_region_variant&nmd_transcript_variant
33 non_coding_transcript_exon_variant
12 splice_acceptor_variant
1 splice_acceptor_variant&5_prime_utr_variant&intron_variant&nmd_transcript_variant
1 splice_acceptor_variant&nmd_transcript_variant
3 splice_acceptor_variant&non_coding_transcript_variant
1 splice_acceptor_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
16 splice_donor_5th_base_variant&intron_variant
2 splice_donor_5th_base_variant&intron_variant&non_coding_transcript_variant
33 splice_donor_region_variant&intron_variant
4 splice_donor_region_variant&intron_variant&nmd_transcript_variant
7 splice_donor_region_variant&intron_variant&non_coding_transcript_variant
19 splice_donor_variant
1 splice_donor_variant&nmd_transcript_variant
2 splice_donor_variant&non_coding_transcript_variant
3 splice_donor_variant&splice_donor_5th_base_variant&coding_sequence_variant&intron_variant
64 splice_polypyrimidine_tract_variant&intron_variant
6 splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
8 splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
2 splice_region_variant&3_prime_utr_variant
2 splice_region_variant&5_prime_utr_variant
4 splice_region_variant&intron_variant
6 splice_region_variant&non_coding_transcript_exon_variant
54 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant
4 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&nmd_transcript_variant
5 splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant
27 splice_region_variant&synonymous_variant
13 start_lost
31 stop_gained
4 stop_gained&frameshift_variant
2 stop_gained&frameshift_variant&splice_region_variant
3 stop_gained&nmd_transcript_variant
2 stop_gained&splice_region_variant
2 stop_gained&splice_region_variant&nmd_transcript_variant
2 stop_lost
1 stop_lost&nmd_transcript_variant
6 stop_retained_variant
2 stop_retained_variant&nmd_transcript_variant
349 synonymous_variant
17 synonymous_variant&nmd_transcript_variant
1 transcript_ablation
2 upstream_gene_variant
*** DONE Regarder annotation VEP des variants sur NA12878 non trataié :na12878:
CLOSED: [2023-10-18 Wed 22:50] SCHEDULED: <2023-10-16 Mon>
/Entered on/ [2023-10-16 Mon 19:39]
*** DONE Regarder si les variants sont dans des zones modifiées de T2T
CLOSED: [2023-10-19 Thu 17:19] SCHEDULED: <2023-10-18 Wed>
/Entered on/ [2023-10-18 Wed 22:49]
Liftover des variants de GRCh38 -> T2T
Cf ~/roam/research/bisonex/code/t2t/comparePositions.jl
#+begin_quote
Successfully converted 1896 records: View Conversions
Conversion failed on 17 records.
#+end_quote
On utilise t2tOnly()
Proportion par chromosome
julia> @by d :Column1 $nrow
24×2 DataFrame
Row │ Column1 nrow
│ String7 Int64
─────┼────────────────
1 │ chr1 678
2 │ chr2 369
3 │ chr3 287
4 │ chr4 224
5 │ chr5 258
6 │ chr6 430
7 │ chr7 321
8 │ chr8 218
9 │ chr9 251
10 │ chr10 275
11 │ chr11 489
12 │ chr12 350
13 │ chr13 74
14 │ chr14 185
15 │ chr15 171
16 │ chr16 283
17 │ chr17 364
18 │ chr18 82
19 │ chr19 550
20 │ chr20 142
21 │ chr21 93
22 │ chr22 171
23 │ chrX 98
24 │ chrY 1
*** TODO Regarder si les variants sont sur des nouveaux gènes
SCHEDULED: <2023-10-19 Thu>
#+begin_src sh :dir "/home/alex/roam/research/bisonex/code/t2t"
wget https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/annotation/chm13v2.0_RefSeq_Liftoff_v5.1.gff3.gz
#+end_src
#+RESULTS:
*** TODO Anntotation avec vep + GTF (dernière versio)
SCHEDULED: <2023-10-19 Thu>
https://www.science.org/doi/10.1126/science.abj6987#core-R61
/Entered on/ [2023-10-19 Thu 10:41]
*** DONE Figure propre pour position des variants
CLOSED: [2023-10-19 Thu 15:41] SCHEDULED: <2023-10-19 Thu>
*** DONE Nombre de variants dans les zones exclusives à T2T
CLOSED: [2023-10-19 Thu 16:39] SCHEDULED: <2023-10-19 Thu>
Zones unique à T2T données par : https://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=1735553302_OcJ6esPoUFcSykF6hKiRmIGU24KD&db=hub_3267197_GCA_009914755.4&c=CP068269.2&g=hub_3267197_hgUnique
Note: le .fai donné ( https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/analysis_set/chm13v2.0.fa.gz.gzi ) cause un problème aec bcftools :
Chromosome "" defined twice in chm13v2.0.fa.gz.gzi
On utilise donc l'index regénér sur le mésocentre
#+begin_src sh :dir "/home/alex/roam/research/bisonex/code/t2t"
wget https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/chain/v1_nflo/grch38-chm13v2.paf
scp meso:/Work/Projects/bisonex/data/fasta/chm13v2.0/chm13v2.0.fa.fai .
cut -f 1,3,4 grch38-chm13v2.paf | bedtools sort -i - -g chm13v2.0.fa.fai | bedtools merge | bedtools complement -g chm13v2.0.fa.fai -i - | bedtools merge | save T2T-CHM13v2.0_unique_regions_hg38.bed -f
#+end_src
#+RESULTS:
On génère le BED des variants supplémentaires en T2T avec
Puis
#+begin_src sh :dir "/home/alex/roam/research/bisonex/code/t2t"
bedtools intersect -a na12878-t2t-only.bed -b T2T-CHM13v2.0_unique_regions_hg38.bed > na12878-t2t-only-unique.bed
wc -l na12878-t2t-only.bed
wc -l na12878-t2t-only-unique.bed
#+end_src
#+RESULTS:
| 6364 | na12878-t2t-only.bed |
| 47 | na12878-t2t-only-unique.bed |
Donc 0.73% sont dans des zones unique
*** TODO Comparer l'annotation sur 1 variant filtré en GRCh388 et non filtré en T2T
SCHEDULED: <2023-10-19 Thu>
*** KILL Snpeff
SCHEDULED: <2023-10-19 Thu>
CLOSED: [2023-10-19 Thu 10:42]
Base de données non disponible
*** TODO Garder les transcrits canonique puis filtrer sur conséquence
SCHEDULED: <2023-10-19 Thu>
/Entered on/ [2023-10-19 Thu 11:23]
** DONE [#B] Indicateurs qualité :qualité:
CLOSED: [2023-09-10 Sun 16:46]
*** Idée
Raredisease:
- FastQC : nombreuses statistiques. Non disponible Nix
- Mosdepth : calcule la profondeur (2x plus rapide que samtools depth). Nix
- MultiQC : fusionne juste les résultats des analyses. Non disponible nix
- Picard's CollectMutipleMetrics, CollectHsMetrics, and CollectWgsMetrics
- Qualimap : alternative fastqc ? Non disponible nix
- Sentieon's WgsMetricsAlgo : propriétaire
- TIDDIT's cov : TIDIT = remaninement chromosomique
Sarek:
- alignment statistics : samtools stats, mosdepth
- QC : MultiQC
MultiQC : non disponible Nix
*** DONE FastqQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Mosdepth
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
Pour exomple, il faut le fichier de capture
subworkflows/local/bam_markduplicates/
*** DONE Samtools stats
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE [#B] Compte-redu exécution avec MultiQC
CLOSED: [2023-08-15 Tue 21:43] SCHEDULED: <2023-08-13 Sun>
*** DONE Résultats sur NA12878 : 98% à 20x
CLOSED: [2023-08-19 Sat 20:45] SCHEDULED: <2023-08-17 Thu>
**** DONE Comprendre 91% à 20x seulement: SNVs inséré
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester autre kit : Twist exome comprehensive
CLOSED: [2023-08-18 Fri 22:24]
Moins bon
***** DONE Tester génome sans alt
CLOSED: [2023-08-18 Fri 22:25]
Idem
***** DONE Tester NA12878 sans SNVs inséré: cause !!
CLOSED: [2023-08-18 Fri 22:25]
***** DONE Tester hg19 sur NA12878 non inséré
CLOSED: [2023-08-18 Fri 22:25]
**** DONE Comprendre pourquoi SNVs diminuent le score: reads manquants
CLOSED: [2023-08-19 Sat 20:34] SCHEDULED: <2023-08-18 Fri>
Voir [[id:5c1c36f3-f68e-4e6d-a7b6-61dca89abc37][Bug: perte de nombreux reads avec NA12878]]
*** DONE Relancer résultats avec NA1287 et NA12878 + sanger
CLOSED: [2023-08-29 Tue 10:30] SCHEDULED: <2023-08-29 Tue>
*** DONE Comparer avec hg19
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec autres kit de capture
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
*** DONE Comparer avec no-alt
CLOSED: [2023-08-28 Mon 17:22] SCHEDULED: <2023-08-20 Sun>
** HOLD vérifier si normalisation
** KILL [#B] Vérification nomenclature hgvs :hgvs:
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-15 Tue>
*** KILL mutalyzer
CLOSED: [2023-08-16 Wed 19:07] SCHEDULED: <2023-08-13
.vcf.gz
#+end_src
#+RESULTS:
| Number | of | samples: | 1 |
| Number | of | SNPs: | 6293 |
|
Number | of | INDELs: | 1515 |
| Number | of | MNPs: | 1588 |
| Number | of | others: | 0 |
| Number | of | sites: | 9322 |
#+begin_src sh :dir /ssh:meso:/Work/Users/apraga/bisonex/out/annotate
~/.nix-profile/bin/filter_vep -i vep/NA12878-sanger-all-GRCh38/NA12878-sanger-all-GRCh38.vep.vcf.gz --filter 'PICK' | bcftools +counts
#+end_src
| Number | of | samples: | 1 |
| Number | of | SNPs: | 6293 |
| Number | of | INDELs: | 1515 |
| Number | of | MNPs: | 1588 |
| Number | of | others: | 0 |
| Number | of | sites: | 9322 |
***** DONE Test NA12878 + variants sanger: vérifier sortie avec julia : ok
CLOSED: [2023-08-29 Tue 10:21] SCHEDULED: <2023-08-28 Mon>
143 variants/146 comme avant
***** DONE Relancer en T2T pour vérifier compatibilité :T2T:
CLOSED: [2023-08-29 Tue 11:03] SCHEDULED: <2023-08-29 Tue>
**** DONE Repasser les tests sanger sur NA12878
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-08-31 Thu>
2 variants manquants après filter vep
**** DONE Choisir le meilleur transcript nous-meme
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-09-01 Fri>
**** DONE Vérifier T2T passe
CLOSED: [2023-08-31 Thu 22:10] SCHEDULED: <2023-08-31 Thu>
**** DONE Revoir choix du transcrit + filtre avec paul
CLOSED: [2023-09-08 Fri 22:46] SCHEDULED: <2023-09-06 Wed>
**** DONE Filtrer les variants selon les filtres d'Alexis et garder tous les résultat
CLOSED: [2023-09-10 Sun 15:39] SCHEDULED: <2023-09-09 Sat>
**** DONE Ajout colonne MANE SELECT et garder les autres
CLOSED: [2023-09-10 Sun 15:39] SCHEDULED: <2023-09-09 Sat>
**** DONE v1.0
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-09 Sat>
***** DONE Branche prod
CLOSED: [2023-09-10 Sun 15:44] SCHEDULED: <2023-09-09 Sat>
Merge depuis debug
***** DONE Mail alexis
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-08-31 Thu>
***** DONE Relancer test sanger
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-10 Sun>
***** DONE Mail Paul pour validation
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-10 Sun>
**** DONE Utiliser spliceAI >= 0.2 pour filtre au lieu de spip
CLOSED: [2023-09-11 Mon 21:48] SCHEDULED: <2023-09-11 Mon>
**** DONE Repasser tests sanger avec spliceAI
CLOSED: [2023-09-14 Thu 22:45] SCHEDULED: <2023-09-11 Mon>
**** DONE Corriger colonne récessive
CLOSED: [2023-09-14 Thu 22:57] SCHEDULED: <2023-09-14 Thu>
soit 1/1, soit 1/2
soit 0/1 avec 2 variants par gene
*** KILL Comparer les annotations sur 63003856
CLOSED: [2023-08-28 Mon 17:28]
**** Relancer le nouveau pipeline
*** KILL Ancienne version
CLOSED: [2023-08-28 Mon 17:24]
**** KILL HGVS
CLOSED: [2023-08-28 Mon 17:24]
**** KILL Filtrer après VEP
CLOSED: [2023-08-28 Mon 17:24]
**** KILL OMIM
CLOSED: [2023-08-28 Mon 17:24]
**** KILL clinvar
CLOSED: [2023-08-28 Mon 17:24]
**** KILL ACMG incidental
CLOSED: [2023-08-28 Mon 17:24]
**** KILL Grantham
CLOSED: [2023-08-28 Mon 17:24]
**** KILL LRG
CLOSED: [2023-04-18 mar. 17:22] SCHEDULED: <2023-04-18 Tue>
Vu avec alexis, n’est plus à jour
**** KILL Gnomad
CLOSED: [2023-08-28 Mon 17:24]
*** DONE Réordonner les colonnes :annotation:
CLOSED: [2023-08-31 Thu 10:38] SCHEDULED: <2023-08-28 Mon>
Pas d'OMIM, pas de CADD, pas de spliceAI
*** DONE Ajouter gnomAD v3 :gnomadv3:
CLOSED: [2023-10-01 Sun 15:34] SCHEDULED: <2023-09-29 Fri>
/Entered on/ [2023-09-29 Fri 22:38]
Après discussion avec Mathieu sur le problème de certaines régions corrigées dans la v3 !
VEP utilise la v2 pour les exomes et v3 pour génomes
Il manquera pour les patients 1-107
On ne l'a pas dans la sortie de VEP jusque là
**** DONE Test sur 1 patient
CLOSED: [2023-09-30 Sat 23:54] SCHEDULED: <2023-09-29 Fri>
**** DONE Reprendre run batch
CLOSED: [2023-10-01 Sun 15:34]
** DONE Porter exactement la version d'Alexis sur Helios
CLOSED: [2023-01-14 Sat 17:56]
Branche "prod"
** KILL Tester version d'alexis avec Nix
CLOSED: [2023-06-14 Wed 22:37]
*** DONE Ajouter clinvar
CLOSED: [2022-11-13 Sun 19:37]
*** DONE Alignement
CLOSED: [2022-11-13 Sun 12:52]
*** DONE Haplotype caller
CLOSED: [2022-11-13 Sun 13:00]
*** KILL Filter
CLOSED: [2023-06-14 Wed 22:37]
- [X] depth
- [ ] comon snp not path
Problème avec liste des ID
**** KILL variant annotation
CLOSED: [2023-06-14 Wed 22:37]
Besoin de vep
*** KILL Variant calling
CLOSED: [2023-06-14 Wed 22:37]
** KILL Tester sarek
CLOSED: [2023-08-12 Sat 15:53]
#+begin_src sh
module load apptainer/1.1.8
nextflow run nf-core/sarek -profile test,singularity --outdir test-sarek
#+end_src
Les dépendences ne se téléchargent pas correctement, on les extrait à la main
#+begin_src sh
rg -IN galaxyproject modules | sed 's/ //g;s/:$//' | sort | uniq > deps.txt
#+end_src
Nettoyage à la main
Puis
#+begin_src sh
cat deps.txt | xargs -L1 singularity pull
#+end_src
** DONE Support pour samplesheet
CLOSED: [2023-08-03 Thu 14:24] SCHEDULED: <2023-08-03 Thu 13:00>
/Entered on/ [2023-08-03 Thu 13:12]
** DONE Petit jeu de données : chr22 sur HG001
CLOSED: [2023-08-05 Sat 14:21] SCHEDULED: <2023-08-05 Sat>
** DONE Corriger OMIM annotation: manquant pour NMNAT1
CLOSED: [2023-09-16 Sat 22:47] SCHEDULED: <2023-09-16 Sat>
/Entered on/ [2023-09-16 Sat 19:32]
** PROJ Regarder la profondeur des variants rendus
/Entered on/ [2023-10-05 Thu 21:44]
* Documentation
:PROPERTIES:
:CATEGORY: doc
:END:
** DONE Procédure d'installation nix + dependences pour VM CHU
CLOSED: [2023-04-22 Sat 15:27] SCHEDULED: <2023-04-13 Thu>
* Bibliographie
** DONE Finir[cite:@alser2021]
CLOSED: [2023-09-26 Tue 11:26] SCHEDULED: <2023-09-22 Fri>
* Manuscript
:PROPERTIES:
:CATEGORY: manuscript
:END:
** DONE Flowchart pipeline (avec T2T)
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
** Aligneur
*** DONE Figure: nombre de publication par aligneur
CLOSED: [2023-09-19 Tue 16:54] SCHEDULED: <2023-09-19 Tue>
/Entered on/ [2023-09-19 Tue 08:43]
*** DONE Biblio performance aligneur <(biblio aligneur)> <(aligneur)>
CLOSED: [2023-10-13 Fri 17:40] SCHEDULED: <2023-10-01 Sun>
*** DONE Figure: nombre d'articles citant les principaux aligneur par année
CLOSED: [2023-10-11 Wed 23:54] SCHEDULED: <2023-10-03 Tue>
Il faudrait utiliser pubmed en local, sinon c'est 10 000 requete par aligner !
*** DONE Figure: nombre d'articles citant les principaux aligneur
CLOSED: [2023-10-12 Thu 23:58] SCHEDULED: <2023-10-12 Thu>
Il faudrait utiliser pubmed en local, sinon c'est 10 000 requete par aligner !
On se base sur
** Appel de variant
*** TODO Biblio <(biblio appel variant)> <(appel variant)>
SCHEDULED: <2023-10-13 Fri>
*** TODO Figure: nombre de publication par appel de variant
SCHEDULED: <2023-10-19 Thu>
/Entered on/ [2023-09-19 Tue 08:43]
** TODO Figure: nombre d'exomes par années
SCHEDULED: <2023-10-18 Wed>
/Entered on/ [2023-09-19 Tue 08:43]
* Tests :tests:
** KILL Non régression : version prod
CLOSED: [2023-05-23 Tue 08:46]
*** DONE ID common snp
CLOSED: [2022-11-19 Sat 21:36]
#+begin_src
$ wc -l ID_of_common_snp.txt
23194290 ID_of_common_snp.txt
$ wc -l /Work/Users/apraga/bisonex/database/dbSNP/ID_of_common_snp.txt
23194290 /Work/Users/apraga/bisonex/database/dbSNP/ID_of_common_snp.txt
#+end_src
*** DONE ID common snp not clinvar patho
CLOSED: [2022-12-11 Sun 20:11]
**** DONE Vérification du problème
CLOSED: [2022-12-11 Sun 16:30]
Sur le J:
21155134 /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt.ref
Version de "non-régression"
21155076 database/dbSNP/ID_of_common_snp_not_clinvar_patho.txt
Nouvelle version
23193391 /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt
Si on enlève les doublons
$ sort database/dbSNP/ID_of_common_snp_not_clinvar_patho.txt | uniq > old.txt
$ wc -l old.txt
21107097 old.txt
$ sort /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt | uniq > new.txt
$ wc -l new.txt
21174578 new.txt
$ sort /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clin
.vcf.gz
#+end_src
#+RESULTS:
| Number | of | samples: | 1 |
| Number | of | SNPs: | 6293 |
| Number | of | INDELs: | 1515 |
| Number | of | MNPs: | 1588 |
| Number | of | others: | 0 |
| Number | of | sites: | 9322 |
#+begin_src sh :dir /ssh:meso:/Work/Users/apraga/bisonex/out/annotate
~/.nix-profile/bin/filter_vep -i vep/NA12878-sanger-all-GRCh38/NA12878-sanger-all-GRCh38.vep.vcf.gz --filter 'PICK' | bcftools +counts
#+end_src
| Number | of | samples: | 1 |
| Number | of | SNPs: | 6293 |
| Number | of | INDELs: | 1515 |
| Number | of | MNPs: | 1588 |
| Number | of | others: | 0 |
| Number | of | sites: | 9322 |
***** DONE Test NA12878 + variants sanger: vérifier sortie avec julia : ok
CLOSED: [2023-08-29 Tue 10:21] SCHEDULED: <2023-08-28 Mon>
143 variants/146 comme avant
***** DONE Relancer en T2T pour vérifier compatibilité :T2T:
CLOSED: [2023-08-29 Tue 11:03] SCHEDULED: <2023-08-29 Tue>
**** DONE Repasser les tests sanger sur NA12878
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-08-31 Thu>
2 variants manquants après filter vep
**** DONE Choisir le meilleur transcript nous-meme
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-09-01 Fri>
**** DONE Vérifier T2T passe
CLOSED: [2023-08-31 Thu 22:10] SCHEDULED: <2023-08-31 Thu>
**** DONE Revoir choix du transcrit + filtre avec paul
CLOSED: [2023-09-08 Fri 22:46] SCHEDULED: <2023-09-06 Wed>
**** DONE Filtrer les variants selon les filtres d'Alexis et garder tous les résultat
CLOSED: [2023-09-10 Sun 15:39] SCHEDULED: <2023-09-09 Sat>
**** DONE Ajout colonne MANE SELECT et garder les autres
CLOSED: [2023-09-10 Sun 15:39] SCHEDULED: <2023-09-09 Sat>
**** DONE v1.0
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-09 Sat>
***** DONE Branche prod
CLOSED: [2023-09-10 Sun 15:44] SCHEDULED: <2023-09-09 Sat>
Merge depuis debug
***** DONE Mail alexis
CLOSED: [2023-09-01 Fri 10:32] SCHEDULED: <2023-08-31 Thu>
***** DONE Relancer test sanger
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-10 Sun>
***** DONE Mail Paul pour validation
CLOSED: [2023-09-11 Mon 19:11] SCHEDULED: <2023-09-10 Sun>
**** DONE Utiliser spliceAI >= 0.2 pour filtre au lieu de spip
CLOSED: [2023-09-11 Mon 21:48] SCHEDULED: <2023-09-11 Mon>
**** DONE Repasser tests sanger avec spliceAI
CLOSED: [2023-09-14 Thu 22:45] SCHEDULED: <2023-09-11 Mon>
**** DONE Corriger colonne récessive
CLOSED: [2023-09-14 Thu 22:57] SCHEDULED: <2023-09-14 Thu>
soit 1/1, soit 1/2
soit 0/1 avec 2 variants par gene
*** KILL Comparer les annotations sur 63003856
CLOSED: [2023-08-28 Mon 17:28]
**** Relancer le nouveau pipeline
*** KILL Ancienne version
CLOSED: [2023-08-28 Mon 17:24]
**** KILL HGVS
CLOSED: [2023-08-28 Mon 17:24]
**** KILL Filtrer après VEP
CLOSED: [2023-08-28 Mon 17:24]
**** KILL OMIM
CLOSED: [2023-08-28 Mon 17:24]
**** KILL clinvar
CLOSED: [2023-08-28 Mon 17:24]
**** KILL ACMG incidental
CLOSED: [2023-08-28 Mon 17:24]
**** KILL Grantham
CLOSED: [2023-08-28 Mon 17:24]
**** KILL LRG
CLOSED: [2023-04-18 mar. 17:22] SCHEDULED: <2023-04-18 Tue>
Vu avec alexis, n’est plus à jour
**** KILL Gnomad
CLOSED: [2023-08-28 Mon 17:24]
*** DONE Réordonner les colonnes :annotation:
CLOSED: [2023-08-31 Thu 10:38] SCHEDULED: <2023-08-28 Mon>
Pas d'OMIM, pas de CADD, pas de spliceAI
*** DONE Ajouter gnomAD v3 :gnomadv3:
CLOSED: [2023-10-01 Sun 15:34] SCHEDULED: <2023-09-29 Fri>
/Entered on/ [2023-09-29 Fri 22:38]
Après discussion avec Mathieu sur le problème de certaines régions corrigées dans la v3 !
VEP utilise la v2 pour les exomes et v3 pour génomes
Il manquera pour les patients 1-107
On ne l'a pas dans la sortie de VEP jusque là
**** DONE Test sur 1 patient
CLOSED: [2023-09-30 Sat 23:54] SCHEDULED: <2023-09-29 Fri>
**** DONE Reprendre run batch
CLOSED: [2023-10-01 Sun 15:34]
** DONE Porter exactement la version d'Alexis sur Helios
CLOSED: [2023-01-14 Sat 17:56]
Branche "prod"
** KILL Tester version d'alexis avec Nix
CLOSED: [2023-06-14 Wed 22:37]
*** DONE Ajouter clinvar
CLOSED: [2022-11-13 Sun 19:37]
*** DONE Alignement
CLOSED: [2022-11-13 Sun 12:52]
*** DONE Haplotype caller
CLOSED: [2022-11-13 Sun 13:00]
*** KILL Filter
CLOSED: [2023-06-14 Wed 22:37]
- [X] depth
- [ ] comon snp not path
Problème avec liste des ID
**** KILL variant annotation
CLOSED: [2023-06-14 Wed 22:37]
Besoin de vep
*** KILL Variant calling
CLOSED: [2023-06-14 Wed 22:37]
** KILL Tester sarek
CLOSED: [2023-08-12 Sat 15:53]
#+begin_src sh
module load apptainer/1.1.8
nextflow run nf-core/sarek -profile test,singularity --outdir test-sarek
#+end_src
Les dépendences ne se téléchargent pas correctement, on les extrait à la main
#+begin_src sh
rg -IN galaxyproject modules | sed 's/ //g;s/:$//' | sort | uniq > deps.txt
#+end_src
Nettoyage à la main
Puis
#+begin_src sh
cat deps.txt | xargs -L1 singularity pull
#+end_src
** DONE Support pour samplesheet
CLOSED: [2023-08-03 Thu 14:24] SCHEDULED: <2023-08-03 Thu 13:00>
/Entered on/ [2023-08-03 Thu 13:12]
** DONE Petit jeu de données : chr22 sur HG001
CLOSED: [2023-08-05 Sat 14:21] SCHEDULED: <2023-08-05 Sat>
** DONE Corriger OMIM annotation: manquant pour NMNAT1
CLOSED: [2023-09-16 Sat 22:47] SCHEDULED: <2023-09-16 Sat>
/Entered on/ [2023-09-16 Sat 19:32]
** PROJ Regarder la profondeur des variants rendus
/Entered on/ [2023-10-05 Thu 21:44]
* Documentation
:PROPERTIES:
:CATEGORY: doc
:END:
** DONE Procédure d'installation nix + dependences pour VM CHU
CLOSED: [2023-04-22 Sat 15:27] SCHEDULED: <2023-04-13 Thu>
* Bibliographie
** DONE Finir[cite:@alser2021]
CLOSED: [2023-09-26 Tue 11:26] SCHEDULED: <2023-09-22 Fri>
* Manuscript
:PROPERTIES:
:CATEGORY: manuscript
:END:
** DONE Flowchart pipeline (avec T2T)
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
** Aligneur
*** DONE Figure: nombre de publication par aligneur
CLOSED: [2023-09-19 Tue 16:54] SCHEDULED: <2023-09-19 Tue>
/Entered on/ [2023-09-19 Tue 08:43]
*** DONE Biblio performance aligneur <(biblio aligneur)> <(aligneur)>
CLOSED: [2023-10-13 Fri 17:40] SCHEDULED: <2023-10-01 Sun>
*** DONE Figure: nombre d'articles citant les principaux aligneur par année
CLOSED: [2023-10-11 Wed 23:54] SCHEDULED: <2023-10-03 Tue>
Il faudrait utiliser pubmed en local, sinon c'est 10 000 requete par aligner !
*** DONE Figure: nombre d'articles citant les principaux aligneur
CLOSED: [2023-10-12 Thu 23:58] SCHEDULED: <2023-10-12 Thu>
Il faudrait utiliser pubmed en local, sinon c'est 10 000 requete par aligner !
On se base sur
** Appel de variant
*** TODO Biblio <(biblio appel variant)> <(appel variant)>
SCHEDULED: <2023-10-18 Wed>
*** TODO Figure: nombre de publication par appel de variant
SCHEDULED: <2023-10-19 Thu>
/Entered on/ [2023-09-19 Tue 08:43]
** TODO Figure: nombre d'exomes par années
SCHEDULED: <2023-10-25 Wed>
/Entered on/ [2023-09-19 Tue 08:43]
* Tests :tests:
** KILL Non régression : version prod
CLOSED: [2023-05-23 Tue 08:46]
*** DONE ID common snp
CLOSED: [2022-11-19 Sat 21:36]
#+begin_src
$ wc -l ID_of_common_snp.txt
23194290 ID_of_common_snp.txt
$ wc -l /Work/Users/apraga/bisonex/database/dbSNP/ID_of_common_snp.txt
23194290 /Work/Users/apraga/bisonex/database/dbSNP/ID_of_common_snp.txt
#+end_src
*** DONE ID common snp not clinvar patho
CLOSED: [2022-12-11 Sun 20:11]
**** DONE Vérification du problème
CLOSED: [2022-12-11 Sun 16:30]
Sur le J:
21155134 /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt.ref
Version de "non-régression"
21155076 database/dbSNP/ID_of_common_snp_not_clinvar_patho.txt
Nouvelle version
23193391 /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt
Si on enlève les doublons
$ sort database/dbSNP/ID_of_common_snp_not_clinvar_patho.txt | uniq > old.txt
$ wc -l old.txt
21107097 old.txt
$ sort /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clinvar_patho.txt | uniq > new.txt
$ wc -l new.txt
21174578 new.txt
$ sort /Work/Groups/bisonex/data/dbSNP/GRCh38.p13/ID_of_common_snp_not_clin
stq1,fastq2
NA12878,sanger-all,data/NA12878-sanger-inserted-all_1.fq.gz,data/NA12878-sanger-inserted-all_2.fq.gz
On lance la simulation
#+begin_src
nextflow run main.nf -profile standard,helios --input=samples-synthetic.csv --genome=GRCh38 -bg
#+end_src
***** DONE Résultat après haplotyecaller: 3 varinat perdus => ok
CLOSED: [2023-08-17 Thu 19:13]
Haplotypecaller 143 found over 146
3×3 DataFrame
| variant | meanQual | depth |
|---------------------+----------+-------|
| chr12:g.13720138C>T | 60.0 | 1 |
| chr17:g.10296150T>A | 60.0 | 3 |
| chr21:g.43426167C>T | 0.0 | 88 |
Pas assez de read (1,2) et problème d'alignement (3)
***** KILL Résultat après filtre depth : +10 variants perduis
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
filter depth : another 10 missed variants
10×3 DataFrame
| variant | meanQual | depth |
|---------------------+----------+-------|
| chr3:g.71112628C>T | 60.0 | 62 |
| chr12:g.40367710A>G | 58.0435 | 46 |
| chr14:g.58458545G>A | 60.0 | 9 |
| chr15:g.66703292C>T | 60.0 | 33 |
| chr16:g.30965737C>A | 60.0 | 18 |
| chr17:g.61968202A>C | 60.0 | 46 |
| chrX:g.124056226T>G | 60.0 | 40 |
| chrX:g.24737739G>T | 60.0 | 16 |
| chrX:g.40591349C>T | 60.0 | 37 |
| chrX:g.53193275G>A | 60.0 | 32 |
| | | |
S'ils sont hétérozygotes, 0.5*depth est effectivement < 30 (notre filtre...)
****** KILL Problème d'inserstion des reads: on en perd de nombreux ! -> regénérer données
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
Ex: chrX:g.124056226T>G : on passe de 65 reads à 1
***** DONE Résultat après filtre common variant: +0 ok
CLOSED: [2023-08-17 Thu 19:32]
***** KILL Résultat après filtre VEP : +23 perdus ??
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
filter vep : another 23 missed variants
23×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼───────────────────────────────────────
1 │ chr1:g.183222115C>T 60.0 168
2 │ chr1:g.39388062C>T 60.0 285
3 │ chr2:g.240719197G>C 60.0 77
4 │ chr3:g.41227353G>C 60.0 105
5 │ chr4:g.15536991T>G 60.0 41
6 │ chr5:g.14474096G>A 60.0 191
7 │ chr8:g.43122149C>T 60.0 237
8 │ chr9:g.128603589A>C 60.0 304
9 │ chr9:g.137452819G>C 60.0 107
10 │ chr10:g.129957338T>C 60.0 116
11 │ chr10:g.247389T>G 60.0 56
12 │ chr11:g.61313668G>A 60.0 83
13 │ chr12:g.45850467C>T 60.0 291
14 │ chr14:g.64216315C>G 60.0 263
15 │ chr15:g.60514655G>A 60.0 259
16 │ chr17:g.61966475G>T 60.0 144
17 │ chr17:g.7852503T>C 60.0 190
18 │ chr19:g.13230158G>A 60.0 172
19 │ chr19:g.38523211C>G 60.0 93
20 │ chr19:g.4110557G>C 59.9929 425
21 │ chr20:g.62334188G>A 60.0 62
22 │ chrX:g.47575255G>A 60.0 244
23 │ chrX:g.53409112G>A 60.0 136
**** DONE [#A] Tout insérer dans NA12878 avec XAMscissors (XAMScissors à jour)
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-19 Sat>
***** DONE Insertion
CLOSED: [2023-08-20 Sun 09:15]
***** DONE Vérifier après haplotypecaller: 3 variants manquant mais ok
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
3×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13720138C>T 60.0 1
2 │ chr17:g.10296150T>A 60.0 1
3 │ chr21:g.43426167C>T 0.0 59
Manque de profondeur sur 2 et mauvaise qualité sur 3
***** DONE Vérifier après filterdepth: 0 perdus en plus
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
***** DONE Vérifier après filterpolymorphis : 0 perdus en plus
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
***** DONE Vérifier après filter vep: 2 perdus en plus
CLOSED: [2023-08-20 Sun 12:37] SCHEDULED: <2023-08-20 Sun>
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼─────────────────────────────────────
1 │ chr17:g.7852503T>C 60.0 96
2 │ chrX:g.47575255G>A 60.0 145
***** DONE 1ere correction spip: meilleur nombre de variants en sortie mais manque toujours ces 2
CLOSED: [2023-08-20 Sun 11:38]
***** DONE --pick : résout le problème
CLOSED: [2023-08-20 Sun 12:37]
chrX:g.47575255G>A est rendu downstream_gene_variant avec l'option --pick
Or il n'est pas en5' dans les transcrits refseq...
https://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chrX%3A47575242%2D47575268&hgsid=301211823_xpelPqPJije7wSIhg070JeGH5ZwV
https://mobidetails.iurc.montp.inserm.fr/MD/api/variant/238296/browser/
Idem pour l'autre
chr17:g.7852503T>C
https://mobidetails.iurc.montp.inserm.fr/MD/api/variant/182993/browser/
Note:
VEP chooses one block of annotation per variant, using an ordered set of criteria. This order may be customised using --pick_order.
MANE Select transcript status
MANE Plus Clinical transcript status
canonical status of transcript
APPRIS isoform annotation
transcript support level
biotype of transcript ("protein_coding" preferred)
CCDS status of transcript
consequence rank according to this table
translated, transcript or feature length (longer preferred)
"Wherever possible we would discourage you from summarising data in this way. "
**** DONE Mail alexis
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-20 Sun>
**** TODO Données simuscop 200x
SCHEDULED: <2023-10-19 Thu>
**** DONE En T2T avec liftover (filtre = spip) : ok mais lent et trop de variants :tests:
CLOSED: [2023-09-17 Sun 17:13] SCHEDULED: <2023-09-17 Sun>
1. Conversion en bed
#+begin_src sh :dir:~/code/sanger
open snvs-cento-sanger.csv | select chrom pos | insert pos2 {$in.pos } | to csv --separator="\t" | save snvs-cento-sanger.bed -f
#+end_src
2. Liftover avec UCSC (en ligne)
NB: vérifié sur le premier résultat en cherche le read contenant le variant (samtools view -r puis samtools view | grep en T2T) et avec l'aide d'IGV, on a un variant qui correspond en
chr1:10757746
3. En supposant que l'ordre des variants n'a pas changé, on ajoute simplement REF et ALT avec annotateLifted.jl
Annotation spip *très lente* : 1h13 !
Résultat:
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13594572 60.0 1
2 │ chr17:g.10204026 60.0 1
144 found over 146
filter depth : another 0 missed variants
filter poly : another 0 missed variants
filter vep : another 0 missed variants
Et on a trop de variants en sortie (7330 !)
**** DONE Mail Paul avec résultats filtre en T2T + nouveau schéma
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
* Ré-interprétation :reanalysis:
** DONE Lancer tests sur données brutes [225/250] <(samples.csv)> <(runs.waiting)>
CLOSED: [2023-10-14 Sat 11:58] SCHEDULED: <2023-10-08 Sun>
- [X] 100222_63015289
- [X] 1600304839_63051311
- [X] 1900007827_62913191
- [X] 1900398899_62999500
- [X] 1900486799_62913197
- [X] 2100422923_62952677
- [X] 2100458888_62933047
- [X] 2100601558_62903
840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63061871
- [X] 2300055440_63061880
- [X] 230006894_63064950
- [X] 2300071111_63070356
- [X] 2300083434_63071675
- [X] 2300103609_63076239
- [X] 2300104572_63076232
- [X] 2300109602_63076765
- [X] 2300109665_63076770
- [X] 2300119721_63078732
- [X] 2300137773_63078133
- [X] 2300137834_63078123
- [X] 2300167821_63086183
- [X] 2300172698_63113453
- [X] 2300188216_63090609
- [X] 2300188281_63090632
- [ ] 2300188800_63090616
- [ ] 2300193193645_63090623
- [ ] 2300193668_63090611
- [ ] 2300195426_63090608
- [ ] 2300201017_63089636
- [ ] 2300227479_63098330
- [ ] 2300232688_63130821
- [ ] 2300292749_63109239
- [ ] 230029277_63109247
- [ ] 2300294712_63109236
- [ ] 2300308032_63111581
- [ ] 2300323537_63114209
- [ ] 2300334609_63115535
- [ ] 2300346867_63118093
- [ ] 2300346867_63118093_NA12878
- [ ] 2300348940_63118099
- [ ] 2300359806_63119915
- [ ] 2300380476_63123963
- [ ] 2300382582_63123749
- [ ] 2300384269_63126867
- [ ] 2300407581_63130826
- [ ] 2300407626_63130842
- [ ] 2300409593_63130874
- [ ] 2300409612_63130980
- [ ] 2300417623_63131524
** TODO Variants manqués :missed:
SCHEDULED: <2023-10-21 Sat>
*** DONE 63012582: chr10:g.102230760 filtré par AD :63012582:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
Il est en sortie d'haplotypecaller !
Attention à la position : POS=102230753 noté CG->C
GT:AD:DP:GQ:PL 0/1:26,8:34:99:146,0,671
Filtré par la condition AD <= 10 (porté par 8 reads seulement)
Non confirméen sanger, rendu vous
**** KILL image BAM cento
CLOSED: [2023-10-08 Sun 23:13]
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 23:23] SCHEDULED: <2023-10-08 Sun>
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** DONE 63060439: chr15:g.26869324 = Problème de profondeur DP=15 :63060439:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
GABRA5
Rendu VOUS avec un variant patho MDB5 pour même patient (VOUS- même)
Non confirmé en Sanger
GT:AD:DP:GQ:PL 0/1:9,6:15:99:103,0,213
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 22:56]
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** TODO Ajouter négatifs dans la liste des variants
SCHEDULED: <2023-10-19 Thu>
* Résultats
** TODO Speed-up BWA-mem
SCHEDULED: <2023-10-22 Sun>
** TODO Speed-up Hapotypecaller
SCHEDULED: <2023-10-22 Sun>
* Communication
** DONE Mail NGS-diag
CLOSED: [2023-10-06 Fri 08:04] SCHEDULED: <2023-10-06 Fri>
/Entered on/ [2023-10-04 Wed 19:33]
stq1,fastq2
NA12878,sanger-all,data/NA12878-sanger-inserted-all_1.fq.gz,data/NA12878-sanger-inserted-all_2.fq.gz
On lance la simulation
#+begin_src
nextflow run main.nf -profile standard,helios --input=samples-synthetic.csv --genome=GRCh38 -bg
#+end_src
***** DONE Résultat après haplotyecaller: 3 varinat perdus => ok
CLOSED: [2023-08-17 Thu 19:13]
Haplotypecaller 143 found over 146
3×3 DataFrame
| variant | meanQual | depth |
|---------------------+----------+-------|
| chr12:g.13720138C>T | 60.0 | 1 |
| chr17:g.10296150T>A | 60.0 | 3 |
| chr21:g.43426167C>T | 0.0 | 88 |
Pas assez de read (1,2) et problème d'alignement (3)
***** KILL Résultat après filtre depth : +10 variants perduis
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
filter depth : another 10 missed variants
10×3 DataFrame
| variant | meanQual | depth |
|---------------------+----------+-------|
| chr3:g.71112628C>T | 60.0 | 62 |
| chr12:g.40367710A>G | 58.0435 | 46 |
| chr14:g.58458545G>A | 60.0 | 9 |
| chr15:g.66703292C>T | 60.0 | 33 |
| chr16:g.30965737C>A | 60.0 | 18 |
| chr17:g.61968202A>C | 60.0 | 46 |
| chrX:g.124056226T>G | 60.0 | 40 |
| chrX:g.24737739G>T | 60.0 | 16 |
| chrX:g.40591349C>T | 60.0 | 37 |
| chrX:g.53193275G>A | 60.0 | 32 |
| | | |
S'ils sont hétérozygotes, 0.5*depth est effectivement < 30 (notre filtre...)
****** KILL Problème d'inserstion des reads: on en perd de nombreux ! -> regénérer données
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
Ex: chrX:g.124056226T>G : on passe de 65 reads à 1
***** DONE Résultat après filtre common variant: +0 ok
CLOSED: [2023-08-17 Thu 19:32]
***** KILL Résultat après filtre VEP : +23 perdus ??
CLOSED: [2023-08-19 Sat 20:04] SCHEDULED: <2023-08-18 Fri>
filter vep : another 23 missed variants
23×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼───────────────────────────────────────
1 │ chr1:g.183222115C>T 60.0 168
2 │ chr1:g.39388062C>T 60.0 285
3 │ chr2:g.240719197G>C 60.0 77
4 │ chr3:g.41227353G>C 60.0 105
5 │ chr4:g.15536991T>G 60.0 41
6 │ chr5:g.14474096G>A 60.0 191
7 │ chr8:g.43122149C>T 60.0 237
8 │ chr9:g.128603589A>C 60.0 304
9 │ chr9:g.137452819G>C 60.0 107
10 │ chr10:g.129957338T>C 60.0 116
11 │ chr10:g.247389T>G 60.0 56
12 │ chr11:g.61313668G>A 60.0 83
13 │ chr12:g.45850467C>T 60.0 291
14 │ chr14:g.64216315C>G 60.0 263
15 │ chr15:g.60514655G>A 60.0 259
16 │ chr17:g.61966475G>T 60.0 144
17 │ chr17:g.7852503T>C 60.0 190
18 │ chr19:g.13230158G>A 60.0 172
19 │ chr19:g.38523211C>G 60.0 93
20 │ chr19:g.4110557G>C 59.9929 425
21 │ chr20:g.62334188G>A 60.0 62
22 │ chrX:g.47575255G>A 60.0 244
23 │ chrX:g.53409112G>A 60.0 136
**** DONE [#A] Tout insérer dans NA12878 avec XAMscissors (XAMScissors à jour)
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-19 Sat>
***** DONE Insertion
CLOSED: [2023-08-20 Sun 09:15]
***** DONE Vérifier après haplotypecaller: 3 variants manquant mais ok
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
3×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13720138C>T 60.0 1
2 │ chr17:g.10296150T>A 60.0 1
3 │ chr21:g.43426167C>T 0.0 59
Manque de profondeur sur 2 et mauvaise qualité sur 3
***** DONE Vérifier après filterdepth: 0 perdus en plus
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
***** DONE Vérifier après filterpolymorphis : 0 perdus en plus
CLOSED: [2023-08-20 Sun 09:18] SCHEDULED: <2023-08-20 Sun>
***** DONE Vérifier après filter vep: 2 perdus en plus
CLOSED: [2023-08-20 Sun 12:37] SCHEDULED: <2023-08-20 Sun>
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼─────────────────────────────────────
1 │ chr17:g.7852503T>C 60.0 96
2 │ chrX:g.47575255G>A 60.0 145
***** DONE 1ere correction spip: meilleur nombre de variants en sortie mais manque toujours ces 2
CLOSED: [2023-08-20 Sun 11:38]
***** DONE --pick : résout le problème
CLOSED: [2023-08-20 Sun 12:37]
chrX:g.47575255G>A est rendu downstream_gene_variant avec l'option --pick
Or il n'est pas en5' dans les transcrits refseq...
https://genome-euro.ucsc.edu/cgi-bin/hgTracks?db=hg38&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chrX%3A47575242%2D47575268&hgsid=301211823_xpelPqPJije7wSIhg070JeGH5ZwV
https://mobidetails.iurc.montp.inserm.fr/MD/api/variant/238296/browser/
Idem pour l'autre
chr17:g.7852503T>C
https://mobidetails.iurc.montp.inserm.fr/MD/api/variant/182993/browser/
Note:
VEP chooses one block of annotation per variant, using an ordered set of criteria. This order may be customised using --pick_order.
MANE Select transcript status
MANE Plus Clinical transcript status
canonical status of transcript
APPRIS isoform annotation
transcript support level
biotype of transcript ("protein_coding" preferred)
CCDS status of transcript
consequence rank according to this table
translated, transcript or feature length (longer preferred)
"Wherever possible we would discourage you from summarising data in this way. "
**** DONE Mail alexis
CLOSED: [2023-08-20 Sun 13:45] SCHEDULED: <2023-08-20 Sun>
**** TODO Données simuscop 200x
SCHEDULED: <2023-10-21 Sat>
**** DONE En T2T avec liftover (filtre = spip) : ok mais lent et trop de variants :tests:
CLOSED: [2023-09-17 Sun 17:13] SCHEDULED: <2023-09-17 Sun>
1. Conversion en bed
#+begin_src sh :dir:~/code/sanger
open snvs-cento-sanger.csv | select chrom pos | insert pos2 {$in.pos } | to csv --separator="\t" | save snvs-cento-sanger.bed -f
#+end_src
2. Liftover avec UCSC (en ligne)
NB: vérifié sur le premier résultat en cherche le read contenant le variant (samtools view -r puis samtools view | grep en T2T) et avec l'aide d'IGV, on a un variant qui correspond en
chr1:10757746
3. En supposant que l'ordre des variants n'a pas changé, on ajoute simplement REF et ALT avec annotateLifted.jl
Annotation spip *très lente* : 1h13 !
Résultat:
2×3 DataFrame
Row │ variant meanQual depth
│ String Float64 Int64
─────┼──────────────────────────────────────
1 │ chr12:g.13594572 60.0 1
2 │ chr17:g.10204026 60.0 1
144 found over 146
filter depth : another 0 missed variants
filter poly : another 0 missed variants
filter vep : another 0 missed variants
Et on a trop de variants en sortie (7330 !)
**** DONE Mail Paul avec résultats filtre en T2T + nouveau schéma
CLOSED: [2023-09-17 Sun 23:15] SCHEDULED: <2023-09-17 Sun>
** TODO Medically relevant genes
SCHEDULED: <2023-10-25 Wed>
/Entered on/ [2023-10-18 Wed 22:37]
* Ré-interprétation :reanalysis:
** DONE Lancer tests sur données brutes [225/250] <(samples.csv)> <(runs.waiting)>
CLOSED: [2023-10-14 Sat 11:58] SCHEDULED: <2023-10-08 Sun>
- [X] 100222_63015289
- [X] 1600304839_63051311
- [X] 1900007827_62913191
- [X] 1900398899_62999500
- [X] 1900486799_62913197
- [X] 2100422923_62952677
- [X] 2100458888_62933047
- [X] 2100601558_62903840
- [X] 2100609288_62905768
- [X] 2100609501_62905776
- [X] 2100614493_62951074
- [X] 2100622566_62908067
- [X] 2100622601_62908060
- [X] 2100622705_62908063
- [X] 2100640027_62911936
- [X] 2100645285_62913212
- [X] 2100661411_62914081
- [X] 2100661462_62914086
- [X] 2100708257_62921596
- [X] 2100738732_62926501
- [X] 2100738850_62926509
- [X] 2100746751_62926505
- [X] 2100746797_62926506
- [X] 2100782349_62931722
- [X] 2100782416_62931561
- [X] 2100782559_62931718
- [X] 2100799204_62934768
- [X] 2200010202_62940284
- [X] 2200023600_62940631
- [X] 2200024348_62999591
- [X] 2200027505_62942457
- [X] 2200038776_62943412
- [X] 2200041919_62943405
- [X] 2200088014_62951326
- [X] 2200146652_62959388
- [X] 2200151850_62960953
- [X] 2200160014_62959475
- [X] 2200160070_62959478
- [X] 2200201368_62967471
- [X] 2200201400_62967470
- [X] 2200265558_62976332
- [X] 2200265605_62976401
- [X] 2200267046_62975192
- [X] 2200273878_62999530
- [X] 2200279708_62977002
- [X] 2200284408_62979102
- [X] 2200293987_62979116
- [X] 2200294359_62979118
- [X] 2200306299_62982217
- [X] 2200306539_62982193
- [X] 220030671_62982211
- [X] 2200307058_62982231
- [X] 2200307108_62982196
- [X] 2200307136_62982221
- [X] 2200307199_62982239
- [X] 2200307230_62982234
- [X] 2200307262_62982219
- [X] 2200307297_62982227
- [X] 2200324510_62985453
- [X] 2200324549_62985478
- [X] 2200324573_62985445
- [X] 2200324594_62985467
- [X] 2200324606_62985463
- [X] 2200324614_62985459
- [X] 2200338306_62985430
- [X] 2200343880_62989407
- [X] 2200343910_62989460
- [X] 2200343938_62989451
- [X] 2200343966_62989456
- [X] 2200343993_62989440
- [X] 2200344013_62989464
- [X] 2200349749_62989465
- [X] 2200363462_62988848
- [X] 2200377880_62991993
- [X] 2200378032_62991991
- [X] 2200383996_62993828
- [X] 2200384015_62993796
- [X] 2200384046_62993822
- [X] 2200384117_62993808
- [X] 2200384187_62993825
- [X] 2200384231_62992898
- [X] 2200385658_63060260
- [X] 2200394260_62994732
- [X] 2200395817_62994742
- [X] 2200396731_62994737
- [X] 2200424073_62999579
- [X] 2200424207_62999632
- [X] 2200426178_62999630
- [X] 2200426243_62999635
- [X] 2200426466_62999605
- [X] 2200426642_62999627
- [X] 2200427406_62999649
- [X] 2200427512_62999639
- [X] 2200428953_62999572
- [X] 2200428981_62999600
- [X] 2200428999_62999592
- [X] 2200441970_63000868
- [X] 2200441989_63000882
- [X] 2200442135_63000864
- [X] 2200442216_63000886
- [X] 2200442257_63000951
- [X] 2200451801_63003573
- [X] 2200451862_63004218
- [X] 2200451894_63004210
- [X] 2200456165_63051294
- [X] 2200459865_63004933
- [X] 2200459968_63004937
- [X] 2200460073_63004943
- [X] 2200460121_63004684
- [X] 2200467051_63003856
- [X] 2200467225_63004940
- [X] 2200467261_63004930
- [X] 2200467338_63004925
- [X] 2200470099_63004485
- [X] 2200470142_63004480
- [X] 2200471780_63004362
- [X] 2200480910_63006466
- [X] 2200495073_63010427
- [X] 2200495510_63009152
- [X] 2200508677_63060252
- [X] 2200510531_63012582
- [X] 2200510628_63012549
- [X] 2200510657_63012554
- [X] 2200511249_63012533
- [X] 2200511274_63012586
- [X] 2200517952_63060399
- [X] 2200519525_63060439
- [X] 2200524009_63014044
- [X] 2200524609_63014046
- [X] 2200524616_63014048
- [X] 2200533429_63060425
- [X] 2200539735_63060406
- [X] 2200549908_63019339
- [X] 2200549965_63019349
- [X] 2200550414_63019357
- [X] 2200550471_63020031
- [X] 2200550490_63019351
- [X] 2200550505_63019340
- [X] 2200555565_63018614
- [X] 2200559438_63020029
- [X] 2200559682_63020030
- [X] 2200559713_63019623
- [X] 2200559739_63019626
- [X] 2200569969_63019991
- [X] 2200570001_63021580
- [X] 2200570025_63021490
- [X] 2200570035_63021491
- [X] 2200570042_63021493
- [X] 2200570050_63021494
- [X] 2200579897_63024910
- [X] 2200583995_63024866
- [X] 2200584035_63024905
- [X] 2200584069_63024888
- [X] 2200584126_63024810
- [X] 2200589507_63026712
- [X] 2200597365_63027994
- [X] 2200597480_63027988
- [X] 2200597752_63026853
- [X] 2200597778_63027992
- [X] 22005977_63026903
- [X] 2200609031_63026527
- [X] 2200614198_63113928
- [X] 2200620372_63030821
- [X] 2200620442_63030810
- [X] 2200620498_63030816
- [X] 2200620628_63031031
- [X] 2200622310_63030984
- [X] 2200622355_63030956
- [X] 2200625369_63028699
- [X] 2200625410_63028697
- [X] 2200625536_63028694
- [X] 2200630189_63030665
- [X] 2200635149_63033182
- [X] 2200644544_63037731
- [X] 2200644594_63037725
- [X] 2200650089_63038093
- [X] 2200666292_63076568
- [X] 2200669188_63036688
- [X] 2200669320_63040259
- [X] 2200669383_63040254
- [X] 2200669414_63040257
- [X] 2200669446_63040251
- [X] 2200680342_63105271
- [X] 2200694535_63042853
- [X] 2200694789_63042862
- [X] 2200694858_63042702
- [X] 2200694917_63042696
- [X] 2200699290_63043047
- [X] 2200699345_63040238
- [X] 2200699383_63043050
- [X] 2200699412_63040731
- [X] 220071551_63048935
- [X] 2200731515_63048963
- [X] 2200748145_63051198
- [X] 2200748171_63051213
- [X] 2200751046_63051249
- [X] 2200751101_63051234
- [X] 2200766471_63054590
- [X] 2200767731_63054595
- [X] 2200767822_63054464
- [X] 2200775505_63060410
- [X] 2200850441_63019345
- [X] 220597589_63026879
- [X] 2300003253_63060430
- [X] 2300005679_63060370
- [X] 2300009914_63060390
- [X] 2300028784_63060001
- [X] 2300036815_63063357
- [X] 2300055382_63061874
- [X] 2300055421_63061871
- [X] 2300055440_63061880
- [X] 230006894_63064950
- [X] 2300071111_63070356
- [X] 2300083434_63071675
- [X] 2300103609_63076239
- [X] 2300104572_63076232
- [X] 2300109602_63076765
- [X] 2300109665_63076770
- [X] 2300119721_63078732
- [X] 2300137773_63078133
- [X] 2300137834_63078123
- [X] 2300167821_63086183
- [X] 2300172698_63113453
- [X] 2300188216_63090609
- [X] 2300188281_63090632
- [ ] 2300188800_63090616
- [ ] 2300193193645_63090623
- [ ] 2300193668_63090611
- [ ] 2300195426_63090608
- [ ] 2300201017_63089636
- [ ] 2300227479_63098330
- [ ] 2300232688_63130821
- [ ] 2300292749_63109239
- [ ] 230029277_63109247
- [ ] 2300294712_63109236
- [ ] 2300308032_63111581
- [ ] 2300323537_63114209
- [ ] 2300334609_63115535
- [ ] 2300346867_63118093
- [ ] 2300346867_63118093_NA12878
- [ ] 2300348940_63118099
- [ ] 2300359806_63119915
- [ ] 2300380476_63123963
- [ ] 2300382582_63123749
- [ ] 2300384269_63126867
- [ ] 2300407581_63130826
- [ ] 2300407626_63130842
- [ ] 2300409593_63130874
- [ ] 2300409612_63130980
- [ ] 2300417623_63131524
** TODO Variants manqués :missed:
SCHEDULED: <2023-10-21 Sat>
*** DONE 63012582: chr10:g.102230760 filtré par AD :63012582:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
Il est en sortie d'haplotypecaller !
Attention à la position : POS=102230753 noté CG->C
GT:AD:DP:GQ:PL 0/1:26,8:34:99:146,0,671
Filtré par la condition AD <= 10 (porté par 8 reads seulement)
Non confirméen sanger, rendu vous
**** KILL image BAM cento
CLOSED: [2023-10-08 Sun 23:13]
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 23:23] SCHEDULED: <2023-10-08 Sun>
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
*** DONE 63060439: chr15:g.26869324 = Problème de profondeur DP=15 :63060439:
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
GABRA5
Rendu VOUS avec un variant patho MDB5 pour même patient (VOUS- même)
Non confirmé en Sanger
GT:AD:DP:GQ:PL 0/1:9,6:15:99:103,0,213
**** DONE image BAM bisonex
CLOSED: [2023-10-08 Sun 22:56]
**** DONE Mail Paul
CLOSED: [2023-10-08 Sun 23:24] SCHEDULED: <2023-10-08 Sun>
** TODO Mettre à jour liste des variants
SCHEDULED: <2023-10-19 Thu>
*** TODO Ajouter négatifs dans la liste des variants
SCHEDULED: <2023-10-19 Thu>
** TODO Comparer variants cento à sortie bisonex
SCHEDULED: <2023-10-19 Thu>
* Résultats
** TODO Speed-up BWA-mem
SCHEDULED: <2023-10-22 Sun>
** TODO Speed-up Hapotypecaller
SCHEDULED: <2023-10-22 Sun>
* Communication
** DONE Mail NGS-diag
CLOSED: [2023-10-06 Fri 08:04] SCHEDULED: <2023-10-06 Fri>
/Entered on/ [2023-10-04 Wed 19:33]