6XUWY7T2ITWJYRUHDWSG66N7DYCBBXHRHQAAS7GGINDYUJHGMARQC
WXXVA2RG5DAZMQVHHYXI3BDVPU2XOT2DGKXEZG777DZJI44OOFVAC
SQAG5QHQNITVNTIDS74F2EYBFIQV24HFZ4D3A2UY2Y4SG7KT4HNQC
RHWQQAAHNHFO3FLCGVB3SIDKNOUFJGZTDNN57IQVBMXXCWX74MKAC
CXW37WKZDOFBTPGZQGQVWDWGA7YWGGJ47SSD4KYEXD6MPERELGGAC
3ZXSF6LXHYWPRATRVITIZETB7FTSP3HYWHV7AIS3B65Y6ID3EWXQC
JCN2NFYRDY2Z6X73BIGMUTT7FLNOAKE3F7KOA5BRO7N2PCY3GBAQC
IRZ3N4E67WSWRGS5F77WEB27CLBG6IEW32GFSOYHFCC23TDGRU2AC
IQHE5LMHVKZPI6VEKZY7JLQ7MWT42ERFDB5R4CIG2BSXYD3CKV5QC
- [ ] Relax for a few days and watch how interactive programs are being
composed
- [ ] Get back to the real-world example and make it a complete
Cabal project.
- [ ] [[https://mmhaskell.com/testing/test-driven-development][Testing]]
**** TODO Relax for a few days and watch how interactive programs are being composed
**** TODO Get back to the real-world example and make it a complete Cabal project.
**** TODO [[https://mmhaskell.com/testing/test-driven-development][Testing]]
**** GHC
***** TODO Lire commentary
***** TODO STRT Lire [[https://www.aosabook.org/en/ghc.html]]
** GHC
*** TODO GHC commentary
Notamment Ollie Charles's 24 days of GHC Extensions,
*** TODO Lire [[https://www.aosabook.org/en/ghc.html]]
** Vidéos
*** STRT https://www.youtube.com/watch?v=re96UgMk6GQ
** TODO Articles historiques
1. [[https://watermark.silverchair.com/320098.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAsYwggLCBgkqhkiG9w0BBwagggKzMIICrwIBADCCAqgGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMHXfjdjwhGI2t4bLLAgEQgIICeQjZ-I8gmuaFqBktP4IOifHODtMAHcNF_LwRYyq7NswQ7vT6LJho9P_junCAORLGMV9dgq9JMePH2PFKNxXxrEP1VY7rIDG0gzoeObSkgMDn4MXalrIxD3ejY8vsGYy6vce8Kh70J_UJ8RamO1l3BNNUzy2W6VRaa_cMQr_ekdwcz0oihz0BVKn_bgm_8DjiiPhzj8uU9flVhi13t_oIFA6b3At2QMmPe7Z9OyfLkXivKkmKKNoHwSS7AnTIYAKCO383e4kG6NzZ_elai-XMAJs2Nk0vcgaltld1KeaW3269104DdIlFGevJUVNgwE_4LIheSYRZr9Gr0yRR6TROxdsyxrmgQ22Pzxxpnl8-KdjkW6aRSCKNk_yb5hYcPoRa3ldc5yPV15j8i4t9Mv4U_mBwmIRtMIKPdEHeMvcRx6c8_8uT4RV2esuOPfZlA05bzBgJhMS87M8myxisH-exkTMkm58o6nzHf1lGxzn_JS1VSHbhJCUl82ubzzOWjvl3QJM_vv805XTbn_G-fcRi0d9EQIRTqoObWVFyXW-pz16bWoZPZnBQ1gOmc3hPTGBMZjFR6p9VEAO7bKcK8o0yQDjVWEELNwfAAHc-oF_wLiEjXDNBoUttghgQzzvymKY_jSZhcU8TraVu2i551fpuDNEjSJd0qY5Rg3J6eWU550nJmnoWmX6o7KGiYp0vVMfOoFYXJ1trZWSGoRhDQP2LOLIOt3t2idlj6kV_MoCY3BRnkbxf4XIH7gLJf6Dky6hXFbTU8Fjsn8XHBeKSmaAYJ-sbmGB_BdZO8hHyvHvPv0lTtGcSuKywoJhMbblXRzyuacj_6mZQl5j3tAWhy][Why functional programming matters]]
Très lisible
2. [[https://dl.acm.org/doi/pdf/10.1145/91556.91592][Comprehending monads]]
Introduction du concept
3. [[https://dl.acm.org/doi/pdf/10.1145/158511.158524][Imperative functional programming]]
Application des monads poru résoudre le problème IO
******* DONE bedtools intersect
CLOSED: [2022-10-23 Sun 01:05]
Si on ne regarde que les variants, on retrouve bien 74302
#+begin_src sh
rg "^NC" none_sorted.vcf | wc -l
#+end_src
NB : test fait avec
#+begin_src
bcftools isec dbSNP_common.vcf.gz clinvar.vcf.gz -c none -n =2 -w 1 | sort > none.vcf
sort common/0003.vcf > common/0003_sorted.vcf
comm -13 common/0003_sorted.vcf none_sorted.vcf
#+end_src
******** DONE Géstion des duplicates: -c none
CLOSED: [2022-10-23 Sun 13:56]
Si on ne garde que ceux avec REF et ALT identiques
#+begin_src sh
bcftools isec dbSNP_common.vcf.gz clinvar.vcf.gz -c none -n =2 -w 1 | wc -l
74978
#+end_src
Si on garde tout
#+begin_src sh
bcftools isec dbSNP_common.vcf.gz clinvar.vcf.gz -c all -n =2 -w 1 | wc -l
137777
#+end_src
Pour regarder la différence :
#+begin_src sh
bcftools isec dbSNP_common.vcf.gz clinvar.vcf.gz -c none -n =2 -w 1 | sort > none_sorted.vcf
bcftools isec dbSNP_common.vcf.gz clinvar.vcf.gz -c all -n =2 -w 1 | sort > all_sorted.vcf
comm -13 none_sorted.vcf all_sorted.vcf | head
#+end_src
Sur un exemple,on a bien des variants différents
******** DONE Suppression des clinvar patho
CLOSED: [2022-10-23 Sun 18:55]
Semble faire le travail vu que dbSNP_commo a 23194960 lignes (donc ~80 000 de moins)
#+begin_src sh
$ bcftools isec -e 'INFO/CLNSIG="Pathogenic" & INFO/CLNSIG="Pathogenic/Likely_pathogenic"' -c none -n~10 dbSNP_common.vcf.gz clinvar.vcf.gz | wc -l
Note: -w option not given, printing list of sites...
23119984
#+end_src
Par contre, l'o'ption -w ou -p fait des ficher "data"...
Après un nouvel essai, plus de problème
#+begin_src
$ bcftools isec -e 'INFO/CLNSIG="Pathogenic" & INFO/CLNSIG="Pathogenic/Likely_pathogenic"' -c none -n=1 dbSNP_common.vcf.gz clinvar.vcf.gz -w 1 -o lol.vcf.gz
$ zcat lol.vcf.gz | wc -l
23120660
#+end_src
À noter le choix de l'option -n qui change entre "=1" et "~10"...
En effet "=1" = au moins 1 fichier et "~10" fait exactement dans le premier et non dans le second
#+begin_src
$ bcftools isec -e 'INFO/CLNSIG="Pathogenic" & INFO/CLNSIG="Pathogenic/Likely_pathogenic"' -c none -n~10 dbSNP_common.vcf.gz clinvar.vcf.gz -w 1 -o lol.vcf.gz
$ zcat lol.vcf.gz | wc -l
23120660
#+end_src
******* DONE Essai bedtools intersect
* Difficultés
** ZFS: "Mismatch between pool hostid and system"
zgenhostid $(hostid) résout le problème ?