QWCRR5CXIUZYZADCPUIE35VYCH4IEIEPPCOP2RV6TETADPS6H6SAC
2USCLVIQKYC5MNFUSK2C7MBTKF6F3MEJMWFVBA4YP4UFLZ5IYMNQC
DOTFSLPG6V4NKETA7D3MDB2MXSAA7CYXLRMYKAYURHGACX6JNP6AC
RHWQQAAHNHFO3FLCGVB3SIDKNOUFJGZTDNN57IQVBMXXCWX74MKAC
FSWJTE7HJWXGV3E3OFBSYY2EG2XSX7M2VCCUWFH5X6J5URJNDDPQC
FGRZYV3XLUW7UNJCWE7LTW4PRHIJVG5RUMQUZH5GZ3ZKMV3UQTYQC
CXW37WKZDOFBTPGZQGQVWDWGA7YWGGJ47SSD4KYEXD6MPERELGGAC
WHKJYYZX7Y2VYHEX4NAKKV4DHI3RBHY3XJMIGPYBPCD3DNWZUHDQC
FXA3ZBV64FML7W47IPHTAJFJHN3J3XHVHFVNYED47XFSBIGMBKRQC
B2YQPU5UAXLACFYTD3FXVUP7AUC24GRHE6RR63CVFONRW6F7HMVQC
MPOLY5PJKJ74PWLI6PUOZ3KOIPLFLDAQO3WSTKRGA5MANYH5ZLUAC
T3IPJM6TYF25RE2EQGASVSGIYJTJSJNZXBCEK3BNJ2LUYIBMTSSAC
* <2023-01-04 Wed> Workout
- RTO: 32-15-13
Skin-the-cat (avec descente): 3
- Muscle-up (négatifs) 3x5
Extension: 3x10
- FL tucked row : 3x{3+1}
Pistols : 4-5-5
- Planche tucked push-up: 3x{5+3+2}
Compression: 3x22
* DONE Quillbot
CLOSED: [2023-01-03 Tue 22:54]
:PROPERTIES:
:ARCHIVE_TIME: 2023-01-03 Tue 22:54
:ARCHIVE_FILE: ~/org/projects.org
:ARCHIVE_OLPATH: Projets personnels
:ARCHIVE_CATEGORY: projects
:ARCHIVE_TODO: DONE
:END:
Possible d'utiliser selenium tout le long
Mais si on copy lese cookies et headers depuis le navigateur:
- on peut bypasser cloudflare (vu qu'on a le résultat)
- mais au lieu d'avoir un json, on a du binaire...
**** HOLD Version de non-régression
Haplotype caller a planté, on relance
**** STRT Version de non-régression: sans télécharger les bases de données
**** TODO Comparer les versions (log)
***** TODO bases de données
****** Genome de référénce:
fna ok
#+begin_src sh :dir /ssh:meso:/Work/Groups/bisonex/
sha256sum data-alexis-reference/genome/GRCh38_latest_genomic.fna
sha256sum data/genome/GRCh38.p13/genomeRef.fna
#+end_src
$ samtools view -c files/tmp_63003856_S135/63003856_S135.bam
128076728
#+RESULTS:
| e0761a7ba5d10de9e7e97fa331667963925531c0199575bcceafbb13c3147e3f | data-alexis-reference/genome/GRCh38_latest_genomic.fna |
| e0761a7ba5d10de9e7e97fa331667963925531c0199575bcceafbb13c3147e3f | data/genome/GRCh38.p13/genomeRef.fna |
au lieu de
$ samtools view -c /Work/Groups/bisonex/ref_63003856_S135/63003856_S135.bam
128077207
Dict ok si on renome le ficdhier d'origine
#+begin_src sh :dir /ssh:meso:/Work/Groups/bisonex/
sed 's/UR:.*/UR:genomeRef.fna/' data-alexis-reference/genome/GRCh38_latest_genomic.dict > lol.dict
diff lol.dict data/genome/GRCh38.p13/genomeRef.dict
#+end_src
**** TODO Comparer les versions (log)
- outils
- base de données
**** TODO Alignement
***** DONE Brut
#+RESULTS:
****** DONE dbSNP et dbSNP common: ok
CLOSED: [2023-01-03 Tue 23:17]
sha256sum GCF_000001405.39.gz
452e1112b6339a9b19821c2a226a8a3ba946e92a47e03e6ae464ef8820ee130d GCF_000001405.39.gz
sha256sum data-alexis-reference/dbSNP/GCF_000001405.39.gz
452e1112b6339a9b19821c2a226a8a3ba946e92a47e03e6ae464ef8820ee130d data-alexis-reference/dbSNP/GCF_000001405.39.gz"
sha256sum dbSNP_common.vcf.gz
70dfd9be859c39916598d23b5744cc1fbda04add5840cd90a6d0cd005bd3075b dbSNP_common.vcf.gz
sha256sum data-alexis-reference/dbSNP/dbSNP_common.vcf.gz
70dfd9be859c39916598d23b5744cc1fbda04add5840cd90a6d0cd005bd3075b data-alexis-reference/dbSNP/dbSNP_common.vcf.gz
***** TODO Outils
| | Prod | Test |
| VCFtools | 0.1.17 | 0.1.16 |
| bcftools | 1.14 | 1.16 |
| samtools | 1.14 | 1.13 |
| gatk | 4.2.4.1 | 4.3.0.0 |
On a des versions plus vieilles sauf (le plus important) Gatk
**** KILL Gatk 4.3.0
CLOSED: [2023-01-04 Wed 19:16]
***** KILL Alignement
CLOSED: [2023-01-04 Wed 19:16]
****** DONE Brut
***** TODO baserecalibrator
options ok
#+begin_src
gatk --java-options "-Xmx3g" BaseRecalibrator \
--input marked_dups.bam \
--output 63003856_S135.table \
--reference genomeRef.fna \
\
--known-sites dbSNP_common.vcf.gz \
--tmp-dir . \
#+end_src
****** KILL baserecalibrator
CLOSED: [2023-01-04 Wed 19:15]
***** TODO applybqsr
******* options ok
#+begin_src
gatk --java-options "-Xmx3g" BaseRecalibrator \
--input marked_dups.bam \
--output 63003856_S135.table \
--reference genomeRef.fna \
\
--known-sites dbSNP_common.vcf.gz \
--tmp-dir . \
#+end_src
****** KILL applybqsr
CLOSED: [2023-01-04 Wed 19:15]
**** variant calling
***** TODO filterDepth : 21 en trop
******** DONE Regarder les flags d'haplotypecaller : nombreuses différences...
CLOSED: [2023-01-04 Wed 19:02]
| --dbsnp /Work/Users/apraga/bisonex/work/08/fca52ac598f21a2812f866bd590792/dbSNP.gz | --dbsnp /mnt/j/bases_de_donnees/dbSNP/GCF_000001405.39.gz |
| --max-mnp-distance 2 | --max-mnp-distance 2 |
| --output 63003856_S135.vcf.gz | --output /mnt/j/working_directory_pipeline_analyse_exome/vcf/63003856_S135.vcf |
| --input 63003856_S135.bam | --input /mnt/j/working_directory_pipeline_analyse_exome/bam/63003856_S135_recalibrated_hg38.bam |
| --reference genomeRef.fna | --reference /mnt/j/bases_de_donnees/genome/GRCh38_latest_genomic.fna |
| --tmp-dir . | --verbosity WARNING |
| --use-posteriors-to-calculate-qual false | --use-posteriors-to-calculate-qual false |
| --dont-use-dragstr-priors false | --dont-use-dragstr-priors false |
| --use-new-qual-calculator true | --use-new-qual-calculator true |
| --annotate-with-num-discovered-alleles false | --annotate-with-num-discovered-alleles false |
| --heterozygosity 0.001 | --heterozygosity 0.001 |
| --indel-heterozygosity 1.25E-4 | --indel-heterozygosity 1.25E-4 |
| --heterozygosity-stdev 0.01 | --heterozygosity-stdev 0.01 |
| --standard-min-confidence-threshold-for-calling 30.0 | --standard-min-confidence-threshold-for-calling 30.0 |
| --max-alternate-alleles 6 | --max-alternate-alleles 6 |
| --max-genotype-count 1024 | --max-genotype-count 1024 |
| --sample-ploidy 2 | --sample-ploidy 2 |
| --num-reference-samples-if-no-call 0 | --num-reference-samples-if-no-call 0 |
| --genotype-assignment-method USE_PLS_TO_ASSIGN | --genotype-assignment-method USE_PLS_TO_ASSIGN |
| --contamination-fraction-to-filter 0.0 | --contamination-fraction-to-filter 0.0 |
| --output-mode EMIT_VARIANTS_ONLY | --output-mode EMIT_VARIANTS_ONLY |
| --all-site-pls false | --all-site-pls false |
| --flow-likelihood-parallel-threads 0 | --gvcf-gq-bands 1 |
| --flow-likelihood-optimized-comp false | --gvcf-gq-bands 2 |
| --flow-use-t0-tag false | --gvcf-gq-bands 3 |
| --flow-probability-threshold 0.003 | --gvcf-gq-bands 4 |
| --flow-remove-non-single-base-pair-indels false | --gvcf-gq-bands 5 |
| --flow-remove-one-zero-probs false | --gvcf-gq-bands 6 |
| --flow-quantization-bins 121 | --gvcf-gq-bands 7 |
| --flow-fill-empty-bins-value 0.001 | --gvcf-gq-bands 8 |
| --flow-symmetric-indel-probs false | --gvcf-gq-bands 9 |
| --flow-report-insertion-or-deletion false | --gvcf-gq-bands 10 |
| --flow-disallow-probs-larger-than-call false | --gvcf-gq-bands 11 |
| --flow-lump-probs false | --gvcf-gq-bands 12 |
| --flow-retain-max-n-probs-base-format false | --gvcf-gq-bands 13 |
| --flow-probability-scaling-factor 10 | --gvcf-gq-bands 14 |
| --flow-order-cycle-length 4 | --gvcf-gq-bands 15 |
| --flow-number-of-uncertain-flows-to-clip 0 | --gvcf-gq-bands 16 |
| --flow-nucleotide-of-first-uncertain-flow T | --gvcf-gq-bands 17 |
| --keep-boundary-flows false | --gvcf-gq-bands 18 |
| --gvcf-gq-bands 1 | --gvcf-gq-bands 19 |
| --gvcf-gq-bands 2 | --gvcf-gq-bands 20 |
| --gvcf-gq-bands 3 | --gvcf-gq-bands 21 |
| --gvcf-gq-bands 4 | --gvcf-gq-bands 22 |
| --gvcf-gq-bands 5 | --gvcf-gq-bands 23 |
| --gvcf-gq-bands 6 | --gvcf-gq-bands 24 |
| --gvcf-gq-bands 7 | --gvcf-gq-bands 25 |
| --gvcf-gq-bands 8 | --gvcf-gq-bands 26 |
| --gvcf-gq-bands 9 | --gvcf-gq-bands 27 |
| --gvcf-gq-bands 10 | --gvcf-gq-bands 28 |
| --gvcf-gq-bands 11 | --gvcf-gq-bands 29 |
| --gvcf-gq-bands 12 | --gvcf-gq-bands 30 |
| --gvcf-gq-bands 13 | --gvcf-gq-bands 31 |
| --gvcf-gq-bands 14 | --gvcf-gq-bands 32 |
| --gvcf-gq-bands 15 | --gvcf-gq-bands 33 |
| --gvcf-gq-bands 16 | --gvcf-gq-bands 34 |
| --gvcf-gq-bands 17 | --gvcf-gq-bands 35 |
| --gvcf-gq-bands 18 | --gvcf-gq-bands 36 |
| --gvcf-gq-bands 19 | --gvcf-gq-bands 37 |
| --gvcf-gq-bands 20 | --gvcf-gq-bands 38 |
| --gvcf-gq-bands 21 | --gvcf-gq-bands 39 |
| --gvcf-gq-bands 22 | --gvcf-gq-bands 40 |
| --gvcf-gq-bands 23 | --gvcf-gq-bands 41 |
| --gvcf-gq-bands 24 | --gvcf-gq-bands 42 |
| --gvcf-gq-bands 25 | --gvcf-gq-bands 43 |
| --gvcf-gq-bands 26 | --gvcf-gq-bands 44 |
| --gvcf-gq-bands 27 | --gvcf-gq-bands 45 |
| --gvcf-gq-bands 28 | --gvcf-gq-bands 46 |
| --gvcf-gq-bands 29 | --gvcf-gq-bands 47 |
| --gvcf-gq-bands 30 | --gvcf-gq-bands 48 |
| --gvcf-gq-bands 31 | --gvcf-gq-bands 49 |
| --gvcf-gq-bands 32 | --gvcf-gq-bands 50 |
| --gvcf-gq-bands 33 | --gvcf-gq-bands 51 |
| --gvcf-gq-bands 34 | --gvcf-gq-bands 52 |
| --gvcf-gq-bands 35 | --gvcf-gq-bands 53 |
| --gvcf-gq-bands 36 | --gvcf-gq-bands 54 |
| --gvcf-gq-bands 37 | --gvcf-gq-bands 55 |
| --gvcf-gq-bands 38 | --gvcf-gq-bands 56 |
| --gvcf-gq-bands 39 | --gvcf-gq-bands 57 |
| --gvcf-gq-bands 40 | --gvcf-gq-bands 58 |
| --gvcf-gq-bands 41 | --gvcf-gq-bands 59 |
| --gvcf-gq-bands 42 | --gvcf-gq-bands 60 |
| --gvcf-gq-bands 43 | --gvcf-gq-bands 70 |
| --gvcf-gq-bands 44 | --gvcf-gq-bands 80 |
| --gvcf-gq-bands 45 | --gvcf-gq-bands 90 |
| --gvcf-gq-bands 46 | --gvcf-gq-bands 99 |
| --gvcf-gq-bands 47 | --floor-blocks false |
| --gvcf-gq-bands 48 | --indel-size-to-eliminate-in-ref-model 10 |
| --gvcf-gq-bands 49 | --disable-optimizations false |
| --gvcf-gq-bands 50 | --dragen-mode false |
| --gvcf-gq-bands 51 | --apply-bqd false |
| --gvcf-gq-bands 52 | --apply-frd false |
| --gvcf-gq-bands 53 | --disable-spanning-event-genotyping false |
| --gvcf-gq-bands 54 | --transform-dragen-mapping-quality false |
| --gvcf-gq-bands 55 | --mapping-quality-threshold-for-genotyping 20 |
| --gvcf-gq-bands 56 | --max-effective-depth-adjustment-for-frd 0 |
| --gvcf-gq-bands 57 | --just-determine-active-regions false |
| --gvcf-gq-bands 58 | --dont-genotype false |
| --gvcf-gq-bands 59 | --do-not-run-physical-phasing false |
| --gvcf-gq-bands 60 | --do-not-correct-overlapping-quality false |
| --gvcf-gq-bands 70 | --use-filtered-reads-for-annotations false |
| --gvcf-gq-bands 80 | --adaptive-pruning false |
| --gvcf-gq-bands 90 | --do-not-recover-dangling-branches false |
| --gvcf-gq-bands 99 | --recover-dangling-heads false |
| --floor-blocks false | --kmer-size 10 |
| --indel-size-to-eliminate-in-ref-model 10 | --kmer-size 25 |
| --disable-optimizations false | --dont-increase-kmer-sizes-for-cycles false |
| --dragen-mode false | --allow-non-unique-kmers-in-ref false |
| --flow-mode NONE | --num-pruning-samples 1 |
| --apply-bqd false | --min-dangling-branch-length 4 |
| --apply-frd false | --recover-all-dangling-branches false |
| --disable-spanning-event-genotyping false | --max-num-haplotypes-in-population 128 |
| --transform-dragen-mapping-quality false | --min-pruning 2 |
| --mapping-quality-threshold-for-genotyping 20 | --adaptive-pruning-initial-error-rate 0.001 |
| --max-effective-depth-adjustment-for-frd 0 | --pruning-lod-threshold 2.302585092994046 |
| --just-determine-active-regions false | --pruning-seeding-lod-threshold 9.210340371976184 |
| --dont-genotype false | --max-unpruned-variants 100 |
| --do-not-run-physical-phasing false | --linked-de-bruijn-graph false |
| --do-not-correct-overlapping-quality false | --disable-artificial-haplotype-recovery false |
| --use-filtered-reads-for-annotations false | --enable-legacy-graph-cycle-detection false |
| --use-flow-aligner-for-stepwise-hc-filtering false | --debug-assembly false |
| --adaptive-pruning false | --debug-graph-transformations false |
| --do-not-recover-dangling-branches false | --capture-assembly-failure-bam false |
| --recover-dangling-heads false | --num-matching-bases-in-dangling-end-to-recover -1 |
| --kmer-size 10 | --error-correction-log-odds -Infinity |
| --kmer-size 25 | --error-correct-reads false |
| --dont-increase-kmer-sizes-for-cycles false | --kmer-length-for-read-error-correction 25 |
| --allow-non-unique-kmers-in-ref false | --min-observations-for-kmer-to-be-solid 20 |
| --num-pruning-samples 1 | --base-quality-score-threshold 18 |
| --min-dangling-branch-length 4 | --dragstr-het-hom-ratio 2 |
| --recover-all-dangling-branches false | --dont-use-dragstr-pair-hmm-scores false |
| --max-num-haplotypes-in-population 128 | --pair-hmm-gap-continuation-penalty 10 |
| --min-pruning 2 | --expected-mismatch-rate-for-read-disqualification 0.02 |
| --adaptive-pruning-initial-error-rate 0.001 | --pair-hmm-implementation FASTEST_AVAILABLE |
| --pruning-lod-threshold 2.302585092994046 | --pcr-indel-model CONSERVATIVE |
| --pruning-seeding-lod-threshold 9.210340371976184 | --phred-scaled-global-read-mismapping-rate 45 |
| --max-unpruned-variants 100 | --disable-symmetric-hmm-normalizing false |
| --linked-de-bruijn-graph false | --disable-cap-base-qualities-to-map-quality false |
| --disable-artificial-haplotype-recovery false | --enable-dynamic-read-disqualification-for-genotyping false |
| --enable-legacy-graph-cycle-detection false | --dynamic-read-disqualification-threshold 1.0 |
| --debug-assembly false | --native-pair-hmm-threads 4 |
| --debug-graph-transformations false | --native-pair-hmm-use-double-precision false |
| --capture-assembly-failure-bam false | --bam-writer-type CALLED_HAPLOTYPES |
| --num-matching-bases-in-dangling-end-to-recover -1 | --dont-use-soft-clipped-bases false |
| --error-correction-log-odds -Infinity | --min-base-quality-score 10 |
| --error-correct-reads false | --smith-waterman JAVA |
| --kmer-length-for-read-error-correction 25 | --emit-ref-confidence NONE |
| --min-observations-for-kmer-to-be-solid 20 | --force-call-filtered-alleles false |
| --likelihood-calculation-engine PairHMM | --soft-clip-low-quality-ends false |
| --base-quality-score-threshold 18 | --allele-informative-reads-overlap-margin 2 |
| --dragstr-het-hom-ratio 2 | --smith-waterman-dangling-end-match-value 25 |
| --dont-use-dragstr-pair-hmm-scores false | --smith-waterman-dangling-end-mismatch-penalty -50 |
| --pair-hmm-gap-continuation-penalty 10 | --smith-waterman-dangling-end-gap-open-penalty -110 |
| --expected-mismatch-rate-for-read-disqualification 0.02 | --smith-waterman-dangling-end-gap-extend-penalty -6 |
| --pair-hmm-implementation FASTEST_AVAILABLE | --smith-waterman-haplotype-to-reference-match-value 200 |
| --pcr-indel-model CONSERVATIVE | --smith-waterman-haplotype-to-reference-mismatch-penalty -150 |
| --phred-scaled-global-read-mismapping-rate 45 | --smith-waterman-haplotype-to-reference-gap-open-penalty -260 |
| --disable-symmetric-hmm-normalizing false | --smith-waterman-haplotype-to-reference-gap-extend-penalty -11 |
| --disable-cap-base-qualities-to-map-quality false | --smith-waterman-read-to-haplotype-match-value 10 |
| --enable-dynamic-read-disqualification-for-genotyping false | --smith-waterman-read-to-haplotype-mismatch-penalty -15 |
| --dynamic-read-disqualification-threshold 1.0 | --smith-waterman-read-to-haplotype-gap-open-penalty -30 |
| --native-pair-hmm-threads 4 | --smith-waterman-read-to-haplotype-gap-extend-penalty -5 |
| --native-pair-hmm-use-double-precision false | --min-assembly-region-size 50 |
| --flow-hmm-engine-min-indel-adjust 6 | --max-assembly-region-size 300 |
| --flow-hmm-engine-flat-insertion-penatly 45 | --active-probability-threshold 0.002 |
| --flow-hmm-engine-flat-deletion-penatly 45 | --max-prob-propagation-distance 50 |
| --pileup-detection false | --force-active false |
| --pileup-detection-enable-indel-pileup-calling false | --assembly-region-padding 100 |
| --num-artificial-haplotypes-to-add-per-allele 5 | --padding-around-indels 75 |
| --artifical-haplotype-filtering-kmer-size 10 | --padding-around-snps 20 |
| --pileup-detection-snp-alt-threshold 0.1 | --padding-around-strs 75 |
| --pileup-detection-indel-alt-threshold 0.5 | --max-extension-into-assembly-region-padding-legacy 25 |
| --pileup-detection-absolute-alt-depth 0.0 | --max-reads-per-alignment-start 50 |
| --pileup-detection-snp-adjacent-to-assembled-indel-range 5 | --enable-legacy-assembly-region-trimming false |
| --pileup-detection-bad-read-tolerance 0.0 | --interval-set-rule UNION |
| --pileup-detection-proper-pair-read-badness true | --interval-padding 0 |
| --pileup-detection-edit-distance-read-badness-threshold 0.08 | --interval-exclusion-padding 0 |
| --pileup-detection-chimeric-read-badness true | --interval-merging-rule ALL |
| --pileup-detection-template-mean-badness-threshold 0.0 | --read-validation-stringency SILENT |
| --pileup-detection-template-std-badness-threshold 0.0 | --seconds-between-progress-updates 10.0 |
| --bam-writer-type CALLED_HAPLOTYPES | --disable-sequence-dictionary-validation false |
| --dont-use-soft-clipped-bases false | --create-output-bam-index true |
| --override-fragment-softclip-check false | --create-output-bam-md5 false |
| --min-base-quality-score 10 | --create-output-variant-index true |
| --smith-waterman JAVA | --create-output-variant-md5 false |
| --emit-ref-confidence NONE | --max-variants-per-shard 0 |
| --force-call-filtered-alleles false | --lenient false |
| --reference-model-deletion-quality 30 | --add-output-sam-program-record true |
| --soft-clip-low-quality-ends false | --add-output-vcf-command-line true |
| --allele-informative-reads-overlap-margin 2 | --cloud-prefetch-buffer 40 |
| --smith-waterman-dangling-end-match-value 25 | --cloud-index-prefetch-buffer -1 |
| --smith-waterman-dangling-end-mismatch-penalty -50 | --disable-bam-index-caching false |
| --smith-waterman-dangling-end-gap-open-penalty -110 | --sites-only-vcf-output false |
| --smith-waterman-dangling-end-gap-extend-penalty -6 | --help false |
| --smith-waterman-haplotype-to-reference-match-value 200 | --version false |
| --smith-waterman-haplotype-to-reference-mismatch-penalty -150 | --showHidden false |
| --smith-waterman-haplotype-to-reference-gap-open-penalty -260 | --QUIET false |
| --smith-waterman-haplotype-to-reference-gap-extend-penalty -11 | --use-jdk-deflater false |
| --smith-waterman-read-to-haplotype-match-value 10 | --use-jdk-inflater false |
| --smith-waterman-read-to-haplotype-mismatch-penalty -15 | --gcs-max-retries 20 |
| --smith-waterman-read-to-haplotype-gap-open-penalty -30 | --gcs-project-for-requester-pays |
| --smith-waterman-read-to-haplotype-gap-extend-penalty -5 | --disable-tool-default-read-filters false |
| --flow-assembly-collapse-hmer-size 0 | --minimum-mapping-quality 20 |
| --flow-assembly-collapse-partial-mode false | --disable-tool-default-annotations false |
| --flow-filter-alleles false | --enable-all-annotations false |
| --flow-filter-alleles-qual-threshold 30.0 | --allow-old-rms-mapping-quality-annotation-data false |
| --flow-filter-alleles-sor-threshold 3.0 | Version="4.2.4.1",Date="December 3, 2022 at 1:20:38 AM CET"> |
| --flow-filter-lone-alleles false |
| --flow-filter-alleles-debug-graphs false |
| --min-assembly-region-size 50 |
| --max-assembly-region-size 300 |
| --active-probability-threshold 0.002 |
| --max-prob-propagation-distance 50 |
| --force-active false |
| --assembly-region-padding 100 |
| --padding-around-indels 75 |
| --padding-around-snps 20 |
| --padding-around-strs 75 |
| --max-extension-into-assembly-region-padding-legacy 25 |
| --max-reads-per-alignment-start 50 |
| --enable-legacy-assembly-region-trimming false |
| --interval-set-rule UNION |
| --interval-padding 0 |
| --interval-exclusion-padding 0 |
| --interval-merging-rule ALL |
| --read-validation-stringency SILENT |
| --seconds-between-progress-updates 10.0 |
| --disable-sequence-dictionary-validation false |
| --create-output-bam-index true |
| --create-output-bam-md5 false |
| --create-output-variant-index true |
| --create-output-variant-md5 false |
| --max-variants-per-shard 0 |
| --lenient false |
| --add-output-sam-program-record true |
| --add-output-vcf-command-line true |
| --cloud-prefetch-buffer 40 |
| --cloud-index-prefetch-buffer -1 |
| --disable-bam-index-caching false |
| --sites-only-vcf-output false |
| --help false |
| --version false |
| --showHidden false |
| --verbosity INFO |
| --QUIET false |
| --use-jdk-deflater false |
| --use-jdk-inflater false |
| --gcs-max-retries 20 |
| --gcs-project-for-requester-pays |
| --disable-tool-default-read-filters false |
| --minimum-mapping-quality 20 |
| --disable-tool-default-annotations false |
| --enable-all-annotations false |
| --allow-old-rms-mapping-quality-annotation-data false" |
| Version="4.3.0.0",Date="December 16, 2022 at 12:51:03 AM CET"> |
****** KILL [#B] filterDepth : 21 en trop
CLOSED: [2023-01-04 Wed 19:16]
***** TODO Filter technical variants
****** KILL Filter technical variants
CLOSED: [2023-01-04 Wed 19:16]
**** Gatk 4.2.4 (même version qu'alexis)
***** TODO Variant calling
****** TODO haplotypecaller: mieux mais non identique !
******* DONE Nombres lignes gatk 4.2.2 : faible différence
CLOSED: [2023-01-04 Wed 19:18]
$ zgrep '^NC' 63003856_S135.vcf.gz | wc -l
1506931
$ grep '^NC' /Work/Groups/bisonex/ref-vcf/63003856_S135 .vcf | wc -l
1506894
******* DONE Flags la même version de gatk 4.2.2 : ok identique
CLOSED: [2023-01-04 Wed 19:09]
##GATKCommandLine=<ID=HaplotypeCaller,CommandLine="HaplotypeCaller
| ",Version="4.2.4.1",Date="January 4, 2023 at 1:46:41 AM CET"> | Version="4.2.4.1",Date="December 3, 2022 at 1:20:38 AM CET"> |
| --dbsnp /Work/Users/apraga/bisonex/work/5d/feb81028d262d7701bed0a759ff6f6/dbSNP.gz | --dbsnp /mnt/j/bases_de_donnees/dbSNP/GCF_000001405.39.gz |
| --max-mnp-distance 2 | --max-mnp-distance 2 |
| --output 63003856_S135.vcf.gz | --output /mnt/j/working_directory_pipeline_analyse_exome/vcf/63003856_S135.vcf |
| --input 63003856_S135.bam | --input /mnt/j/working_directory_pipeline_analyse_exome/bam/63003856_S135_recalibrated_hg38.bam |
| --reference genomeRef.fna | --reference /mnt/j/bases_de_donnees/genome/GRCh38_latest_genomic.fna |
| --tmp-dir . | --verbosity WARNING |
| --use-posteriors-to-calculate-qual false | --use-posteriors-to-calculate-qual false |
| --dont-use-dragstr-priors false | --dont-use-dragstr-priors false |
| --use-new-qual-calculator true | --use-new-qual-calculator true |
| --annotate-with-num-discovered-alleles false | --annotate-with-num-discovered-alleles false |
| --heterozygosity 0.001 | --heterozygosity 0.001 |
| --indel-heterozygosity 1.25E-4 | --indel-heterozygosity 1.25E-4 |
| --heterozygosity-stdev 0.01 | --heterozygosity-stdev 0.01 |
| --standard-min-confidence-threshold-for-calling 30.0 | --standard-min-confidence-threshold-for-calling 30.0 |
| --max-alternate-alleles 6 | --max-alternate-alleles 6 |
| --max-genotype-count 1024 | --max-genotype-count 1024 |
| --sample-ploidy 2 | --sample-ploidy 2 |
| --num-reference-samples-if-no-call 0 | --num-reference-samples-if-no-call 0 |
| --genotype-assignment-method USE_PLS_TO_ASSIGN | --genotype-assignment-method USE_PLS_TO_ASSIGN |
| --contamination-fraction-to-filter 0.0 | --contamination-fraction-to-filter 0.0 |
| --output-mode EMIT_VARIANTS_ONLY | --output-mode EMIT_VARIANTS_ONLY |
| --all-site-pls false | --all-site-pls false |
| --gvcf-gq-bands 1 | --gvcf-gq-bands 1 |
| --gvcf-gq-bands 2 | --gvcf-gq-bands 2 |
| --gvcf-gq-bands 3 | --gvcf-gq-bands 3 |
| --gvcf-gq-bands 4 | --gvcf-gq-bands 4 |
| --gvcf-gq-bands 5 | --gvcf-gq-bands 5 |
| --gvcf-gq-bands 6 | --gvcf-gq-bands 6 |
| --gvcf-gq-bands 7 | --gvcf-gq-bands 7 |
| --gvcf-gq-bands 8 | --gvcf-gq-bands 8 |
| --gvcf-gq-bands 9 | --gvcf-gq-bands 9 |
| --gvcf-gq-bands 10 | --gvcf-gq-bands 10 |
| --gvcf-gq-bands 11 | --gvcf-gq-bands 11 |
| --gvcf-gq-bands 12 | --gvcf-gq-bands 12 |
| --gvcf-gq-bands 13 | --gvcf-gq-bands 13 |
| --gvcf-gq-bands 14 | --gvcf-gq-bands 14 |
| --gvcf-gq-bands 15 | --gvcf-gq-bands 15 |
| --gvcf-gq-bands 16 | --gvcf-gq-bands 16 |
| --gvcf-gq-bands 17 | --gvcf-gq-bands 17 |
| --gvcf-gq-bands 18 | --gvcf-gq-bands 18 |
| --gvcf-gq-bands 19 | --gvcf-gq-bands 19 |
| --gvcf-gq-bands 20 | --gvcf-gq-bands 20 |
| --gvcf-gq-bands 21 | --gvcf-gq-bands 21 |
| --gvcf-gq-bands 22 | --gvcf-gq-bands 22 |
| --gvcf-gq-bands 23 | --gvcf-gq-bands 23 |
| --gvcf-gq-bands 24 | --gvcf-gq-bands 24 |
| --gvcf-gq-bands 25 | --gvcf-gq-bands 25 |
| --gvcf-gq-bands 26 | --gvcf-gq-bands 26 |
| --gvcf-gq-bands 27 | --gvcf-gq-bands 27 |
| --gvcf-gq-bands 28 | --gvcf-gq-bands 28 |
| --gvcf-gq-bands 29 | --gvcf-gq-bands 29 |
| --gvcf-gq-bands 30 | --gvcf-gq-bands 30 |
| --gvcf-gq-bands 31 | --gvcf-gq-bands 31 |
| --gvcf-gq-bands 32 | --gvcf-gq-bands 32 |
| --gvcf-gq-bands 33 | --gvcf-gq-bands 33 |
| --gvcf-gq-bands 34 | --gvcf-gq-bands 34 |
| --gvcf-gq-bands 35 | --gvcf-gq-bands 35 |
| --gvcf-gq-bands 36 | --gvcf-gq-bands 36 |
| --gvcf-gq-bands 37 | --gvcf-gq-bands 37 |
| --gvcf-gq-bands 38 | --gvcf-gq-bands 38 |
| --gvcf-gq-bands 39 | --gvcf-gq-bands 39 |
| --gvcf-gq-bands 40 | --gvcf-gq-bands 40 |
| --gvcf-gq-bands 41 | --gvcf-gq-bands 41 |
| --gvcf-gq-bands 42 | --gvcf-gq-bands 42 |
| --gvcf-gq-bands 43 | --gvcf-gq-bands 43 |
| --gvcf-gq-bands 44 | --gvcf-gq-bands 44 |
| --gvcf-gq-bands 45 | --gvcf-gq-bands 45 |
| --gvcf-gq-bands 46 | --gvcf-gq-bands 46 |
| --gvcf-gq-bands 47 | --gvcf-gq-bands 47 |
| --gvcf-gq-bands 48 | --gvcf-gq-bands 48 |
| --gvcf-gq-bands 49 | --gvcf-gq-bands 49 |
| --gvcf-gq-bands 50 | --gvcf-gq-bands 50 |
| --gvcf-gq-bands 51 | --gvcf-gq-bands 51 |
| --gvcf-gq-bands 52 | --gvcf-gq-bands 52 |
| --gvcf-gq-bands 53 | --gvcf-gq-bands 53 |
| --gvcf-gq-bands 54 | --gvcf-gq-bands 54 |
| --gvcf-gq-bands 55 | --gvcf-gq-bands 55 |
| --gvcf-gq-bands 56 | --gvcf-gq-bands 56 |
| --gvcf-gq-bands 57 | --gvcf-gq-bands 57 |
| --gvcf-gq-bands 58 | --gvcf-gq-bands 58 |
| --gvcf-gq-bands 59 | --gvcf-gq-bands 59 |
| --gvcf-gq-bands 60 | --gvcf-gq-bands 60 |
| --gvcf-gq-bands 70 | --gvcf-gq-bands 70 |
| --gvcf-gq-bands 80 | --gvcf-gq-bands 80 |
| --gvcf-gq-bands 90 | --gvcf-gq-bands 90 |
| --gvcf-gq-bands 99 | --gvcf-gq-bands 99 |
| --floor-blocks false | --floor-blocks false |
| --indel-size-to-eliminate-in-ref-model 10 | --indel-size-to-eliminate-in-ref-model 10 |
| --disable-optimizations false | --disable-optimizations false |
| --dragen-mode false | --dragen-mode false |
| --apply-bqd false | --apply-bqd false |
| --apply-frd false | --apply-frd false |
| --disable-spanning-event-genotyping false | --disable-spanning-event-genotyping false |
| --transform-dragen-mapping-quality false | --transform-dragen-mapping-quality false |
| --mapping-quality-threshold-for-genotyping 20 | --mapping-quality-threshold-for-genotyping 20 |
| --max-effective-depth-adjustment-for-frd 0 | --max-effective-depth-adjustment-for-frd 0 |
| --just-determine-active-regions false | --just-determine-active-regions false |
| --dont-genotype false | --dont-genotype false |
| --do-not-run-physical-phasing false | --do-not-run-physical-phasing false |
| --do-not-correct-overlapping-quality false | --do-not-correct-overlapping-quality false |
| --use-filtered-reads-for-annotations false | --use-filtered-reads-for-annotations false |
| --adaptive-pruning false | --adaptive-pruning false |
| --do-not-recover-dangling-branches false | --do-not-recover-dangling-branches false |
| --recover-dangling-heads false | --recover-dangling-heads false |
| --kmer-size 10 | --kmer-size 10 |
| --kmer-size 25 | --kmer-size 25 |
| --dont-increase-kmer-sizes-for-cycles false | --dont-increase-kmer-sizes-for-cycles false |
| --allow-non-unique-kmers-in-ref false | --allow-non-unique-kmers-in-ref false |
| --num-pruning-samples 1 | --num-pruning-samples 1 |
| --min-dangling-branch-length 4 | --min-dangling-branch-length 4 |
| --recover-all-dangling-branches false | --recover-all-dangling-branches false |
| --max-num-haplotypes-in-population 128 | --max-num-haplotypes-in-population 128 |
| --min-pruning 2 | --min-pruning 2 |
| --adaptive-pruning-initial-error-rate 0.001 | --adaptive-pruning-initial-error-rate 0.001 |
| --pruning-lod-threshold 2.302585092994046 | --pruning-lod-threshold 2.302585092994046 |
| --pruning-seeding-lod-threshold 9.210340371976184 | --pruning-seeding-lod-threshold 9.210340371976184 |
| --max-unpruned-variants 100 | --max-unpruned-variants 100 |
| --linked-de-bruijn-graph false | --linked-de-bruijn-graph false |
| --disable-artificial-haplotype-recovery false | --disable-artificial-haplotype-recovery false |
| --enable-legacy-graph-cycle-detection false | --enable-legacy-graph-cycle-detection false |
| --debug-assembly false | --debug-assembly false |
| --debug-graph-transformations false | --debug-graph-transformations false |
| --capture-assembly-failure-bam false | --capture-assembly-failure-bam false |
| --num-matching-bases-in-dangling-end-to-recover -1 | --num-matching-bases-in-dangling-end-to-recover -1 |
| --error-correction-log-odds -Infinity | --error-correction-log-odds -Infinity |
| --error-correct-reads false | --error-correct-reads false |
| --kmer-length-for-read-error-correction 25 | --kmer-length-for-read-error-correction 25 |
| --min-observations-for-kmer-to-be-solid 20 | --min-observations-for-kmer-to-be-solid 20 |
| --base-quality-score-threshold 18 | --base-quality-score-threshold 18 |
| --dragstr-het-hom-ratio 2 | --dragstr-het-hom-ratio 2 |
| --dont-use-dragstr-pair-hmm-scores false | --dont-use-dragstr-pair-hmm-scores false |
| --pair-hmm-gap-continuation-penalty 10 | --pair-hmm-gap-continuation-penalty 10 |
| --expected-mismatch-rate-for-read-disqualification 0.02 | --expected-mismatch-rate-for-read-disqualification 0.02 |
| --pair-hmm-implementation FASTEST_AVAILABLE | --pair-hmm-implementation FASTEST_AVAILABLE |
| --pcr-indel-model CONSERVATIVE | --pcr-indel-model CONSERVATIVE |
| --phred-scaled-global-read-mismapping-rate 45 | --phred-scaled-global-read-mismapping-rate 45 |
| --disable-symmetric-hmm-normalizing false | --disable-symmetric-hmm-normalizing false |
| --disable-cap-base-qualities-to-map-quality false | --disable-cap-base-qualities-to-map-quality false |
| --enable-dynamic-read-disqualification-for-genotyping false | --enable-dynamic-read-disqualification-for-genotyping false |
| --dynamic-read-disqualification-threshold 1.0 | --dynamic-read-disqualification-threshold 1.0 |
| --native-pair-hmm-threads 4 | --native-pair-hmm-threads 4 |
| --native-pair-hmm-use-double-precision false | --native-pair-hmm-use-double-precision false |
| --bam-writer-type CALLED_HAPLOTYPES | --bam-writer-type CALLED_HAPLOTYPES |
| --dont-use-soft-clipped-bases false | --dont-use-soft-clipped-bases false |
| --min-base-quality-score 10 | --min-base-quality-score 10 |
| --smith-waterman JAVA | --smith-waterman JAVA |
| --emit-ref-confidence NONE | --emit-ref-confidence NONE |
| --force-call-filtered-alleles false | --force-call-filtered-alleles false |
| --soft-clip-low-quality-ends false | --soft-clip-low-quality-ends false |
| --allele-informative-reads-overlap-margin 2 | --allele-informative-reads-overlap-margin 2 |
| --smith-waterman-dangling-end-match-value 25 | --smith-waterman-dangling-end-match-value 25 |
| --smith-waterman-dangling-end-mismatch-penalty -50 | --smith-waterman-dangling-end-mismatch-penalty -50 |
| --smith-waterman-dangling-end-gap-open-penalty -110 | --smith-waterman-dangling-end-gap-open-penalty -110 |
| --smith-waterman-dangling-end-gap-extend-penalty -6 | --smith-waterman-dangling-end-gap-extend-penalty -6 |
| --smith-waterman-haplotype-to-reference-match-value 200 | --smith-waterman-haplotype-to-reference-match-value 200 |
| --smith-waterman-haplotype-to-reference-mismatch-penalty -150 | --smith-waterman-haplotype-to-reference-mismatch-penalty -150 |
| --smith-waterman-haplotype-to-reference-gap-open-penalty -260 | --smith-waterman-haplotype-to-reference-gap-open-penalty -260 |
| --smith-waterman-haplotype-to-reference-gap-extend-penalty -11 | --smith-waterman-haplotype-to-reference-gap-extend-penalty -11 |
| --smith-waterman-read-to-haplotype-match-value 10 | --smith-waterman-read-to-haplotype-match-value 10 |
| --smith-waterman-read-to-haplotype-mismatch-penalty -15 | --smith-waterman-read-to-haplotype-mismatch-penalty -15 |
| --smith-waterman-read-to-haplotype-gap-open-penalty -30 | --smith-waterman-read-to-haplotype-gap-open-penalty -30 |
| --smith-waterman-read-to-haplotype-gap-extend-penalty -5 | --smith-waterman-read-to-haplotype-gap-extend-penalty -5 |
| --min-assembly-region-size 50 | --min-assembly-region-size 50 |
| --max-assembly-region-size 300 | --max-assembly-region-size 300 |
| --active-probability-threshold 0.002 | --active-probability-threshold 0.002 |
| --max-prob-propagation-distance 50 | --max-prob-propagation-distance 50 |
| --force-active false | --force-active false |
| --assembly-region-padding 100 | --assembly-region-padding 100 |
| --padding-around-indels 75 | --padding-around-indels 75 |
| --padding-around-snps 20 | --padding-around-snps 20 |
| --padding-around-strs 75 | --padding-around-strs 75 |
| --max-extension-into-assembly-region-padding-legacy 25 | --max-extension-into-assembly-region-padding-legacy 25 |
| --max-reads-per-alignment-start 50 | --max-reads-per-alignment-start 50 |
| --enable-legacy-assembly-region-trimming false | --enable-legacy-assembly-region-trimming false |
| --interval-set-rule UNION | --interval-set-rule UNION |
| --interval-padding 0 | --interval-padding 0 |
| --interval-exclusion-padding 0 | --interval-exclusion-padding 0 |
| --interval-merging-rule ALL | --interval-merging-rule ALL |
| --read-validation-stringency SILENT | --read-validation-stringency SILENT |
| --seconds-between-progress-updates 10.0 | --seconds-between-progress-updates 10.0 |
| --disable-sequence-dictionary-validation false | --disable-sequence-dictionary-validation false |
| --create-output-bam-index true | --create-output-bam-index true |
| --create-output-bam-md5 false | --create-output-bam-md5 false |
| --create-output-variant-index true | --create-output-variant-index true |
| --create-output-variant-md5 false | --create-output-variant-md5 false |
| --max-variants-per-shard 0 | --max-variants-per-shard 0 |
| --lenient false | --lenient false |
| --add-output-sam-program-record true | --add-output-sam-program-record true |
| --add-output-vcf-command-line true | --add-output-vcf-command-line true |
| --cloud-prefetch-buffer 40 | --cloud-prefetch-buffer 40 |
| --cloud-index-prefetch-buffer -1 | --cloud-index-prefetch-buffer -1 |
| --disable-bam-index-caching false | --disable-bam-index-caching false |
| --sites-only-vcf-output false | --sites-only-vcf-output false |
| --help false | --help false |
| --version false | --version false |
| --showHidden false | --showHidden false |
| --verbosity INFO | --QUIET false |
| --QUIET false | --use-jdk-deflater false |
| --use-jdk-deflater false | --use-jdk-inflater false |
| --use-jdk-inflater false | --gcs-max-retries 20 |
| --gcs-max-retries 20 | --gcs-project-for-requester-pays |
| --gcs-project-for-requester-pays | --disable-tool-default-read-filters false |
| --disable-tool-default-read-filters false | --minimum-mapping-quality 20 |
| --minimum-mapping-quality 20 | --disable-tool-default-annotations false |
| --disable-tool-default-annotations false | --enable-all-annotations false |
| --enable-all-annotations false | --allow-old-rms-mapping-quality-annotation-data false |
| --allow-old-rms-mapping-quality-annotation-data false | |
****** filter depth : Toujours la même différence...
$ grep '^NC' filter-depth.vcf | wc -l
82054
$ zgrep '^NC' /Work/Groups/bisonex/ref_63003856_S135/63003856_S135_DP_over_30.vcf.gz | wc -l
82033
Non lié à la profondeur : on teste avec
bcftools filter -i 'FORMAT/DP<=30' filter-depth.vcf
bcftools filter -i 'FORMAT/AD[0:1]<=10' filter-depth.vcf
****** Vérifier qu'en utilsant 2 filtres différents on a bien la même chose : oui
$ bcftools filter -e 'FORMAT/DP<=30' 63003856_S135.vcf.gz | bcftools filter -e 'FORMAT/AD[0:1]<=10' -o two-filters.vcf
$ grep '^NC' two-filters.vcf | wc -l
82054
***** Tester bwa en séquentiel