B:BD[
3.26896] → [
2.8220:17783]
CDEECBDBCECBEECCEACDEEBBFGDEFGCCFFFFCFCCEFBFDCFCDAAEBEE:CECBABBEBEE;DBFCCCDBCDBCCBBC?@BEEDA NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
Effectivement, on aligne sur une zonne supprimée !
******* DONE Corriger la qualité: non
CLOSED: [2023-05-24 Wed 22:19]
******** DONE Comparaison avec le fastq de référénce : qualité !!
CLOSED: [2023-05-24 Wed 22:17]
#+begin_src sh
cd /Work/Users/apraga/bisonex/work/6e/8548fc90263830bf677f36585f11dc
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" 63003856_chr22_1.fq.gz
#+end_src
@A00853:477:HMLWYDSX3:1:1413:4390:28573
AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
+
ADEEB@?CBBCCBDCBDCCCFBD;EEBEBBABCEC:EEBEAADCFCDFBFECCFCFFFFCCGFEDGFBBEEDCAECCEEBCECBDBCEEDCCBCBFAECCFEACAEAEBCCDCBCBFB:;CAEDCAEDBEEEEDC?<ECFBCDBCCCEDAA
#+begin_src
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" /Work/Projects/bisonex/centogene/fastq/2200467051_63003856/63003856_S135_R1_001.fastq.gz
#+end_src
#+RESULTS:
: @A00853:477:HMLWYDSX3:1:1413:4390:28573 1:N:0:ATTCCACACA+TAGGCGATTG
AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
: +
: FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFF
******** DONE Regarder la qualité après bwa mem vs applybqsr: différente
CLOSED: [2023-05-24 Wed 22:18]
Sur le mésocentre, dans /Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing
$ samtools view mapped/63003856_S135_R.bam NC_000022.11 | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NC_000022.11 42212845 0 151M = 42212883 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT FFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NC_000022.11 42212883 0 151M = 42212845 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT FFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
samtools view applybqsr/63003856_S135_R.bam NC_000022.11 | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NC_000022.11 42212845 0 151M = 42212883 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT ACC+FBCDCBBBAEAEDEEBBCCCECACBAEBEBDCCBCBFDCCCCFACEBEBCEEDCCCCFDCAEDCACBCEBBCFEACCFBDCACDCBCEBDBBCFEEDCCCFAFEACECCCECAEEDCADCBEDC7BEBCCCFBAFDCECCFBEAACA MC:Z:151M MD:Z:151 PG:Z:MarkDuplicates RG:Z:sample NM:i:0 AS:i:151 XS:i:151
A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NC_000022.11 42212883 0 151M = 42212845 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT AADECCCBDCBFCE<?CDEEEEBDEACDEAC;:BFBCBCDCCBEAEACAEFCCEAFBCBCCDEECBDBCECBEECCEACDEEBBFGDEFGCCFFFFCFCCEFBFDCFCDAAEBEE:CECBABBEBEE;DBFCCCDBCDBCCBBC?@BEEDA MC:Z:151M MD:Z:151 PG:Z:MarkDuplicates RG:Z:sample NM:i:0 AS:i:151 XS:i:151
******** DONE Réaligner à partir de la sortie de bwa mem
CLOSED: [2023-05-24 Wed 22:32]
#+begin_src sh
cd out/63003856_S135_R/preprocessing/mapped/
samtools view 63003856_S135_R.bam NC_000022.11 -f 0x2 -o 63003856_chr22.bam
samtools sort -n 63003856_chr22.bam -o 63003856_chr22_sorted.bam
samtools fastq -1 63003856_chr22_1.fq.gz -2 63003856_chr22_2.fq.gz -0 /dev/null -s /dev/null -n 63003856_chr22_sorted.bam
#+end_src
ON vérifie la qualité
#+begin_src
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" 63003856_chr22_1.fq.gz
#+end_src
#+RESULTS:
: @A00853:477:HMLWYDSX3:1:1413:4390:28573
: AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
: +
: FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFF
#+begin_src
NXF_OPTS=-D"user.name=apraga" nextflow run main.nf -c nextflow.config -profile standard,helios -resume --input="out/63003856_S135_R/preprocessing/mapped/63003856_chr22_{1,2}.fq.gz" --outdir=out/63003856_chr22-from-mapped
#+end_src
Puis ::
#+begin_src
cd /Work/Users/apraga/bisonex/out/63003856_chr22-from-mapped/63003856_chr22/preprocessing/mapped
samtools view 63003856_chr22.bam | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
#+end_src
#+RESULTS:
: A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NW_014040930.1 115017 0 151M = 115055 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT FFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
: A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NW_014040930.1 115055 0 151M = 115017 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT FFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
******* DONE Aligner sur génome de référence limité au chromosome 22
CLOSED: [2023-05-24 Wed 23:18]
******** KILL Test données non modifiées
CLOSED: [2023-05-24 Wed 23:18]
/Work/Users/apraga/bisonex/tests/bamscissors
#+begin_src
cd /Work/Groups/bisonex/data/genome/GRCh38.p13/
mkdir chr22/
samtools faidx genomeRef.fna NC_000022.11 > chr22/chr22.fna
cd chr22
samtools faidx chr22.fna
bwa index chr22.fna
#+end_src
#+begin_src
cd /Work/Users/apraga/bisonex/tests/bamscissors
ln -s ../../out/63003856_S135_R/preprocessing/applybqsr/63003856_chr22_{1,2}.fq.gz .
srun -c 24 -p smp -t 1:00:00 --pty bash
bwa mem -t 24 /Work/Projects/bisonex/data/genome/GRCh38.p13/chr22/chr22.fna 63003856_chr22_1.fq.gz 63003856_chr22_1.fq.gz -o smallref.sam
#+end_src
******** DONE Test données modifiées: ok
CLOSED: [2023-05-24 Wed 23:18]
Données dans data/init
#+begin_src sh
time julia insertVariant.jl
rsync -avz data/init/*.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
#+begin_src
srun -c 24 -p smp -t 1:00:00 --pty bash
bwa mem -t 24 /Work/Projects/bisonex/data/genome/GRCh38.p13/chr22/chr22.fna 63003856_chr22_1.fq.gz 63003856_chr22_1.fq.gz | samtools sort -@24 - -o smallref.bam
#+end_src
#+begin_src
rsync -avz meso:/Work/Users/apraga/bisonex/tests/bamscissors/smallref.bam mapped/
#+end_src
****** DONE Test haplotypecaller 1 variant
CLOSED: [2023-05-29 Mon 15:38]
****** DONE Test haplotypecaller tous les variants
****** TODO Comprendre pourquoi la répartiton ne suit pas la loi normale
Certains hétérozygote soint à 0.01 ou 1...
******* DONE augmenter le nombre d'échantillions: idem
CLOSED: [2023-05-31 Wed 22:24]
******* TODO Vérifier le nombre de reads marqué vs édité
******* DONE vérifier que 100 appel à rand(d, 1)[1] est semblable à un appel de rand(d, 100)
CLOSED: [2023-05-31 Wed 22:24]
julia> df = vcat(DataFrame(:y => z, :type => "z"), DataFrame(:y => y, :type => "y"));
julia> y = [rand(d, 1)[1] for x in 1:1000];
julia> z = rand(d,1000);
julia> df = vcat(DataFrame(:y => z, :type => "z"), DataFrame(:y => y, :type => "y"));
draw(data(df)*histogram(bins=100)*mapping(:y, color=:type,dodge=:type))
****** TODO Refaire le test avec la nouvelle version
Mésocentre
#+begin_src sh
cd /Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing/mapped
samtools view 63003856_S135_R.bam NC_000022.11 -o 63003856_S135_R_chr22.bam
#+end_src
Génération locale du fichier
#+begin_src sh :dir /home/alex/roam/recherche/bisonex/code/xamscissors
~/.local/julia-1.9.0/bin/julia --project=.. snvs.jl
rsync -avz meso:/Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing/mapped/63003856_S135_R_chr22.bam .
samtools index 63003856_S135_R_chr22.bam
#+end_src
#+RESULTS:
| receiving | incremental | file | list | | | | |
| | | | | | | | |
| sent | 20 | bytes | received | 79 | bytes | 66.0 | bytes/sec |
| total | size | is | 143,162,969 | speedup | is | 1,446,090.60 | |
On génère les données
#+begin_src julia
using XAMScissors
insertSNV("./63003856_S135_R_chr22.bam", "snvs_chr22.csv", "out")
#+end_src
Puis
#+begin_src sh
~/.local/julia-1.9.0/bin/julia --project=.. xamscissors.jl
#+end_src
***** TODO PHase 3 : tous les SNV
**** TODO Test Indel
*** Divers
**** DONE Vérifier nombre de reads fastq - bam
CLOSED: [2022-10-09 Sun 22:31]
CDEECBDBCECBEECCEACDEEBBFGDEFGCCFFFFCFCCEFBFDCFCDAAEBEE:CECBABBEBEE;DBFCCCDBCDBCCBBC?@BEEDA NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
Effectivement, on aligne sur une zonne supprimée !
******* DONE Corriger la qualité: non
CLOSED: [2023-05-24 Wed 22:19]
******** DONE Comparaison avec le fastq de référénce : qualité !!
CLOSED: [2023-05-24 Wed 22:17]
#+begin_src sh
cd /Work/Users/apraga/bisonex/work/6e/8548fc90263830bf677f36585f11dc
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" 63003856_chr22_1.fq.gz
#+end_src
@A00853:477:HMLWYDSX3:1:1413:4390:28573
AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
+
ADEEB@?CBBCCBDCBDCCCFBD;EEBEBBABCEC:EEBEAADCFCDFBFECCFCFFFFCCGFEDGFBBEEDCAECCEEBCECBDBCEEDCCBCBFAECCFEACAEAEBCCDCBCBFB:;CAEDCAEDBEEEEDC?<ECFBCDBCCCEDAA
#+begin_src
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" /Work/Projects/bisonex/centogene/fastq/2200467051_63003856/63003856_S135_R1_001.fastq.gz
#+end_src
#+RESULTS:
: @A00853:477:HMLWYDSX3:1:1413:4390:28573 1:N:0:ATTCCACACA+TAGGCGATTG
AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
: +
: FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFF
******** DONE Regarder la qualité après bwa mem vs applybqsr: différente
CLOSED: [2023-05-24 Wed 22:18]
Sur le mésocentre, dans /Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing
$ samtools view mapped/63003856_S135_R.bam NC_000022.11 | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NC_000022.11 42212845 0 151M = 42212883 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT FFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NC_000022.11 42212883 0 151M = 42212845 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT FFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
samtools view applybqsr/63003856_S135_R.bam NC_000022.11 | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NC_000022.11 42212845 0 151M = 42212883 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT ACC+FBCDCBBBAEAEDEEBBCCCECACBAEBEBDCCBCBFDCCCCFACEBEBCEEDCCCCFDCAEDCACBCEBBCFEACCFBDCACDCBCEBDBBCFEEDCCCFAFEACECCCECAEEDCADCBEDC7BEBCCCFBAFDCECCFBEAACA MC:Z:151M MD:Z:151 PG:Z:MarkDuplicates RG:Z:sample NM:i:0 AS:i:151 XS:i:151
A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NC_000022.11 42212883 0 151M = 42212845 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT AADECCCBDCBFCE<?CDEEEEBDEACDEAC;:BFBCBCDCCBEAEACAEFCCEAFBCBCCDEECBDBCECBEECCEACDEEBBFGDEFGCCFFFFCFCCEFBFDCFCDAAEBEE:CECBABBEBEE;DBFCCCDBCDBCCBBC?@BEEDA MC:Z:151M MD:Z:151 PG:Z:MarkDuplicates RG:Z:sample NM:i:0 AS:i:151 XS:i:151
******** DONE Réaligner à partir de la sortie de bwa mem
CLOSED: [2023-05-24 Wed 22:32]
#+begin_src sh
cd out/63003856_S135_R/preprocessing/mapped/
samtools view 63003856_S135_R.bam NC_000022.11 -f 0x2 -o 63003856_chr22.bam
samtools sort -n 63003856_chr22.bam -o 63003856_chr22_sorted.bam
samtools fastq -1 63003856_chr22_1.fq.gz -2 63003856_chr22_2.fq.gz -0 /dev/null -s /dev/null -n 63003856_chr22_sorted.bam
#+end_src
ON vérifie la qualité
#+begin_src
zgrep -A 3 "A00853:477:HMLWYDSX3:1:1413:4390:28573" 63003856_chr22_1.fq.gz
#+end_src
#+RESULTS:
: @A00853:477:HMLWYDSX3:1:1413:4390:28573
: AGGGTTACCACCACCACCCTGACAGGAGATATTCTAGGAGTACTCAAGAGCATCAGGGGATGGCTGGTAGCCTAGAAGGAACCACAAGGCCCAATGTCTTGGTTAGTCAAACCAATGAATTAGCTAGCAGGGGCCTTCTGAACAAAAGCAT
: +
: FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFF
#+begin_src
NXF_OPTS=-D"user.name=apraga" nextflow run main.nf -c nextflow.config -profile standard,helios -resume --input="out/63003856_S135_R/preprocessing/mapped/63003856_chr22_{1,2}.fq.gz" --outdir=out/63003856_chr22-from-mapped
#+end_src
Puis ::
#+begin_src
cd /Work/Users/apraga/bisonex/out/63003856_chr22-from-mapped/63003856_chr22/preprocessing/mapped
samtools view 63003856_chr22.bam | rg "A00853:477:HMLWYDSX3:1:1413:4390:28573"
#+end_src
#+RESULTS:
: A00853:477:HMLWYDSX3:1:1413:4390:28573 163 NW_014040930.1 115017 0 151M = 115055 189 CCCAGGGGCCCCAGTGGGGATTTTCTAATAGAGACCCAATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACT FFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
: A00853:477:HMLWYDSX3:1:1413:4390:28573 83 NW_014040930.1 115055 0 151M = 115017 -189 ATGCTTTTGTTCAGAAGGCCCCTGCTAGCTAATTCATTGGTTTGACTAACCAAGACATTGGGCCTTGTGGTTCCTTCTAGGCTACCAGCCATCCCCTGATGCTCTTGAGTACTCCTAGAATATCTCCTGTCAGGGTGGTGGTGGTAACCCT FFFFFFFFFFFFFF::FFFFFFFFFFFFFFF::FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFF NM:i:0 MD:Z:151 MC:Z:151M AS:i:151 XS:i:151 RG:Z:sample
******* DONE Aligner sur génome de référence limité au chromosome 22
CLOSED: [2023-05-24 Wed 23:18]
******** KILL Test données non modifiées
CLOSED: [2023-05-24 Wed 23:18]
/Work/Users/apraga/bisonex/tests/bamscissors
#+begin_src
cd /Work/Groups/bisonex/data/genome/GRCh38.p13/
mkdir chr22/
samtools faidx genomeRef.fna NC_000022.11 > chr22/chr22.fna
cd chr22
samtools faidx chr22.fna
bwa index chr22.fna
#+end_src
#+begin_src
cd /Work/Users/apraga/bisonex/tests/bamscissors
ln -s ../../out/63003856_S135_R/preprocessing/applybqsr/63003856_chr22_{1,2}.fq.gz .
srun -c 24 -p smp -t 1:00:00 --pty bash
bwa mem -t 24 /Work/Projects/bisonex/data/genome/GRCh38.p13/chr22/chr22.fna 63003856_chr22_1.fq.gz 63003856_chr22_1.fq.gz -o smallref.sam
#+end_src
******** DONE Test données modifiées: ok
CLOSED: [2023-05-24 Wed 23:18]
Données dans data/init
#+begin_src sh
time julia insertVariant.jl
rsync -avz data/init/*.fq.gz meso:/Work/Users/apraga/bisonex/tests/bamscissors/
#+end_src
#+begin_src
srun -c 24 -p smp -t 1:00:00 --pty bash
bwa mem -t 24 /Work/Projects/bisonex/data/genome/GRCh38.p13/chr22/chr22.fna 63003856_chr22_1.fq.gz 63003856_chr22_1.fq.gz | samtools sort -@24 - -o smallref.bam
#+end_src
#+begin_src
rsync -avz meso:/Work/Users/apraga/bisonex/tests/bamscissors/smallref.bam mapped/
#+end_src
****** DONE Test haplotypecaller 1 variant
CLOSED: [2023-05-29 Mon 15:38]
****** DONE Test haplotypecaller tous les variants
****** DONE Comprendre pourquoi la répartiton ne suit pas la loi normale
CLOSED: [2023-06-01 Thu 21:44]
Certains hétérozygote soint à 0.01 ou 1...
******* DONE augmenter le nombre d'échantillions: idem
CLOSED: [2023-05-31 Wed 22:24]
******* DONE Vérifier le nombre de reads marqué vs édité
CLOSED: [2023-06-01 Thu 21:44]
******* DONE vérifier que 100 appel à rand(d, 1)[1] est semblable à un appel de rand(d, 100)
CLOSED: [2023-05-31 Wed 22:24]
julia> df = vcat(DataFrame(:y => z, :type => "z"), DataFrame(:y => y, :type => "y"));
julia> y = [rand(d, 1)[1] for x in 1:1000];
julia> z = rand(d,1000);
julia> df = vcat(DataFrame(:y => z, :type => "z"), DataFrame(:y => y, :type => "y"));
draw(data(df)*histogram(bins=100)*mapping(:y, color=:type,dodge=:type))
****** TODO Refaire le test avec la nouvelle version
******* DONE Localement
CLOSED: [2023-06-02 Fri 23:40]
Mésocentre
#+begin_src sh
cd /Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing/mapped
samtools view 63003856_S135_R.bam NC_000022.11 -o 63003856_S135_R_chr22.bam
#+end_src
Génération locale du fichier
#+begin_src sh :dir /home/alex/roam/recherche/bisonex/code/xamscissors
~/.local/julia-1.9.0/bin/julia --project=.. snvs.jl
rsync -avz meso:/Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing/mapped/63003856_S135_R_chr22.bam .
samtools index 63003856_S135_R_chr22.bam
#+end_src
#+RESULTS:
| receiving | incremental | file | list | | | | |
| | | | | | | | |
| sent | 20 | bytes | received | 79 | bytes | 66.0 | bytes/sec |
| total | size | is | 143,162,969 | speedup | is | 1,446,090.60 | |
On génère les données
#+begin_src julia
using XAMScissors
insertSNV("./63003856_S135_R_chr22.bam", "snvs_chr22.csv", "out")
#+end_src
Puis
#+begin_src sh
~/.local/julia-1.9.0/bin/julia --project=.. xamscissors.jl
#+end_src
******** DONE Améliorer les performances
CLOSED: [2023-06-02 Fri 23:39]
#+begin_src julia
@time include("xamscissors.jl")
#+end_src
430s pour chromosome 22. Majorité dans l'édition de reads:
********* DONE Inserér tous les variants d'un reads d'un coup
CLOSED: [2023-06-01 Thu 23:09]
Ne change rien
********* DONE Test avec -t4: idem
CLOSED: [2023-06-01 Thu 23:17]
********* DONE Test mésocentre : idem
CLOSED: [2023-06-01 Thu 23:40]
348s
********* Changer la structure de données des
Dataframe -> dict = les performances horribles ont disparuse
******* TODO Mesocentre
#+begin_src
cp xamscissors.jl snv*.csv meso:/Work/Users/apraga/bisonex/tests/xamscissors/
ssh meso
cd /Work/Users/apraga/bisonex/tests/xamscissors
cp /Work/Users/apraga/bisonex/out/63003856_S135_R/preprocessing/mapped/*chr22.bam .
samtools index 63003856_S135_R_chr22.bam
#+end_src
***** TODO PHase 3 : tous les SNV
**** TODO Test Indel
*** Divers
**** DONE Vérifier nombre de reads fastq - bam
CLOSED: [2022-10-09 Sun 22:31]