其中bowtie2比对加入了-X 2000 参数,是最大插入片段,宽泛的插入片段范围(10-1000bp)
cat >bowtie2.sh
index=~/my_data/ref_data/mouse/bowtie2_index/mm10
ls *.fq.gz | while read id;
do
file=$(basename $id)
sample=${file%%_*}
bowtie2 -p 5 --very-sensitive -X 2000 -x $index -1 ${sample}_1_val_1.fq.gz -2 ${sample}_2_val_2.fq.gz | samtools sort -O bam -@ 5 -o - > ../bowtie2/${sample}.bam
samtools index ../bowtie2/${sample}.bam
# 比对结束后,需要了解比对结果的情况
samtools flagstat ../bowtie2/${sample}.bam > ../bowtie2/${sample}.stat
#记录mapping的结果,其中第一列是染色体名称,第二列是序列长度,第三列是mapped reads数,第四列是unmapped reads数
samtools idxstats ../bowtie2/${sample}.bam > ../bowtie2/${sample}_idxstats.txt
done
nohup sh bowtie2.sh &
ls *.txt | while read id
do
sample=${id%.*}
mtReads=$(cat ${sample}.txt | grep 'chrM' | cut -f 3)
totalReads=$(cat ${sample}.txt | awk '{SUM += $3} END {print SUM}')
echo '==> mtDNA Content:' $(bc <<< "scale=2;100*$mtReads/$totalReads")'%'
done
==> mtDNA Content: 70.01%
==> mtDNA Content: 68.88%
==> mtDNA Content: 62.97%
==> mtDNA Content: 71.26%
mkdir ../no_chrM_bam
ls *.bam | while read id
do
sample=${id%.*}
samtools view -h -f 2 -q 30 ${sample}.bam | grep -v chrM |samtools sort -O bam -@ 5 -o - > ../no_chrM_bam/${sample}.bam
samtools index ../no_chrM_bam/${sample}.bam
samtools flagstat ../no_chrM_bam/${sample}.bam > ../no_chrM_bam/${sample}.stat
done
如下面示意图所示,Tn5 建库时会插入 9bp 的序列,所以比对到正链(+)的 reads 移动 +4 bp, 比对负链(-)的 reads 移动 -5 bp.
image.png
这个移动可以用 deeptools 的 alignmentSieve 命令完成。
alignmentSieve --numberOfProcessors 8 --ATACshift -b \
${bam_dir}/${sample}_nSorted.bam -o ${bam_dir}/${sample}_Shift.bam
一般的分析这点 reads 位移没影响,不需要进行这个步骤。
留言与评论(共有 0 条评论) “” |