there are some inconsistency in the samples comping from different techniques(e.g RRBS shows more meth coverage than mbd for Mcap where we did not see that before we aligned reads
Cirocos plot showed : - MBD matches WGBS better for Pact - RRBS matches WGBS better for Mcap - how could this be?
Verification steps:
- verify sample numbers match their techniques (excerpt from sample data table)
Meth1 | Pico Methyl-SeqTM Library Prep Kit |
---|---|
Meth2 | Pico Methyl-SeqTM Library Prep Kit |
Meth3 | Pico Methyl-SeqTM Library Prep Kit |
Meth4 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth5 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth6 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth7 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
Meth8 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
Meth9 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
Meth10 | Pico Methyl-SeqTM Library Prep Kit |
Meth11 | Pico Methyl-SeqTM Library Prep Kit |
Meth12 | Pico Methyl-SeqTM Library Prep Kit |
Meth13 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth14 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth15 | EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S |
Meth16 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
Meth17 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
Meth18 | MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit |
- Verify samples were trimmed according to their library prep method
- per (Meth_Compare_Pipeline.md samples match
- Verify trimmed files that were aligned were the same as the trimmed files I generated
(base) [strigg@mox1 20200410]$ awk 'NR==FNR{a[$1]=$1;next}$1 in a{print a[$1],$1}' \
> 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt \
> 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt
047d811b05a0df7a93344c110ce5f155 047d811b05a0df7a93344c110ce5f155
04829778554df5986ae415fcda3b7e81 04829778554df5986ae415fcda3b7e81
04e14c4ec8cd7ef280b97d3ecca51b1a 04e14c4ec8cd7ef280b97d3ecca51b1a
06a030b4c6ae092d748a08a4c226e674 06a030b4c6ae092d748a08a4c226e674
0edb6d01e4f6561d3eb24c4b2ff62035 0edb6d01e4f6561d3eb24c4b2ff62035
1926fcfc1e4172e26b83cad82d87fb67 1926fcfc1e4172e26b83cad82d87fb67
1c1bb5648e064bd9f927e4f481a7d8e5 1c1bb5648e064bd9f927e4f481a7d8e5
27895574a941878cd26a1400b04ee3cd 27895574a941878cd26a1400b04ee3cd
2bafaff063eb4b8c960e345634291fb4 2bafaff063eb4b8c960e345634291fb4
2e87e45941d8a4c523e6080fc460d3f3 2e87e45941d8a4c523e6080fc460d3f3
36aba1f7007b9fb4c83a76810e4b1306 36aba1f7007b9fb4c83a76810e4b1306
49ece06a7e8a4a8a936fddcdbd9c3921 49ece06a7e8a4a8a936fddcdbd9c3921
4ed014c23ba4c28681d5b4af17e95346 4ed014c23ba4c28681d5b4af17e95346
645233a25517dcb1e620e794aaf1e4bc 645233a25517dcb1e620e794aaf1e4bc
6ea2c605b839ea9ca91ecb524e7184d7 6ea2c605b839ea9ca91ecb524e7184d7
7a40c8cfafabb02e78bb8efe9001bf53 7a40c8cfafabb02e78bb8efe9001bf53
7e94bfa7f8c6c0794e4958e5628fe45f 7e94bfa7f8c6c0794e4958e5628fe45f
7eed0ec3df5d8fd42984f517e5633ef6 7eed0ec3df5d8fd42984f517e5633ef6
84f59f4b67562be14bd0692c89ee3066 84f59f4b67562be14bd0692c89ee3066
8937fd8eb2ef852e103a79b48231a9ab 8937fd8eb2ef852e103a79b48231a9ab
8c5a88ab86e9673b4e1c82f1452b6ad1 8c5a88ab86e9673b4e1c82f1452b6ad1
8c5ac01b3aeee1b4a40e991885259007 8c5ac01b3aeee1b4a40e991885259007
900a3858aab3571a8329c2e5952b265e 900a3858aab3571a8329c2e5952b265e
91e1151899fa6de016d4bce233be0819 91e1151899fa6de016d4bce233be0819
a006f7dbf9baa7239272506b1ea560be a006f7dbf9baa7239272506b1ea560be
a4c8dd40dece0e44397abebd447d7ec1 a4c8dd40dece0e44397abebd447d7ec1
ab16c162efedbcb6dce8e678affa4ee1 ab16c162efedbcb6dce8e678affa4ee1
c4e449e7210609e56be0348d1cacaf85 c4e449e7210609e56be0348d1cacaf85
cdde1f3bc4e4020fb4025ea1dc7a0c2e cdde1f3bc4e4020fb4025ea1dc7a0c2e
d6e026bb59b10a11ad9b51b8acdd18a7 d6e026bb59b10a11ad9b51b8acdd18a7
e1048fea898bc32cb03ff801534183d9 e1048fea898bc32cb03ff801534183d9
e78a80a91c6d0e0928299d22a7aeba33 e78a80a91c6d0e0928299d22a7aeba33
e9548420a9512df45a08cf3b735cc95f e9548420a9512df45a08cf3b735cc95f
efc6cc59733842eabadd21536c2f33bb efc6cc59733842eabadd21536c2f33bb
f41790ce58777f20ee742cba75692065 f41790ce58777f20ee742cba75692065
f5f0339ac5f056d8dd3cf6747cfcbff6 f5f0339ac5f056d8dd3cf6747cfcbff6
(base) [strigg@mox1 20200410]$ awk 'NR==FNR{a[$1]=$1;next}$1 in a{print a[$1],$1}' 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt | wc - l
36 72 2376 -
wc: l: No such file or directory
36 72 2376 total
(base) [strigg@mox1 20200410]$ wc -l 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt
36 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt
36 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt
72 total
- Verify bismark output matches our hypothesis?