there are some inconsistency in the samples comping from different techniques(e.g RRBS shows more meth coverage than mbd for Mcap where we did not see that before we aligned reads

Cirocos plot showed : - MBD matches WGBS better for Pact - RRBS matches WGBS better for Mcap - how could this be?

Verification steps:

  1. verify sample numbers match their techniques (excerpt from sample data table)
Meth1 Pico Methyl-SeqTM Library Prep Kit
Meth2 Pico Methyl-SeqTM Library Prep Kit
Meth3 Pico Methyl-SeqTM Library Prep Kit
Meth4 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth5 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth6 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth7 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
Meth8 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
Meth9 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
Meth10 Pico Methyl-SeqTM Library Prep Kit
Meth11 Pico Methyl-SeqTM Library Prep Kit
Meth12 Pico Methyl-SeqTM Library Prep Kit
Meth13 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth14 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth15 EZ DNA Methylation RRBSTM Library Prep Kit Catalog Nos. D5460S
Meth16 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
Meth17 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
Meth18 MBD enrichment followed by Pico Methyl-SeqTM Library Prep Kit
  1. Verify samples were trimmed according to their library prep method
  2. Verify trimmed files that were aligned were the same as the trimmed files I generated
(base) [strigg@mox1 20200410]$ awk 'NR==FNR{a[$1]=$1;next}$1 in a{print a[$1],$1}' \
> 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt \
> 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt 
047d811b05a0df7a93344c110ce5f155 047d811b05a0df7a93344c110ce5f155
04829778554df5986ae415fcda3b7e81 04829778554df5986ae415fcda3b7e81
04e14c4ec8cd7ef280b97d3ecca51b1a 04e14c4ec8cd7ef280b97d3ecca51b1a
06a030b4c6ae092d748a08a4c226e674 06a030b4c6ae092d748a08a4c226e674
0edb6d01e4f6561d3eb24c4b2ff62035 0edb6d01e4f6561d3eb24c4b2ff62035
1926fcfc1e4172e26b83cad82d87fb67 1926fcfc1e4172e26b83cad82d87fb67
1c1bb5648e064bd9f927e4f481a7d8e5 1c1bb5648e064bd9f927e4f481a7d8e5
27895574a941878cd26a1400b04ee3cd 27895574a941878cd26a1400b04ee3cd
2bafaff063eb4b8c960e345634291fb4 2bafaff063eb4b8c960e345634291fb4
2e87e45941d8a4c523e6080fc460d3f3 2e87e45941d8a4c523e6080fc460d3f3
36aba1f7007b9fb4c83a76810e4b1306 36aba1f7007b9fb4c83a76810e4b1306
49ece06a7e8a4a8a936fddcdbd9c3921 49ece06a7e8a4a8a936fddcdbd9c3921
4ed014c23ba4c28681d5b4af17e95346 4ed014c23ba4c28681d5b4af17e95346
645233a25517dcb1e620e794aaf1e4bc 645233a25517dcb1e620e794aaf1e4bc
6ea2c605b839ea9ca91ecb524e7184d7 6ea2c605b839ea9ca91ecb524e7184d7
7a40c8cfafabb02e78bb8efe9001bf53 7a40c8cfafabb02e78bb8efe9001bf53
7e94bfa7f8c6c0794e4958e5628fe45f 7e94bfa7f8c6c0794e4958e5628fe45f
7eed0ec3df5d8fd42984f517e5633ef6 7eed0ec3df5d8fd42984f517e5633ef6
84f59f4b67562be14bd0692c89ee3066 84f59f4b67562be14bd0692c89ee3066
8937fd8eb2ef852e103a79b48231a9ab 8937fd8eb2ef852e103a79b48231a9ab
8c5a88ab86e9673b4e1c82f1452b6ad1 8c5a88ab86e9673b4e1c82f1452b6ad1
8c5ac01b3aeee1b4a40e991885259007 8c5ac01b3aeee1b4a40e991885259007
900a3858aab3571a8329c2e5952b265e 900a3858aab3571a8329c2e5952b265e
91e1151899fa6de016d4bce233be0819 91e1151899fa6de016d4bce233be0819
a006f7dbf9baa7239272506b1ea560be a006f7dbf9baa7239272506b1ea560be
a4c8dd40dece0e44397abebd447d7ec1 a4c8dd40dece0e44397abebd447d7ec1
ab16c162efedbcb6dce8e678affa4ee1 ab16c162efedbcb6dce8e678affa4ee1
c4e449e7210609e56be0348d1cacaf85 c4e449e7210609e56be0348d1cacaf85
cdde1f3bc4e4020fb4025ea1dc7a0c2e cdde1f3bc4e4020fb4025ea1dc7a0c2e
d6e026bb59b10a11ad9b51b8acdd18a7 d6e026bb59b10a11ad9b51b8acdd18a7
e1048fea898bc32cb03ff801534183d9 e1048fea898bc32cb03ff801534183d9
e78a80a91c6d0e0928299d22a7aeba33 e78a80a91c6d0e0928299d22a7aeba33
e9548420a9512df45a08cf3b735cc95f e9548420a9512df45a08cf3b735cc95f
efc6cc59733842eabadd21536c2f33bb efc6cc59733842eabadd21536c2f33bb
f41790ce58777f20ee742cba75692065 f41790ce58777f20ee742cba75692065
f5f0339ac5f056d8dd3cf6747cfcbff6 f5f0339ac5f056d8dd3cf6747cfcbff6
(base) [strigg@mox1 20200410]$ awk 'NR==FNR{a[$1]=$1;next}$1 in a{print a[$1],$1}' 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt | wc - l
     36      72    2376 -
wc: l: No such file or directory
     36      72    2376 total
(base) [strigg@mox1 20200410]$ wc -l 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt
  36 20200311_ST_trim_RRBS_WGBS_MBD_md5sum.txt
  36 20200410_SR_copy_trim_RRBS_WGBS_MBD_md5sum.txt
  72 total



  1. Verify bismark output matches our hypothesis?