Aug. 21 Conference call
paper repo:
- Decided on Hollie’s repo: Geoduck_Meth
Current v074 .gff:
- 8400 gene list that correspond to 5000 proteins
- TE gff on genomic resources on github
DMG analysis:
- doesn’t require same loci; just % methylation across DMGs
- finds common gene loci that tend to be more methylated
- Comparisons: see supp tables in manuscript
DML analysis:
- Steven is working on DMLs to compare loci across bivalves
Strandedness:
- After running Mac’s code, Hollie’s output goes from 8M in bam to 6M in merge.cov
- code deletes bottom strand coverage of CG
- We decided to destrand in all our analyses
Coverage:
- We decided to use 5x and 10x, and will see which is best after analysis
Global analysis of genome methylation
Describe where is methylation in genome:
- Steven will run bismark on concatenated data
- while this was running I generated a code to concatenate .cov.gz files from Bismark coverage analysis
- practice code here: https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/Concatenate_Covgz_files.ipynb
- final code here: https://gannet.fish.washington.edu/metacarcinus/mox_jobs/20190821_concatCovgz.sh
- output here: https://gannet.fish.washington.edu/metacarcinus/Pgenerosa/analyses/20190821/uber_methylome.cov.gz
- while this was running I generated a code to concatenate .cov.gz files from Bismark coverage analysis
- With one coverage file:
- bedtools intersect on genome features (.gff) and .cov.gz file
- for each feature, sum CpG?
- sashimi plots?
DMR calling
Methylpy: Destrand data
- I didn’t find an option to destrand the data so I created a code to destrand the allc files and then run DMRfind on the destranded allc files: https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/Get_DMRs_for_SR_v074_alignments/methylpy_DMRs/destrand_allc_files/Destrand_allc_files.ipynb
- this was before Yupeng sent me his code for destranding
- I did find 10 more DMRs after destranding using the same DMRfind settings: 56 DMR vs. 46 DMRs (see table below)
Methylkit: adjust stringency
-
I had previously set the methylkit unite function to select common sites in samples if they were in >= 2 samples (1L)
-
I tried 3L to see if that would give less false positive DMRs
-
I also tried the default min.group parameter (requires all samples to show coverage of loci)
DMR caller (script linked) | min cov/loci | destranded? | DMR length (bp) | addn’l params | # DMRs (output files linked) |
---|---|---|---|---|---|
Methlkit calculatediffmeth | 3x | N | 1000 | unite min.per.group = 1L | 3248 |
Methlkit calculatediffmeth | 3x | Y | 1000 | unite min.per.group = 3L | 529 |
Methlkit calculatediffmeth | 3x | Y | 1000 | unite min.per.group = NULL | 529 |
Methylpy DMRfind | 3x | N | 1000 | 3 DMLs/DMR | 4 |
Methylpy DMRfind | 3x | Y | 250 | 3 DMLs/DMR | 0 |
Methylpy DMRfind | 3x | N | 1000 | 3 DMLs/DMR; DML = 25bp chunk | 46 |
Methylpy DMRfind | 3x | Y | 250 | 3 DMLs/DMR; DML = 25bp chunk | 59 |
Methylpy DMRfind | 3x | Y | 1000 | 3 DMLs/DMR; DML = 25bp chunk | 56 |
Methylpy DMRfind | 5x | Y | 1000 | 3 DMLs/DMR; DML = 25bp chunk | 26 |
Methylpy DMRfind | 5x | Y | 250 | 3 DMLs/DMR; DML = 25bp chunk | 26 |
*Methylkit R project here: https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/Get_DMRs_for_SR_v074_alignments/Get_DMRs_for_SR_v074_alignments.Rproj
IGV Sessions
-
session comparing DMRs from different Methylkit parameters
- session visualizing methylpy DMRs NOT-destranded, 3x coverage of 1000bp windows containing minimum of 3 DMLs (DML = 25bp chunk): https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/20190819_Pgnrv074_DMRs_in_IGV/20190819_Pgnrv074_DMRs_in_IGV.xml
- jupyter notebook for preparing files and loading them into IGV: https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/20190819_Pgnrv074_DMRs_in_IGV/20190819_Pgnrv074_DMRs_in_IGV.ipynb
- session comparing methylpy DMRs destranded versus stranded: https://github.com/shellywanamaker/Shelly_Pgenerosa/blob/master/analyses/JuviPgen_ALSL2lowd145/Get_DMRs_for_SR_v074_alignments/methylpy_DMRs/destrand_allc_files/destranded_v_stranded.xml
Some DMRs don’t get called after destranding, even though the DMSs are still there. Why is this? Example = scaffold 6 42570880-42570905