Sample details

  • 2 lice samples from 20190523
    • 2 adult females
  • Samples from 20191118 (in fridge in 209)
    • 2 pools of adult sea lice from different populations
    • from two different salmon farms
    • different salinities? temperatures?
  • *** need more info about both of sample groups; waiting to hear back from Cristian’s lab

Methylation data

aligned trimmed reads to genome

desc. stat Sealice F1 S20 Sealice F2 S22
total reads before trim 226181237 163876559
perc reads trim removed 0.56 0.43
total reads after trim 224907116 163179589
uniq aligned reads 62174415 34864893
perc uniq aligned 27.64 21.37
ambig reads 49967995 31414190
perc ambig aligned 22.22 19.25
perc no align 50.14 59.38
dedup reads 39280061 26425601
dedup reads percent 63.18 75.79
dup reads 22894354 8439292
dup reads percent 36.82 24.21
percent cpg meth 1.1 1.0
percent chg meth 1.0 0.8
percent chh meth 1.4 1.5

prepared merged CpG 5x cov files

categorized CpGs with 5x cov:

5x CpG summary tables:

Sample|Methylated CpG (>= 50%)|Sparsely methylated CpG (10% - 50%)|Unmethylated CpG (< 10%) :—–:|:—–:|:—–:|:—–: F1|2335|391515|8890795 F2|1864|2232274|6108325

CpG category|F1:F2 CpG overlap|Uniq F1 CpGs|Uniq F2 CpGs|frac F1 mCpG overlapping|frac F2 mCpG overlapping :—–:|:—–:|:—–:|:—–:|:—–:|:—–: Methylated CpG(>= 50%) | 545 | 1790 | 1319 |23.34 | 29.24 Sparsely methylated CpG(10% - 50%) | 19838 | 371677 |212436| 5.07 | 8.54 Unmethylated CpG(< 10%) |5431008 |3459787| 677317 |61.09| 88.91

5x merged CpG summary tables:

Sample Methylated CpG (>= 50%) Sparsely methylated CpG (10% - 50%) Unmethylated CpG (< 10%)
F1 1342 233941 6831337
F2 1102 165423 5130433
CpG category F1:F2 CpG overlap Uniq F1 CpGs Uniq F2 CpGs frac F1 mCpG overlapping frac F2 mCpG overlapping
Methylated CpG(>= 50%) 314 1028 788 23.40 28.49
Sparsely methylated CpG(10% - 50%) 13570 220371 151853 5.80 8.20
Unmethylated CpG(< 10%) 4824175 2007162 306258 70.62 94.03

IGV session

Example of highly methylated CpG overlapping between both samples

zoomed in view

Genomic Feature analysis

Feature num. of features
CDS 30022
exon 30022
gene 23686
mRNA 23686
  • Checked for features overlapping with CpGs methylated >=50%
Sample mCpG(>= 50%) mCpG overlapping with genes/mRNA mCpG overlapping with exon/CDS
F1 2335 114 106
F2 1864 103 95
F1.merged 1342 60 58
F2.merged 1102 45 42

Moving forward

  • Overall most mCpGs are not located in genic regions so it’s hard to say how to target this sparse methylation aside from MBD
  • Would be great to get repeat regions and other features (UTR, etc)
  • Options for 1 sequencing run:
    • resequence 2 individuals to attempt to acheive 100% genome coverage
      • this would give at least 500M reads more data per individual
      • cost: $4,940 (1 Novaseq run)
    • WGBS 4 individuals aiming for 400M reads each to attempt acheive 5x coverage of >95% of genome
      • cost: ~$5150 = $4,940 (1 Novaseq run) + library prep (~ $50/sample) + time
      • logic: 226M reads gave 65% genome coverage @5x read depth, 168M gave 45% genome coverage @5x read depth
        • assuming a linear relationship between read depth and genome coverage (2.9 * 100) + 37.5 = ~340M ; see chart below
    • WGBS 2-3 individuals aiming for 500M reads each to attempt to achieve 10x coverage of > 95% of genome
      • cost: ~$5100 = $4,940 (1 Novaseq run) + library prep (~ $50/sample) + time
      • logic: 226M reads gave 46% genome coverage @ 10x read depth, 168M gave 31% genome coverage @10x read depth
        • assuming a linear relationship between read depth and genome coverage (3.77 * 100) + 51.8 = ~430M ; see chart below
    • MBD-BS on 10 individuals from each population: