HowTo

From salvaEwiki
Jump to: navigation, search

Successful tiny examples.

In bash, at Moria[edit]

Generate the md5 signatures of the files in a folder:

for file in /Volumes/MBL21/A_TREASURY/012_A_TRASURY_ThiobiosGenomes/* ; do  md5 -q $file >> resultsQ.out ; done

Generate the sizes of the files in a folder:

for file in /Volumes/MBL21/A_TREASURY/012_A_TRASURY_ThiobiosGenomes/* ; do  wc -c $file >> sizes.out ; done

Extract the first column:

awk '{print $1}' sizes.out > sizesQ.out

Get the lengths of the reads of a fastq file (after this):

cat /Users/Moria/Desktop/g043.refined.1.fq | awk '{if(NR%4==2) print length($1)}' > /Users/Moria/Desktop/g043.refined.1.EXTRACT.fq

At CUBE[edit]

GC contents of the files in a folder:

for file in /proj/genomes/Thiobios/data/ThiobiosMAGs/* ; do  gc  $file >> gc.out ; done
awk '{print $4}' gc.out > gcQ.out

Assess completeness, contamination and heterogeneity of the genomes in a folder:

checkm lineage_wf -t 8 /proj/genomes/Thiobios/results/2017_08_24_checkM/data /proj/genomes/Thiobios/results/2017_08_24_checkM/therest.checkm --tab_table --file therest.checkm.out

Check for tRNAs of the genomes in a folder:

cd /proj/genomes/Thiobios/data/ThiobiosMAGs
for file in ./* ; do tRNAscan-SE -B $file -o /proj/genomes/Thiobios/results/2017_08_25_tRNAscan-SE/$file.tRNAscan-SE.out ; done

With R (and RStudio)[edit]

Extract a part of a sequence, using ape:

s2c(c2s(as.matrix(g43Z2[1])[44214:47213]))

Build a Maximum Likelihood tree, using ape and phangorn:

ali.16SB<-as.phyDat(ssuAlignB[c(2:10,1),segSitB])
dist.16SB<-dist.ml(ali.16SB)
tree.16S.njB<-root(NJ(dist.16SB),10)
mod.16SB<-modelTest(ali.16SB,model="all",multicore=TRUE)
env.16SB<-attr(mod.16SB,"env")
fitStart.16SB<-eval(get(mod.16SB$Model[which.min(mod.16SB$BIC)],env.16SB),env.16SB) # mod.16SB$Model[which.min(mod.16SB$BIC)]="K80"
fitNJ.16SB<-pml(tree.16S.njB,ali.16SB)
fit.16SB<-optim.pml(fitNJ.16SB,rearrangement="stochastic",model="K80",optInv=FALSE,optGamma=FALSE)
bs.16SB<-bootstrap.pml(fit.16SB,bs=1000,optNni=TRUE,multicore=TRUE)
plotBS(fit.16SB$tree,bs.16SB,p=50,type="p",bs.adj=c(1.2,-.7)) 
add.scale.bar()