介绍
使用phispy软件,对基因组序列做前噬菌体预测
输入
gbk文件
例如
LOCUS ntrd02_1 4133097 bp DNA circular BCT 06-MAR-2006
DEFINITION Roseobacter denitrificans. 4133097 bp, complete sequence.
ACCESSION NC_008209
VERSION NC_008209.0
KEYWORDS HTG.
SOURCE Roseobacter denitrificans
ORGANISM Roseobacter denitrificans
Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales;
Rhodobacteraceae; Roseobacter.
REFERENCE 1 (bases 1 to 4133097)
(base) [root@localhost g_eggnog_mapper_120749_732415]# less /home/bioinfor/software/cgview/cgview_xml_builder/sample_input/R_denitrificans.gbk
(base) [root@localhost g_eggnog_mapper_120749_732415]# head -n 100 /home/bioinfor/software/cgview/cgview_xml_builder/sample_input/R_denitrificans.gbk
LOCUS ntrd02_1 4133097 bp DNA circular BCT 06-MAR-2006
DEFINITION Roseobacter denitrificans. 4133097 bp, complete sequence.
ACCESSION NC_008209
VERSION NC_008209.0
KEYWORDS HTG.
SOURCE Roseobacter denitrificans
ORGANISM Roseobacter denitrificans
Bacteria; Proteobacteria; Alphaproteobacteria; Rhodobacterales;
Rhodobacteraceae; Roseobacter.
REFERENCE 1 (bases 1 to 4133097)
AUTHORS Swingley,W.D., Gholba,S., Mastrian,S.D., Matthies,H.J., Hao,J.,
Ramos,H., Acharya,C.R., Conrad,A.L., Taylor,H.L., Dejesa,L.C.,
Shah,M.K., O'Huallachain,M.E., Lince,M.T., Blankenship,R.E.,
Beatty,J.T. and Touchman,J.W.
TITLE A ubiquitous marine phototroph with a novel carbon-fixation pathway
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 4133097)
AUTHORS Touchman,J.W.
TITLE Direct Submission
JOURNAL Submitted (01-MAR-2006) Pathogen Genomics Division, Translational
Genomics Research Institute, 445 N. Fifth Street, Phoenix, Arizona
85004, USA
FEATURES Location/Qualifiers
source 1..4133097
/organism="Roseobacter denitrificans"
/mol_type="genomic DNA"
/strain="OCh 114"
/db_xref="taxon:2434"
/chromosome="Chromosome"
gene 1723..2871
/locus_tag="RD0003"
CDS 1723..2871
/locus_tag="RD0004"
/EC_number="2.1.2.10"
/note="no close characterized matches identified by match
to protein family HMM PF01571"
/codon_start=1
/transl_table=11
/product="aminomethyltransferase, putative"
/translation="MAIIYRTSALAQRHAEIGGELEDWNGMGTAWFYDHSDERAKADY
EAVRTKAGLMDVSGLKKIHLSGPHAAAVIDRATTRNVDKLMPGRAVYAAMLDDRGLFI
DDCVIYRLSVNNWLLVHGTGTGHESLAMAAYGKNVSMIFDDDLHDMSLQGPVAVDFLA
KHVPGIRDLAYFGIIQTKLFGMPVMISRTGYTGERGYEIFCEGRHAIALWDAILEDGK
DMGIRPVQFSTLDLLRTESYLLFYPGDNSETYPFENGAACGDSLWELGLEFTVSPGKT
GFRGAENHYALEGKERFKIYGVRLEGTTAADEGADLLKDGEKVGVVTYGMRSDLFDHT
VGIARMPVECATPGTKMTVRNGDGTEIPCVAEEMPFYDKDKAIRTAKG"
......
结果
结果表格文件
prophage_coordinates.tsv
例如
pp1 NC_002737 529631 569288 529591 529606 570494 570509 CATGTACAACTATAC CATGTACAACTATAC Longest Repeat flanking phage and within 2000 bp
pp2 NC_002737 778642 820599 778526 778576 820960 821010 AAACTCAAGAAGTGATTAAATAAAACATTAAAGAACCTTGTCATATCAAC AAACTCAAGAAGTGATTAAATAAAACATTAAAGAACCTTGTCATATCAAC Longest Repeat flanking phage and within 2000 bp
pp3 NC_002737 1191309 1222549 1193572 1193583 1220349 1220360 TCAGATTTTTT AAAAAATCTGA Longest Repeat flanking phage and within 2000 bp
pp4 NC_002737 1775862 1785658 1774377 1774389 1782817 1782829 AAATGACTAAGT ACTTAGTCATTT Longest Repeat flanking phage and within 2000 bp