NewMer-新建流程任务

流程：宏基因组流程[metagenome]

帮助文档

时间 2025-03-07 查看流程图

流程包括fastp质控、megahit组装、基因预测(metagene)、去冗余序列得到非冗余基因集、salmon定量、基因注释、物种注释等

文献引用： 1 Clustering of highly homologous sequences to reduce thesize of large protein database, Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282-283;Tolerating some redundancy significantly speeds up clustering of large protein databases, Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2002) 18:77-82 2 Li D , Liu C M , Luo R , et al. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph[J]. Bioinformatics, 2015, 31(10):1674-1676. 3 Huerta-Cepas J , Forslund K , Coelho L P , et al. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper[J]. Oxford University Press, 2017(8). 4 Chen S , Zhou Y , Chen Y , et al. fastp: an ultra-fast all-in-one FASTQ preprocessor[J]. Bioinformatics, 2018, 34(17):i884-i890. 5 Hideki, Noguchi, Jungho, et al. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences.[J]. Nucleic Acids Research, 2006. 6 Truong D T , Franzosa E A , Tickle T L , et al. Erratum: MetaPhlAn2 for enhanced metagenomic taxonomic profiling[J]. Nature Methods.

运行环境

在线本地：无查看/配置环境

参数设置

任务名称

数据质控

fastp A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.

info.csv文件

选择文件文件预览上传临时文件粘贴内容

qualified_quality_phred

length_required

n_base_limit

cut_mean_quality

cut_window_size

cut_front

cut_tail

序列组装

megahit用于宏基因组测序数据的组装。组装速度较快，消耗资源较低。

min contig length

k-min

k-max

k setp

去冗余基因集

使用cd-hit软件，去fasta文件的冗余序列

identity

salmon建索引

salmon软件用于基因表达量或丰度的计算，需要先对参考序列建立索引。

kmer

基因氨基酸序列

基因的核酸序列转氨基酸序列使用EMBOSS-6.5.7的transeq子程序

code表

trim

clean

登录运行

环境

软件:

数据库:

其他:

木牛零码