流程包括fastp质控、megahit组装、基因预测(metagene)、去冗余序列得到非冗余基因集、salmon定量、基因注释、物种注释等
文献引用:
1
Clustering of highly homologous sequences to reduce thesize of large protein database, Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2001) 17:282-283;Tolerating some redundancy significantly speeds up clustering of large protein databases, Weizhong Li, Lukasz Jaroszewski & Adam Godzik. Bioinformatics, (2002) 18:77-82
2
Li D , Liu C M , Luo R , et al. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph[J]. Bioinformatics, 2015, 31(10):1674-1676.
3
Huerta-Cepas J , Forslund K , Coelho L P , et al. Fast Genome-Wide Functional Annotation through Orthology Assignment by eggNOG-Mapper[J]. Oxford University Press, 2017(8).
4
Chen S , Zhou Y , Chen Y , et al. fastp: an ultra-fast all-in-one FASTQ preprocessor[J]. Bioinformatics, 2018, 34(17):i884-i890.
5
Hideki, Noguchi, Jungho, et al. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences.[J]. Nucleic Acids Research, 2006.
6
Truong D T , Franzosa E A , Tickle T L , et al. Erratum: MetaPhlAn2 for enhanced metagenomic taxonomic profiling[J]. Nature Methods.