生信框架学习-nextflow（微信文章未删减版）

理论上讲大部分人学会用shell script写流程就够了，在高级一点就用make。这些工具已经满足了最基本的需求，但是如果要云计算的话可能需要找点更加高级的框架了。

框架工具选择

A review of bioinformatics pipeline framework 的作者对已有的工具进行很好的分类

image https://vip.biotrainee.com/assets/images/5-b9Y64zX54qeWQQir.png

作者的看法：

implicit，也就是Make rule语法更适合用于整合不同执行工具
基于配置的流程更加稳定，也比较适合用于集群分配任务。

最后作者建议是：

如果实验室既不是纯粹的生物学试验（不需要workbench这种UI界面），也不需要高性能基于类的流程设计，不太好选，主要原则是投入和产出比
如果实验室进行的是重复性的研究，那么就需要对数据和软件进行版本控制，建议是 configuration-based pipelines
如果实验室做的是探索性的概念证明类工作（exploratory proofs-of-concept)，那么需要的是 DSL-based pipeline。
如果实验室用不到高性能计算机(HPC)，只能用云服务器，就是server-based frameworks.

目前已有的流程可以在awesome-pipeline 进行查找。

就目前来看，pipeline frameworks & library 这部分的框架中 nextflow 是点赞数最多的生物学相关框架。所以我就开始学习了nextflow, 只可惜nextflow在运行时需要创建fifo，而在NTFS文件系统上无法创建，所以Ubuntu On Windows10是玩不转的。

nexflow简单了解

nextflow基于JAVA, 安装有两种方式：

curl -s https://get.nextflow.io | bash
conda install -c bioconda nextflow

安装之后还需要用 nexflow run hello测试是否安装成功，安装成功后编写第一个流程 tutorial.nf

#!/usr/bin/nextlfow
params.str = 'Hello world!'
process splitLetters {
    output:
    file 'chunk_*' into letters mode flatten
    """
    printf '${params.str}' | split -b 6 - chunk_
    """
}
process convertToUpper{
    input:
    file x fomr letters
    output:
    stdout result
    """
    cat $x | tr '[a-z]' '[A-Z]'
    """
}





    
result.susribe {
    println it.trim()
}

在命令行执行如下命令

nextflow run tutorial
# 终端输出内容
N E X T F L O W  ~  version 0.25.1
Launching `tutorial.nf` [jolly_ride] - revision: 4a11bf2927
[warm up] executor > local
[92/fd0b10] Submitted process > splitLetters
[d3/8f6c61] Submitted process > convertToUpper (2)
[2c/39e326] Submitted process > convertToUpper (1)
WORLD!
HELLO

nextflow在运行时会在当前路径下生成"work"目录，用于记录运行时每一步的数据。这意味着什么？这意味着你修改了脚本其中一个部分的时候，继续运行可以基于已有的数据，而不需要重新重头开始。我们可以尝试修改其中 convertToUpper部分。

process convertToUpper {
    input:
    file x from letters
    output:
    stdout result
    """
    rev $x
    """
}

在原来执行的方式上加上 -resume参数，

nextflow run tutorial.nf -resume
# 终端输出内容
N E X T F L O W  ~  version 0.25.1
Launching `tutorial.nf` [berserk_torvalds] - revision: 211d9426dc
[warm up] executor > local
[92/fd0b10] Cached process > splitLetters
[bc/dd002c] Submitted process > convertToUpper (2)
[9b/ef1aa9] Submitted process > convertToUpper (1)
!dlrow
olleH

你会发现前后两部有一个共同的部分 [92/fd0b10] Cached process > splitLetters, 也就是说修改后的代码时直接从中间某一步继续，而不是从头到位跑下来。

nextflow还支持外部输入参数，覆盖已有的设置 params.str = 'Hello world!'。

nexflow run tutorial.nf --str 'Hola mundo'

通过如上内容我们了解nextflow的基本工作方式，但是如何用nextflow编写流程则由后续讲解。

PS: 群里有人说今天没有更新，真是小看了我的名号。



        

        
    


        
        
        
        


    

        

        

    

        
            推荐文章
        

        
        
        

            

            
                
            

            

                

                
                    
                    

                        
                            
                        

                        

                        
                            机器学习算法与Python实战
                             · 
                            
                                仅需 1 个文件，无需安装，单机离线跑 ...
                            
                             · 
                            6 月前  
                        
                    
                    
                

                

            

        

            

            
                
            

            

                

                
                    
                    

                        
                            
                        

                        

                        
                            云南网
                             · 
                            
                                【云南生物多样性数字化百科图谱】脊椎动物·小 ...
                            
                             · 
                            2 年前  
                        
                    
                    
                

                

            

        

            

            
                
            

            

                

                
                    
                    

                        
                            
                        

                        

                        
                            36氪
                             · 
                            
                                全球首家水上 Apple Store ...
                            
                             · 
                            3 年前  
                        
                    
                    
                

                

            

        

            
                


    
    
    




                
            

            
                
            

            

                

                
                    
                    

                        
                            
                        

                        

                        
                            军报记者
                             · 
                            
                                【我们拿金牌作为生日礼物奉献给您，我的祖国！ ...
                            
                             · 
                            4 年前  
                        
                    
                    
                

                

            

        

            

            
                
            

            

                

                
                    
                    

                        
                            
                        

                        

                        
                            四海筑家
                             · 
                            
                                黎叔每天60秒：你主动给出的建议，其实没有人听
                            
                             · 
                            4 年前