Skip to content
zhaotianjing edited this page Feb 21, 2022 · 3 revisions
  1. BLAS has pre-check system: if the input is zero matrix/vector, then BLAS will return zero matrix/vector directly without calculating. Thus, the speed is faster.

    Example:

    BLAS.axpy!(oldAlpha-α[j],x,yCorr)
    

    If oldAlpha-α[j] is a zero vector, BLAS will return zero vector directly. If oldAlpha-α[j] and x are not zero vectors, BLAS will continue to calculate.

    In conclusion: when testing speed, please check whether there exists zero vector/matrix.

  2. reshape is reference, not copy.

a=[1,1,1]
b=reshape(a,1,3)
b[1]=999
b
a

Extra pitfall from Tianjing:

  1. I found this pitfall when I changed BLAS function to normal function. In helper function, the BLAS changes ycorr in global scope, but += changes ycorr only in function scope. So, we have to return ycorr in helper function.

    • BLAS version:
    function helper(ycorr)
       BLAS.axpy!(a,x,ycorr)
    end
    
    function f()
        helper(ycorr)
    end
    
    • Normal version #WRONG!
    function helper(ycorr)
       ycorr += a*x
    end
    
    function f()
        helper(ycorr)
    end
    
    • Right version
    function helper(ycorr)
       ycorr += a*x
       return ycorr 
    end
    
    function f()
        ycorr = helper(ycorr)
    end
    
Clone this wiki locally