Matrix Notes


Matrix multiplication:

a0 0 a0 1 a0 2 a0 3
a1 0 a1 1 a1 2 a1 3
a2 0 a2 1 a2 2 a2 3
a3 0 a3 1 a3 2 a3 3
  *  
b0
b1
b2
b3
  =  
c0
c1
c2
c3

Where ci is:
       (i th row of a) ∙ bT

Computed as two matrix multiplications and a vector addition:

  (  
a0 0 a0 1 a0 2 a0 3
a1 0 a1 1 a1 2 a1 3
0 0 0 0
0 0 0 0
  +  
0 0 0 0
0 0 0 0
a2 0 a2 1 a2 2 a2 3
a3 0 a3 1 a3 2 a3 3
  )   *  
b0
b1
b2
b3
  =  
c0
c1
0
0
  +  
0
0
c2
c3
  =  
c0
c1
c2
c3

Alternatively one could use:

  (  
a0 0 a0 1
a1 0 a1 1
a2 0 a2 1
a3 0 a3 1
  +  
a0 2 a0 3
a1 2 a2 3
a2 2 a2 3
a3 2 a3 3
  )   *  
b0
b1
b2
b3
  =  
x0
x1
x2
x3
  +  
y0
y1
y2
y3
  =  
c0
c1
c2
c3

A Plausibility Argument that this may give more accurate results as well being faster:.

We are working with 1 digit floating point integers. <digit>∗10<integer>.
Add 1∗100 ten times to get:
      1∗101
Add 1∗100 ten more and because of roundoff the result is unchanged.
However if you add the second ten in a separate calculation and add the two results you get:
      1∗101 + 1∗101 = 2∗101

Blaise Barney's MPI Tutorial

  • Collective Communication Routines
  • Point to Point Communication
  • Virtual Topologies

The MPI 3.1 Report


Foot Note: Our MPI implimentation will use column splitting, which simplifies the code.

a0 0 a0 1 a0 2 a0 3
a1 0 a1 1 a1 2 a2 3
a2 0 a2 1 a2 2 a2 3
a3 0 a3 1 a3 2 a3 3
  =  
a0 0 a0 1 a0 2 a0 3
a1 0 a1 1 a1 2 a1 3
  +  
a2 0 a2 1 a2 2 a2 3
a3 0 a3 1 a3 2 a3 3