Matrix Notes

Matrix multiplication:

a_{0 0}	a_{0 1}	a_{0 2}	a_{0 3}
a_{1 0}	a_{1 1}	a_{1 2}	a_{1 3}
a_{2 0}	a_{2 1}	a_{2 2}	a_{2 3}
a_{3 0}	a_{3 1}	a_{3 2}	a_{3 3}

b₀

b₁

b₂

b₃

c₀

c₁

c₂

c₃

Where c_i is:
(i th row of a) ∙ b^T

Computed as two matrix multiplications and a vector addition:

(

a_{0 0}	a_{0 1}	a_{0 2}	a_{0 3}
a_{1 0}	a_{1 1}	a_{1 2}	a_{1 3}
0	0	0	0
0	0	0	0

0	0	0	0
0	0	0	0
a_{2 0}	a_{2 1}	a_{2 2}	a_{2 3}
a_{3 0}	a_{3 1}	a_{3 2}	a_{3 3}

)

b₀

b₁

b₂

b₃

c₀

c₁

c₂

c₃

c₀

c₁

c₂

c₃

Alternatively one could use:

(

a_{0 0}	a_{0 1}	0	0
a_{1 0}	a_{1 1}	0	0
a_{2 0}	a_{2 1}	0	0
a_{3 0}	a_{3 1}	0	0

0	0	a_{0 2}	a_{0 3}
0	0	a_{1 2}	a_{2 3}
0	0	a_{2 2}	a_{2 3}
0	0	a_{3 2}	a_{3 3}

)

b₀

b₁

b₂

b₃

x₀

x₁

x₂

x₃

y₀

y₁

y₂

y₃

c₀

c₁

c₂

c₃

A Plausibility Argument that this may give more accurate results as well being faster:.

We are working with 1 digit floating point integers. <digit>∗10^<integer>.
Add 1∗10⁰ ten times to get:
      1∗10¹
Add 1∗10⁰ ten more and because of roundoff the result is unchanged.
However if you add the second ten in a separate calculation and add the two results you get:
      1∗10¹ + 1∗10¹ = 2∗10¹

Blaise Barney's MPI Tutorial

Collective Communication Routines

Point to Point Communication

Virtual Topologies

The MPI 3.1 Report

Foot Note: Our MPI implimentation will use column splitting, which simplifies the code.

a_{0 0} a_{0 1} a_{0 2} a_{0 3}

a_{1 0} a_{1 1} a_{1 2} a_{2 3}

a_{2 0} a_{2 1} a_{2 2} a_{2 3}

a_{3 0} a_{3 1} a_{3 2} a_{3 3}

  =

a_{0 0} a_{0 1} a_{0 2} a_{0 3}

a_{1 0} a_{1 1} a_{1 2} a_{1 3}

0 0 0 0

0 0 0 0

  +

0 0 0 0

0 0 0 0

a_{2 0} a_{2 1} a_{2 2} a_{2 3}

a_{3 0} a_{3 1} a_{3 2} a_{3 3}

Matrix Notes

Matrix multiplication:

Where ci is: (i th row of a) ∙ bT

Computed as two matrix multiplications and a vector addition:

Alternatively one could use:

A Plausibility Argument that this may give more accurate results as well being faster:.

Blaise Barney's MPI Tutorial

Collective Communication Routines Point to Point Communication Virtual Topologies

The MPI 3.1 Report

Foot Note: Our MPI implimentation will use column splitting, which simplifies the code.

Where c_i is:
(i th row of a) ∙ b^T

Collective Communication Routines

Point to Point Communication

Virtual Topologies