Besides cleanup code, all code has access to the %(fail)s template. For three code blocks, the generated C code will pretty much look like this:
Besides cleanup code, all code has access to the %(fail)s template. For three code blocks, the generated C code will pretty much look like this:
{{{
.. code-block::
int failure = 0;
int failure = 0;
{
<code1>
{
<code2>
{
{
<code3>
<code1>
label3:
{
<cleanup3>
<code2>
{
<code3>
label3:
<cleanup3>
}
label2:
<cleanup2>
}
label1:
<cleanup1>
}
}
label2:
return failure;
<cleanup2>
}
label1:
<cleanup1>
}
return failure;
}}}
And %(fail)s in the nth code block will take the value "{failure = n; goto label<n>;}". This means only the blocks executed up to the failure point are cleaned up and the return value indicates which block failed, which is handy for debugging.
And %(fail)s in the nth code block will take the value "{failure = n; goto label<n>;}". This means only the blocks executed up to the failure point are cleaned up and the return value indicates which block failed, which is handy for debugging.
When compiling an Op, we want to sync the outputs so we can get the results from Python. In case of failure, we will not necessarily want to sync. Because of that, typical code will look like this:
When compiling an Op, we want to sync the outputs so we can get the results from Python. In case of failure, we will not necessarily want to sync. Because of that, typical code will look like this:
{{{
.. code-block::
int failure = 0;
int failure = 0;
<declare input>
<declare input>
<declare output>
<declare output>
{
<extract input>
{
<extract output>
{
{
<perform>
<extract input>
label3:
{
<clean up perform>
<extract output>
{
<perform>
label3:
<clean up perform>
}
label2:
if (!failure)
<sync output>
<clean up output>
}
label1:
<clean up input>
}
}
label2:
return failure;
if (!failure)
<sync output>
<clean up output>
}
label1:
<clean up input>
}
return failure;
}}}
Furthermore, is not necessary to extract the output because we mean to overwrite it anyway. In that case, <extract output> will be a no-op, but of course we may still need to clean up or sync what <perform> will put in the declared outputs.
Furthermore, is not necessary to extract the output because we mean to overwrite it anyway. In that case, <extract output> will be a no-op, but of course we may still need to clean up or sync what <perform> will put in the declared outputs.
...
@@ -124,20 +122,19 @@ Example ResultBase
...
@@ -124,20 +122,19 @@ Example ResultBase
The following ResultBase represents a double (we only care about the C part).
The following ResultBase represents a double (we only care about the C part).
@@ -33,27 +33,26 @@ Question: does it make sense to apply the order to the loop, or is this broadcas
...
@@ -33,27 +33,26 @@ Question: does it make sense to apply the order to the loop, or is this broadcas
Here is the loop for {{{order == c}}}. Check for errors!
Here is the loop for {{{order == c}}}. Check for errors!
{{{
.. code-block::
<initialize iterators>
<initialize iterators>
i1 = -1
i1 = -1
while (++i1 < dim1) {
while (++i1 < dim1) {
i2 = -1
i2 = -1
rank_N-1_accumulator = init
rank_N-1_accumulator = init
while (++i2 < dim2) {
while (++i2 < dim2) {
...
...
iN = -1
iN = -1
while (++iN < dimN) {
while (++iN < dimN) {
<accumulate rank N input>
<accumulate rank N input>
<SET rank N output using broadcasted inputs>
<SET rank N output using broadcasted inputs>
<NEXT rank N iterator>
<NEXT rank N iterator>
}
...
}
<SET rank 1 output using accumulated inputs>
<NEXT rank 1 iterator>
}
}
...
}
<SET rank 1 output using accumulated inputs>
<NEXT rank 1 iterator>
}
}}}
When {{{order == f}}}, the iterators ''ideally'' (but not necessarily) iterate in FORTRAN order, i.e. the while loops are on {{{dimN..dim1}}} instead of {{{dim1..dimN}}}.
When {{{order == f}}}, the iterators ''ideally'' (but not necessarily) iterate in FORTRAN order, i.e. the while loops are on {{{dimN..dim1}}} instead of {{{dim1..dimN}}}.