We would like to get rid of weave dependencies, avoid name conflicts with the support code and have a nicer user interface for the produced module. The proposed new structure is as follows:
{{{
<imports>
.. code-block:: c
<imports>
struct op1 {
struct op1 {
<persistent variables>
<support code>
init() { <initialize persistent fields> }
cleanup { <clean up persistent fields> }
run(<inputs>) { <run the computation for op1> }
};
};
struct op2 { <same> };
...
struct opN { <ditto> };
struct op2 { <same> };
...
struct opN { <ditto> };
struct driver {
struct driver {
op1 o1; op2 o2; ... opN oN;
<input storage>
<output storage>
...
...
@@ -76,28 +79,27 @@ struct driver {
oN.run(...);
<sync outputs>
}
}
}
PyObject* <name>(PyObject* inputs) {
PyObject* <name>(PyObject* inputs) {
<init driver, input/output storage>
<put inputs in input storage>
driver.run()
<free input storage>
<return output storage>
}
}
PyObject* <name>_driver(PyObject* storage) {
PyObject* <name>_driver(PyObject* storage) {
<init driver with storage>
<return driver>
}
}
<export <name> and <name>_driver>
}}}
<export <name> and <name>_driver>
Gains:
* support code can be put inside a struct and become private to the Op
* we can export several functions that can be used directly, eg {{{z = module.add(1, 2)}}}
* this won't do filtering like {{{Result.filter}}} so the usefulness is limited by that
* we can export several functions that can be used directly, eg ``z = module.add(1, 2)``
* this won't do filtering like ``Result.filter`` so the usefulness is limited by that
* the sequence of operations might be clearer to read
* we can use more descriptive names in each Op struct representing its input names (if we can find them using the inspect module) without worrying about name conflicts
...
...
@@ -106,14 +108,15 @@ Losses:
* make functions static and inline as much as possible
== Caching ==
Caching
=======
The current way of caching is from a hash of the generated code. That is inefficient because code has to be generated each time, which might be a costly process. Furthermore, usage of hashing in sets make it difficult to ensure a consistent ordering of Ops in graphs where several orderings are valid, so the generated C code is potentially different each time. Here is a proposal for a better way to compute the hash:
* Result_hash = Result version + Result desc
* Op_hash = Op version + Op desc + input/output hashes
* Env_hash = Env version + combination of the Op hashes and their traversal order wrt a consistent traversal method
The version could be set explicitly via a {{{__version__}}} field or it could simply be equal to the file's last modification date. We could also have a {{{__nocache__}}} field indicating that code produced by the Op or Result cannot be cached.
The version could be set explicitly via a ``__version__`` field or it could simply be equal to the file's last modification date. We could also have a ``__nocache__`` field indicating that code produced by the Op or Result cannot be cached.
It should also be easier to bypass the cache (eg an option to CLinker to regenerate the code).