- 26 7月, 2011 5 次提交
-
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
- 25 7月, 2011 24 次提交
-
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
This optimization has been moved in sandbox/cuda
-
由 Razvan Pascanu 提交于
I've added another version of the optimization for gpu scans.
-
由 Razvan Pascanu 提交于
This import used to be on top of the file, but that creates a cycling import.
-
由 Razvan Pascanu 提交于
preparing for the cython code, I used to grab the wrong inputs ( it should be the node inputs not self.inputs)
-
由 Razvan Pascanu 提交于
reconstruct_graph always replaces CudaNdarrays with TensorTypes, so in case of a gpu scan op, we can not use that logic to generate a hash ( I compute it locally when I am generating the gpu scan op).
-
由 Razvan Pascanu 提交于
The way it works, __init__ gets a lambda function that construct a Tensor Type. By default constructs Tensor Types, but move to gpu will replace it with a CudaNdarray constructor.
-
由 Razvan Pascanu 提交于
Notes on this change: 1) I needed to replicate the inplace optimization, to make a version that can handle ops moved on the gpu. I've also added a replica of the reconstruct_graph function from scan_utils that again does not replace CudaNdarray with TensorTypes ..
-
由 Olivier Delalleau 提交于
-
由 Frederic Bastien 提交于
-
由 Olivier Delalleau 提交于
-
由 Razvan Pascanu 提交于
Two changes. First one is to make profiling work with scan (scan was looking for a subclass of the ProfileStat object). Second, I made a more fixed length printing of timings which I believe personally is much better.
-
由 Razvan Pascanu 提交于
While we agreed that there might be a more principial way of solving this, this solution was fast to add and it is pretty efficient for now.
-
由 Razvan Pascanu 提交于
Is the shape of the data that should match.
-
由 Razvan Pascanu 提交于
In order for scan to run correctly inplace, it needs that none of the initial states are the same memory buffer.
-
由 Razvan Pascanu 提交于
-
由 Razvan Pascanu 提交于
The old implementation used to result in stochastic order error in debugmode. After many attempts to solve it, I decided that it would be better and faster just to rewrite it. This new implementation does not suffer from any bug (i.e. all tests pass in debug mode).
-
由 Razvan Pascanu 提交于
as Pascal suggested we should make it clear that scan doesn't have a normal perform anymore, and that one needs to use make_thunk.
-
由 Razvan Pascanu 提交于
After talking to Pascal we decided that having this kind of funcion can be quite useful, such that not every optimization does this splitting over and over again ( it is really easy to have bugs by either messing up the order or the count).
-
由 Razvan Pascanu 提交于
I think this commit got reverted ( I'm not sure when and why). Without it graphs produced with pydotprint might be extremely misleading.
-
- 24 7月, 2011 6 次提交
-
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
- 23 7月, 2011 5 次提交
-
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
-
由 Frederic Bastien 提交于
Small refactoring at the same time.
-
由 Frederic Bastien 提交于
-