> Executing this example I've got the following message > > X10RT: async 37 is not a CUDA kernel. > > If I'm not wrong, this message comes from the kernel function, once this > message dissapears when I comment out the call to the kernel function. > > I suspect the kernel activity is getting attached to the clock, and the clock implementation is trying to treat the CUDA place as an ordinary host place by sending it asyncs without @CUDA. This is a bug, we need to make sure that clocks can be used in the host application without ending up on the GPU place.
So, could you guys guide me about this? > > 1. Am I correct to think that I cannot employ Team when 2 or more GPUs > belong to the same place? > > You can do this, you just need to make a team with more than one participant per place, and then each participant gets its own integer id to use when calling into the collective operations. > 2. What is the relationship between a finish in the host code and a > finish in the kernel code? Or maybe this question should be on "clock"s > instead on "finish"es ? > They ought to just work, but we haven't tested them for a while. Certainly you cannot do: clocked async (gpu) { ... } However if you do not do this, the gpu should never end up being 'part' of the clock, and it should work fine. > 2. Would you recommend a explicit clock in order to avoid conflict with > the clock in the kernel function? > > That would be a reasonable workaround, you could also use teams instead of clocks. ------------------------------------------------------------------------------ _______________________________________________ X10-users mailing list X10-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/x10-users