Hi there,
I have just wondering whether there is any way of requesting multiple streams from the implementation of one single operator. Currently the provided interface is
virtual void Forward(const OpContext & ctx,
const std::vector < TBlob > & in_data,
const std::vector < OpReqType > & req,
const std::vector < TBlob > & out_data,
const std::vector < TBlob > & aux_args)
{
From which we can request one cudaStream
using
Stream < gpu > * cuda_stream = ctx.get_stream < gpu > ();
But given that my operator has some portion that can be run in parallel, is there any way for me to obtain another cudaStream
from the context ctx
?
Thanks a lot.