Requesting multiple cudaStreams for a single operator

Hi there,

I have just wondering whether there is any way of requesting multiple streams from the implementation of one single operator. Currently the provided interface is

virtual void  Forward(const OpContext & ctx,
                      const std::vector < TBlob > &  in_data,
                      const std::vector < OpReqType > &  req,
                      const std::vector < TBlob > & out_data,
                      const std::vector < TBlob > & aux_args)
	{

From which we can request one cudaStream using

Stream < gpu > * cuda_stream = ctx.get_stream < gpu > ();

But given that my operator has some portion that can be run in parallel, is there any way for me to obtain another cudaStream from the context ctx?

Thanks a lot.