I am trying to serve my two models through multi-modal server as below link:
During the inference time, I want to send an image to served system through “Curl” command and want to process image with the first model. Then I want to use output of first model as an input for second model. However, I don’t want to do these two processes as separate processes with two different “curl” commands. Instead, I want to send an image with “curl”, process it with two models consecutively, and get back the output of the second model due to importance of time. However, I did not find any information about this.
Could anyone have any idea? Thank you.