In order to create AI/ML based solutions that will be trusted during production, issues that hamper usage of AI models in practical solutions needs to be addressed. Despite a significant interest in the area of AI/ML, the primary focus of the research community has been on the training of AI models, including their performance, trustworthiness, explainability and scalability. Training, however, is only one half of the work required to create an AI-based solution. The other half, using the trained model for inference during operations, is mistakenly considered a relatively mundane task. As a result, challenges arising in model inference time has received comparatively scant attention. Inference is when AI model is put into practice, resulting in many challenges that are worth the attention of the research community. Despite the existence of several pre-trained models on many Internet sites, anyone trying to build an AI/ML based solution would be hard- pressed to find a model that is useful, trustworthy and reliable, or suitable for the task. Even when a custom model is trained, the solution often falters because the use of model fails to account for the differences in the training and inference environment. In this paper, we identify those challenges and discuss how we can design a generic inference server for trustworthy AI/ML based solutions.