Predictive Business Process Monitoring tasks such as next activity prediction, next timestamp prediction, etc. are becoming crucial as new technologies are enabling intelligent automation of business processes. Recent works try to address this problem by using deep learning models that encode limited attribute information of past activities for a case independently w.r.t the other cases in execution. However, the predictions for a case can also depend on contextual information such as inter-case dependencies and domain-specific attributes, which is not considered in previous works. We propose a novel method of encoding the contextual state information i.e., encoding the state of on-going cases and multi-attribute domain-specific information along with intra-case information in an unsupervised manner. We train two widely used deep learning models i.e., LSTM and Transformer using the proposed representation, and compare their performance to show the improved results over the state-of-the-art models. We also investigate the influence of past activities and other on-going cases on prediction using self-attention, making the framework to provide interpretable predictions for a decision making business user.