A task-oriented dialog (TOD) agent often grounds its responses in an external knowledge base (KB), which can be dynamic and may undergo frequent updates. Learning a TOD agent thus necessitates saving the KB snapshot contemporary to each individual training dialog. However, only the latest KB snapshot is often available during training. As a result, inconsistencies can arise in training data where dialogs and KB deliver diverging facts, potentially confusing the TOD learner. In this work, we propose the novel problem of learning a TOD system with training data that has dialog-KB inconsistencies. We introduce two datasets for the task, created by systematically modifying two publicly available dialog datasets. We show that existing end-to-end TOD architectures suffer a loss in performance due to these inconsistencies. In response, we propose a Dialog-KB Arbitration Framework (DKAF) that reduces the inconsistencies -- based on the dialog, DKAF introduces new rows to the KB and removes contradictory ones. The resulting KB is then used for training downstream TOD agents. We show that TOD agents trained with DKAF recover well from performance loss due to inconsistencies.