Interactive fiction games have emerged as an important vehicle to improve the generalization and reasoning capabilities of language-based reinforcement learning (RL) agents. Existing environments for interactive fiction games are domain-specific and do not require the RL agents to utilize complex reasoning (sequences of inter-dependent decision-making capabilities to complete a task on hand). In this work, we introduce a benchmark interactive environment, ComplexWorld, for text-based games that require complex composition of previously learned skills to reach a goal. These games require the agent to understand the cause-effect relationship between the intermediary decision taken towards an overarching goal. We create and test an environment with 100 complex reasoning games, generated using an automated framework that uses large language models (GPT3) and an interactive fiction game engine (based on Inform7) to provide the user with the ability to generate more games under minimal or no human supervision. Experimental results based on both the human participants and baseline text-based RL agents reveal that current state-of-the-art text-based RL agents cannot use previously learned skills in new situations involving complex reasoning at the level humans can. These results enforce ComplexWorld’s potential to serve as a sandbox environment for further research with complex reasoning.