DSMM 2014
Conference paper

Data science challenges in real estate asset and capital markets

View publication


The real estate financial markets are complex supply chains. Understanding their behavior is limited by a lack of data that would capture the richly interconnected networks of financial institutions and complex financial products, e.g., asset backed securities. This lack of transparency is further compounded by limited knowledge of the contractual rules that control the flow of funds from mortgage pools to securities, as well as the financial events that regulate these flows. In this project, we will use the IBM Midas framework and tools to extract entities, relationships, events, contractual rules and risk profiles for financial institutions. Our source of information will be the MBS prospectus documents that are public and are filed with the Securities and Exchange Commission. We will describe the data management needs of the Haas Real Estate and Financial Markets (REFM) Lab and presents some recent REFM analytics that high-light the importance of these markets and the impact on systemic risk. We use excerpts extracted from the prospectus of a mortgage backed security (MBS) to illustrate the information extraction challenges and outline our approach to address these challenges.