Drug Discovery Today

Application of an automated natural language processing (NLP) workflow to enable federated search of external biomedical content in drug discovery and development

View publication


External content sources such as MEDLINE®, National Institutes of Health (NIH) grants and conference websites provide access to the latest breaking biomedical information, which can inform pharmaceutical and biotechnology company pipeline decisions. The value of the sites for industry, however, is limited by the use of the public internet, the limited synonyms, the rarity of batch searching capability and the disconnected nature of the sites. Fortunately, many sites now offer their content for download and we have developed an automated internal workflow that uses text mining and tailored ontologies for programmatic search and knowledge extraction. We believe such an efficient and secure approach provides a competitive advantage to companies needing access to the latest information for a range of use cases and complements manually curated commercial sources.