In today’s competitive environment, any enterprise relies crucially on timely and accurate information: about markets, trends, competitors, products, consumer opinion, and the like.
Today’s Business Intelligence practices increasingly augment the use of company-internal data with the wealth of information generated by human and business activity on the web and social platforms.
DOPA seeks to enable European economic actors and researchers to participate in this development, by achieving breakthroughs in:
1. Large scale, high-quality information sourcing (automation of dataset detection and curation workflow)
2. Automated information processing at scale by way of Data Supply Chains on a distributed platform
3. Automated entity linkage to help bring together related data from disparate sources.
4. Visualization tools to help make sense of this wealth of data
The flow of data is described by Data Supply Chains: a definition and scalable implementation of a domain-specific data flow. A Data Supply Chain may access a variety of information services and potentially link data of different types, from different sources.
In economic and financial analytics, statistical data can often be valuably related to information gleaned from the web. The latter includes poly-structured data: audio, video, images, free-form text, tables, and XML files. These contain valuable information, and that information will become more and more machine-readable via innovative information extraction techniques (e.g. unsupervised learning) — though such techniques are still under heavy development.
Thus there is a need for a framework that supports these extraction techniques at scale, while also providing the structure and standards to achieve orderly data exchange among diverse information services.
DOPA aims to create this framework, and adapt existing pools of data to interoperate with it, producing a source and exploitation platform for economic and financial information in Europe. The end result will provide direct value to economic actors, and open new possibilities for researchers in the data domain to conduct large-scale experiments.
Project Coordinator: Prof. Dr. Volker Markl
Project duration: 1.5.12-30.04.14
Total Cost: 2,602,200.00 €