Reference data—or information about counterparties, financial products, issuers, exchange rates, corporate actions and prices—is a critical input for all key business systems of capital markets firms.
Every financial transaction involves reference data, and it is used across front, middle and back office systems. Given its key role, capital markets firms rely heavily on sourcing reference data from data providers.However, there are several challenges involved in sourcing the right reference data. First among them is siloed data. In most firms, business units maintain their own separate reference-data databases sourced from external data providers. This results in data duplication and excessive spends on sourcing data from expensive data providers. It also leads to issues such as poor data quality, multiple data sources, data duplication and lack of data governing policies.The resulting inconsistent, inaccurate, and incomplete reference data causes major disruptions. Capital markets firms use straight through processing (STP) to reduce transaction time and to facilitate seamless initiation and settlement of securities without manual intervention. Poor quality reference data causes STP failures that lead to monetary losses and increased operational costs. The impact is not limited to STPs. Poor quality reference data impacts all critical business functions as teams have to spend time to investigate and correct errors.
Another challenge is regulatory compliance. Stricter financial regulation requires higher levels of reporting from capital markets firms.
They are required to report data on trades and counterparties to reduce systemic risk. Consistent and accurate reference data across the organization is necessary for firms to better assess risk profiles of legal entities and financial instruments they trade with.
To overcome the challenges of traditional data sources, enterprises use a host of technologies to assimilate reference data on cloud. Gartner has said that enterprises with a cohesive strategy incorporating data hubs, lakes and warehouses will support 30% more use cases than competitors. Deploying a singlesource of reference data across the organization will eliminate most of the issues related to reference data, leading to significant reduction in operating costs
Moving reference data to cloud
The key to building a successful data paradigm on cloud is taking a holistic approach to map business drivers to a unified data service model that assures end-to-end data lineage, data quality and compliance.
This will help enterprises move away from the overheads of on-demand data management to self-service mode.The data lake should be accessible across the organization and delivered through a secure, robust and scalable platform, such as AWS Cloud. Some of the key steps in the process includes:Creating a centralized enterprise-wide data lake for a growing poolof data to be consumed by several downstream applications, a cloud deployment makes for a compelling case as it provides near-infinite compute and storage capacity. AWS DynamoDB and Redshift can be used to build reference data lakes and manage associated metadata.Next comes centralized sourcing of all external and internal data. All data transformations and custom data ingestion rules can be applied at one location, ensuring reference data integrity and data governance across the enterprise. WS Glue/EMR can be used for ingestion and transformation of data from a variety of sources; and AWS Athena and QuickSight can be used for analytics and reporting.To provide access to reference data, APIs need to be developed that all downstream systems will use. AWSLambda and Amazon API Gateway provide the API layer for all downstream systems.Finally, the access to reference data needs to be controlled and monitored. Solutions such as AWS KMS and CloudWatch serve as efficient gatekeepers.
Over the long term, a cloud-based centralized reference data sourceoffers even more opportunities, such as: