For any enterprise, data management is an exceedingly daunting task. With humungous volumes of data getting added daily to already congested data warehouses, enterprises are challenged to manage this data of disparate types, formats, structures, technologies, and arrival velocity. Enterprises have typically banked on one or more commercial off the shelf data management software to take care of their data management needs.
However, the sheer growth in the volume of stored data and need for agility with which insights must be derived from this stored data, is necessitating that the data management software themselves gear up to these needs. One way it can happen is for these software to become more autonomous by executing a few actions on their own.
This is where Mathematics can join forces with Data Management, for building data management software. By leveraging machine learning algorithms, data management software can embody next-generation automation.
One possible scenario that comes to mind, is in the context of data privacy. With stringent data protection regulations coming in force across the world, enterprises are looking at inventorying sensitive data attributes. For accomplishing this, enterprises need automated data discovery. Data management software can start looking at typically recurring patterns of data. If the Data Officer in an enterprise flags the data with this said pattern as ‘sensitive’ in a given context, the software can ‘learn’ that this pattern potentially contains sensitive information and automatically flag data with similar patterns elsewhere, to be also sensitive.
Another case where machine learning can play a role is automated master creation. Typically, master records are built from data derived from multiple sources. Let us assume a scenario, wherein, a telecom company receives subscriber detail records from two sources and the business process requires that the customer master be updated with these incoming records. Now, a question arises: which of the two sources must be trusted as having more accurate information, for each of the business data element contained in the incoming data record? Again here, one can imagine the master data creation software to be designed in such a way, that it observes and records the trend of correctness of the data element, as the data records arrive from each data source. Based on the ‘correctness trend’, the software machine automates the choice of source in making the updates.
There is a collateral benefit of implementing machine learning in data management software: improved user experience. For example, data management software can track the ‘frequently executed’ functions by a given user of the software. Accordingly, the software can place direct links to such specific functions on the dashboard of the given user, for an enhanced user experience. Further, aligned to Business 4.0, it results in mass customization of the software. Again, the software can remember the ‘last used state’ prior to signoff. When the same user logs in the next time, the software automatically brings up the screen of the prior usage, thereby saving navigation time for the user.
Being a nascent field, the possibilities of using machine learning for improved data management are endless. However, it is easy to get carried away by a few successes in a limited scope and the generated hype. The real scalability and performance of machine learning algorithms in the context of data management needs more research. Finally, when a software machine learns and takes automated actions, appropriate triggers and manual overrides must be available. Otherwise, when the context of learning changes completely, the software machine’s learning can be flawed. Nevertheless, machine learning-driven automation is here to stay, and data management cannot remain untouched by it.
In the long run, will machine learning have a strong impact on the way data management software functions? Please do voice your opinion below