Sign in

    What Machine Learning Means to Database Professionals

    By: Ben Richardson

    Machine learning databases are now coming of age. This presents huge opportunities for database professionals who are able to evolve to take advantage of this change.  

    Currently database professionals e.g database administrators (DBA) and database developers are some of the most important positions in any IT organization. A database professional is responsible for creating, managing and providing controlled access to a database. Having the right person as a DBA can help companies save time and shorten the application development time. However, with the increasing access to an enormous amount of data, the responsibilities of a database professional are evolving rapidly.

    Several technologies have been developed that can be used to not only manage and explore data but can help make well-informed decisions on the basis of data. Machine learning is one such technology that has seen a great surge in the last decade This article provides a brief overview of how machine learning can impact database professions, and what are the advantages of having machine learning as a skill set.


    What Is Machine Learning?

    Machine learning is a process of understanding and extracting useful patterns from data with the help of various statistical algorithms. Machine learning is further divided into supervised and unsupervised learning techniques. Machine learning is currently being used to solve many complex problems such as classifying ham and spam emails, house price prediction, poetry generation, image classification and so on.


    Will Machine Learning Replace Database Professions?

    One of the most common misconceptions about machine learning is that it is going to replace humans at many jobs. While this might be true for some repetitive tasks, AI and machine learning are basically going to complement the human brain not replace it. For database professionals, machine learning databases will not replace them, rather they will help them hugely.  

    It will allow database professionals to focus much more on planning and strategic tasks, as it will automate more boring and autonomous tasks such as installation, configuration, and regular database updates. Therefore instead of fearing the impact of machine learning on their jobs, database professionals should embrace it as a way to complete less challenging tasks much more quickly and efficiently. .


    Handling Big Data is a Challenge

    Due to the rise of the worldwide web over the past two decades, data is available in all shapes and sizes. In fact, the term big data is often used for the data set that is huge in volume, coming at a high velocity and contains a variety of content.  

    Handling huge amounts of such unstructured data has become a challenge for DBA. Algorithms run on machine learning databases have been found to work well with unstructured data as well. A huge amount of data can easily be broken down into meaningful information via machine learning techniques which highlight the need for database professionals to acquire machine learning skills.


    Machine Learning Databases Are Here

    Companies such as Microsoft and Oracle have already started incorporating different machine learning capabilities into databases. For example, Microsoft Azure SQL Database has a module that suggests and recommends different performance improvement strategies that can be automatically applied. Similarly, the SQL Server Query Store provides a plan to identify queries causing performance bottlenecks. Oracle 18c database contains self-healing capabilities and can apply self-patches and upgrade whenever a database issue occurs. A good knowledge of machine learning actually helps database developers understand the rationale behind the different recommendations made by machine learning database tools.


    The Advent of Fully Autonomous Databases

    Current machine learning databases have limited capabilities. The focus of current research is to develop fully automated databases. Wouldn’t it be nice to have a database that can anticipate the problems that are going to occur and is proactive enough to take preventive measures in advance? Or wouldn't it make the life of a database professional much easier if the database backs itself up automatically whenever a crucial transaction occurs? There are many scenarios where machine learning databases are extremely useful.

    For example, existing databases perform automatic backups at a specific time but not all database transactions are worth backing. In this sort of scenario, machine learning databases could become smart enough to know when to backup and when not to backup. 

    Furthermore, many database problems can be anticipated beforehand. For instance, in the scenarios where multiple users are accessing different database resources, the likelihood of a deadlock increases many fold. If this was happening, a machine learn database could move to providing controlled access to resources and avoid a deadlock.

    There are several academic research groups that have tried to develop fully autonomous databases.

    Carnegie Mellon Database Research Group has developed project OtterTune which uses machine learning techniques and workload data from a huge number of old databases to create models capable of automatically tuning new workloads. The OtterTune machine learning database also automatically recommends the optimum settings for improved throughput and reduced latency for new database applications.

    MIT has also developed an open-source database management framework called DBSeer which predicts performance for a given set of database resources and identifies performance bottlenecks as well.


    Learning Curve

    Machine learning is often defined as the intersection of computer science and statistics. Anyone with computer science knowledge can relatively quickly build their machine learning skills to an intermediate level if they develop a reasonable understanding of statistics.

    Many GUI tools and cloud platforms such as Google AI, IBM Watson, Amazon Sagemaker, Azure ML have simplified the process of implementing machine learning techniques by providing GUI based drag and drop interfaces for machine learning databases. Users only have to know how to use the tool as the majority of the work (adding datasets, selecting pre-processing techniques, training the model, and finally evaluating the model) can be done with a few mouse clicks.

    If a database professional really wants to build a career in advanced machine learning however they will need to build a thorough understanding of statistics. The computer science background of a database professional will be more than good enough to grasp the CS-related concepts of machine learning quickly.

    However, as we said above, if a database professional is only interested in using machine learning to automate repetitive tasks, a knowledge of GUI based machine learning tools will be more than enough.


    Multiple Career Paths

    The success of machine learning and artificial intelligence has prompted organizations to develop dedicated data science teams containing skilled machine learning experts.

    Currently, machine learning experts and database professionals have different career paths, however more and more organizations will expect machine learning or data science experts to have some level of database expertise, and vice versa.

    Given that this is currently in flux, database professionals with a knowledge of machine learning skills are preferred and have better chances of being hired either as database professional, or machine learning expert or someone with both the job responsibilities.


    Final Verdict

    The advent of big data and related machine learning techniques is likely to bring substantial changes in the job responsibilities of database professionals as overtime their focus will shift to the data from the database, as machine learning databases increasingly manage themselves.

    Machine learning will help database professionals automate a lot of manual and laborious tasks, and free them up to invest time and effort to embrace machine learning skills and to put them to use.

    Learning the statistics required to develop from a database professional into a broader database and machine learning isn’t straight forward but will pay big dividends in terms of career growth and opportunities.

    December 19, 2019 6:59:00 AM PST
    Ben Richardson

    Written by Ben Richardson

    Ben Richardson runs Acuity Training a leading provider of SQL training the UK. It offers a full range of SQL training from introductory courses through to advanced administration and data warehouse training Acuity has offices in London and Guildford, Surrey. He also blogs occasionally on Acuity’s blog: