Managing your data right in the long run

In the face of fast changing economic landscape and business demands, IT is often tasked to deliver data quickly and precisely. As such, it is common for implementations to be handcoded so as to respond and provide results rapidly. However, is such an data management approach truly ideal in a fast-paced technology driven world, where both the speed of deployment and human resource turnover are increasing?

The case for handcoding in organisations

With digital transformation, leveraging on data as part of everyday business operations and analytics is now a staple in organisations. Data management is the fundamental bedrock of any digital transformation process as data needs to be taken from silos across the organisation, ingested, cleansed, analysed then share throughout the company.

Often, data is ingested from one or multiple data sources and imported into a datawarehouse or datalake. Developers may choose to write customised code in languages such as SQL, Java or Spark, to deliver the ingested data from source to destination due to several reasons :

  • It is fast at resolving the immediate business needs if performed by an inhouse readily available IT developer,
  • It is low cost as no additional development tools needs to be purchased,
  • There is ease of deployment as the integration is specific in its targeted destination,

As a result of the above reasons, handcoding would seem like a panacea in meeting immediate business needs in an agile, cheap and effective way for the IT team.

Is handcoding truly the cure ?

According to a poll done by IDG in 2019, in organisations with at least 1,000 employees, 400 was the average number of data sources to feed BI and analytics efforts. Considering that each of these 400 data sources would their individual data structure and format, handcoding is not viable as an approach in organisations where IT resources are limited as streamlining all these differences will take up much manpower within the team.

In terms of resource management, developers skilled in languages such as Spark and Java, may have their efforts better deployed elsewhere on more critical digital projects. A CNBC report (2018) found that too many developers are tied up in projects designed to prop up legacy systems and bad software, at a cost of $300 billion a year — $85 billion just dealing with bad code.

The ease of maintenance of code is another consideration. If the project is small, one-off and does not require regular maintenance or update of code, handcoding may be a suitable approach. However, if the integration is across multiple sources and targets, using a data integration tool is better suited as when deployment upgrades to the various database sources and targets is not managed well, “brittle” integration will result.

Another related consideration is the ease and speed of developers’ code learning. With proper documentation tracked and maintained using integration software, it is easier to pick up and learn integration code, compared to handcoding that is subjected to documentation issues once the developer leaves the organisation.

Lastly, the applicability of reusing data integration patterns is another dimension to evaluate. It may be more efficient for the developer to integrate across similar databases or targets with an existing data patterns, that is already created with an data integration tool, than to start from scratch with handcoding to save time and resource.

Thinking long for better data

In order for an organisation to benefit fully from its data management strategy, immediate short-term gains need to be balanced with the longer-term benefits which an organisation will derive from its implementation. A report by McKinsey & Company (2020) reported that industry spending on data-related costs is expected to increase, on average, by nearly 50% over 2019-2021, versus 2016-2018. Although handcoding is a means of addressing integration needs immediately, it is a short fix as with the exponential increase in data most organisations face today, the number of integrations needed within an organisation will only increase.

With a centralised approach to data integration, companies will benefit from cost and time savings when tenders which target individual data management functionalities, such as data quality and governance, is now addressed by a single platform. This is in contrast to handcoding data management projects, where, as the data requirements of business increases, the more budget the IT department requires to hire developer headcount with different skillsets.

The use of a data management platform also optimise learnings from other organisation as pattern recognition algorithm enables immediate applications of recommended data patterns from previous similar implementations. This is in contrast to handcoding which is highly dependent on the skills of the individual. In addition, security loopholes may be missed by the developer, but provided as part of the maintenance patches of a data integration platform.

Lastly, from a governance perspective, each handcoding project represents a single pointed “tool” which needs to be properly managed and governed. As the number of data sources or destinations increase, the number of tools that needs to be managed multiplies, and correspondingly becomes additional governance silos that needs to be integrated as part of the organisation’s governance and security framework. In the long term, the exponential increase in governance silos become untenable to manage in terms of security, cost and resources.

McKinsey (2020) reported that by improving data governance, data architecture, data sourcing and data consumption, companies has an overall savings potential of 15 – 35% of data spend, with 5 – 15% savings in the near term.

Bringing long term value with a centralised data management platform

As Warren Buffett puts it : “Someone is sitting in the shade today because someone planted a tree a long time ago”. Good IT leaders have to understand their organisation’s long-term strategic goals, map out the corresponding digital strategy and metrics, and finally, architect and deploy a central data management platform in the immediate present to build internal capabilities and addresses stakeholders’ varying needs over time for the organisation.