Separator

Data First: A New Paradigm for Companies

Separator
Arpan Shah, Director, Virtu PropdealIn today's startups most people are aware of the importance of making data driven decisions. Most companies and entrepreneurs realize that data solves the blind spots that will come when intuition runs out. This awareness has resulted in people thinking about the right questions often but not thinking of the best ways those questions can be answered. For instance, a lot of entrepreneurs I speak with are acutely aware of the trends of machine learning and ask questions about how they can make their users'experience more personalized. How ever what is rare is thinking through the kinds of data systems that are necessary to store and manage data in the right way to extract intelligence.

A Whole New Scale

Data systems are the tools, technologies and architectures in place to develop applications and draw actionable insights from data. The scale of data is larger than it has ever been before. It is quite normal for companies to routinely be querying and working with terabytes of data. To put that in perspective: companies routinely operate on enough data to store every book written in human history. Working with such data requires extremely carefully architectured systems and specific tools to manipulate it with. Knowledge of the latest improvements are critical in an environment of continually increasing complexity.

It is important therefore to firstly develop an intuition of how the methods of data handling operate: batch vs stream. Then it is important to understand the frameworks to work with. It is surprisingly few people that really understand how MapReduce or Stream Processing work.

Prior to the explosion of the awareness of data; most companies were operating in the simple relational database model and using it to power most of their applications. With the large amounts of data that applications need to be able to process today, these systems and frameworks just are n't enough. Technologists globally have created tools to solve the various data processing and handling needs. From NoSQL stores such as MongoDB and Elastic search to data processing frame works
such as Spark, the tools have never been better for working with data. Perhaps surprisingly how ever this period also overlaps with a deep lack of understanding in most companies how to use this plethora of tools.

For any given set of applications there are numerous competing data technologies that could be deployed to create it. Without an understanding of these systems it is impossible to choose the best one suited for the dynamic world we operate in.

Companies who prioritize thinking about how to store and manage their data are the ones who will be able to thrive in the data driven world today


Companies who prioritize thinking about how to store and manage their data are the ones who will be able to thrive in the data driven world today. These are the organizations which will be able to develop the machine learning and AI experiences that will distinguish their products.

Democratizing Data Through The Organization

It is very difficult for organizations to put data driven decisions as a priority across the organization unless the benefits are also available across the entire company. A key priority for companies should be the democratization of data through the entire rank and file. The goal here is to allow every individual to take the best and most informed decision.

To achieve this objective choices and investments have to be made to think of internal employees as equally important 'users' of any technology as external customers.In a technology company, an example of this would be the creation of dashboards and internal visualization mechanisms for the company metrics and data. By empowering every member of the organization to be able to dig into the data the entire company benefits from a common language and process for making decisions

An important part of this process is integrating data through the culture of the organization. My favorite saying here is that 'Data trumps arguments'. It is important to develop an ethos where the person who is the most informed and can back their perspective with data gets to decide outcomes. This creates a meritocratic and informed culture and making data widely available is the only way to achieve this.

Conclusion

The most successful companies in the next decade will put data and its use as the primary functions of the organization. To enable this successfully there is a human and a technological endeavor needed. Technologically the right systems and choices need to be made to make operating with data at terabyte and petabyte scale possible. It is imperative that analyses happen quickly and modern technology enables that only if correct choices are made for the application in question. Additionally, members of the organization need to be encouraged to use data as an important part of their workflow. Managers and executives need to democratize data so that every member is aligned and informed. A culture of data backed decisions can enable quick iterations on products. Further more, this can prevent situations where a passionate argument that is not backed by data is pursued for too long. As companies across India and the world wake up to the fact that their organizations have strength in their data, putting data first will strengthen them above and beyond their competitors.