Trends in data analysis

When creating the foundations for data analysis and the Business Intelligence concept, we started with single data sources. Initially, we dealt with small data, i.e. small amounts of information subject to processing and research. This type of analysis was often based on historical data: from yesterday, a week ago or a month ago. However, we have come a long way since then. Analyzes are now available at any time and from any device, and data and reports are stored securely in the cloud.

In today’s business world, offline reporting is still popular, in which printouts of reports are sent via email. In these solutions, on-premises infrastructure, i.e. equipment, are stored within a given company.

User needs evolution

However, this traditional approach has changed in most companies. Currently, analysts have access to many data sources. This is data from within the company, but also external sources, e.g. from social networks. Thanks to such a large amount of information, i.e. Big Data, we can analyze much more and formulate more complex theses based on this data. Concentrate on details or examine data from, e.g., a large number of sensors, machine controllers, as is the case in the manufacturing industry.

Today, much, much more data is produced than even a decade ago. There are other needs as well. We strive to obtain the results of analyzes as soon as possible from the moment the data is generated, and we also want to have access to it at any time. Near real-time solutions are used more and more often, implementing all processes: from data acquisition to data analysis.

If reporting and analyzes are performed on-line, we usually gain access to them also via mobile devices. At the same time, there are tools that allow users to build their own reports and store analyzes in the cloud infrastructure.

As a result of the described processes, observed on the BI market, a more universal and functional cloud infrastructure is increasingly replacing the on-premises one.

Business intelligence evolution >>>
Single data sourceMultiple data sources (Internal and external)
Small dataBig data
Historical data analysisNear-real time data analysis
Offline reportingOnline reporting also mobile devices
On-premises infrastructureCloud infrastructure

Advantages and disadvantages of using the cloud in BI solutions compared to on-premises infrastructure

1. Data processing

In the on-premises infrastructure, data processing is most often performed sequentially.  This means that if we have, for example, a large data file and we want to load it into our system, we do it “piece by piece” until we load the whole file

In the cloud, this process relies heavily on distribution. We can divide the data file into smaller parts and process this data in parallel, thanks to that we speed up the process.

2. Scalability

In on-premises infrastructure, solutions are less scaled and require additional expenditure on hardware. This approach requires constant commitment on the part of IT, which makes processes lengthen over time and it is more difficult to achieve high efficiency.

In the cloud, scaling is implemented more efficiently, thanks to which it is possible to more efficiently handle spikes in demand for resources required for BI analyzes.  A feature of cloud solutions is the ability to scale resources both up and down, depending on the current demand. This means that when the need arises, we increase the resources that we can use, and then “give them back” when they are no longer needed (scale down). In this way, we reduce the cost of maintaining the system. Instead of incurring constant expenses, we only pay for the additional resources used, only when they are needed.

3. Costs instead of investments, implications for the Proof of Concept of BI system

In on-premises infrastructure, we usually pay for everything in advance. We have to make an investment in the form of the purchase of equipment and a license and only then start implementing our solution.

In the cloud infrastructure, we do not have this entry barrier, because we only pay for consumption, i.e. what we use. This applies, inter alia, to software license costs, which we only bear when using the selected solution.

Often, in a on-premises infrastructure, the challenge is to create the necessary environment to perform conceptual work. It is necessary to have the entire infrastructure to start creating proof of concept (PoC) at all.

On the other hand, in the cloud, building the environment needed to conduct conceptual work costs us only as much as we pay for the resources we use during the PoC phase. Therefore, using the cloud, it is easier to implement PoC and finally make a decision about the future of the project. If the created solution meets the business requirements, it can be continued. Otherwise, we close the project without generating unnecessary costs. Thanks to this approach, the company avoids the trap of becoming “hostages” of the investment, but it is also easier for it to take the risk of starting such a project, because it does not involve significant costs.

4. Cost predictability

The advantage of having a on-premises infrastructure is that it is easy to predict costs, which in this case are constant. This makes it easier to plan your budget.

In the cloud, there are many factors that affect costs, they are often difficult to determine and can easily be underestimated. Building a cloud infrastructure requires specialist knowledge. It concerns, among other things, the knowledge of the cost formation method. Lack of this knowledge may unknowingly lead to the generation of unnecessary costs, e.g. by maintaining highly scaled websites without a real business need. These services consume cloud resources for which the enterprise pays, while users either do not use them, or the benefits of their operation cannot compensate for the costs incurred.

ON-PREMISES INFRASTRUCTURE

  • Sequential processing (single node)
  • Low-scalable solutions
  • Everything is paid in advance (hardware, licenses)
  • Building an environment for concept work (PoC) is a challenge
  • Fixed, well-predictable costs
CLOUD INFRASTRUCTURE

  • Parallel processing (distributed nodes)
  • Well-scalable solutions
  • Charges per usage (including hardware, licenses)
  • Easily build an environment for concept work (PoC)
  • Variable costs, difficult to estimate

Summarizing all the advantages and disadvantages of the approaches based on cloud technologies and on-premises infrastructure, it is not difficult to notice that more benefits are obtained by using the cloud approach. For this reason, BI cloud solutions are becoming more and more popular.

Marek Czachorowski

Head of Business Intelligence Practice at Inetum in Poland. For 10 years Marek has been involved in BI and broadly defined data analysis and processing. Since the beginning he has been mainly associated with Microsoft solutions and tools. Since 2017, a certified specialist in the area of data warehouse design and SQL Server platform management. He is currently developing primarily in the area of cloud analytics. As a consultant, he helps clients define company processes, establish rules for processing and access to data.