Every photo, sent email, created or shared document, visit to the company’s website, etc. moves and creates new data, which impacts the pocket in one way or another.
Luckily, due to the constant technological development of data carriers and network devices, as well as increases in storage and network capacity, the cost of data volumes are declining. Despite this, it is important to assess a company’s data volume now and their requirements for the future.
There are two types of data volume which one should be aware of generally and more specifically when using cloud services.
Most people first think of storage capacity, which means exactly that – how much data can be stored by a device or service. This indicator is used when considering the choice of a new phone or computer, as well as cloud services.
In addition, there is also data transfer volume, which means how much data moves between different data carriers, be it in local connections or networks. This volume becomes very significant in the case of web-based services. Since the network connectivity capabilities of hosted server service providers is limited, in the online world the data transfer volume is usually also limited, or prices are based on usage.
In the case of data volume, it also pays to consider whether one is dealing with constant or non-recurring volumes. For example, when migrating to a cloud service or moving between server rooms, the transfer of historical data is usually a one-time occurrence. When transferring the historical data, the data volumes may be so large that it may be worth considering whether this should be done through the network or with the help of physical devices. This could include moving the data using storage devices or, for example, ordering an encrypted AWS Snowball storage suitcase. In more extreme cases, it is even possible to order an AWS Snowmobile, a truck that holds enough hard drives together to store over 100 petabytes of data.
Web services
In the case of web services (website, self-service environments, online stores, etc.) storage and data transfer volumes are important, because both are usually limited or must be paid for based on usage.
In addition, in the case of web services, both volumes are directly related to usage events and depend on the applications being used.
Generally, the storage volume can be divided in two – the application and the necessary volume to run it, which is usually not noteworthy and therefore not calculated separately. However, based on the application’s usage events, it is quite usual for the application to produce greater data volumes while working, e.g. log entries for analysis or media files uploaded by the users.
The data transfer volume is also very closely related to the application’s usage events, although most of those events can be used to predict the size of the data transfer volume. When creating a strategy for websites or web applications, the goal of for what and for whom it is being created is already known. This means that it is already known which so-called ‘pathways’ the users will most probably start to use and what they will encounter on these pathways. Based thereon, it will be possible to calculate the data transfer volume that one visitor will use to complete the pathway (by adding up the volume of the downloadable pages on the pathway). And thereafter – very simplistically speaking – multiply this volume with the predicted number of visitors.
Of course, such a thing cannot be calculated precisely because it is very unusual to be 100% certain of the visitors’ pathways and their number in advance. But this is not a problem because when calculating the data storage and transfer volumes, it is important to understand the magnitudes. However, it is always possible to see very precisely how much of any specific volume has been used.
Backup copies
In the case of backup copies, assessing the volume is somewhat simpler. In most cases, it is already known how many backup copies are made of devices and services, and how much data these devices are able to store. In addition, it must also be decided how long to store the data for and how long it must be quickly available. For example, if the goal is to store all the backup copies for up to 30 days (in the case of financial and other data, for instance, for 7 years), then usually the data for the last few days should be immediately available (if an accident should happen and it is necessary to restore the data as quickly as possible).
Based on this logic, it is possible to choose the most suitable and most cost-effective archiving services. For example, in the case of AWS services, data that requires quick access must be stored in S3 service and those which are less urgent, in AWS Glacier service, which helps to lower costs.
If the volumes and data storage time periods are agreed upon in advance, the more exact volumes of such an automated and secure solution for backup copies and the costs thereof can be calculated.