Search-based data discovery

Simple questions do not scale

Every log management best practice is strongly recommending to know what you are collecting. This generic statement should answer the questions of:
  • Which data is being stored? Are there any duplicates or unnecessary data?
  • For how long is it kept (what is the retention period)?
  • What is the amount of it?
  • etc.
These may seem simple enough to answer. However, when log management reaches the scale of petabytes and hundreds or thousands of different data sources, the situation is different. What if the data is stored in different repositories (for example part of it on-premise, part of it in the cloud)? What’s more, log management is not static. New sources and types of logs are added over time, some applications are discarded while the data it produced is kept until the end of the retention period. Suddenly the questions are not that easy to answer.
The ability to maintain an up-to-date inventory of stored data assets is not only relevant to the total cost of ownership. It also relates to compliance requirements (is the required retention period maintained, is data deleted at retention expiry, etc).

A quick overview of all the assets and metadata

SpectX is designed to perform structured analysis on unstructured data in its original raw form and location. It is, therefore, a great tool for developing and refining views and analyses of structured and unstructured data using search terms. Obtaining the metadata of data assets in their various storage repositories is an organic part of SpectX functionality. You can simply browse around to get a quick visual understanding of your assets and their contents or use queries to get sophisticated reporting.

