Key Findings from 37GB of Dot-cm Typosquatting Scheme Logs
Digging deeper into the typosquatting operation: popular domains
This is the sequel to exploring BlueCoat proxy logs. Having defined their structure I now proceed to analyse actual log content. As my knowledge of the origin of these sample logs is limited, I'll try to extract as much information about the nature of proxy deployment, user behaviour and their geographical location.
I recently came across an article about an introductory open data exercise analytics using Python (scraping web, extracting data with regex and visualization). Fun learning indeed. But then I thought how much much time and how many lines of code it would take to do the same thing with SpectX? Let’s find out.
SpectX has raised investment from Karma Ventures for product development and growth. The startup founded by former security engineers at Skype and Swedbank is developing a powerful analytics software solution for rapid processing of unlimited amounts of data. The product will be publicly launched in autumn 2017.
It turns out that extracting useful information from logs is not easy. In fact, the processes of obtaining data and preparing it for analytics is a complicated and costly process. The aim of this whitepaper is to describe these complexities and bring out the core reasons that form the phenomena of inertia in machine generated data.