By Liisa Tallinn and Raido Karro
Parsing and analyzing IIS logs with SpectX is easy - no need to wait for the data to ingest anywhere. SpectX reads and queries raw data from its current storage, parsing it at query runtime.
1. Download the SpectX desktop edition and point it at your data by navigating to one of the log files. Here are detailed instructions on configuring the IIS parser if needed.
2. Click on 'Prepare Query' and 'Run'
3. Add an asterisk to the file path in the LIST command to query multiple files.
TIP: configure SpectX access to Maxmind GeoLite2 database to enrich IP-addresses with geolocation and AS information.
The free desktop edition will automatically apply a pattern (schema) to the IIS logs extracting timestamps, IPs, user agents and other fields necessary for in-depth analysis of events happening in your web server. To give you some inspiration on what to look for, we've put down a collection of sample questions and copy-pastable queries. We've written a separate post in case you're interested in analyzing OWA and Activesync activities.
| filter(c_ip = 0.0.0.0)2) How many requests are unique client IPs generating? Sort the results in descending order.
| select(cs_uri_stem, count(cs_uri_stem) as requestcount)
| group (cs_uri_stem)
| sort(requestcount DESC)
| select(c_ip,reqcount:count(c_ip))Note: if you wish to get a top list of c_ip’s, add a limit to this script. Eg to get top 20 IPs:
| group(c_ip)
| sort(reqcount desc)
| limit(20)3) How many hits per hour did the server get from a particular IP address?
| filter(c_ip = 0.0.0.0)Press on the chart button to get a line visualization
| select (date_time[1 hour], reqcount:count(*))
| group(date_time[1 hour])
| select(date_time, GEO(c_ip))Click on ‘Map’ and shorten the range to hit play and enjoy the scenery.
| sort(date_time)
| filter(cs_uri_stem like '%foo%')Note: If you wish to display this result on a chart, add one more line and then click on the Chart button
| select(c_ip, reqcount:count(c_ip))
| group (c_ip)
| sort(reqcount DESC)
| select(string(c_ip), reqcount) //transform the IP back to strings6) What is the domain associated with these IP addresses (ReverseDNS)
| filter(c_ip IN (0.0.0.0, 1.1.1.1, 1.2.3.4))
| select(c_ip, reverse:DNS_LOOKUP(c_ip))
| group(c_ip)
| select(cs_uri_stem, reqcount:count(cs_uri_stem))8) How many times were the URIs containing ‘foo’ or ‘bar’ hit?
| group(cs_uri_stem)
| sort(reqcount desc)
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%')9) How many times were the URIs containing ‘foo’ or ‘bar’ hit? When was the first and when the last hit?
| select(cs_uri_stem, reqcount:count(cs_uri_stem))
| group(cs_uri_stem)
| sort(reqcount desc)
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%')10) How many times did various IPs request a URI containing ‘foo’ or ‘bar’?
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem), first_seen:min(date_time), last_seen:max(date_time))
| group(cs_uri_stem, c_ip )
| sort(reqcount desc)
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%')11) Compare the hit count in time for URIs that contain ‘foo’, ‘bar’ and ‘baz’.
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem))
| group (cs_uri_stem, c_ip )
| sort(reqcount desc)
| select(date_time, cs_uri_stemPress on ‘chart’ to get a visualization on this graph.
,total_hits:count(*)
,foo:count(cs_uri_stem like '%foo%')
,bar:count(cs_uri_stem like '%bar%')
,baz:count(cs_uri_stem like '%baz%')
)
| group(cs_uri_stem)
| filter(foo > 0 AND bar > 0 AND baz > 0)
| filter(c_ip=0.0.0.0 and (cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%'))13) What are the URIs visited after 1st September 2019 at 10 AM? How many hits did they get?
| select(date_time, cs_uri_stem, reqcount:count(*))
| group(date_time[1 hour], cs_uri_stem)
| sort(reqcount desc)
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%')
| filter(date_time > T('2019-09-01 10:00:00') )
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem))
| group(cs_uri_stem, c_ip )
| sort(reqcount desc)
| select(totalhits:count(*), browser:cs_user_agent)15) What external pages (referrers) refer to broken links?
| group(browser)
| sort(totalhits desc)
| limit(50)
| filter(cs_referer IS NOT NULL AND sc_status = 404 AND (sc_substatus IS NULL OR sc_substatus=0))16) How many times did the server return a particular status code?
| select(referer:cs_referer, uri:cs_uri_stem)
| group(referer)
| SELECT(status:sc_status,count(*) )17) Are there any requests originating from the TOR network?
| GROUP(status)
| SORT(status)
| join(PARSE(src:'https://check.torproject.org/cgi-bin/TorBulkExitList.py?ip=1.1.1.1', pattern:'ipaddr:exitAddress (EOL|EOS)') on left.c_ip = right.exitAddress)18) Are there any Googlebot user agents not coming from Google AS? That is - who's pretending to be a Googlebot?
|filter(asname(c_ip) != 'AS15169 Google LLC' and lower(cs_user_agent) like '%googlebot%')
| select(hour:date_time[1 hour], incomingKB:round(Sum(cs_bytes)/1024,2), outgoingKB:round(Sum(sc_bytes)/1024,2))20) What are the total, max and average bandwidth and max & average time of requests per URI in hourly slots?
| group(hour[1 hour])
| select( hour:date_time[1 hour], uri:lower(cs_uri_stem), reqcount:COUNT(*), TotBytesSent:SUM(sc_bytes), AvgBytesSent:AVG(sc_bytes), MaxBytesSent:Max(sc_bytes), AvgTime:(Avg(time_taken)*0.001), MaxTime:(MAX(time_taken)*0.001))
| group(hour[1 hour], uri)
| filter(reqcount> 10)
| sort(reqcount ASC)
| limit(80)