Analyzing IIS Log Files

By Liisa Tallinn and Raido Karro

Parsing and analyzing IIS logs with SpectX is easy - no need to wait for the data to ingest anywhere. SpectX reads and queries raw data from its current storage, parsing it at query runtime.  
1. Download the SpectX desktop edition and point it at your data by navigating to one of the log files. Here are detailed instructions on configuring the IIS parser if needed.
2. Click on 'Prepare Query' and 'Run' 
3. Add an asterisk to the file path in the LIST command to query multiple files.

TIP: configure SpectX access to Maxmind GeoLite2 database to enrich IP-addresses with geolocation and AS information.

The free desktop edition will automatically apply a pattern (schema) to the IIS logs extracting timestamps, IPs, user agents and other fields necessary for in-depth analysis of events happening in your web server.  To give you some inspiration on what to look for, we've put down a collection of sample questions and copy-pastable queries. We've written a separate post in case you're interested in analyzing OWA and Activesync activities.

20 sample queries on IIS logs

SpectX' query syntax is similar to piping UNIX commands. Queries are written by chaining commands one after another, directing the previous output to the next input. Copy-paste the following queries one by one after the base query you get after navigating to an IIS log file and pressing prepare query. Remember to remove the | limit(1000) filter if you wish to get more than 1000 rows in the resultset. 

Adding the queries to the inital script

Explore CLIENT IPs

1) Which URIs did a specific client IP visit and how many times? Right-click on one of the cs_ip fields in the results to add that IP as the filter.
| filter(c_ip = 0.0.0.0)
| select(cs_uri_stem, count(cs_uri_stem) as requestcount)
| group (cs_uri_stem)
| sort(requestcount DESC)
2) How many requests are unique client IPs generating? Sort the results in descending order. 
| select(c_ip,reqcount:count(c_ip)) 
| group(c_ip)
| sort(reqcount desc)
Note: if you wish to get a top list of c_ip’s, add a limit to this script. Eg to get top 20 IPs:
| limit(20)
3) How many hits per hour did the server get from a particular IP address?
| filter(c_ip = 0.0.0.0)
| select (date_time[1 hour], reqcount:count(*))
| group(date_time[1 hour])
Press on the chart button to get a line visualization

4) What is the geographical location of the IP addresses? Generate a “pew-pew” map (note that this query presumes you've configured SpectX access to GeoLite2)
| select(date_time, GEO(c_ip))
| sort(date_time)
Click on ‘Map’ and shorten the range to hit play and enjoy the scenery.



5) How many times did different IPs visit a particular URI that contains the string ‘foo’?
| filter(cs_uri_stem like '%foo%') 
| select(c_ip, reqcount:count(c_ip))
| group (c_ip)
| sort(reqcount DESC)
Note: If you wish to display this result on a chart, add one more line and then click on the Chart button
| select(string(c_ip), reqcount) //transform the IP back to strings
6) What is the domain associated with these IP addresses (ReverseDNS)
| filter(c_ip IN (0.0.0.0, 1.1.1.1, 1.2.3.4)) 
| select(c_ip, reverse:DNS_LOOKUP(c_ip))
| group(c_ip)

Explore the URIs


7) How many times was each URI hit?
| select(cs_uri_stem, reqcount:count(cs_uri_stem))
| group(cs_uri_stem)
| sort(reqcount desc)
8) How many times were the URIs containing ‘foo’ or ‘bar’ hit?
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%')
| select(cs_uri_stem, reqcount:count(cs_uri_stem))
| group(cs_uri_stem)
| sort(reqcount desc)
9) How many times were the URIs containing ‘foo’ or ‘bar’ hit? When was the first and when the last hit?
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%') 
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem), first_seen:min(date_time), last_seen:max(date_time))
| group(cs_uri_stem, c_ip )
| sort(reqcount desc)
10) How many times did various IPs request a URI containing ‘foo’ or ‘bar’?
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%') 
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem))
| group (cs_uri_stem, c_ip )
| sort(reqcount desc)
11) Compare the hit count in time for URIs that contain ‘foo’, ‘bar’ and ‘baz’.
| select(date_time, cs_uri_stem
,total_hits:count(*)
,foo:count(cs_uri_stem like '%foo%')
,bar:count(cs_uri_stem like '%bar%')
,baz:count(cs_uri_stem like '%baz%')
)
| group(cs_uri_stem)
| filter(foo > 0 AND bar > 0 AND baz > 0)
Press on ‘chart’ to get a visualization on this graph.

12) How many times per hour did a particular IP visit URIs containing foo or bar?
| filter(c_ip=0.0.0.0 and (cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%'))
| select(date_time, cs_uri_stem, reqcount:count(*))
| group(date_time[1 hour], cs_uri_stem)
| sort(reqcount desc)
13) What are the URIs visited after 1st September 2019 at 10 AM? How many hits did they get?
| filter(cs_uri_stem like '%foo%' or cs_uri_stem like '%bar%') 
| filter(date_time > T('2019-09-01 10:00:00') )
| select(cs_uri_stem, c_ip, reqcount:count(cs_uri_stem))
| group(cs_uri_stem, c_ip )
| sort(reqcount desc)

Explore Misc

14) What are the top 50 user agents?
| select(totalhits:count(*), browser:cs_user_agent)
| group(browser)
| sort(totalhits desc)
| limit(50)
15) What external pages (referrers) refer to broken links?
| filter(cs_referer IS NOT NULL AND sc_status = 404 AND (sc_substatus IS NULL OR sc_substatus=0))
| select(referer:cs_referer, uri:cs_uri_stem)
| group(referer)
16) How many times did the server return a particular status code?
| SELECT(status:sc_status,count(*) ) 
| GROUP(status)
| SORT(status)
17) Are there any requests originating from the TOR network?
This query fetches and parses current TOR exit nodes published on the project’s website and joins them virtually with the IIS logs based on the client IP. 
| join(PARSE(src:'https://check.torproject.org/cgi-bin/TorBulkExitList.py?ip=1.1.1.1', pattern:'ipaddr:exitAddress (EOL|EOS)') on left.c_ip = right.exitAddress)
18) Are there any Googlebot user agents not coming from Google AS? That is - who's pretending to be a Googlebot?
|filter(asname(c_ip) != 'AS15169 Google LLC' and lower(cs_user_agent) like '%googlebot%')

Explore the Traffic

These two queries are executable if logging cs_bytes and sc_bytes are enabled on the IIS server (check the pattern)
19) What is the hourly incoming and outgoing bandwidth in KB? (chart)
| select(hour:date_time[1 hour], incomingKB:round(Sum(cs_bytes)/1024,2), outgoingKB:round(Sum(sc_bytes)/1024,2))
| group(hour[1 hour])
20) What are the total, max and average bandwidth and max & average time of requests per URI in hourly slots?
| select( hour:date_time[1 hour], uri:lower(cs_uri_stem), reqcount:COUNT(*), TotBytesSent:SUM(sc_bytes), AvgBytesSent:AVG(sc_bytes), MaxBytesSent:Max(sc_bytes), AvgTime:(Avg(time_taken)*0.001), MaxTime:(MAX(time_taken)*0.001)) 
| group(hour[1 hour], uri)
| filter(reqcount> 10)
| sort(reqcount ASC)
| limit(80)

Export the results and Run the Query via API

Click on Save > Save File As to save the script to the SpectX resource tree. To export the results to e.g. CSV or JSON, click on Save > Export As and choose the format you prefer (.json or .csv)

To run a query via the API, save the query to the SpectX resource tree (Save > Save File As). Then right-click on the script file at the resource tree > properties. Copy the API query from the API tab.

Conclusion

These 20 sample queries are just a start. SpectX has a rich query language that helps to deliver complex analysis tasks and test even the wildest hypotheses. Browse the documentation for further inspiration on the SpectX parsing and query languages.

Back to articles