Parsing Windows 2008 DNS Debug Logs

Parsing and analysing Windows DNS logs can be a challenge. If your server runs post-2012 software, you are probably good as the output is formatted into Windows event logs. However, if you're up against an earlier version than Microsoft Server 2012 r2, then the output in plaintext log files is challenging to analyse both for humans (developers) and machines. Here's an example of parsing and querying Windows 2008 DNS debug logs.

What’s There: the Windows DNS Debug Log Format

The DNS debug record is of course all about queries from the local host and responses from the DNS-server. The raw output looks similar to this:
23.05.2019 11:28:06 07F4 PACKET  00000000035021D0 UDP Rcv    6ad9   Q [0001   D   NOERROR] AAAA   (3)www(7)example(3)com(0)
23.05.2019 11:28:06 07F4 PACKET 00000000035021D0 UDP Snd   6ad9 R Q [8081 DR NOERROR] AAAA (3)www(7)example(3)com(0)
The fields are described at the beginning of the file:

  Field # Information       Values
  ------- -----------       ------
    1   Date
    2   Time
    3   Thread ID
    4   Context
    5   Internal packet identifier
    6   UDP/TCP indicator
    7   Send/Receive indicator
    8   Remote IP
    9   Xid (hex)
  10   Query/Response     R = Response
                              blank = Query
  11   Opcode             Q = Standard Query
                              N = Notify
                              U = Update
                              ? = Unknown
  12   [ Flags (hex)
  13   Flags (char codes) A = Authoritative Answer
                              T = Truncated Response
                              D = Recursion Desired
                              R = Recursion Available
  14   ResponseCode ]
  15   Question Type
  16   Question Name

Parse the Logs - Challenge Accepted

Parsing these fields separated by space doesn't seem too difficult. But the drama kicks in with the unique format of the last field: domain names separated by integers enclosed in brackets. The question is - how to transform www(7)example(2)com(0) into 

Doing it manually normally require advanced regex tricks but you can also get this task done with the full functionality trial version of SpectX (download here). No need to import the data anywhere (run queries on raw log files), no limits on volumes.

Once installed, copy this pattern and query to your SpectX query window. Replace the path to our Win DNS debug log sample (line 25) if you want to use your own data and to transform into an analysable and typified format.
The hack with the weird domain name format is this: we’ve added the strings into an array skipping integers and their brackets (lines 17-20). Then, in the query, we’ve used SpectX' Array_JOIN function to smash the elements back into one string separated by dots (line 30).

$pattern = <<<EOP
TIMESTAMP('dd.MM.yyyy HH:mm:ss'):dateTime ' ' //extract the date and time fields as the TIMESTAMP into a column named dateTime
HEXINT:threadId ' ' //extract the hex numerical thread id as an integer value
UPPER:context ' '+ //extract the uppercase context field as a string
HEXLONG:packetId ' ' //extract the hex numerical internal packet identifier as a long value
UPPER:protocol ' ' //extract the uppercase UDP/TCP indicator as a string
LD:sndRcv ' ' //extract the Send/Receive indicator as a string
IPADDR:remoteIp ' '+ //extract the remote IP as an IPADDR-type value
HEXLONG:xid ' ' //extract the Xid hex numerical field as a long value
ENUM{'R'=0, ' '=1}:isQuery ' ' //map the Query/Response values to 0/1 integers (a true/false condition in queries)
[QNU?]:opCode ' [' //extract the Opcode field values as a string
HEXINT:flagsHex ' ' //extract the Flags hex numerical field as an integer value
('A':authoritative_answer | ' ') //map the Flags char codes fields to separate string columns
('T':truncated_response | ' ') ('D':recursion_desired | ' ') ('R':recursion_available | ' ') ' '+
LD:responseCode '] ' //extract the ResponseCode field as a string
UPPER:questionType ' '+ //extract the Question Type uppercase field as a string
ARRAY{ //extract the Question Name field with the domain name labels into an ARRAY:
  '('INT')' //an integer enclosed in brackets preceeds each label
  LD:label >>('('INT')') //extract the label as a string until the next integer enclosed in brackets
'(0)' //the Question Name field is terminated by a zero enclosed in the brackets
WINEOL{1,2} //the record is terminated by one or two CRLF line breaks

@list   = LIST(''); //replace the path with your own if needed
@stream = PARSE(pattern:$pattern, src:@list);

.select(dateTime, remoteIp, questionName:ARRAY_JOIN(dNameArray, '.'));
Here are the query results when selecting the timestamp, source IP and the normalised target domain from the sample data:

Having the data cleaned and typified opens up a whole new world - analysing the records. Comparing hostnames, looking at DNS request frequencies or even just entropy in the data can give you incredible insights to discover anomalies and malicious activities. Stay tuned for the next post (follow us on Twitter: @spectxlab).

Back to articles