Smashing logs into Papertrail
• Mark Eschbach
I’ve got the entire pipeline integrated and verified messages are being properly sent across a network. Unforunately the messages are mysteriously disappearing somewhere on the Papertrail side. Time to roll up my sleves and start reading.
Search for the protocol details
According to the PaperTrail documentation they should accept anything compliant with RFC5424 and RFC3164. Starting with RFC5424 under section 6 they define the format of the message. Perhaps this is where something is going wrong? Yup, looks like we are missing a version field. After looking at this I’m wondering if I shouldn’t break out the generation and parsing into it’s own library. This isn’t a simple subject and really a component of the responsibilities of the SwiftPaperTrail CocoaPods.
To reduce effort I’ll begin with building the components within the Pod then look at breaking it out. Excersizing restraint here! To create valid assertions I need to be able to parse the messages. I’ll start by building the tests for the parser then that parser itself. Now to find an example message. I think PaperTrail has some. Yup, it looks like we are using the older RFC3164 format. So modifying the plan to first build out a tests and parser for RFC3164.
Sane parsing in Swift 3
String parsing is always an interesting problem. The first example text is <22>sendername: the log message
. First component is the priority field, so I’ll extract that. I’m verify the message expectations with the first character being a <
. String parsing in Swift 3 is a pain, but StackOverflow provides a good hint on how to deal with the first left bracket. Now to be concerned with String searching. The right bracket can be 1-3 characters to the right of the left bracket.
I ran into NSScanner. Looking at the documentation it feels like they never committed to cleaning it up. Perhaps the class will disappear in the future? That would be sad as string parsing is already difficult enough. Too bad I don’t have Prolog here :smile:. Get that done in a jiffy…with only a few infinite loops! NSHipster to the rescue with a some details on the usage! Hmm, raison d’être is not a term I’ve encountered before. According to Google it means ‘the most important reason or purpose for someone or something’s existence’.
Onto my reason for investigating this. I’m hoping the usage of the class is similar but the article was written for the Swift 1 series. Well, that worked relatively well, I used the following extensions to the Parser
class to make the code easier to read:
extension Scanner {
func verifyConstant(character what : String ) -> Bool {
let set = CharacterSet(charactersIn: what)
var string : NSString?
self.scanCharacters(from: set, into: &string)
return string != nil && string as! String == what
}
func scanUp(to what : String ) -> String? {
var value : NSString?
guard self.scanUpTo(what, into: &value), !self.isAtEnd else { return nil }
guard verifyConstant(character: what) else { return nil }
return value as? String
}
func scanInt() -> Int? {
var value : Int = 0
guard self.scanInt(&value) else {
return nil
}
return value
}
func remainder() -> String? {
var value : NSString?
guard scanUpToCharacters(from: CharacterSet.newlines, into: &value) else { return nil }
guard isAtEnd else { return nil }
return value as? String
}
}
This wraps many of the methods to make them more swift like. The general pattern of the parsing algorithm is as follows:
guard let priority = scanner.scanInt() else { return nil }
guard let sender = scanner.scanUp(to: ":")
Parsing RFC 3164
Not the best design but shorter to write than alternatives. Well written RFCs are great beacuse the give examples of conversions of numeric codes and enumerations. In this case it’s the mapping of the syslog facility (bits 3-7) and severity (bits 0-2). I’m on the fence for using enumerations to represent the values. On one hand it would be much more complete, on the other I’m not sure if there is seamless bridging between the raw values and the enumerations. I must say I really dig the get
, set
, didSet
, etc features of Swift. I didn’t use them too often in my prior C#
expierence but this is awesome.
Examining the protocols a little closer with RFC 3164 there is no delimeter between the header and the message. It looks like the common practice of using the process name and a colon is just that, not a standard. That makes my parsing algorithms a little different. Sane defaults are to fix all messages to be from a relay via the user-message
facility with the severity notice
. Dates and hosts are required, with the expectation the processing entity will fix any missing feilds. This quickly becomes a problem while attempting to figure out the difference between the header
and the message
section of a packet. If they used a simple null terminator or a character excluded from the header this would really make life easier. The truth is I would have to manually parse and verify the date, which isn’t hard but would take about an hour to do correctly. Either way we aren’t actually sending RFC 3164 packets, so I think it’s time to move onto the other format.
Parsing RFC 5424
So far reusing the same techniques for parsing the facility and severity are working well. To continue running with the examples I’ll need to create a custom date. Easy enough using the following technique:
var components = DateComponents()
components.calendar = Calendar.current
components.year = 2003
components.month = 10
components.day = 11
components.hour = 22
components.minute = 14
components.second = 15
components.nanoseconds = 3 * 1000000
let when = components.date
Now to examine how the dates are to be parsed. For RFC 524 they can be a subset of RFC3339 (6.2.3) or a hyphen to indicate a nil value. The test example I’m using is a single word, so I’ll go with the default. I ended up with the format yyyy-MM-dd'T'HH:mm:ss.SSSZ
. There was an annoying discrepency of 15 nanoseconds between the date I parsed and generated. To resolve this I verified the difference was below a millisecond, then discarded the example nanoseconds. Much thanks to stinger at SwiftyThoughts for his article on dealing with dates.
Time to close out for today. Overall I’m happy with my progress. I’ve built out a parser for part of RFC3339, plus a parser for most of RFC3164, and one for RFC5224 less structured data. On Friday it will be exciting to return and verify we’re producing the correct format and clean up the generators.