All logs are not created equal Part 1

Author: 
Pramod Sridharamurthy
Thursday, October 5, 2017

Let’s assume you are looking for a log analytics solution for your machine data and you start your research by looking for solutions. If you search “Log Analytics”, you will notice tens of open source and commercial offerings out there that provide a solution in this space. As a buyer, it is easy to assume that any log analytics solution can deal with all types of logs and log formats are similar.

But are all logs similar?

In this three-part series, I will show examples of how varied the log formats can be and also discuss about the variety in the packaging of these logs at the device before being transmitted. Understanding the variety of log formats will help when looking for solutions that can handle your logs.

At a high level, logs can be classified into two broad buckets:

  • Time series logs
  • Non-time series logs

Time series Log

A time-series log is the most common variety that most people think of when they hear the word log. A time series log means that every log line or event is tagged with a timestamp of when an event occurred. A time series log, even though simple in structure, is available in multiple formats. While these are not comprehensive, here are some examples of time-series logs.

Standard Event Log

Feb 28 20:02:04.107 2013 sjc03-c01-rtr02 SJC03-C01-RTR02: last message repeated 117 times

Multi-line Event Log

These are events that have multiple lines of data for a single timestamp. Typical examples are Java exception messages.

016-08-18 00:00:12,759 ERROR [commonScheduleQueue.SchedulingThread] com.ptc.wa.propertiesFile.GlassbeamPropertiesPlugin Administrator - Error writing Property files to storage

java.lang.NullPointerException: Local JMX URL string cannot be null here.

at wt.util.jmx.JmxConnectUtil.getLocalConnector(JmxConnectUtil.java:138)

at wt.util.jmx.JmxConnectUtil.getLocalConnector(JmxConnectUtil.java:98)

at wt.util.jmx.JmxConnectUtil.getDefaultServerManagerLocalConnector(JmxConnectUtil.java:86)

at com.ptc.wa.propertiesFile.WindchillPropertiesHelper.collectClusterProperties(WindchillPropertiesHelper.java:70)

at com.ptc.wa.propertiesFile.WindchillPropertiesPlugin.collect(WindchillPropertiesPlugin.java:89)

at com.ptc.wa.create.WAReportCreationManager.generateReport(WAReportCreationManager.java:226)

at com.ptc.wa.create.WAReportCreationManager.generateReportFromQueue(WAReportCreationManager.java:128)

Syslog

This is a well-defined, but customizable log format, very commonly used by most logging systems.

May 11 10:18:22 scrooge Web-Requests: May 11 10:18:22 @IunAIir1----7k-- EVENT_WR-Y-attack-600 SG_child[823]: [event.error] Possible attack - 5 blocked requests within 120 seconds

Regular event log without the year component

There are devices that generate events where the date component doesn’t have year in it. This requires additional logic to decide which year to consider and also handle events in year boundaries. If the events are streamed in real time, then it is easy to attach the current year to it, but since most devices send logs in batches and these batched log data will have events from present to many years behind, this can get very tricky.

May 5 17:18:40 stm[7595]: <330001> |stm| sapm_proc_hello_req: netmask N/A

Event log represented as JSON

This is an example of an event in JSON format that is easy to parse but difficult for humans to read in raw format.

{"ts":"2017-06-22T04:08:33.364","pid":900,"tid":"b94","sev":"info","req":"-","sess":"-","site":"-","user":"-","k":"msg","v":"Resource Manager: CPU info: 0% (current process)"}

Event log represented as XML

Similar to JSON, an event can be represented as XML.

<System>
<Provider Name='Microsoft-Windows’ Guid='{54429625-5478-4654-7E3B0218C30D}'/>
<EventID>4675 </EventID>
<Version>0</Version>
<Level>0</Level>
<Task>12388</Task>
<Opcode>0</Opcode>
<Keywords>0x8020006700000000</Keywords>
<TimeCreated SystemTime='2017-04-19T12:19:03.280843700Z'/>
<EventRecordID>5465</EventRecordID>
<Correlation/>
<Execution ProcessID='532' ThreadID='875'/>
<Channel>Security</Channel>
<Security/>
</System>

Event data from a sensor ( IoT)

This is an example of an event from a set of sensors, each logging their state, one per line

floor_1,BuildingMonitor,speed_rpm,1245.1255255463038,Sat Feb 14 20:04:01 IST 2015,analog

floor_1,BuildingMonitor,flow_air,53.97129530940758,Sat Feb 14 20:08:17 IST 2015,analog

 

An event represented as a cluster of events

This is an interesting use case of events, where a cluster of events represent one logical event.

In the below example, while each event has its own significance and has to be dealt with individually, there is also a need to handle the whole cluster of events from the ‘Starting Exam’ till ‘Ending Exam’ as highlighted as a single entity. So, in this case, there is a need to parse the below log data both as individual events and also the whole as a single event.

2016/01/19 10:38:00.033; Debug ; EchoVGU(3512); Starting Exam, Old/New ExamId=1/1;

2016/01/19 10:38:00.136; Info ; EchoProtocol(3512); HandleMsg(ExamDataChanged): received a ExamDataChanged message, data=All; 35059;4f34;ProtocolManager.cpp(2946) 2016/01/19 10:38:00.137; Warning; EchoProtocol(3512); ClearAllImageData(): m_protocolTemplate->GetNumRows() == 0;

35060;6d36;ProtocolManager.cpp(7434) 2016/01/19 10:38:00.142; Info ; CProtocolManager::HandleMsg(3512); ExamDataChanged, rotocol Screen or Live Protocol Mode not active;35061;2870;ProtocolManager.cpp(2991)

2016/01/19 10:38:00.142; Warning; EchoProtocol(3512); ClearAllImageData(): m_protocolTemplate->GetNumRows() == 0; 35062;6d36;ProtocolManager.cpp(7434) 2016/01/19 10:38:00.142; Warning; Manager(3512); ** Id(0000000021CD7510) Manager:ManagerImplementation::InitializeWindowViewPort: View portrectangle is empty.;35063;0583;ManagerImplementation.cpp(1658) 2016/01/19 10:38:00.292; Debug ; EchoVGU(3512) Ending Exam

Handling duplicate data

While I will cover the challenges with packaging of logs in part 3 of this series, what I would like to highlight here is that, most devices that are sending log data in batch mode, typically run a bunch of commands on the device, collate all the logs generated till date, package and send. This ends up sending historical log data for time series logs that overlap. That is, if the batched log received from a device on 06th Sep 2017 has data from say 1st Jan 2017 till 06th Sep 2017, then the next batched log from the same device received on say 10th Sep 2017, might have data from another log from 1st Jan 2017 till 10th Sep 2017. That is, apart from the new data created between 6th Sep and 10th Sep, the device might also send duplicate data sent before, i.e. from 1st Jan 2017 till 6th Sep 2017. While this is a packaging issue, changing packaging behaviour of devices already deployed in thousands in the field, will not be easy and hence the log analytics solution, needs to handle it.

As you can see from above examples, along with the variety in the log formats, there are also cases of partial data, duplicate data and unique processing requirements. So, when looking at log analytics for your machine logs, it is not only important to check the capability of parsing all formats, but it is also important to think about how this data will be used and if it imposes additional processing needs that the log analytics platform has to handle.