Most operation teams will agree, without runtime information like log data and metrics you are literally blind in production. Even if such data is available, without correlating events from different sources it becomes very difficult to make sense of your metrics and to draw accurate conclusions. With the rise of microservice architectures, which aim at breaking monoliths into zoos of distributed services, standard approaches with local logfiles and basic runtime metrics are not sufficient anymore. Instead there is an increasing need for a centralized log and metrics collection that allows for aggregating, correlating and visualizing any kind of runtime events.
In this session, we will show both, the conceptual foundation and best practices for a monitoring solution of a decentralized and distributed application landscape. Coming from a blind example setup, we will show how logging data and runtime metrics can be captured, collected, evaluated and visualized in order to improve the operational experience and the ability to respond early to problems.