CloudWorld PRO Stage D
Monday, February 7, 2022
Cloud-native applications today are increasingly complex and therefore increasingly hard to understand. It’s critical to connect decisions around resource allocation and architecture to business metrics such as end-user latency, but very difficult to do in practice. Ultimately, understanding how your systems behave and why is a data analytics problem. Like most data analytics problems, the trick is in collecting and wrangling the right data sources. In this talk, you will learn how Pixie, an open-source observability platform for Kubernetes, can be used to painlessly turn low-level telemetry data into high-level signals about system health. The talk will also show these high-level signals can be used as input to infrastructure workloads such as CI/CD and load balancing in order to improve their performance.
All the unit tests in the world, the largest QA team still can’t stop bugs from slithering into production. With a distributed microservice architecture debugging becomes much harder. Especially across language & machine boundaries. APMs/Logs have limits.
Production bugs are the WORST bugs. They got through unit tests, integration tests, QA and staging… They are the spores of software engineering. Yet the only tools most of us use to attack that vermin is quaint little log files and APMs. We cross our fingers and put on the Sherlock Holmes hat hoping that maybe that bug has somehow made it into the log… When it isn’t there our only remedy is guesswork of more logging. That in turn bogs performance for everyone, makes the logs damn near unreadable and can literally cost millions in fees. But we have no choice other than crossing our fingers and going through CI/CD again and again until we find it.
There are better ways. With modern continuous observability tools we can follow a specific process as it goes through several different microservices and “step into” as if we were using a local debugger without interrupting the server flow. In this session I will demonstrate such an approach.
Utilizing an all Apache stack for Rapid Data Lake Population and querying utilizing Apache Flink, Apache Pulsar and Apache NiFi. We can quickly stream data to and from any datalake, data lake house, lakehouse, database or any datamart regardless of cloud or size. FLiP allows for Java and Python developers to build scalable solutions that span messaging and streaming in cloud native fashion with full monitoring.