Scale By the Bay Scale By the Bay

cloud

Thursday, October 28, 2021

- PDT
Proxies, Gateways, and Meshes: Cloud Connectivity 101 for Developers
Viktor Gamov
Viktor Gamov
Kong, Principal Developer Advocate

API gateway technology has evolved a lot in the past decade, capturing more prominent and more comprehensive use cases in what the industry calls “full lifecycle API management.” API gateways were just the management of the network runtime that allows us to expose and consume the APIs (RESTful or not), secure, and govern our API traffic. However, today, they provide a series of functionalities to support the complete development cycle, including creating, testing, documentation, monitoring, event monetization, monitoring, and overall exposure of our APIs. Then around 2017, another pattern emerged from the industry: service mesh! Service Mesh is an infrastructure layer for microservices communication. It abstracts the underlying network details and provides discovery, routing, and a variety of other functionality. Many attempted to describe the differences between gateways and service meshes, e.g., API gateways are for north-south traffic and service meshes for east-west traffic. I want to illustrate the differences between API gateways and service mesh — and when to use one or the other pragmatically and objectively. This talk will also discuss the similarities and differences between the communication layer provided by gateways, service mesh!

- PDT
Funnel Rocket: A Serverless Query Engine
Elad Rosenheim
Elad Rosenheim
Stackbit, Developer Growth

Funnel Rocket is a newly released open-source engine for running serverless big data queries at scale - quickly and cheaply.

It builds on robust foundations: AWS Lambda, Pandas, Apache Arrow and Redis, which are used in a variety of interesting ways to facilitate fast, highly distributed job with minimal orchestration logic.

Funnel Rocket is currently purpose-built for a specific type of query: from a sea of raw data, find users who've matched multiple conditions - optionally in a specific sequence. However, I believe there's a big potential for cloud-native big data tools: composing existing building blocks by cloud providers + the OSS community to create lightweight, fast-scaling and cost-efficient solutions.

- PDT
Battle-tested event-driven patterns for your microservices architecture
Natan Silnitsky
Natan Silnitsky
Wix.com, Backend Infra Developer

During the past couple of years I’ve implemented or have witnessed implementations of several key patterns of event-driven messaging designs on top of Kafka that have facilitated creating a robust distributed microservices system at Wix that can easily handle increasing traffic and storage needs with many different use-cases. 

In this talk I will share these patterns with you, including: * Consume and Project (data decoupling) * End-to-end Events (Kafka+websockets) * In memory KV stores (consume and query with 0-latency) * Events transactions (Exactly Once Delivery)

- PDT
Top new CNCF projects to look out for
Annie Talvasto
Annie Talvasto
CAST AI, Product Marketing Manager

The Cloud Native Computing Foundation (CNCF) bought you such fan favorites like Kubernetes & Prometheus. In this talk Annie Talvasto will introduce you the most interesting and coolest upcoming CNCF tools and projects.

This compact and demo-filled talk will give you ideas and inspiration that you can 1) discover new technologies and tools to use in your future projects as well as 2) be the coolest kid in the block, by being up to date with the latest and greatest.

- PDT
A Journey of Migrating Data Platform from Data Center to Cloud
Lei Gao
Lei Gao
Workday, Engineering Manager (Data Platform)

We have built a data platform which is powering thousands of internal analytics use cases over last four years. With more and more challenges from both business growth and cost effectiveness, the team decides to move the data platform from our data center to public cloud. I will share the experiences and lessons what we learned during the migration, such as modernizing the platform through levering the cloud native services, reduce the cost with the elastic cloud provision and so on.

- PDT
Prepare Your System To Scale (OR Why Auto-Scaling Is Not Enough)
Eynav Mass
Eynav Mass
Oribi, VP R&D

Scale. We tend to dismiss it as just a buzzword, until we reach the point when it has taken over our engineering team’s nights, weekends and thoughts. Scaling is usually the negative effect of positive business growth. The question is, how does one best prepare for scale? How can we proactively prepare our systems to handle a wishful, yet expected, future load? In this session, I will share some considerations that are not often discussed, yet are not trivial when you begin to think about preparing for scaling.

- PDT
Making Kubernetes How We Build Things
Rob Richardson
Rob Richardson
Cyral, Developer Advocate

It's day 2. The corporate k8s cluster is humming. Everything works perfectly in a local environment, but how do you connect the wires? Your first few steps in Kubernetes may feel like walking through uncharted territory. Yet, several tools can make you just as productive as you were in your comfortable local setup. With only a few changes in your configuration, you can automatically rebuild clusters when a file changes and even debug software running in containers. Add this to some visualization tools and some templating software, and you'll be back on track very quickly. In this talk, you’ll learn how to use some open source tooling available around the Kubernetes ecosystem to become more productive and optimize for developer joy.

- PDT
The Incremental ETL Architecture
John O'Dwyer
John O'Dwyer
Databricks, Developer Advocate

Incremental ETL in a conventional Data Warehouse has been possible for some time but scale, cost, accounting for state and the lack of access for machine learning make it not ideal. Until now, Incremental ETL in a Data Lake has not been possible due to factors such as updating data and identifying changed data in a big data table. Incremental ETL also makes the medallion table architecture possible and efficient so that all consumers of data can have the correct curated data sets for their needs. We will discuss the advances in big data technology that make Incremental ETL possible as well as the architecture as a whole.

- PDT
Building Complete State-of-the-art Natural Language Processing Projects with Free Software
David Talby
David Talby
John Snow Labs, CTO

This session introduces three free software tools built on top of the Spark NLP library. They enable data science teams to deliver end-to-end natural language processing projects, with state-of-the-art accuracy, much faster than what was possible just a year ago. First, we'll cover the recent release of Python's NLU library, which provides training & inference over thousands of NLP models with one line of code. Second, we'll cover the Annotation Lab, which enables business domain experts to train & tune models using active learning and transfer learning, with support for enterprise-grade security, versioning, audit, and analytics. Third, we'll cover the NLP Server, which provides no-code access via a simple UI and API to the entire models hub.

Friday, October 29, 2021

- PDT
Developing Declarative Backends
Ramnivas Laddad
Ramnivas Laddad
Paya Labs, Inc., Co-founder

Backend development is often saddled with repetitive tasks that not only add to development effort, but also lead to fragile systems with scattered implementations for authorization, data access, and integration with external services. What if we take a declarative approach where we define models, relationships between entities, access rules, and external services? Such a system would be easy to express, examine, and extend. We can then build an API within a few minutes and have confidence that it will meet our current and future needs.

In this talk, we will explore how to build such a system focusing on considerations such as the language to express models and associated access rules, define external services and their interactions with the rest of the system, and deployment considerations. We will also discuss what it takes to expose APIs such as GraphQL and REST. Finally, we will examine special considerations in implementing such a system in Rust.

- PDT
Reproducible Machine Learning at Scale
Lukas Biewald
Lukas Biewald
Weights & Biases, Co-founder, CEO

How can we support effective, reproducible, and explainable deep learning and coordination across practitioners? In this talk, Lukas Biewald will share best practices for conducting, debugging, and sharing deep learning experiments at scale. He will talk through how some of the best tech companies in the world use the Weights & Biases platform for managing datasets, debugging models, versioning training/evaluation recipes, extracting insights, and storing all the crucial details needed to make their models reproducible and their research collaborative.

- PDT
Live coding session with James Douglas
James Douglas
James Douglas
Reonomy, Director of Data Engineering

Let's live-code an effect system in Scala.  We'll begin by describing an effectful program that we would like to write as a pure value.  Then we will write a toy effect system to be able to run it.  For our effect system, we will choose from monad transformers, free monads, tagless final, or reader monads.

- PDT
What the heck is gevent?
Shiv	Toolsidass
Shiv Toolsidass
Lyft, Senior Software Engineer
Roy Williams
Roy Williams
Lyft, Principal Tech Lead Manager

Python is known to be fun to write but has well-known shortcomings with concurrency. A large chunk of Lyft's backend is powered by Python microservices that utilize a concurrency library called gevent. Gevent can be incredibly powerful, but that power comes with tradeoffs. Understanding these tradeoffs has allowed Lyft to scale Python well in production. In this talk, we'll share with you these learnings, which are applicable to just about any event-loop based framework (e.g Node.js).

- PDT
How We Built SQL Rollups on Streaming Data
Tudor Bosman
Tudor Bosman
Rockset, Chief Architect
Karen Li
Karen Li
Rockset, Software Engineer

Rockset is a real-time analytics database for serving fast search and analytics at scale. We built SQL rollups in Rockset that can pre-aggregate data from streaming sources, like Apache Kafka and Amazon Kinesis. Using rollups can improve storage efficiency and query performance.Over the course of this project, we encountered multiple challenges in building SQL rollups on streaming data, including:
* supporting exactly-once write semantics
* executing SQL on streaming data at ingest time
* processing out-of-order arrivals correctly
In this talk, we discuss how we overcame these technical challenges to implement rollups.

- PDT
10 Things You Should Know About Quantum Computing and Machine Learning
Chris Fregly
Chris Fregly
AWS, Principal Engineer

Based on nature's operating system of quantum mechanics, quantum computing offers massive parallelism and compute power well beyond today's super-computers. In this talk, I demystify the field of quantum mechanics, quantum computing, and quantum machine learning. I will discuss the fundamentals of quantum computing, introduce various quantum machine learning algorithms, and explain how hybrid quantum-classical algorithms work together to solve todays' problems.

- PDT
The intersection of Quantum Computing and Distributed Systems: Observations and lessons learned.
Karl Wehden
Karl Wehden
IBM, Program Director for the Quantum Cloud

The dawn of quantum computing is upon us, and we are marching inexorably towards a new mode of computing altogether. 

Or are we?  

What tells us that the integration of quantum and distributed (classical) computing needs to follow a curve that has just been pasted forward from the 1940s to the 2020s? Nothing.  Let's look at what is similar, what is different, and some of the unexpected surprises we have encountered in our journey to quantum advantage from a software perspective