AlloyDB: Revolutionizing PostgreSQL

Deep dive into the latest innovations in AlloyDB: The new way to PostgreSQL

Estimated read time: 1:20

Learn to use AI like a Pro

Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

Summary

In this enlightening presentation by Google Cloud Events, the latest innovations in AlloyDB are explored, highlighting its transformative impact on PostgreSQL. The session, featuring key voices from Google and industry partners, delves into AlloyDB's superior price performance, its unique structural architecture that merges compute and storage layers, and its unprecedented security and high availability features. Furthermore, the discussion includes new updates such as AlloyDB's integration with Google’s AI technologies, offering advanced vector search capabilities, and the introduction of managed connection pooling. Customer experiences further underscore AlloyDB's real-world applicability and advantages over traditional self-managed systems.

Highlights

AlloyDB offers twice the price performance of self-managed Postgres, reducing operational costs significantly. 💸
Innovative architecture separately manages compute and storage for unparalleled efficiency. 🔍
AlloyDB introduces on-premise versions for versatile deployment, rare among cloud databases. 🏠
Boasts near-zero downtime, crucial for mission-critical workflows. ⏱️
New AI tools for anomaly detection and vector search capabilities included, boosting analytics. 📊
Managed connection pooling helps handle connection spikes, ensuring robust performance. 🌐

Key Takeaways

AlloyDB is setting new benchmarks in database performance and price efficiency! 🏆
Google is merging the best of open-source PostgreSQL with its cloud prowess, creating AlloyDB. 🌩️
AlloyDB's unique architecture separates compute from storage, boosting efficiency! 🔧
Security, availability, and near-zero downtime make AlloyDB a top choice for critical applications. 🔐
AlloyDB Omni brings these innovations to any environment, be it on-prem, cloud, or hybrid! 🌐
Managed connection pooling and AI integration enhance performance and user experience. 🚀
Customers are saving costs and achieving higher performance effortlessly with AlloyDB. 💰

Overview

Google Cloud Events presents the transformative advancements of AlloyDB, spotlighting its enhanced performance metrics and architectural innovations. The amalgamation of Google's cloud capabilities with open-source PostgreSQL is evidenced in AlloyDB's improved price efficiency, delivering twice the performance at reduced costs compared to self-managed options.

The unique structural design of AlloyDB, which decouples compute from storage, enhances operational efficiency and flexibility. Security and near-zero downtime are core features, making it ideal for critical business applications. The presentation also highlights the introduction of AlloyDB Omni, enabling deployment across various environments including on-premise, which is groundbreaking for cloud databases.

Excitingly, AlloyDB now includes advanced AI integrations for enhanced data management and analytics, such as vector search and managed connection pooling. Real-world case studies reveal how businesses are leveraging these innovations to boost efficiency, reduce costs, and streamline operations, affirming AlloyDB as a game-changing tool in the data management landscape.

Chapters

00:00 - 00:30: Introduction and Overview The introduction and overview chapter introduces the topic with a light-hearted tone, encouraging engagement by promising interesting content and a break for lunch. The speaker, GitHan, who is the product manager for LoDB and other services, is joined by Ravi Morti. Ravi, who is the foundational creator of the product, is highlighted for his significant contributions. The chapter sets up the context for what the audience can expect in terms of database insights and acknowledges key individuals involved in the product's development.
00:30 - 01:00: Journey and Goals of AlloyDB The chapter titled 'Journey and Goals of AlloyDB' discusses the experiences and insights from representatives of Manhattan Associates and Intessa, focusing on their journey with AlloyDB. It highlights the motivations behind creating another database by questioning the need for it. The narrative unveils a six-year journey aimed at building the world's best Postgres. The definition of 'best' is subjective, and varying perspectives are acknowledged in addressing this challenge.
01:00 - 03:00: Features and Innovations In this chapter, the focus is on the database named 'Lloyd' which boasts being the most price performant Postgres database. The transcript highlights the claim that Lloyd offers two times better price performance compared to self-managed databases. It is emphasized that transitioning workloads from self-managed Postgres databases on any IaaS to Lloyd results in cost savings, a feat claimed to be unprecedented in the history of open source.
03:00 - 04:00: Performance and Price Efficiency The chapter "Performance and Price Efficiency" focuses on the critical importance of high availability for mission-critical and business-critical workflows. It highlights the advantages of configuring a database with default 49 SLAs and RPO zero, which are crucial for ensuring fast failover, efficient recovery, and nearly zero downtime during updates. These features, typically found in commercial databases, have been made available for Postgres. The chapter concludes by emphasizing the importance of security in database management.
04:00 - 06:00: Architecture and New Announcements The chapter discusses the integration of Google's security features with IDB to enhance the management of mission-critical business applications. It highlights the advantages of combining Postgres's open community and developer momentum with Google's expertise, creating a robust solution. This fusion is described as 'the best of Google', offering a secure and efficient database management experience.
06:00 - 15:30: Customer Experiences with AlloyDB The chapter discusses unique customer experiences with AlloyDB, highlighting its innovative approach. One notable feature is AlloyDB's release of an on-premises version of a cloud database, a rarity among cloud providers. This decision was made to provide customers with flexibility, ensuring the benefits of both open source and cloud databases are accessible. AlloyDB is likened to 'Lloyd Omni,' which represents meeting customers where they are with downloadable options. Revi, presumably a feature or person associated with AlloyDB, is mentioned as a source of further insights.
15:30 - 17:30: Vision and Future Plans In the chapter titled 'Vision and Future Plans', the speaker discusses the expansion and versatility of a technology called Lloyd. It is now capable of operating across various platforms, including Azure, AWS, on-premises systems, laptops, and Google's disconnected distributed cloud. They also announced a new partnership with Ivan for managed services on these clouds, which showcases a unique capability in the industry. Ravi is introduced to provide further insights into the journey of Lloyd, labeled as LIV, to highlight the strategies and vision moving forward.
17:30 - 18:00: Conclusion and Upcoming Sessions In the 'Conclusion and Upcoming Sessions' chapter, Ravi concludes the discussion by reiterating the ambitious goal of building the world's best Postgres database. He expresses confidence in their progress and mentions that Alloy DB has become a versatile database capable of handling diverse workloads. The chapter sets the stage for presenting facts and upcoming discussions to further substantiate their achievements.

Deep dive into the latest innovations in AlloyDB: The new way to PostgreSQL Transcription

00:00 - 00:30 [Music] So we'll keep it interesting so that you have the right appetite to fulfill the database insights as well as get a good lunch. Myself GitHan product manager for LoDB and a few other services. Um it's a pleasure for me to be joined with Ravi Morti the man behind the the product. He built it from scratch and it's a absolute honor for us to have Sanjie
00:30 - 01:00 from Manhattan Associates and Jeppe from Intessa to share their journey and their perspective on Lloyd and what that has done for them. We started this journey almost six years ago. So we'll talk a little bit about what brought us here, why did the world need one more database and why we built it. So we went out on this journey six years ago with the intention to build the world's best Postgress That was our goal. Now, how do you define what's best? Some of you will say
01:00 - 01:30 a best database has to be the most price performance for us. And that's where we said, okay, we built Lloyd to be the most price performant Postgress out there. Right now, we are two times better price performance than self-managed databases. So, if you're self-managing Postgress on any IAS and you move that workload to Lloyd, you will save money. never happened before in the history of open source. Second, you will say for a database to be a real
01:30 - 02:00 database for me to serve mission critical workflows, business critical workflows, high availability is extremely important to us. And we said you're right. So by default when you configure all DB database, you get 49 SLAs always RPO zero. We manage fast fail. We do all you need to do for recovery of the database and we give you near zero downtime updates, subsecond downtime updates only possible in commercial databases. We brought that to Postgress. Last but not the least, security is the key aspect of all
02:00 - 02:30 mission critical business critical applications and we brought the goodness of Google security to all IDB and made sure that you have all the capabilities you need to manage the database seamlessly in a very secure manner. Our promise was let's take the goodness of Postgress the community it has the openness it has uh the momentum it has of the developers back it up or forge it with what Google does really best and we brought these two things together which made the real alloy the best of Google
02:30 - 03:00 and best of open source and that's what gives us uh the name alloy DB one very unique thing we did which you will not see any cloud provider doing is to release an on premises version of a cloud database. Why would we do that? Our goal was to meet we meet meet you where you are and provide you the best of goodness of these two worlds that we are building for you. And that was the birthplace of Lloyd Omni. And you hear a lot about this from Revi as well in which we give you a downloadable
03:00 - 03:30 version of Lloyd that runs everywhere. It runs in Azure, it runs in AWS, it runs on premises, it runs on a laptop, it runs on Google disconnected distributed cloud as well. And today we also announced a partnership with Ivan. So you can get a managed service on these clouds running alloyb as well. Very unique once in kind in the industry. How did we do this? No better person to tell this than Ravi. So I'll ask Ravi to come on over and take us to the journey of LIV. Thanks [Applause]
03:30 - 04:00 Ravi. All right. As Gigi said, we set out with a very lofty goal of building the world's best Postgress. And I think today you'll hear the facts that prove the theory. Um, and really when we started off, we we wanted to build a database that is really for all of your workloads. And today Alloy DB is by far and as you will see in the rest of the
04:00 - 04:30 slides the best database that you can pick not just for your transactional workloads that is our uh bread and butter of course but also for your OLAP reporting analytical type workloads in a HTAP type of a scenario uh as well as the new emerging set of applications around genai that need the highest level of vector search capabilities as well. So, it's it's really the onesizefits-all type of a database here. Before I get into the details,
04:30 - 05:00 just some headline numbers that you may have heard before, but worth looking at again. Um, Alloy DB today is provides four times higher transactional throughput compared to open-source stock posgress and this is on TPCC benchmarks. Um we provide up to 100x higher uh performance in terms of latency for analytical reporting type of workloads thanks to some deep innovation we've done in the Postgress engine. And last
05:00 - 05:30 but not the least when you're building Genai applications you want to store vectors and search them in the database. We provide some unique indexing capabilities that give us four times faster vector search, 10 times faster write throughput with a much lesser memory consumption. So really fantastic results here. I just want to touch a bit on the price performance that uh GG spoke about. This is something that is very very important for all customers. Um, and thanks to the superior architecture
05:30 - 06:00 that we have built all with, and I'll speak a little bit about that later, we are really much more efficient in terms of consuming the resources that are in the machine. Um, so you get a much higher performance in this case like with a 8 vCPU, not only do you get a higher performance than a 16 vcpu stock PG, but you actually end up paying half the the cost. So, so really fantastic price performance and that is on top of
06:00 - 06:30 all the things that you get from a managed service. So, when you think about self-managing Postgress versus having a managed alloy DB, uh we think that the choice is really clear. So before I go deep dive into some of the recent work that has happened over the last year, just a a reminder about how the uh architecture of Alloy DB sets it up to for us to be able to deliver all of these amazing capabilities. It all starts with an architecture that
06:30 - 07:00 cleanly separates the compute layer of the database from the storage layer. And you've got a primary compute, a standby compute, and any number of read instances that are all connected to a very high performance uh storage layer that is built on uh proven Google uh technology at scale. And and what this allows us is provide a very unique architecture for getting very cheap, fast read replicas as well as a very
07:00 - 07:30 highly available uh primary node for your database. All right. So now let me uh jump into some uh new exciting things that we are announcing this week. One huge announcement I'm very happy to share is we've just launched in preview alloyb on our latest uh Axion uh processors. That's our Google's ARM based um hardware and alloyb running on Axion. Um
07:30 - 08:00 we just ran these benchmarks recently. Um delivers 3 million transactions per minute. So that's just from a high headline number. It's an amazing accomplishment for us to be able to provide that with a 100% Postgress compatible engine. Um but beyond the 3 million transactions per minute, it's also compared to the x86 uh architectures that we have, it's 45% uh better price performance. Um, and the
08:00 - 08:30 thing that I really like to point out is we also did some comparisons with uh some of our friends over in uh uh AWS Aurora running on Graviton 4 and what we see is a three times better price performance compared to Aurora running on Graviton. So really uh impressive numbers here. Another one that I want to talk about is uh a new feature that we recently uh announced in preview called
08:30 - 09:00 managed connection pooling. And this is a case where all customers do uh run into uh at times issues with connection spikes and how do I handle connection loads and so on. And with managed connection pooling just out of the box, you get a very uh fully managed experience of a connection pooler that not only improves the reliability of uh your database and being able to weather connection spikes and so on, but also it
09:00 - 09:30 ends up improving your performance and latency because of having to avoid creating heavyweight connections to the database. And in in this particular experiment that you see the the graph u if you just do direct connections to the database it may top out at say around 5,000 direct connections to the database but with this managed connection pooling transparently from an application standpoint you're now able to scale to much higher numbers of connections and
09:30 - 10:00 uh drive a much higher uh throughput as well. So this is available in preview this week. One of the areas we've really invested a lot with Alloy DB over the recent years is in observability because at the end of the day um when you're running a database application you really want to know what is going on. You want to know what are the queries, what are the system metrics and uh we've got uh some really exciting stuff to announce both
10:00 - 10:30 by way of advanced query insights that has gone GA now where you get a very fine grained set of metrics at a very fine granularity um as well as a lot of uh query plans at your disposal. So it's a very easy to use console experience that shows you all the queries that have been running in the system, the various metrics around them, the query plans for it. U but we went one step further and said
10:30 - 11:00 okay what if we use AI to really detect anomalies and surface these things to you proactively. So with the AI assisted troubleshooting that uh has gone preview this week u using the same flow you will now start seeing deeper analysis of your database system anomalous behavior whether it's slow queries spikes in resource utilization along with recommendations as to what you can do to fix them. So when again when you think
11:00 - 11:30 about price performance, this is one where if you're managing large fleets of databases, autopilot capabilities like these really help reduce the cost from an operational standpoint. So definitely encourage you folks to go check this out. With alloy DB, we've gone even further. Observability is the first step. But folks who are used to sort of uh what we call uh sort of traditional
11:30 - 12:00 uh relational enterprise databases are are used to this concept of performance snapshots and and this is where as a database administrator you have a lot of control to be able to take snapshots detailed set of metrics reports on what's going on in the database and you take these snapshots at different points in time and the system lets you now compare compare these snapshots across these points in time to say, "Hey, what has changed?" This can be a real
12:00 - 12:30 lifesaver when you're trying to debug something that's going on and and really want to quickly get to the bottom of what's going on and how do I go fix it. So uh this is a feature called uh performance snapshots and uh it's very easy to set up and with uh just a couple of commands you can set it up to automatically take these snapshots and then when you do need to debug you it's easy to compare these snapshots and tell exactly uh what is going on. So, another
12:30 - 13:00 really uh cool uh enterprise feature um that uh that helps with uh debugging and and operations. Again, continuing on the theme of like what what we've been doing over the last year, um when alloyb first launched, we were supporting uh a specific uh Postgress major version and over the years now we are supporting multiple Postgress major versions. Um, and so the obvious question is, okay,
13:00 - 13:30 how do I go from a version to the next version? And we've made that now really, really simple with a truly one-click in place major version upgrade. So with with a very simple set of commands, u, you can go from Postgress 14 to 15 or 16 and 17 when it comes out fairly soon. And at the end of it, it's uh, fully automated. And from an application standpoint, you don't need to change the endpoints. It's it's exactly the same uh
13:30 - 14:00 transparent experience. And uh I won't go into all the other amazing stuff that the team has been hard at work over the last year. Uh but just to touch on a few highlights of what we've been uh again listening to a lot of feedback from customers uh and and working closely with with a lot of you. U we've made onboarding a lot easier. Um we've now got a concept of free trial clusters
14:00 - 14:30 that you can very quickly spin up a DB instance free of cost, try it out, see the value for yourself. Um and we've also made it very easy to uh bring your data into alloyb through uh what we call a fast restore from cloud SQL backups. So if you have a cloud SQL for Postgress instance, you can just go to a backup and and with one click restore it into an alloyb instance and and this happens really really fast. We're talking about like many terabytes an hour type of
14:30 - 15:00 restore speed. So even for some of your larger databases, you can very quickly create an LIB instance and and start playing with it. Um we've also added BQ federation. So if you have bigquery sitting alongside Alloy DB, you can uh play with data across them. On the connectivity side, uh this is an area we know there's a lot of friction. So we've tried to simplify it through a public IP support that you have now, but also a lot of work with PSC where we have done uh uh
15:00 - 15:30 advanced PSC and outbound PSC uh that just makes it easy to to set up networking. HA and DR is of course the bread and butter for any database and over the last year we have uh enhanced our ability with cross region replicas where you can now create uh multiple up to five regions uh where you have a a regional replica uh for your primary alloyb cluster um as well as
15:30 - 16:00 improvements with the way we uh automate and manage uh encryption keys. Um last but not the least on performance and scalability uh on the very high end we added support for 128 vCPU uh machine instances. These are as of today our largest instances on which you can run alloyb um and we continue to push the the ball forward on that. But uh also equally important on the other end of the spectrum when you're just getting
16:00 - 16:30 started uh you you just want to a dev instance a test instance we uh this week announced uh the support for one vCPU. So now you can create your your smallest alloyb instance to get started with is a is a 1vcpu instance. On the database size aspect of things, we go all the way up to 128 terabytes, which is and we continue to push uh the boundary on that as well. And the columner engine, the the the secret
16:30 - 17:00 sauce behind a lot of that 100x improvements with analytic queries, we continue to expand the feature set for that and uh have it do even faster with more and more of your queries. All right. No talk would be complete uh if I don't talk about AI. So of course Alloy DB AI is a huge part of our focus over the last year because we've heard from a lot of you that you want to you
17:00 - 17:30 you want to build genai applications and and you really want a database that understands the the components of a genai application. And with Alloy DB, we really are focusing on three main pillars. When we say Alloyb AI, it really is three things. One, the best possible vector processing engine where you can store vectors, query them, and and manage them. But we've also recently added two more things, and I'll go into that in a bit. One, what we are calling
17:30 - 18:00 the AI query engine, the ability for you to unlock the power of LLMs right inside your SQL. Um and then the last pillar is natural language interface. The idea that you've got a database, you can talk SQL to it, but what about natural language and how do we bring a natural language interface to the database. So these are the three pillars. Um but over the last year, it's it's been amazing to see the number of uh adoption of just
18:00 - 18:30 using vectors within database. Embeddings is the breadand butter foundational uh aspect for all gena applications and we've seen the adoption of vector search uh grow by 7x in in a year. So what does vector search in in alloyb mean? This is the ability for you to store your vectors in in a vector column. Um, but what we've improved is now your ability to combine vector search with the rest of SQL because that
18:30 - 19:00 is really the power of storing vectors in a SQL database is I can do vector search but then I can also combine it with SQL filters, joins, aggregations, all the power of SQL and that's a hard problem because this is a brand new data type. And what we've done with Alloy DB is enhance the engine, the optimizer to really understand vectors uh to the point where uh we're happy to report up to 10x faster filtered vector search queries where you're combining vectors
19:00 - 19:30 with SQL filters. And all of this is powered by what we call a scan index. Scan is a Google technology that was developed many years back uh and powers the core of Google search. uh and many other uh Google experiences and what we did was took the scan technology and put it as part of all DB. So now it's as simple as creating an index which is the scan index and with that you get the superior
19:30 - 20:00 performance and you get much faster index updates you get much better memory consumption and and uh we we look at scan in comparison to other options that sometimes may be a good option things like HNSW and and IVF flat. So uh the takeaway here is Alloy DB comes prepackaged with a whole set of vector capabilities that uh as you build out your application just working with SQL with indexes you can unlock all of this
20:00 - 20:30 power. I said uh we we did something called the AI query engine and and what this is is really the ability to call LLMs from directly inside your SQL. So the the first use case of this was to generate embeddings. So you have a piece of text, you have a image, you have a video or an audio file right inside the database SQL you can call into the LLM and turn it into embeddings. But now
20:30 - 21:00 we've gone one step further. You can use things like AI. Which is a function filter function and use that to invoke the LLM to do things like sentiment analysis. The example here is like, hey, I've got a review. Is it a positive review? And and this is where the entire power of LLMs and the knowledge of the world that LLMs have is at your fingertips as a SQL developer, which is amazing. And then you combine it with all the rest of SQL. Like yes,
21:00 - 21:30 it's a positive review and the price is less than 50. Um, so that's really a a huge power that you get by combining LLMs. And we're making it really, really easy for you to call into LLMs, not just Gemini models, which of course is our first party models, but any model that you have. You can call into any uh third party models using the same simple syntax. It it really doesn't get uh any easier than this.
21:30 - 22:00 And last but not the least, as I said, we also have introduced a natural language interface into the database. So the way to think about this is SQL is great. If you're a developer, you know SQL, SQL is amazing. But for all the folks who are not the SQL experts, how do they talk to the database? And now it's become as simple as just using natural language. So you can walk up to an Aloyd DB database and just in any language just pose a question. show me my orders in the last three months and the database just figures out what that
22:00 - 22:30 means based on the schema based on the context and gives you back the results. So this truly reduces the barrier for for everyone to be able to interact with with databases and we do it in a way that doesn't uh violate any security and privacy and and all of those things. I want to touch very briefly on uh Alloy DB Omni. So when you think about Alloy DB, there's all this amazing capabilities packed into it. And what we
22:30 - 23:00 did with the Omni edition of Alloy DB is bring it everywhere. So if you're running the managed service on GCP, that's awesome. But if you really need this capability running on say a different cloud AWS, Azure or you want it on a on-prem uh setup or on the edge with Alloy DB Omni, you have the ability to get all of this capabilities anywhere your applications are running. And uh
23:00 - 23:30 and we've seen over the last year a lot of customer interest and adoption of this. And again to touch on like why is Omni amazing, it's got all of those capabilities that I talked about, but the proof is in the pudding and as the data shows it's Omni running anywhere is is much faster than standard Postgress and uh as of this week we are announcing the preview of a brand new feature called atomic IO which uh is a it was an
23:30 - 24:00 amazing innovation on the data path for for the database engine which is now making Omni four times faster than standard Postgress and and this is running anywhere. So you can take this run on any cloud, run on prem and and get the benefits. And the last thing I'll talk about Omni is beyond just the core engine, we've also got a Kubernetes operator that lets you manage Omni in all these other environments where you can use the Kubernetes operator and you
24:00 - 24:30 can set up your management service around it. And we've done a lot of improvements to the operator uh over the last year uh making it very easy for you to set up Omni with HA DR and and deal with all the sort of the day two operations using the operator um while we continue to expand the set of platforms and versions that it supports. So hopefully this gives a a bird's eye view of all the uh stuff that we have been working on with Alloy DB. But I'm
24:30 - 25:00 super excited for you to hear not just from us that are working on alloyb but really our esteemed customers who have walked the alloyb journey and are here to share their experiences with us. So for that I am excited to welcome Jeppe from [Applause] Intessa. So good morning everyone. Uh first of all I'd like to thank you to
25:00 - 25:30 Ravi Gigi and Google to give us um to give me the opportunity to share our experience with Google cloud and uh in particular with Alob. My team is responsible for all infrastructure that support data. So from storage, traditional block storage, network storage and cloud storage to database solution, both mainframe and open
25:30 - 26:00 technologies and big data platform and analytics and also data protection. All of these supports in S. Paulo, the leading bank in Italy and one of the most important in Europe. In June 2023, Inessa S. Paulo launched EC Bank, the digital and fully digital and online groups bank. EC Bank was built as a cloud native native platform directly on
26:00 - 26:30 Google cloud uh with a core banking system developed by our partner to machine and a lot of application developed on our native cloud framework in house. But how did we get here? It all started in 2021 when Google and Intel San Paulo become partner with the first approach based on a lift and shift YAS migration. We began with noncritical workload
26:30 - 27:00 because uh at uh at at the time Google had only European regions in Frankfurt and Amsterdam and regulations didn't allow us to store sensitive uh data outside Italy plus the latency over than 30 milliseconds not acceptable for many services. The game changer came at the end of 2022 when Google opened the first Italian region in Milan and after six
27:00 - 27:30 months the second region in Tian. Google became so the only cloud provider with the two regions in Italy which was huge for us especially for legacy application and integration with mainframe systems. When we started this bank project at the end of 2021, we quickly realized that on our infrastructure service solution wouldn't
27:30 - 28:00 scale for more than 10 million customer accounts. So in summer 2022, uh we began evaluating a solution also recommended by our partner to machine. The migration wasn't without uh challenges. Uh let me share some example about uh the first example is about uh networking int Paulo as a particular
28:00 - 28:30 topology network topology and so initially communication between application layer and database uh required the cloudVPN but for a core banking system with an high volumes uh transaction this wasn't a viable solution so working with Google and our network architecture, we identify a better path. Uh in in December 2023, Google release in private
28:30 - 29:00 preview in private J sorry uh the PSG support and this enable us to design the right network solution uh and architecture for our needs. The second example is on the business continuity. Uh we are required by regulation to test our disaster recovery plan at least once per year ensuring RPO equal to zero and the minimal
29:00 - 29:30 RTO. Uh with uh the YAS infrastructure this was very easy. We have stretched cluster. We had a stretcher cluster basically posgress between cur and milan. So you have to move just the primary services on a virtual machine with alloy this initially allb initially didn't allow this and so um if you move the the services on the secondary region you have to rebuild
29:30 - 30:00 recreated the cluster. So RPO is never equal to to zero also RTO is not minimal as required by regulations. So for this Google released the cross region replication with replica version and in um and release for us in private GA in December 23. Now is engineer available for all. Uh now so this mean that we can make fail over a fail back safely with RPO equal to zero
30:00 - 30:30 and minimal RTO. Last but not least something about performance and cost optimization. On the performance side, Aludb truly impressed us with alpha resources. As you can see, 32 cores against 64 of the YAS infrastructure. It delivered equal or even better performance. Look at also the disk size in infrastructure services. To get the
30:30 - 31:00 UPS we needed, we had to allocate 10 terabyte on SSD disk. In LoB you pay just one to use and in that case is two and a half terabyte. Last point about performance is about restore speed. We passed from one terabout per hour restore speed to 9 terabyte per hour. An amazing result. So today is bank is fully running on
31:00 - 31:30 Alob and thanks to close collaboration with Google we overcome technically regulatory and performance related challenges. It's been a demanding journey but also very rewarding one. Thanks again to Ravi, Gigi, Google Cloud Italy that support us every day and um all of you for your [Applause]
31:30 - 32:00 attention. Sanjie from Manhattan. uh to speak about his uh alloyb journey and experiences. Over to you Sanjie. Thank you Ravi. Hey guys, I know I'm standing between you and lunch. So I'll try to be quick. So let me start with uh who
32:00 - 32:30 we are. Very quick introduction. Uh we are a public software company been around for 35 years. Those of you in supply chain probably know us well. Uh those who are not may think from our name that we are a law law firm from Manhattan. We are actually based out of Atlanta. Uh we about 5,000 uh associates overall. Now uh in terms of software we provide uh there are four categories of software we uh we provide. For those of
32:30 - 33:00 you who are not in supply chain space, the easiest way I kind of describe us is when you go to a website and press that buy button. We essentially take over the transaction until that package gets to your home. We do we provide every software which is required to make that package show up at your house. Uh specific products we do is order management which essentially takes that order figures out where does it get sourced from we provide everything on a warehouse management space where we figure out how to pickp pack ship that order put it in a box and then we manage
33:00 - 33:30 the transportation of that thing uh to your house. Uh recently uh in last two years we've also got into the stores and we provide a point of sales solutions. So there are a lot of leading retailers uh using us from that front. Uh we do play in a lot of different verticals. Retail being one of the big ones. Uh 37 of top 50 retailers uh use us. Uh pretty big presence in grocery. Uh 17 of top 20 uh grocerers in US use us. Almost every freight forwarder uses us.
33:30 - 34:00 So pretty big presence uh uh overall in supply chain industry. So moving on from uh who we are to the challenges we're trying to solve with Alloy DB. So uh couple of things just I'll give you context and I'll talk through these two problems. So the first one was uh I'll give you a little bit context on where we are. So for each of our customers we actually create a separate database instance because we don't want any data mixing up for these customers. So we run
34:00 - 34:30 about 2,000 MySQL instances in production today. uh that includes the replica and the core. Uh if you look at the data size of this combined, it's about two two plus pabytes of data. Uh if you take any individual in instance, it ranges anywhere from 100 GB data to close to 35 terabytes of data in any given uh instance. If you take a single production environment, a single customer, our core setup pretty simple. You got your production database which is your readr database and then you have
34:30 - 35:00 replica which is your reporting database and most of our customers are very reporting heavy. Uh they run a lot of their operations because every customer has a different unique needs. We provide them a lot of dashboards in the application but typically that's not enough. So they'll land up writing a lot of custom uh SQL or reports essentially uh to look at the data. The size of a single data about 10,000 tables in in the schema. There are tables which expand about 500 plus columns in in a given table. So pretty complex set of uh data tables uh
35:00 - 35:30 and database. I just kind of I didn't mean to read this thing. This is one of the sample SQLs coming out of it. And this is probably a simpler example. So a lot of these reports have extremely complex SQL. Sometimes I look at it and say okay somebody has to have a PhD to be able to write this uh uh SQL. Uh they're extremely complicated SQL. So our our challenge was really and these SQLs when you run in reports
35:30 - 36:00 sometimes takes a really really long time to run. So the challenge I had for the Google team was how do we make this thing perform? What can we do? And we were on my SQL running uh this thing on my SQL. Uh and we started looking at LIB on canv uh perform uh any better. So what we did was we took a customer we took their top 30 SQL which were really uh badly performing and we took exact same SQL and put it in LB. So we copied the data
36:00 - 36:30 replicated the data from my SQL into DB. So you saw the diagram with replication going to my SQL. So switch this replica to an production is still running the rewrite is still running my SQL. The replica is running uh alloy data being replicated and these are the numbers we saw. Uh these are real numbers on the top 10 queries. So you can see on the MySQL side there were queries which were running for 20 minutes plus because of the SQL I showed you guys and some of
36:30 - 37:00 these things have 350x. Most of these things came down to 3 seconds. And our challenge with our customers has been these are operational reports and they need these answers in in a minute to know what's going on in the warehouse versus waiting for 15 minutes by the time things have already moved on. So LB and I think the core of it was the columner engine which helped us a lot uh in getting this performance. So we are kind of getting into the stage where we're trying to replace lot of our
37:00 - 37:30 replicas to Alli DB so we can gain from this performance. So that's kind of one challenge I think Ali DB has helped us from an overall latency perspective. The second problem I I'll kind of build the context a little bit. So I talked to you about we are about 5,000 people. Uh we do a lot of projects. So a lot of these companies uh we have a services team which is about 1,200 people and they're on these projects. We run about 600 projects a year for various companies. So
37:30 - 38:00 somebody's working for Michael Kors and doing a project. uh somebody's working at uh blah blah blah blah and doing a project. Now as you can imagine these project teams they're disparate project teams completely isolated from each other. They don't talk to each other. So left hand doesn't really know what right hand is doing. But a lot of times they're doing a lot of repetitive work because a lot of our extensions they are building on the product are quite similar. A project a may want something but project B is also doing the same thing or may have somebody may have done it over period of time. We generate
38:00 - 38:30 between these teams between these 1200 associates about 10,000 documents a year right these are design specs how do you do an extension what do you do with it so there are about roughly 10,000 artifacts uh which we generate uh on a yearly basis so our challenge was how do we make these teams efficient how do we make kind of one one team learn from experience of the other team and how do we kind of collaborate between these teams so they can share knowledge So what we did
38:30 - 39:00 was we took all our documentation the 10,000 documents I referred to and we indexed them into uh agent space. So all of the product documentation everybody's producing that goes into agent space. Now uh and I'll get to what we've kind of done eventually with it. But what we also did was and these documents are written by different people. So you can imagine even though we try to use same templates the style which is in there is different the language is
39:00 - 39:30 different somebody gets a different table so it's not exactly the same same set and and lot of our queries needed some structured answers back so we took these documentations also using uh AI we extracted a structure out of it what's the customer ID what's the extension number uh think of as a web hook we call it extension points what's a web hook on it. Uh what is the purpose of this extension? So we extracted this thing into a columner data apart from the
39:30 - 40:00 unstructured piece sitting in agent space and we kind of built an agent which has interface. So any of these 1200 folks can go to this uh chatbot and ask a question has anybody extended a label for shipping for UPS uh which requires XYZ right or they can ask uh and they it gets an answer. So we have kind of two sets of problems. So sometimes these questions are very
40:00 - 40:30 general that you're just basically querying hey has anybody done this? How was it done? and you want sort of an English description of whatever was done. And some of these questions are very specific. Hey, for customer I'll use Chick-fil-A. For customer Chick-fil-A, uh what was done as part of extension 05 or how many times a label has been extended? How many customers are extended? So, some queries are very structured queries and some are generic answers. So in between we have kind of our own agent framework which basically takes
40:30 - 41:00 this query and routes it figures out whether it's a generic question or it's a very specific question. The generic question gets routed to agent space and the chatbot would kind of show you the response and the specific question goes to alloy db uh where we fire the query. So in this example I kind of asked the question is uh what's the extension point used for Chick-fil-A uh and this is the query uh which essentially gets formed uh providing that answer. So what we have done is and this has helped us tremendously across the organization
41:00 - 41:30 where when somebody does something it gets indexed we can search it from agent space we can search it from lobb and put together the answer uh back. So that's the overall solution we have kind of built and uh uh I have three minutes so I'm going to talk a little bit about our vision. This is where we're trying to go with uh using a lot. This is where we're trying to go. So this is a pretty complex dashboard. I'm not meaning to for you guys to understand all of this stuff up here. But this dashboard the
41:30 - 42:00 data you're seeing the columns you're seeing comes from various data sources. So for in this example right uh this is about an order. it can go through different states and and when it goes to different states it creates different kind of entities and objects in the database to run a query like this to create this dashboard the query I showed you extremely complicated query probably eight pages long uh would never kind of really perform and it probably takes somebody 3 4 days to even get to the
42:00 - 42:30 point on what query to write understand a database schema very well and what we are attempting to do this is not in production and we are still kind of uh shooting for this is break this down into natural language right is ask break this out into five or six questions where somebody can ask a query saying okay I want to see uh release and allocated status from an order table from a task table by an order I want to see get me all the data here and then we would create a framework on top of it
42:30 - 43:00 which would kind of take all of this stuff and put that data table together and then we'll use AI as a visualization tool saying once I have this data visualize it in this form create a pie chart create a table which would reflect this data. So our goal is to reduce this 2 days 3 days to kind of create this complex dashboard and sometimes 3 days maybe a lot more to few hours where somebody can ask a natural language query uh and a set of natural language queries which we can all put together and create dashboards like this. So
43:00 - 43:30 that's the core goal. Uh that's where we're headed and that's all I had for you guys. Thank you. Thank you Sanjie. U it's really inspiring to see uh all the amazing things that our customers are building with alloyb and we really appreciate the close partnership. Um so just to close things off you've heard a lot about alloyb. It's the new way to posgress. Um
43:30 - 44:00 it's for your transactional workloads but also for your analytical and your genai applications and uh lot of new announcements this week. So please do go check it out. U and uh there are a few more sessions that I would like to draw your attention to. There are two tomorrow if you want to continue the learning journey. We've got a session on alloyb omni uh and uh alloyb vector search and AI. So these are further deep dive sessions with demos and and a lot
44:00 - 44:30 more detail. So if you're interested, please do check those out. And uh if you have feedback, uh please do share that. And with that, uh thank you all. [Music]