Conquer Tech Debt Like a Pro

Scaling the Mountain: A Framework for Tackling Large-Scale Tech Debt - Jimmy Lai

Estimated read time: 1:20

Summary

Jimmy Lai shares an advanced framework for tackling large-scale tech debt at PyCon US. The session emphasizes creating strategies to measure, recognize, and automate the management of tech debt across vast codebases, particularly focusing on how to leverage AI and static analysis tools. Using real-life examples from his career, Lai highlights the effectiveness of setting strategic goals, fostering a culture that recognizes contributions to tech debt reduction, and optimizing processes through automation. His insights provide a valuable guide for developers facing similar challenges in managing large code systems.

Highlights

Jimmy Lai emphasizes the importance of measuring tech debt using static analysis 🚀.
Recognizing contributions to tech debt reduction boosts morale and accountability 🎉.
Automating routine processes saves time and energy for larger projects ⏳.
PyCon US session offers a framework to tackle tech debt using real-life examples 💼.
AI and tools like Pyrite can simplify and enhance tech debt management 🧑‍💻.

Key Takeaways

Tech debt isn't just unavoidable; it's conquerable with the right strategy 🚀.
Static analysis tools are your new best friend for identifying code issues 🔍.
Recognizing and celebrating tech debt contributions fosters a positive culture 🙌.
Set clear goals and track progress to manage tech debt effectively 📈.
Use automation and AI to make tech debt management easier and more efficient 🤖.

Overview

In his engaging talk at PyCon US, Jimmy Lai peels back the layers on tackling tech debt in large and sprawling codebases. Tech debt, often seen as a necessary evil in software development, becomes manageable when leveraging Lai's strategies of measurement, recognition, and automation. With a dash of humor and heaps of practical advice, Lai guides the audience on how to keep tech debt from drowning productivity with methods that are both pioneering and feasible.

Lai dives into the nitty-gritty of utilizing static analysis tools to keep track of troublesome code areas and highlights the potential of AI in this arena. His examples aren't just theoretical; they're culled from industry experiences, making the insights particularly relatable for anyone dealing with massive systems. His solution isn't just about identifying problems but celebrating the solutions, encouraging a workplace culture that values and uplift those who help tackle tech debt.

Throughout the session, the theme of automation sings through loud and clear. By automating routine tech debt management tasks, Lai argues, developers can invest their time in building new features and innovating more broadly. His session wraps up with anecdotes about successful implementations, underscoring the effectiveness of strategic goal-setting and showing how even the most unruly tech debt can be tamed.

Chapters

00:00 - 01:30: Introduction and Introduction of Jimmy Lai The chapter begins by setting a lively and engaging atmosphere with greetings and encouragement for the audience to be energetic, suggesting a morning event, possibly a conference or meeting. It then introduces Jimmy Lai, mentioning his long-standing involvement with the Pyons, an organization or community related to Python programming. The introduction highlights his first experience with Python in Oregon, providing a personal touch to his background.
01:30 - 03:00: Overview of Talk Outline and ZIP Company The chapter introduces a speaker named Jimmy Lie, who has a keen interest in Python. The context seems to involve a presentation where he intends to share his knowledge about Python with the audience. The title "Scaling the Mountain" suggests that the presentation may involve addressing complex or large-scale technical challenges, possibly with Python as the tool or framework to do so.
03:00 - 04:30: Challenges with Large Codebases and Tech Debt The chapter, titled "Challenges with Large Codebases and Tech Debt," addresses strategies for dealing with technical debt in large-scale technology projects. The speaker begins by inquiring whether the audience likes technical debt, noting that someone does, but implies that those attending likely see it as an issue they need to address. The speaker then introduces the framework that will be discussed throughout the session to tackle these challenges.
04:30 - 08:00: Strategies to Manage Tech Debt: Measure, Recognize, Automate Chapter Title: Strategies to Manage Tech Debt: Measure, Recognize, Automate Summary: The chapter discusses the challenge of technology debt and provides actionable strategies to manage it. It includes open source demonstrations and case studies. At the end of the chapter, there is a five-minute Q&A session. The speaker, currently working at ZIP, introduces their company, a peer-to-peer (P2P) spending service designed for businesses needing collaborative spending solutions.
08:00 - 11:00: Case Studies: Pyrite and Async Adoption The chapter discusses the challenges faced by a company with a large Python monolith codebase of over three million lines of code, managed by a growing team of 100 developers. Despite being small, the company serves many big customers, which requires maintaining a robust testing framework that includes more than 20,000 tests. This scale of operations has led to significant technical debt, creating challenges that the company needs to address.
11:00 - 14:15: Discussion and Q&A In the chapter titled "Discussion and Q&A," the focus revolves around the productivity challenges faced by developers due to outdated or dead code. This issue requires developers to invest a significant amount of time to comprehend such code, thereby reducing the time available for developing and shipping new features. The conversation suggests that attempting to adopt type checking can lead to numerous type coverage issues, potentially resulting in several runtime errors and making the process cumbersome.

Scaling the Mountain: A Framework for Tackling Large-Scale Tech Debt - Jimmy Lai Transcription

00:00 - 00:30 Hello. Hello. Hello everybody. Wow, so many faces. Ah, very nice. How is everybody's day going? Oh, good. Yeah. I mean, come on. We have more energy than this. It's morning. Come on. Nobody had a coffee here yet. Very good. Yes. Very good. Perfect. So, um, we have now Jimmy Lee. Um, and Jimmy has been joining Pyons for many years now. and his first Python was in Oregon around seven or
00:30 - 01:00 eight years ago, right? Yes. Cool. Um, Jimmy likes Python and he also likes to share what he learns with Python with people like yourselves. So, thank you very much for coming for joining us today and this is Jimmy Le everybody with scaling the mountain a framework for tackling large scale technical depths. Please a round of applause. Thank you. Hi, I'm Jimmy Lie. So today I
01:00 - 01:30 will talk about a framework for tackling uh tech large scale tech. Um let me ask a question. Anyone like uh tech that here? So I hope you don't uh I oh I do see someone. So um I think uh you are here you probably you had uh take that or you want view them as problem you want to solve them. So um yeah so uh this is the outline for today. We are
01:30 - 02:00 going to uh talk about the challenge of the tech that and uh share some actionable strategies and with some open source demo and do some case study. At the end we will have about five minutes for Q&A. So let's get started. Um so I'm uh currently working at ZIP. It's a uh P2P uh spending service. So if your business need to collaborate uh for spending uh you may need a service like this and we have
02:00 - 02:30 many big customers and we are still small but we already have a big code base. So our large Python monolith codebase has more than three million slides of code. We have 100 developers and we continue to grow pretty fast and we also have a lot of tests like 20 more thousand more now. Um so with uh codebase like this uh we have seen tech that as a problem. Uh the
02:30 - 03:00 issue we have found is it caused a productivity drag. So developers has to spend significant amount of time to understand the dead or outdated code and they don't have enough time for shipping features. um in a logical base. If you just uh start to uh de uh adopt type checking, you may get a lot of uh type coverage issue and that cause a lot of uh runtime errors and it's hard to uh
03:00 - 03:30 adopt uh uh type checking solution in a logical ways and uh also we have seen uh developers fear of touching OO and uh to even to make a simple whole code change will take them a lot of time because they need to understand the existing logic. Um here are some example for productivity drag. We have seen the pull request cycle time increase overtime or
03:30 - 04:00 the number of pull requests merged per week per person reduced over time. Um for our codebase uh initially when we start to adopt pyrite type checking we have more than 85,000 pyrite ignores everywhere in the codebase. Um and we also have other type of tech that like a lot of outdated uh unclaned uh feature flags or we also want to adopt as role but uh it's a old
04:00 - 04:30 codebase so we have to introduce uh as runs uh to run the code in the nested event loop and uh when we do the migration over time we still have uh so many as a runa and we want to clean them up. So why it's so hard? Uh I think the main challenge in a large code base is we don't have the visibility. We don't know how many tech that we have and
04:30 - 05:00 where are they in the code basease and also it's the accountability is not clear. uh we don't know who should be responsible for fixing some tech debt and we also don't know who are contributing to uh fixing the tech debt without accountability um people don't have much motivation to contribute to tech debt. Um and also uh we have found some trivial tech uh tech that uh can be
05:00 - 05:30 fixed uh in some repetitive ways but uh those are uh training developer energies. So here's uh yeah so we are going to uh introduce a solution stat strategy first is uh to measure um we want to be able to measure our tech that across the entire codebase uh when we know how many tech that we have we can
05:30 - 06:00 set a clear goal and monitor the progress and in order to recognize we want to track every contributions and that way we can celebrate the wins. And the third is to automate uh we want to build commods or use AI to assist uh fixing the tech that so the first one on the major. So we develop uh a framework allow us to build tech analyzer like this. So given the pyite uh as example
06:00 - 06:30 in our codebase we have so many pyro ignore commands and we want to track each of them each of them as a tech. So in our framework you can implement a analyzer like this. You just extends from the base code analyzer class and this is using deep cst concrete syntax tree library. So you can just implement a function uh vit comment to uh inspect any comment node in the source code. So
06:30 - 07:00 we want to look for the uh comment value that starts with pyro ignore. when we found that we found a tech that we just reported. Um in this framework uh we can focus on the logic to identify a tech that and automate the testing with a g example code. So every example code like this we expect the analyzer to uh find a tech and uh automated test will be
07:00 - 07:30 generated. Um so for some of the slides I include the link uh in uh our code base uh to the examples uh it's on GitHub zip hqbarraer um you can also find a link to this slides on the uh github repo. So with the analyzer we want to run the analyzer across the codebase in different uh files uh in parallel uh
07:30 - 08:00 especially when you have large codebase uh this is very important to be efficient and at the end we aggregate the results. So in our framework we have a helper uh path results that analyze paths given a list of paths. We apply the analyzers in parallel and we built a lot of different analyzers to keep track of different type of tech that uh we mentioned as run
08:00 - 08:30 visual flag parnore. We also track uh some front end tech that like es disabled next 9 comments or we also uh built some uh generic analyzer to keep the codebase size uh back end uh python lines and front end uh typescript lines of code. So with this framework we can uh know how many remaining tech in our
08:30 - 09:00 codebase and so with this this will be useful to uh set a goal to tackle them uh with some strategy. So you can think uh as an organization uh to set some goals and then with the goals you can use this framework to keep track progress and we you can also prioritize different uh take that in different quarter. So uh
09:00 - 09:30 and assign uh different weights to them. Uh I'll talk about how these weights can be useful later. Uh so here's an example. So uh for example we have essential flag to uh tech that categories and in the first quarter of this year we want to set a goal. We uh so currently I think gang has 420 uh call sites uh take that and we want to reduce the number down to 300 at the
09:30 - 10:00 end of the quarter. So we can configure this in our codebase uh in our framework and then we can do the progress check. So based on the time of point in the current quarter we can check whether we are meeting the goal. So here it shows uh no we are expecting 330 but it still has 420 so we didn't meet the goal but for feature flag we are on track to hit
10:00 - 10:30 1500 at the end of the uh quarter and we set a single run with a higher weight uh because we think it's more important or maybe we it's uh harder to tackle so we want to incentivize uh more uh developer to contribute to it. So now let's talk about the second strategy recognize. So in order to recognize we
10:30 - 11:00 need to track the recognition and the contributions uh or regressions comes from every code change. So we need to analyze every code change which is the pull request. if you are using GitHub. So each pull request usually updated a few files. So we can run our analyzer on those updated files and to do this efficiency we only want
11:00 - 11:30 to analyze the updated file instead of the entire codebase otherwise it will be too slow and so because of this incremental analysis we need to store the snapshot or the codebase uh somewhere. So we use AWS S3 for this. So if you check our example code, you can find a helper helper gen from incremental analysis. Given a S3 store and a heck commit u you will be able to fetch the snapshot from the heck commit
11:30 - 12:00 and run the analyzer on the updated files and uh aggregate the data to report uh the contributions from the code changes. So with the code changes we can show some interesting thing like this. So uh we can tell the author hey you have some upcoming contribution from this pull request you are introducing one more ess and you are fixing two feature flags
12:00 - 12:30 that's great and we can also keep track of the history of contribution that the author know their uh their contribution so far in this quarter. And we can also remind them when they are about to introduce some uh new tech that on their prest if this is a high priority tech uh t category uh because uh that is the best timing to fix it to stop it. Yeah. Otherwise when it's
12:30 - 13:00 merged uh it usually take more time to fix them. And we also share some links to surface some other tech that opportunities. Yeah. Maybe for the one they are introducing it's hard to fix but uh we provide them some opportunity to make more contribution to offset the impact and then with that we can also celebrate contribution with a
13:00 - 13:30 leaderboard. So uh we turn it into a fun quarterly challenge and spotlight the top contributors. So that's where our weights are used. So this deep der board we try to use the weighted score which is the number of uh contribution from each category multiply by the weight of the category. So uh with this weight we can prioritize different uh take that uh differently
13:30 - 14:00 and uh drive uh the or entire organization to make uh contribution differently in different quarter. Yeah. So uh from here uh this is just an example. So I'm currently at the fourth place and the top five at the end of the quarter will be uh given a co-quality award. Um so um and it also tell me uh for the pyro ignore so far I have
14:00 - 14:30 cleaned up three and if I want to make contribution to rank up I need to clean up run more uh to compete with Jerry. Um yeah sim similar thing for our run and we can also celebrate the contribution in real time. So uh internally we use Slack for cont communication. So we created a channel for uh real time uh tech contribution
14:30 - 15:00 shout out. So when someone make a contribution on some uh prioritized uh tech category we can send a automated message like this to let more people know uh this guy is making contribution let's uh celebrate together. So to make this happen um we have a uh script take that framework that run analyzers which do the everything I just talked about
15:00 - 15:30 including the run the analyzers uh on the p request uh to uh do the incremental analysis and build the update the leaderboard and then do the uh shout outs. So let me do a quick demo uh in so this is the a pull request example pull request uh on the uh zip hq
15:30 - 16:00 bar raiser project. So you can see uh the bot uh commented uh saying I uh introduced uh n pyra ignore and offered uh the link to check my contribution. So yeah, so you can see it's like this. So in this example project I set up two analyzers uh be nice and pyro ignored. Um so you can see uh um yeah so far I'm
16:00 - 16:30 only making regression no contribution yet and uh there's only single person on the leaderboard. Yeah, but as long as this been uh used by many developers, you can uh get more data and edit button you can also see the contribution history. So develop the author can read through this to understand how they get the uh weighted score. Um so how this works if you check
16:30 - 17:00 this pull request you can find there's a run analyzer job being configured. It will run on each pull request when they are created. Also, it will run on the uh merged commits when the pull request is merged to the main branch. So, that's where we uh store we uh analyze and record the the win the actual wins uh to to our
17:00 - 17:30 codebase uh to S3. Um okay so let's continue. Oh yeah so uh the thing you just see including the pull request the CI jobs and the comments and the GitHub check. So we just share the uh the information on the GitHub check uh which you can also find. You can see the the
17:30 - 18:00 run is created as a quality win report and here it shows the detail. Next uh so with all of those we can uh celebrate the contributions to build a positive accountability so that way we can foster a culture that uh tech clean up is recognized and encouraged. Yeah. So with without uh this based on my experience I found it's very hard to
18:00 - 18:30 uh get people to work on tech that because uh it's usually over uh dep prioritized uh over uh building new features and um yeah so uh this has been made differently with the strategy I just mentioned and with with the analyze data we can also build uh team reports uh by aggregate the data and join with the co
18:30 - 19:00 owners. So uh on GitHub co owners it has the mapping from task to the team. So we can generate uh team report like this to let each team knows how many remaining T that they own and what's the Dota progress they have made uh in the current quarter and also provide them the action item. So when you click into the uh team page you can see exactly
19:00 - 19:30 where the a single IO run is uh located. you can get a link and can open it on uh your editor or you can also create task to uh plan them into your spring planning process. So we found this is useful for the the team to uh plan their uh quality work more strategically. So last uh I would like
19:30 - 20:00 to talk about the automation part. So first uh let's uh use this pyro ignores as example. If you have a very old codebase and you just start to adopt pyrite, you may have pyrite ignore everywhere like this. And if you try to fix it manually, you may start from adding the missing type annotation of the parameter l like this. When you add this, you have to uh clean up a bunch of
20:00 - 20:30 pyro ignore. If you do this manually, it will be pretty repetitive and boring. So, we can use comma for this. So, uh we have a comma which uh you can also find it in the paraser codebase remove unnecessary par ignore commands. So, it's built using lip cst. So with the com you can run it like this and given an example file you can it it will run
20:30 - 21:00 pyrite and compare with the code and figure out which pignor is not necessary and clean it up. So you don't need to manually clean it up and if you integrate this with uh your ID like VS code shortcuts it will be even easier. We also try to use uh AI to fix the tech that. So uh to convert a
21:00 - 21:30 function like fo as a async, you may just say convert fo ss async as a prompt. And we found this simple prompt usually has uh some issues. For example, it may result in incomplete conversion. uh maybe AI only find some co- sides and miss some others or it doesn't know the our project conventions so when you make code changes it uh follow some general
21:30 - 22:00 public uh convention which we don't like or it has some hallucinations um made up some uh wrong changes or do some unnecessary refactorings and in order to help uh everyone every developer to use AI more efficiently. We ended up build a very comprehensive guide. Uh this is a part
22:00 - 22:30 uh a small part from our comprehensive guide just for converting uh a function as async. So we have to explicitly tell AI when converting a target function into a async uh it needs to update all the co- sites and following those different strategies. Yeah, depends on where the co- sites is located and there are some um uh project specific convention. Yeah, for example, we have a
22:30 - 23:00 lintfix me comment that should be removed uh when the async out the wrong cause is removed and replaced with an await. Yeah. So we found with this uh pro uh prepared prompt and auto attached to uh every uh every chat uh when developer are updating a python file. So for example, you can use cursor rules to make it auto detach attached and with
23:00 - 23:30 this we found it can work better. Yeah, but it's still not perfect. Sometimes it even when we tell it do not convert uh other functions unless unless it's specially instructed uh sometimes it still do the different thing. Yeah, but uh as you we continue to improve this prompt, we have seen it works better. uh and the next uh about uh automation
23:30 - 24:00 is automated refactoring. So uh in my uh earlier talk uh like two years ago I shared how uh we built a pipeline to automatically create pull requests by running the commas and uh the pipeline can manage the p created pull request based on the test result and send them for review and automerge them or close
24:00 - 24:30 them. Yeah, if you are interested interested in the detail, you can find uh blog post and the recording here. So, uh let's uh do case study. Uh so, first when we adopt uh t uh pyite, initially we have more than 80,000 pyrite north in our codebase. So we uh we build this framework to start track them and when
24:30 - 25:00 we know how many tech that we have we set a goal like uh fix 20,000 uh for the first quarter and then adjust based on the result in the at the end of the quarter uh with this approach at the end we were able to boost the type checking coverage from 60% to 99.6%. uh and over time we continue to improve the framework. We introduced the co-quality awards for the top five contributor
25:00 - 25:30 uh and share the team wins and milestones to sustain the momentum and we also build the u coats you just seen and also make it uh uh vs code shar for to make it easier. We also try uh auto uh uh complete uh from the pylons uh try to uh modify pilance to uh do some
25:30 - 26:00 automated refactoring um with the type inference. Uh and when we uh we also have some projects focused on ad missing type stop and annotations of the commonly used open source libraries to make the entire project efforts easier and for run um we had uh 60 I'm sorry 600 tech at some point and then we start to set quarterly goal like fix 150
26:00 - 26:30 uh We at that time we build a team report dashboard to help help and to get the help from every team. We continues to use the co-quality awards. Uh but we also build the new uh real time stack shouts. uh and we also have some lip cs and uh the uh cursor rules you just see uh to use the
26:30 - 27:00 comprehensive prompts for using AI mode to fix the T that easier. So uh to uh to summarize so we have three strategy major recognize and automate uh and we have a good way to execute uh we can set a goal and prioritize them and review the progress and improve and for automation we leverage coms and uh AI rules. So here are the uh
27:00 - 27:30 conclusions uh recap and if you are interested in uh the code example you can find it at zip hq uh bar raiser it's uh the code is under the tech framework folder um yeah and so now we have a few minutes for Q&A and we we we are hiring if you are interested in working with me using AI to take large tech uh scale tech or
27:30 - 28:00 test automated testing you you may check out. Thank you. Oh, Tess, hi. I had a question about uh you know I feel like there was a lot of setup on uh it seems like there was a lot of
28:00 - 28:30 infrastructure work that had to take place to tackle all of this debt and I think it helped in the long run in your case and I was just curious how much time did you set up on uh yeah all those tools you use as well as like measuring as well as like coming up with a game plan what was kind of the allocation to setup verse execution Um yeah so what uh what I shared is the my learnings uh over time
28:30 - 29:00 uh o over my career. So um yeah I currently work at zip but previously I also work at Instagram and we are all facing the same problem and I has been in the infra team so has been thinking about how to uh tackle those and uh over time through the brainstorming we were able to come up with some of those different idea to try it out. Yeah. So for measurement I think that's a uh a
29:00 - 29:30 pretty uh initial idea and to start to build them um yeah we try to leverage uh the opensource uh solution um that way we don't have to spend too too much times yeah so um yeah there there's uh some initial setup uh but once you have the setup you know how many tech that you have you and when especially when you see the data you may have more idea
29:30 - 30:00 about what to do the next. Yeah. So we didn't build the entire framework in the beginning. We just build some features and try it out and uh if it doesn't work we try other ideas and eventually we found those are useful ideas. I'm curious, you had to do anything special with your finance people. In other words, were you able to somehow track the the going forward what the
30:00 - 30:30 benefit of the tech debt reduction was um in terms of helping the business? Um yeah, that's a good question. So I think that's also a challenge of a software engineer. When you work on infrastructure, you may be making your system more stable or tackling the T debt and that doesn't seem directly contribute to your business revenue and yeah so we had a lot of
30:30 - 31:00 uh related discussion in my career with my manager or with my colleagues. Yeah. And I think in general if the company wants to foster a good culture they need to recognize this. Yeah. So at some point uh maybe initially they didn't pay too much attention. Initially they only focus on revenue but over time they will realize system reliability is important. Um and then they need to take some
31:00 - 31:30 um actions to uh change how they think. uh for example when evaluate the performance of every individuals they will also need to consider quality uh as one of the aspect. Yeah. So um yeah I think it depends on where your situation is. Yeah it may be challenging uh at some point. Yeah. But you need to
31:30 - 32:00 keep uh communicate and yeah that's my experience. They were not able to give you any any specific in other words as far as I know f financial accounting principles can't don't have a way of depreciating investments in improving efficiency. So there's only a handful of companies that seem to have actually figured that stuff out. And I was just wondering if you guys had gotten that far. So that's cool. Thank you. Thank
32:00 - 32:30 you. Um, so my question is about sort of tech debt in a broader sense. I mean, this is really impressive and I think the infrastructure that you've built up is really powerful here. I'm wondering if you've thought about addressing sources of tech debt that come in application logic that aren't necessarily so structurally available because they can also be very permiss pernicious. Like one thing I'm thinking about is dead code from unused API endpoints. I feel
32:30 - 33:00 like the bulk of our tech debt is really like that. And so I'm wondering if you thought about ways to get at app logic tech debt. Um yeah. So uh the example analyzer I shared uh they are using static analysis. So they are only looking at the code uh in a file and that's limited. And for the code that is not wrong at all, you would need uh some runtime data to back that up. And uh
33:00 - 33:30 yeah, so something uh I have tried in my previous company was to develop some providers to collect data like whether a function is called uh in the past uh few days. with this kind of data you can identify take that better. Yeah. But so I think uh in my current job I don't uh have something
33:30 - 34:00 like that yet but uh I I think in the future we will also want to explore more uh in this area. So be able to track uh the tech that that requires some other data set like runtime data or maybe it's some of them are like whether something is tested or uh well or not. Yeah, those you would need to collect the data differently. Thank you. Um, so it looks
34:00 - 34:30 like we uh run out of time, but I still see people have questions. Maybe we can move to the hallway to continue the discussion. Uh, thank you everyone.