Compiler Construction with LLVM and MLIR

[Tutorial] How to build a compiler with LLVM and MLIR - 01 Introduction

Estimated read time: 1:20

    Learn to use AI like a Pro

    Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.

    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo
    Canva Logo
    Claude AI Logo
    Google Gemini Logo
    HeyGen Logo
    Hugging Face Logo
    Microsoft Logo
    OpenAI Logo
    Zapier Logo

    Summary

    In the first installment of his video series, Samuel, a dedicated programming language enthusiast, introduces viewers to the intricacies of building a compiler using LLVM and MLIR. Samuel discusses the motivation behind this series, aiming to fill a knowledge gap despite the substantial but complex online documentation. He highlights his two-year journey developing a new programming language named Serene, emphasizing the series as a guide for contributors and those interested in LLVM and MLIR. Samuel plans to document his non-live coding journey through structured episodes, each accompanying a dedicated branch for reference. This episode sets the stage for future discussions, exploring the foundational elements required for compiler construction while inviting enthusiasts to contribute to the Serene project.

      Highlights

      • Meet Samuel, your guide through the world of compiler construction with LLVM and MLIR! 🚀
      • Explore Serene, Samuel's new programming language, as we dive into the nitty-gritty of its development! 💻
      • Discover how Samuel intends to make this series a vital resource for understanding LLVM and MLIR! 👨‍💻

      Key Takeaways

      • Samuel's passion for language design sparks this informative series on LLVM and MLIR! 🔥
      • Serene, the new language, is two years in the making and open to contributions! 🤝
      • Understand the intricate world of compiler design with structured yet comprehensive insights! 🧠

      Overview

      Samuel kicks off this exciting series by delving into the world of compiler construction using LLVM and MLIR, driven by his deep passion for programming languages. With two years of effort poured into his new programming language, Serene, Samuel sets out to demystify the often complex documentation surrounding these technologies.

        Throughout the series, Samuel plans to walk viewers through his development process, providing insights into the language's creation and evolution. By avoiding live coding, he ensures each episode is rich with content and backed by a dedicated code branch, making this series a practical and valuable resource for learners and contributors alike.

          As Samuel lays the groundwork in this introductory episode, he emphasizes collaboration and community involvement, especially for those captivated by LLVM and MLIR. This series promises to empower both budding and seasoned developers by offering a transparent glimpse into the fascinating realm of language and compiler design.

            Chapters

            • 00:00 - 01:00: Introduction and Host Background The introduction provides an overview of the video series, focusing on building a compiler using LLVM and MLIR. The host, Samuel, is introduced as a software engineer with a passion for programming languages. He has been developing his own programming language for over two years and aims to share insights and methodologies throughout the series.
            • 01:00 - 02:30: Series Overview and Goals This chapter introduces the motivation behind creating a video screencast series on LLVM and MLAR. It highlights the lack of sufficient comprehensive online resources despite there being extensive documentation, necessitating personal research and understanding of source code to fully grasp the topics.
            • 02:30 - 04:30: Programming Language History and Challenges In the inaugural episode of the series, the speaker introduces the overall plan for upcoming episodes and offers a brief overview of the programming language they've been developing over the past two years.
            • 04:30 - 09:00: Journey Through Language Development This chapter introduces a video series focused on creating a new programming language. The speaker mentions that most of the language has already been developed and will be showcased in the series. Different sections and parts of the language will be discussed throughout the series.
            • 09:00 - 13:00: Current State and Future Plans Chapter 1: Current State and Future Plans - This chapter delves into the development of a new programming language called 'serene' or 'syrian lang'. The author, besides working on creating the language, aims to make this video series a comprehensive guide for potential contributors interested in joining the project.
            • 13:00 - 15:30: Call to Action and Closing Remarks The chapter focuses on the intention behind the video series, aiming to serve as a tutorial or guide for individuals interested in LLVM and MLIR. It mentions that while there is good documentation and even tutorials available on the LLVM website for both LLVM and MLIR, the series seeks to complement those resources. The goal is to create a practical and understandable guide for those who wish to delve into these topics.

            [Tutorial] How to build a compiler with LLVM and MLIR - 01 Introduction Transcription

            • 00:00 - 00:30 hello and welcome to my video series on how to build the compiler with llvm and mlir this is samuel your host and i am a programmer i'm a software engineer who's obsessed with programming languages and i've been working on creating a new programming language for my own for over two years now i guess i thought to myself that it might be a
            • 00:30 - 01:00 good idea to start a video screencast series on the topic because it's really interesting and you can't find really enough information about this kind of stuff on the internet especially around llvm and mlar there's great documentation around them but i still you need to read a lot of source code you need to do your own research and it's
            • 01:00 - 01:30 not an easy job uh to do so here i am recording my very first episode on the topic and today on this video on this episode uh i'm going to just talk about our plan um for the rest of the series uh and a little bit of history about the programming language i'm working on for the past two years so
            • 01:30 - 02:00 let's begin with what basically what we're going to do and what is this video series all about hopefully in the rest of the series we're going to create a new programming language which um i already made most of it uh so i'm going to showcase it for you and we're going to talk about like different sections and different parts
            • 02:00 - 02:30 of this new programming language one of my actual goals uh besides creating a programming language is to create like for this video series to be a guide for any contributor who might be interested in contributing to this programming language which the name is serene uh syrian lang um another goal
            • 02:30 - 03:00 is to for this video series to be as a like a tutorial or a guide for whoever is interested in llvm and mlir which is like a subsection of llvm uh there's some really good documentation on llvm website about llvm and mlar there's like even a tutorial for both to create a like a really simple language but that being said that
            • 03:00 - 03:30 doesn't mean that by reading that tutorial you would get and you would understand everything you still need to study hard you need to read a lot of source code because like even with the extent of documentation which is available on llvm.org it's it's really huge and you have to dig in like ask questions in different communities read a lot of source code and there's no clear pass on
            • 03:30 - 04:00 how to like utilize llvm like to the fullest so hopefully i'm aiming to create this video series to be like a guide through like for everyone through lvm and mlir but um sorry so
            • 04:00 - 04:30 if we want to like for the rest of the video series um the plan is to i'm not going to live code anything so i'm going to go on my own write some stuff like try to figure out things and uh come up with a solution to some of the problems problems i'm facing and then when i reached to a certain milestone i'm going to record a video
            • 04:30 - 05:00 describing what i what i did and hopefully it would be a guide to that section for example the next my plan for the next episode is to talk about the build system and start by probably the reader part of the language but i will get to that later so um today is the 2nd of july in 2021 so what i'm going to do for each episode
            • 05:00 - 05:30 is to create a branch in that moment in time for that specific episode so the master branch like will do its thing its own thing for anyone who watch this uh is watching this video in the future you have to refer to the branch for each episode to find out like the stuff that i'm going to talk about so uh and
            • 05:30 - 06:00 my plan is to keep the branches around for a long time to match the videos and each episode uh also uh i can't really refer to myself as an expert i'm i'm no expert i'm just someone who's obsessed with these topics and i i'm really into like language design and stuff like that so i do my own research i do my own study but
            • 06:00 - 06:30 if you find something that uh you like picked if it picked your interest please feel free feel free to contribute to the uh to serene i would be more than happy uh to review your contribution um to continue oh not again sorry let me give you a like a brief history
            • 06:30 - 07:00 about the language itself uh which we're going to have a look uh in the next next episode so for the past two years i've been working on creating a new language which which isn't at least basically i started really simple i i like did many implementation in different languages i started with java and
            • 07:00 - 07:30 i created like a really simple list interpreter which really was easy to implement and it worked just as i expected just to gain more experience about like what do i need to do how the reader would like look like or what would be the challenges in designing a language i started really simple by creating uh an interpreter rather than a compiler but little by little when i implemented like newer stuff you
            • 07:30 - 08:00 in like different implementations i had i realized that okay uh having an interpreter might be cool might be might be good or even handy but it's not going to be very different than other dialect of lisp or other uh like scripting languages like python or other stuff and to be and to be frank like whatever i create as an interpreter is
            • 08:00 - 08:30 not going to be able to compete with something like ruby or python because they have more than 20 25 years of experience beside them to behind them to support them so little by little i decided to do more research on different topics like the type system the uh differences sorry about the type system and different aspects of a language and i like i used
            • 08:30 - 09:00 my research the result and the fruit of my research to implement new like different implementation using different languages to find out to find like what platform can be a good uh platform for the language i'm working on so obviously the first one was the jvm it worked fine but not really great i moved to uh using growl vm with truffle library
            • 09:00 - 09:30 i worked on it a little bit it's it seems really promising i have a blog post about it in my uh the blog about the whole research i've uh i've done for picking like a choosing a good uh platform and like i even implemented another interpreter for butcher strapping a compiler in golang but when i got introduced to llvm and especially mlir it changed everything llvm is so
            • 09:30 - 10:00 elegant and like well designed that it simply over like it took my breath like i had no other like when i saw it uh mli or especially i i was like okay i don't need anything else this is the one it's the best let's do it uh probably in
            • 10:00 - 10:30 future episodes i'm going to talk about llvm and mli are more in depth and wider why i made this decision why llvm is like the best but for now to be really like brief llvm is good because it's modular and it designed as a library as a framework to help you create your own compiler um it's
            • 10:30 - 11:00 the main language for llvm is c plus plus uh so obviously we need to ha you need to know c plus plus to some extent i'm not as simple as blocks c plus expert myself but we're going to use it anyway because the api of uh llm is in c plus plus uh and we need to know a little bit of how to use cmake the build system which is quite easy you don't have to be you don't have to be worried
            • 11:00 - 11:30 about so the main code exists and uh repository on the virus.code if i want to show you we have like two main repositories uh this steering symbol is the java implementation which i did like long ago i i made some tweaks here and there but the the majority of the code is done like long ago um there's no branch on and the second
            • 11:30 - 12:00 and the most important repository is serine itself which contains many different branches i i kept the other implementation i've done on setting on different branches just i don't know as like a historic thing i like to have them around like the golang implementation the first try on c plus plus implementation uh there should be a rust somewhere
            • 12:00 - 12:30 around here yeah the rust implementation overall up until now i've created enough of infrastructure for the language which contains a reader a semantic analyzer and several layers of intermediate representation languages which we're going to talk about in like in-depth in future episodes
            • 12:30 - 13:00 but um right now my aim like it's not it doesn't have any specific feature at the moment the type system is missing like it only understands uh the only thing that it can't compile is functions and integers but my aim is to wire up every single piece of functionality in the compiler to it like together be able to like
            • 13:00 - 13:30 compile a certain file from reading the file to generating the actual binary from start to end and also create a just in time compiler so overall i want the mvp of a compiler you know like the most valuable product no sorry i want something really minimal it just works but it's going to give me enough insight
            • 13:30 - 14:00 into different pieces so i won't over engineer any piece or i spend too much time on on a section which i might have to change in the future which which honestly happened to me many times so uh right now 2nd of july i'm in data state but it's good enough that we kind of start talking about the reader the semantic analyzer different classes we have in in the source code also uh
            • 14:00 - 14:30 that's it but if you if you're interested in helping out please reach out to me um i'm going to share my info with you as well so my website is xamarin.com and like my email address is
            • 14:30 - 15:00 again alex here you know you know.org why not um oh my god so um please reach out to me also if you if you're interested in this topic and especially if you want to keep updated with the new episodes please subscribe
            • 15:00 - 15:30 to this channel hope to see you in the next episode thank you