Howard Chang (Stanford, HHMI) 1: Epigenomic Technologies

Estimated read time: 1:20

Summary

In this engaging video, Howard Chang, a Stanford University professor and Howard Hughes Medical Institute investigator, explores the exciting world of epigenomic technologies. Chang delves into the concept of epigenomics, explaining it as the study of the living genome and its dynamic interactions with the environment—offering crucial insights for personalized health approaches. The video reviews groundbreaking techniques like ATAC-seq for mapping chromatin accessibility, illuminates the significant role of non-coding DNA in disease, and illustrates how epigenomic methods are revolutionizing our understanding of cancer and genetic information. The discussion emphasizes the profound potential of these technologies in developing personalized therapeutic strategies.

Highlights

Howard Chang examines the role of epigenomics and how it reveals that DNA isn't the sole determinant of traits. 🌟
New epigenomic technologies like ATAC-seq are revolutionizing how scientists study gene regulation and interactions. 🔬
Chang explains the importance of non-coding DNA in understanding disease mechanisms, particularly in cancer. 🎯
The potential of epigenetic corrections to restore healthy phenotypes and their application in clinical settings is discussed. 🧪
Through innovative mapping techniques, researchers can pinpoint which genetic switches are active in various cell states, aiding in personalized treatment development. 🗺️

Key Takeaways

Epigenomics explores how genes are turned on or off, influencing cell diversity and health. 🧬
DNA isn't destiny—your genetic makeup interacts with your environment in complex ways. 🌱
ATAC-seq technology allows detailed analysis of chromatin accessibility, offering insights into genetic regulation. 📊
Non-coding DNA plays a crucial role in disease—understanding it can lead to breakthroughs in treatment. 🔍
Epigenomic technologies provide new perspectives on cancer, revealing how cells operate differently in disease states. 🦠
Mapping the epigenome enables personalized medicine approaches and potential therapeutic strategies. 💊

Overview

In this video, Howard Chang, a prominent researcher in the field of epigenomics, breaks down the complex topic of epigenomics and its importance in understanding the dynamic regulation of genes beyond their DNA sequence. He introduces viewers to the concept that while our DNA contains the blueprint, it's the epigenome that dictates which parts of the blueprint are utilized, a key factor in cell differentiation and health.

Chang takes us on a journey through innovative technologies like ATAC-seq, which have transformed the scientific landscape by allowing precise mapping of chromatin accessibility. This technology sheds light on how genes are regulated and how these processes can go awry in diseases like cancer. By understanding these mechanisms, scientists are gaining invaluable insights into the intricate dance between genes and their regulatory networks.

The video emphasizes the revolutionary impact of epigenomic research on medicine, particularly in the realm of personalized therapies. By identifying which parts of the genome are actively regulating cells, researchers can better understand disease phenotypes and develop more targeted and effective treatment strategies. Chang's engaging presentation conveys the promising future and potential of these cutting-edge technologies in advancing human health.

Chapters

00:00 - 02:20: Introduction In the introduction, Howard Chang, a professor at Stanford University and investigator of the Howard Hughes Medical Institute, introduces the topic of his talk. The talk will cover three parts about epigenomics and long non-coding RNAs. He highlights that epigenetics is an especially relevant and popular topic, specifying that the term 'epigenetics' refers to the meaning 'above the genes'.
02:21 - 02:59: The Living Genome and Personalized Health The chapter discusses the concept that DNA does not solely determine destiny. It uses the example that while nearly every cell in the body shares the same DNA, their functions differ – a skin cell differs from a muscle or brain cell. This is because cells make choices about which genes are activated or suppressed, a study field known as epigenomics.
03:00 - 04:00: Epigenomic Technology Development Epigenomics is the study of the living genome, focusing on the interactions and dynamics between genetic predispositions and environmental influences. This field has evolved to directly measure these activities, providing significant implications for personalized health. Epigenomic technologies are pivotal in clinical applications and monitoring health states by understanding how nature and nurture interact.
04:01 - 05:20: Chromatin Features and Gene Regulation The chapter discusses the relationship between DNA (genome), epigenome, and potential disease states. It uses an analogy where genes are the template information, like an image, while the epigenome is the lens that projects this image. With aging or disease, the DNA template may degrade, and the epigenome lens may become cloudy, impacting gene expression and regulation.
05:21 - 10:00: ATAC-seq Technology and Applications The chapter discusses the potential of epigenetics in restoring degraded genetic information to achieve a healthier phenotype. The metaphor of eyeglasses improving vision is used to illustrate how correcting the epigenome can enhance genetic information presentation. This correction might lead to better health outcomes, aligning with the promise of epigenetic interventions.
10:01 - 17:00: Single-cell ATAC-seq and Cancer Research The chapter focuses on the burgeoning field of epigenomics, particularly highlighting the current technological advancements that are significantly impacting cancer research. It discusses how technologies typically advance through phases, moving from discovery to systematic decoding. Mention is made of the historical progression in genomics, tracing advancements starting from the 1950s with the discovery of DNA structure, to the first DNA sequencing technologies in the 1970s.
17:01 - 27:20: Enhancer Connectome Method The chapter titled 'Enhancer Connectome Method' begins with a discussion on the advancements in next-generation sequencing technology over the past decade, which has made genome sequencing more accessible. The chapter also explores the evolution of epigenomics, describing it as akin to the software programming of our cells. It mentions the discovery of the first chemical markers associated with epigenetic memory in the 1950s, setting the stage for further developments in the field.
27:21 - 39:00: Perturb-ATAC and Epigenomic Writing The chapter delves into the methodological advancements in detecting epigenomic marks in laboratory settings, tracing its evolution from the 1980s to the present. Highlighted is the Perturb-ATAC technology, which has significantly accelerated the ability to systematically map and decode epigenomic information. The discussion emphasizes the importance of understanding genome features, particularly those related to disease-associated genes, as critical in the field of epigenomic research.

Howard Chang (Stanford, HHMI) 1: Epigenomic Technologies Transcription

00:00 - 00:30 I'm Howard Chang a professor in the Stanford University in California an investigator of the Howard Hughes Medical Institute today I'll be talking to you in three-part talks about epigenomics and long non-coding RNAs epigenetics is a very hot topic today the word literally means above the genes and you can remember the catchphrase
00:30 - 01:00 that your DNA is not your destiny and a very good example of this is that nearly every cell in your body has the same DNA yet your skin cell is not the same as your muscle cell or your brain cell and that is because these cells have choices choices about which genes to turn on and off and this comprehensive study of these gene Beckett or events is the modern study of epigenomics literally we
01:00 - 01:30 can think about epigenomics as studying the living genome this feel has evolved so that we can out directly measure these activities but it has important implication because it has the dynamics of the interaction between nature and nurture or you're born with and the impact of your environment as such this may have important implications for your personalized health for example in clinical applications and also monitoring health States
01:30 - 02:00 here's another potentially useful analogy to think about the relationship between your DNA your genome your epigenome and the involvement in potential disease states we can't match in that your genes are like this template information like this image and the epigenome as the lens through which of the information is projected to show this beautiful image with aging in or with disease this template get degraded and the lens may become cloudy so this
02:00 - 02:30 image is now blurred the promise of epigenetics is that perhaps we can actually fix the situation even if the genome information is still somewhat degraded the lens the epigenome through which is information projects can be corrected and in such a way like the glass I'm wearing to actually restore this image and basically restore the phenotype we associate with a healthful state that is the conceptual promise
02:30 - 03:00 another reason for the excitement for epigenomics is because the technology is really at an inflection point in every field technologies go through different phases of discovery detection in systematic decoding in the field genomics the hardware the DNA in ourselves we discovered the structure of DNA in the 1950s the first technology to detect or sequence DNA occur in the 70s
03:00 - 03:30 but only in the last decade or so that we have really next-generation sequencing technology to make routine genome sequencing a possibility epigenomics is also going through relate a kind of evolution and we can think about the epigenomics now as the counterpart the software programming of our cells so the first chemical Bark's associated with epigenetic memory were discovered in the 50s some of the first
03:30 - 04:00 method to detect these marks in a laboratory setting or develop in the 80s and I'll be telling about some new technology that developed were developing the last decade I really sped up the capacity to systematically code this epigenomic information let's zoom in to the specific features in the genome that we're talking about when we think about genes specifically disease associated genes we have to remember
04:00 - 04:30 that each of these genes are associated with switches DNA regulatory elements that decide when and where this gene turns on and off these DNA elements are the binding sites for transacting protein transcription factors or regulatory RNAs the picture in the human genome actually look more like the bottom where own just 2% of information is protein coding and the vast amount or the real estate 98% is actually part of
04:30 - 05:00 this regulatory DNA we also know that that human variance associated with disease reside in this non-coding space so systemic work over the last several decades by many investigators have found that Dulles DNA is packed into chromatin and I'll refer you to the eye biology talk by David Allis which goes into great depth about these different chemical marks but the in conclusion is really that each part of the gene has
05:00 - 05:30 characteristic chemical and physical features on chromatin and that these features reflect the current activity and a future trajectory of the genes and looking and epic genomic technologies I'm talking about basically is the systematic mapping of these chromatin features across the genome so this cartoon shows the fact that if there's a protein coding gene here that there be promoters that's where the gene starts there are DNA elements like enhancers
05:30 - 06:00 that activate this gene in specific cell types there are additional DNA elements that might prevent the shin from being activated in the different situation and there also for example insulators things that basically break up genomes into neighborhoods of control and this interaction would have to occur of presidential through long-range DNA looping chromosome looping a very fundamental feature of these activities is that the DNA has to be accessed it has to be physically touching the
06:00 - 06:30 regulatory she nuri for this regulation to happen and that is a fundamental feature that we can exploit now in every human cell two meters of DNA is packed into a 10 micron nucleus therefore most of your DNA is highly compacted all wound up and not accessible except that the act of DNA elements that your cell is actually using and reading and so simply finding out where these accessible elements are located Jacob
06:30 - 07:00 can give us a lot of information about the software program that your cell is running a few years ago my colleague will Greenleaf ania Stanford invented a technology called a Co transposes accessible chromatin or a taxi for short it uses an enzyme called T and Phi transpose it which copies and pastes DNA it derived from a bacteriophage we already load up this enzyme with
07:00 - 07:30 sequences that can go onto our sequencing machine when this enzyme tries to copy and paste into eukaryotic chromatin it can only paste into the open chromatin science and so therefore in a single step you selectively incrementally tag the genome at the accessible sites that then allows us to amplify and sequence these elements so this a very elegant simple sort of strategy led to a million fold improvement in the sensitivity and a
07:30 - 08:00 hundred full improving the speed of mapping the regulatory DNA the epigenome in human cells here's the example of what the data will look like on the x-axis these are the locations of genes and the height of these Peaks indicates a level accessibility the taller the peak the more accessible it is the first track show in blue was the standard prior gold standard technology called DNA's hypersensitivity and they used ten
08:00 - 08:30 million cells the second row in green is the first virtual attack seek technology which you only used 50,000 cells and the third road with the ultimate resolution that we achieved which was actually single cell attack seek this is several hundred single cells some together you can see that the patterns look very similar across these applications however now with the single saw information you can zoom in and now every row is a single cell your 254
08:30 - 09:00 single cells going down this way and at every position here every you basically see either 0 1 or 2 reads because human cells are diploid and therefore this kind of analog signal can turn into a channel information when we see these individual Peaks we of course want to know what are the factors that are acting on this individual or gene switches
09:00 - 09:30 there's another very interesting feature of the attack seek signal that we can exploit we learn that many times at the summit of every peak there's an approximately eight to ten basis of a dip and this is called a footprint okay so this is an example of an attack seek signal and this is exactly the binding side of a DNA binding factor on to DNA so the idea is that we're essentially spray-painting the genome with our a toxic enzyme and if a protein sitting on
09:30 - 10:00 the DNA you can spray paint to the left of red or to the right of it but not on top of it and so if I were putting my hand in front of a wall and I spray painted when I move my hand away you'll see a shadow it shows that an object was there that is the kind of similar principle and so we can see that if we're directly retrieve this particular factor call ctcf this is the location where the CDC F is sitting and the footprint of the ctcf on a taxi data
10:00 - 10:30 looks very similar okay so because we naturally know the binding preference of hundreds of factors across a genome we can actually look across a genome and ask where do we see this kind of footprint and infer the binding locations so for example I look in this map here every call every row is an instance of the ctcf binding site in the genome this is the center of the sequence that it's being bound to we can
10:30 - 11:00 see that only these sites up here have this kind of footprint pattern and these ones at the bottom do not if we directly retrieve ctcf by a different technology we can see the same answer that these top ones are bound and the bottom ones are not bound okay and so because again that we know the binding sequence or the preference of hundreds of factors in that bound to DNA we can actually learn the binding locations that these factors are once in allele specific fashion now
11:00 - 11:30 that we have this powerful technology we can think about what we can learn from this map this epigenomic map of individual cells and I analogy we were really quite inspired by the kind of maps digital maps that we all use in our daily lives or navigation these digital maps represent the real world in different layers including the lay of the land the different businesses different streets where your friends are
11:30 - 12:00 and each of these layers of the map makes this map a more useful so we also imagine that by analogy now if we build up a personal snapshot of gene regulation the epigenome of the regula we want to learn from this map different cell types different cell states the tissue micro environment the cell lineages the effects of perturbations or drugs connect them all together from computation and to kind of they make maximal use of this personal regular
12:00 - 12:30 information I'll show you some examples how can extract this kind of information from the epigenomic map an important concept is that the epigenome encoding information or cell type identity here on this map I'm showing you six tracks six different cell types from the blood starting from the hematopoietic stem cell the HSC two cells that make different lineages myeloid cells white cells read us mep
12:30 - 13:00 which makes red cells and specific kinds of immune cells cd8 t-cells and NK cells on the right you can see that for this particular gene tattoo the messenger RNA level varies by less than two fold across these different cell types so you might think that tattoo is not a very good marker for different cell type identities but if you look at the chromatin landscape now you see a completely different picture which is shown on the left so you can see that
13:00 - 13:30 the tattoo promoter it's accessible in all these different cell types but didn't you see that progenitor cells have one set of accessible elements and further elements distinguish let's say lymphoid cells and specific kinds of cells are just see these eight cells and NK cells so the message here is that each of these cell types they're making the same turning on the same gene making the same RNA I did doing it with different gene switches and these switches then tell us
13:30 - 14:00 the identity of cells that are involved this particular concept can be particularly powerful when we think about the problem of cancer cancer cells are individually different and this has been long known on the left is a play from a paper by Virchow back in 1847 it was drawing images that he saw under the microscope you can see that individual cells are not all identical and the
14:00 - 14:30 right hand side is a more modern image from a review which raises the concept that tumor cells can go through this kind of epigenome make a chromatin changes and that changes their behavior we use our technology and teamed up with colleague at Stanford University to study human leukemias acute myeloid leukemia in this particular disease which is a cancer of the blood cells we know that the hematopoietic stem cell the H has C that gives rise to all the
14:30 - 15:00 other cells in the blood suppers are series of mutations their first mutation and create something called a pH SC and further mutations causes a cell shown yellow here that leukemic stem cell that can now propagate the disease there's still a minority of cells in the blood the vast majority of cells are these blast cells which is color red here we're able to isolate all C different cell types from leukemic patients and show that in fact they're also in
15:00 - 15:30 parallel to the genetic changes there's corresponding systemic changes to the epigenome as these cancer cells progress through these different as stem cell fates we can also answer some important questions we know that in certain cancers in leukemias that the leukemic cells will show features or markers of different kinds of normal cell parts cell types and so this is a confusing situation people are not sure whether
15:30 - 16:00 it's because there are two kinds of cells running around in leukemia or is it that really there's one cell running two programs and so we use our single cell technology single cell attacks each to examine a leukemia patient in this case a patient so he makes themselves so on the graph on the right here each doc shows either individual cells or particular cell type and this two-dimensional plot here indicates our relationship by distance
16:00 - 16:30 the more related the cells are the closer they're together and if they're far apart amines are quite different well what we see is that these purple cells two individual cells from the cancer patients they do not map to any of the known cell types they map in between them and that really indicates that it's a single cell running two different programs a concept called lineage infidelity it also further turns out that the more that these cancer stem
16:30 - 17:00 cells are running the program the normal hematopoietic stem cell to HSC the more they able to copy themselves and renew themselves and that in that case of cancer is a bad situation we found that in fact that there's quemic stem cells with a high sort of phsc potential they're much more likely to cause death unfortunately for the patients whereas those have with a low THC content have a much better outcome and so therefore we
17:00 - 17:30 can see that even this epigenome information has potential prognostic information we were able to extend these concept into also solid cancers Cancer Genome Atlas has been a major effort for the cancer community over the last decade and many investigators have systematically collected nearly 10,000 tumor samples and sequence their genome sequence their RNA but until very recently we didn't have any any information on the epigenome landscape
17:30 - 18:00 we teamed up with the TCGA group and we really use a taxi technology to map the chromatin landscape in 23 human cancer types which are shown here on the right and these span some of the most common and deadly human cancers including glioblastoma lung cancer breast cancer colon cancer and so on and so forth we studied 410 tumors and we discovered
18:00 - 18:30 over half a million DNA elements that are active in these diverse cancer types what is very intriguing is that we found that nearly half of these elements are not active in our surveys or normal tissues they're only activated in the contacts in the pathology of cancer we can learn some really intriguing results geneticists have long studied different families looking at different risk of
18:30 - 19:00 cancer and the vast majority of these sort of risks associated cancer are actually following to the non-coding elements and so if them it's a mystery as to how they might work so on the right is an image coming from the epigenome mapping again genes are on the x-axis and the hike on the y-axis indicates accessibility at the top in orange I'm showing you five examples of colon cancer on the bottom five examples
19:00 - 19:30 of kidney cancer this gene being shown here is called Mik is a very important and powerful encouraging and nearly all the cancers return on Mick but the point I want to make is that the colon cancers turn on Mick using different elements show more to the left side the five prime end of the locus and the kidney cancers turn on Mick with a difference of elements more to the three prime end of the locus so again different switches even for a common encouraging across
19:30 - 20:00 different cancers the second important point is that one of these switches that's turn on in colon cancer is precisely the location for colon cancer predisposition it's only active in colon cancer and conversely an element that's associated with kidney cancer predisposition it's actually only again only turn on in kidney cancer okay so this epigenome mapping then provided us an ELISA I think a biochemical hypothesis an explanation for these risk
20:00 - 20:30 elements associated with cancer predisposition we also learned that beyond inherited risk we can also explain somatic mutations those are acquired in the body in the course of cancer this is a map looking at a particular locus in different kinds of bladder cancers and kidney cancers and we see that all these cancers have the same landscape except this one now all of a sudden gangs this very strong
20:30 - 21:00 accessible element activity in this locus and what we discover there is that if you look in the attack seek data well this accessibility comes from a mutated element and so the normal sequence is shown at the bottom of this graph here okay and the mutated sequence has changed a single base this letter in T from C to T and we'll realize here is
21:00 - 21:30 that the cancer is essentially hacking the password of the genome this sequence shown on the top here is the perfect binding site for a particular transcription factor called NK X and when the cancer cell changes that C to a T it now has the perfect binding site again for NK acts and therefore it gains as accessibility because the Machine starts reading that part of the genome and turning on the gene we further found that the gene linked to this elements
21:30 - 22:00 called FG d for when FG d for level is quite high this is actually associated with a very strong risk again of of death and therefore this is the kind of information that be quite valid too now we can therefore use the epic unit information to understand both inherited and acquire Wisc of cancer this technology has continued to undergo evolution in a very important recent
22:00 - 22:30 advance is the increase in the scale of mapping single cell chromatin accessibility this is using a micro fula technology that can parse individual nuclei from south into nano liter size drops into these droplets then we combined them with so these are basic little beads that contain a DNA sequences each bead contains a different sequence and that's the barcode and so when
22:30 - 23:00 individual nucleus meets an individual barcode we can transfer the information from the barcode onto the nucleus and that says that all the molecules in that little drop came from the same cell once we have tagged all these individual drops we can then break the drops and in sequence all the molecules together but then we should retain information that they came originally from different cells so this technology allowed us to scale up the the throughput of single
23:00 - 23:30 cell epigenomics from let's say several hundred cells appart a say to not tens of thousands of cells or perhaps even more in the single experiment we were able to recently team up with colleagues at Stanford University to use this technology to look at a very important aspect of cancer treatment called cancer immunotherapy the poster child cancer immunotherapy is an antibody called PD one it's called
23:30 - 24:00 checkpoint blockade because in releases the brakes that are on the immune system for fighting against cancer and so in this kind of work people are really interested in what kind of immune cells are coming in to fight cancer and how do they change in the progress of cancer treatment and the challenges are that again we're talking about clinical material biopsies from patients that their timing and you have one shot to get it right and because you can't just
24:00 - 24:30 go back and keep asking a person to do surgery so in the context of a clinical trial for a kind of cancer called basal cell carcinoma we're able to see really biopsy the same tumor before during and after treatment and then subject them to this very powerful single cell epigenome analysis okay so in this map I call a you map against a two-dimensional plot that represents this cell information again related cells are more clustered
24:30 - 25:00 together different cells are separated and there are nearly 30,000 single tumor infiltrating t-cells in this map that we have analyzed they've been color coded based on different classes of cells and the only point I want to make here is that this tumor microenvironment is really a world into itself it's really diverse and there are all different kinds of cells that you wouldn't have missed if you just average everything together into a good mesh okay what we can further learn
25:00 - 25:30 is that these cells are related on the Left I'm showing you is trajectories that we've mapped out based on the single sign attack see data of the cells as they develop so from naive CDA cells into effector T cells memory cells or exhausted cells and also a knife cd4 cells into these cd4 t fh cells but what we learn on the right is that we can compare the same patients before and after checkpoint blockade and ask what populations change and then Brett really
25:30 - 26:00 emerges is that there's two populations exhausted T cells CDA cells and the cd4 positive T FA shows these two arms and going down and this is what expands and we think are very important for cancer immunotherapy we've been talking about individual DNA elements and how can use that to learn about the epigenome it equally important challenge is linking these DNA elements to their target genes and this cartoon kind of illustrates
26:00 - 26:30 part of the problem we know that a gene regulatory landscape is interweaved a DNA element can actually control a gene that's actually quite far away from itself there might be several genes in between and therefore simply finding out an element's active is not enough to say which nearby gene is actually being controlled and so this is a question can phrase as the last mile of Human Genetics what is my target gene if we got us a these large-scale studies find DNA variants our social disease now we
26:30 - 27:00 want to know what our that genes under control that might be changed and so this really needs a different aspect of epigenome technology looking into DNA folding and how does dean elements touch their target genes and so a technology that was developed that we think is quite useful as a method that we call the enhancer connector the idea is that we can take cells cross link them in their native
27:00 - 27:30 nucleus to preserve the three-dimensional contacts we can then retrieve the active enhancers based on one of these chemical marks that I talked about in the beginning in this case a histone modification called histone h3 lysine 27 acetylene and then when you sequence the these contacts what you should get is a map like this where we can see individual DNA elements in this case for example or causal variant for a disease and then its
27:30 - 28:00 target genes in this case gene D and G a but not the nearest gene which is gene B I she mentioned that by default in the genetics literature people oftentimes report these disease gene associations just based on the nearest gene on the linear genome and this information may or may not be correct so it's really a shame that we've done all this work but maybe haven't gotten the very precise information that we need so this enhancer connectome method actually
28:00 - 28:30 proved to be quite powerful it was a ten thousandfold improvement in the sensitivity we needed only fifty thousand cells instead of millions or tens of millions of cells and there's also a tenfold improvement in the sequencing depth that one needs to get precise information here's an example of looking at a kind of rather rare blood cell th17 cells from a human blood from an individual standard blood draw we can see these kind of sort of checkered maps
28:30 - 29:00 relate long range contacts in DNA it's the same genome on the X and the y axis and therefore anything that's off diagonal such as shown here of reflux or long-range contacts and this map to show that we can actually see this kind of contacts from 500 kilobase resolution all the way down to a kilobase resolution this kind of mapping from primer human cells is important and needed this technology some of the rails
29:00 - 29:30 he analyzed we calculated using the prior technology would need about 4 litres of blood just so everybody is on the same page an adult human has 5 litres of blood so taking out 4 litres it's not something that I would recommend okay so literally would not be doable without this kind of technology okay so let's just show first check that informations accurate and so we're looking again at this very powerful Mik
29:30 - 30:00 on Corinne and this is something called a virtual 4c view we have an anchor point which usually shown by a dotted line and that's the point in the genome that we're looking from each of these Peaks then would be an active enhancer that is touching this viewpoint and the taller peak that means there's stronger interaction or is a stronger enhancer or combination of both and so this viewpoint told us that in this
30:00 - 30:30 particular cell that we're studying this Mik Jing is being contacted and turned on by these Peaks these five peaks that are shown here so how do we know that is correct it turns out that a recent study by focal at all and colleagues they actually went in and systematically try to block every piece of DNA in this entire interval okay whether it's known to be active or not and they found five elements show on the bottom here okay in red hatch marks and they exactly line up with the locations that were identified
30:30 - 31:00 by this enhancer connectome study showing that information is actually accurate now that we know that information perhaps useful we can think about applying it for something questions in human genetics for example in this map of amine cells t-cells we know that there are DNA elements been associated by genome-wide Association studies with diseases like type 1 diabetes or rheumatoid arthritis so what is the target gene the nearest gene is
31:00 - 31:30 this gene at the bottom shown the green called SMI m20 it's not Gina has really any known relationship for demonology but in this enhancer connectome map we discover that you start from the viewpoint this these diseases associate dean element that the true target genes are actually this gene are ppj which is very important for t-cell development and a second gene called stem chew which is a calcium channel that's involved in T cell activation and that makes much more
31:30 - 32:00 sense we can also verify that the these controls are really happening this is using a version of the CRISPR technology that we used a dead cast nine to bring in a silencing protein and this shows that if we target our PBJ promoter we can silence this lower its expression and similar if we target that disease associated element that was predicted to contact our ppj we also have equivalently powerful effect in lowering expression so it shows that in fact this
32:00 - 32:30 element disease associate element is controlling that target gene we can expand that concept and ask systematically for all these DNA associated elements in let's say autoimmune disease what are the true target genes is there really the nearest gene that's my report in literature or could it be something else and in fact we found that across either all autoimmune diseases or specific well known diseases like Crohn's disease
32:30 - 33:00 multiple sclerosis lupus or type 1 diabetes that there's nearly four full expansion of the protein targets or the genes encoding protein targets by a four-fold okay so a really substantial expansion of our understanding of these diverse disease types and finally I want to talk to you about ways of systematically now testing these sort of nominated gene epigenome connections and regulation and that involves combining
33:00 - 33:30 epigenome reading with epigenome writing and this is a method that we've called perturb attack is a single cell CRISPR screen for epigenomic phenotypes the current method for doing sort of large-scale CRISPR screens involves perturbing a large population of cells each for example getting a different CRISPR guide to silence or knock out a different gene we then impose some sort of selection for example cell growth or some sort of
33:30 - 34:00 reporter gene and we basically pull out a very small subset of cells that met our criteria we can sequence the CRISPR guys and see which ones are enriched which ones have been lost and we essentially know what's been enriched so this is something and so so we have these hits but everything else that got perturbed a didn't sort of pass our solution gets lost okay there are many phenotypes that don't manifest themselves as cell growth or reporters
34:00 - 34:30 in retail and so the concept of the perturb attack is that we want to again perturb cells lots and lots of different combinations but for every single cell we're going to capture that cell we're going to sequence the guide RNA and also read out its epigenome landscape by a taxi okay and this really means that we're doing multi-omics we're recording two kinds two modes of information the chromatin and RNA to make this possible
34:30 - 35:00 and so this was accomplished using a microfluidic platform where we can capture the single cells and in different chambers first perform attack seek then capture the RNA or code the molecules from the same well and so we can then map this single sound a taxi to the single cell RNA information and they'll graphs on the bottom show that this technology is actually working if we introduce a guide RNA for example to this gene sp1 we see a loss of
35:00 - 35:30 accessibility at xp1 and if you look genome Y the targets of sp1 are also being impacted we use this technology to perturb actually make lots of different perturbations either singly or in combination so here every row is a different perturbation and this is a recording of what kind of perturbations have been made then we can see that in fact their DNA regulatory elements that get changed
35:30 - 36:00 that are different with each perturbation and we can then show on the third column what kind of factors are most enriched at the sites that have been perturbed and the results we think make a lot of sense if you could perturb this factor called easy H to silence aid this is an enzyme that writes it's a histone mark called k-27 h3k27 trimethylation associated with gene silencing so if you get rid of easy h to the sites that
36:00 - 36:30 previously had k-27 trimethylation are most effected and they're all up regulated remove the silencer the targets get activated if we target a transcription factor called sp1 okay again this is a factor that's involved in activating genes so the most affected elements are those that contain sp1 sites and you loose the activator so the target genes go down so they're on the left side of this graph now finally
36:30 - 37:00 at the bottom this is a targeting and long coding RNA called eber to and prior work has shown that it interacts with a factor called pax 5 and indeed a pax lies one of the most affected a class of elements in this particular screen we wanted to use the sun technology to look again at the disease associated risk in there indogene oh and we know that there are elements that are affected I've showed you how we can find them find their target genes but what we want to
37:00 - 37:30 know now is that what do we have to do to affect these switches to turn them all on or off at the same time okay and so this technology the strategy we used in is to basically identifies names autoimmune diseases altered these regulators in trance and asked which of these combinations or regulators can most affect this disease associate elements and their nearby contacts which we identify using enhancer connector and so this is such a map is a very busy slide so every column is a different
37:30 - 38:00 disease and we're looking at the DNA elements that are associated with that disease every row is a different perturbation either we're basically silencing different transcription factors either singly or in combination and we want to ask which of these factors have the most impact selectively on the thesis disease associates and then the the color coat indicates the level of impact and just as an example for this period disease called lupus what we identify is that
38:00 - 38:30 among these factors that we examine this particular factor end of kb1 and coding NF kappa-b the p50 subunit has a strong impact and if we allow to do a combinatorial studies the second factor shows up call relay now it turns up encode the P 65 subunit and a Kappa B and these two subunits actually work together okay so this unbiased screen told us that this heterodimeric complex was perhaps very important in this particular disease as a transacting
38:30 - 39:00 regulator affecting these thousands of switches across a genome and that fits with a lot of known biology so in summary I've told you about sort of progress in the entering the epigenome this is an exciting time where we really have a personal GPS for navigating the gene regulation landscape the concept is that we have technologies not to go quickly from individual patients or they're even rare clinical specimens do technology to find the DNA switches that
39:00 - 39:30 control when and where these genes turn on and off and that might put us in position to develop custom therapeutic strategies you