Hacking DNA: The Story of CRISPR, Ken Thompson, and the Gene Drive

by Geoff Ralston4/3/2017

The very nature of the human race is about to change. This change will be radical and rapid beyond anything in our species’ history. A chapter of our story just ended and the next chapter has begun.

This revolution in what it means to be human will be enabled by a new genetic technology that goes by the innocuous sounding name CRISPR, pronounced “crisper”. Many readers will already have seen this term in the news, and can expect much more of it in the mainstream media soon. CRISPR is an acronym for Clustered Regularly Interspaced Short Palindromic Repeats and is to genomics what vi (Unix’s visual text editor) is to software. It is an editing technology which gives unprecedented power to genetic engineers: it turns them into genetic hackers. Before CRISPR, genetic engineering was slow, expensive, and inaccurate. With CRISPR, genome editing is cheap, accurate, and repeatable.

This essay is a very non-technical version of the CRISPR story concluding with a discussion of Gene Drive1, a biological technique which, when used with CRISPR, gives even greater power to genetic engineers. The technical details go very deep and for those who are interested in diving in, I’ve included a number of useful pointers. At the end, I will very briefly discuss the implications of these two new technologies.

First, a bit of background. The genetic code described in DNA can be thought of as the software that builds every life form on our planet. Geneticists are learning to decode genomes, but have been hobbled by the fact that they have very limited ability to modify that code. It is as if software engineers have access to vast numbers of supremely powerful programs which they barely understand and cannot edit at all. But given the ability to edit and change the genetic software, learning can begin in earnest. Not only is it far easier to discover how the code works, but bugs can be fixed and improvements made; hackers will now emerge.

It is easiest to understand this amazing technology by learning where it came from. In the 1980s Japanese researchers at Osaka University were sequencing the DNA of the bacteria E. Coli when they noticed something weird. DNA is built of long strands of nucleotides and within the bacterial DNA were repeated clusters of strange, out-of-place nucleotide sequences that were themselves interrupted by seemingly random strands of DNA. Over time, similar clusters of repeated nucleotide sequences were found in several other bacteria and were christened CRISPR because, well, the term pretty much described the sequences the researchers found. But no one knew at the time why they were there, nor their function.

There were several discoveries leading to the answer to these questions but two were critical. First, three different teams used newly available databases of genetic material to point out that those weird CRISPR sequences in bacteria looked a whole lot like vastly-out-of-place viral DNA. Second, a brilliant evolutionary biologist at the National Center for Biotechnology Information, Eugene Koonin, had the critical insight that this was a bacteriological defense mechanism against viruses. Eventually, this idea led to the discovery of CRISPR’s purpose.

We humans fight our own, all-too-frequent battles with both bacteria and viruses, and thus it may come as a bit of a surprise to learn that bacteria and viruses have been locked in their own epic battle for survival for billions of years. Every day trillions upon trillions of bacteria are killed by viruses (a virus which attacks bacteria is called a “phage”). Many more manage to fend off their phage foes, largely thanks to various armaments the bacteria have evolved over the eons – a rudimentary form of immune system. CRISPR, it turns out, is a highly effective weapon in this bacterial immune system.

First, in order to avoid confusion, let me just clarify one thing. As we saw, “CRISPR” was originally a description of certain weird DNA sequences. To use those sequences as a defense mechanism, or, as we shall see, to create an edit, a tool called Cas, for CRISPR-associated system, is needed. Sometimes, when referring to a specific type of Cas protein you will also see Cas9 (or Cas3, etc.). The complete toolset is therefore sometimes called CRISPR/Cas together, but is usually just abbreviated to CRISPR, which is the convention I am following.

Here’s how it works. When certain bacteria with CRISPR in their arsenal manage to fight off a viral attack, they then use the Cas enzymes to grab fragments of the viral DNA and insert them into their own DNA. If the same virus makes the mistake of attacking such a well-prepared bacteria, the bacteria would use the CRISPR sequence as a template to recognize the virus, then employ the Cas enzyme to chop it up into inert, non-threatening pieces.

It’s a very cool mechanism. Cas enzymes copy the CRISPR viral genes from their own DNA into an RNA molecule (like DNA, RNA is fundamental to gene expression) creating a Cas plus RNA package. It then take a random trip around the bacterial cell. When the Cas package bumps into another molecule it checks for DNA and if found, reads the nucleotide sequence. If it matches the copied CRISPR, we have a matching virus. The RNA then latches on and the Cas enzymes act like a guillotine, chopping up the viral DNA and rendering it inactive (or “dead” if one wants to believe a virus was ever alive). If this sounds to you sort of like an immunology algorithm realized in biology, you would be right:

    CRISPR.wander() – until encounter another molecule
        If (new molecule contains DNA and DNA matches
            CRISPR.dna) CRISPR.cas9.grab() // nab the viral dna
            CRISPR.cas9.chop() // kill the virus!
    continue

Cool, right? Decoding this mystery might have remained worthy of a path to a doctorate or from associate to full professor, instead of the Nobel-worthy, humanity-altering discovery it has become, if it hadn’t been for the insights of a several different (and now competing2) teams, each of whom had the same brilliant insight. They understood that this eons-old defense system had the potential to revolutionize genetic engineering. If the CRISPR mechanism could be harnessed, they would have a tool that could precisely chop out a piece of DNA. What’s more, with CRISPR and two different Cas9 enzymes, they could precisely chop out an entire segment of arbitrary composition. This is only half of an edit, of course. Once you remove the unwanted sequence you still need to replace it with the gene you wish in its place. As it turns out, this is a piece of cake. It is sufficient to simply inject the gene(s) you want to replace the snipped segments, and already-existing DNA repair enzymes will put everything back together.

One of the coolest features of CRISPR is that unlike most of the existing gene editing techniques, which only work on a limited number of organisms, CRISPR is generic. It will, at least in theory, work in any type of cell in any (earthly) life form.

Let’s just stop for a minute and contemplate how amazing CRISPR may just be for the future of humanity. We will be able to neutralize enemies in the animal kingdom. We will be able to cure genetic diseases. We may be able to develop therapies against any viral or bacteriological foe that will be custom and perfectly effective. There may even be a cure for cancer. And (or perhaps I should say “But”) we will also have the capacity to build custom designed human beings or, for that matter, any other life form.

So that is CRISPR: the most powerful genetic engineering tool ever created. But CRISPR only allows us to modify one gene at a time, one organism at a time. To make species-level changes, CRISPR must be amplified by another powerful phenomenon: gene drive. First, a definition and then a brief but relevant diversion to one of the most famous programming hacks ever.

The concept of gene drive has been around for almost 15 years. Particular versions of genes (aka alleles) promulgate themselves through generations in sexually reproducing species, with 50% odds of that allele being inherited by each offspring (assuming only one parent has that version), since ½ of the DNA comes from each parent. So, if you have one allele for blue eyes and one allele for brown eyes, your child has a 50% chance of getting either. Obviously, outside of any external force this means in the next generation the allele will have a 25% chance and so on. What might that external force be? The most obvious is natural selection. If a particular allele confers an advantage to its owner, then that allele is more likely to promulgate through a population. Gene drive is any mechanism that makes a gene particularly “selfish” in that it increases the probability that that particular gene will be inherited above 50%, regardless of any selection pressure.

This is important because if you want to use CRISPR to have a big impact on populations, say to change the Anopheles mosquito to no longer carry malaria, instead of having to CRISPRize millions of mosquitos and still see an attenuation in future generations (assuming the change does not confer an advantage to the mosquito) then a gene drive which placed the genes in the population more rapidly would have a huge advantage. Here is where a new character in the story pops up, Kevin Esvelt at MIT. Esvelt’s idea was to create a gene drive using the unparalleled editing accuracy of CRISPR. In doing so, he created the most powerful gene drive ever.

To understand what he did, let’s take a brief tangent into the world of software. In 1984, Ken Thompson, the inventor of the Unix operating system, and one of the greatest programmers of all time, wrote about his favorite hack3. It was a brilliant, recursive idea in which he created a virus capable of infecting the core infrastructure of every Unix operating system. It worked like this.

• First Thompson modified the source code (written in the C language) for the Unix “login” program to give himself a secret backdoor to become any user on the system.

if (password == “user’s password” or password == “ken’s special password”)
Log user in.

At this point his relatively modest hack would have given Thompson complete access to any Unix system which included his compiled version of the login code. But it would have also been completely obvious to anyone who looked at the login program.

• Next (and this is the truly insidious aspect of the hack), Thompson modified the source code of the C language compiler to recognize when it was compiling the login program code and to inject the viral code into its compiled output, regardless of the original source.4

Thus, the core code that rebuilds Unix was, itself, changed to sneakily create a hacked version of Unix. This is actually a pretty good analogue to Kevin Esvelt’s DNA hack.

Esvelt realized that DNA is the compiler of life itself. Everything that life did, including CRISPR, was created by that compiler. So, like Thompson, Esvelt realized that one could modify the compiler itself to change what it was compiling, regardless of the instructions within the program. So, here is a simplified explanation of Esvelt’s hack:

• Create a CRISPR edit, which will CRISPRize embryonic DNA with a desired change. For example, replace a brown eye color gene with a blue eye color gene.
• Create a separate CRISPR edit, which will embed the instructions required to create the first CRISPR edit within that same DNA. In other words, the compiler of life has now been modified to recognize and modify other DNA.
• During fertilization, when the modified DNA meets its new pair, those instructions will cause the creation of a CRISPR tool which will then CRISPRize that new DNA ensuring both strands of the pair have the desired change. Thus, even if the DNA from the non-CRISPRized parent had a brown eye color gene, that gene will now be changed to blue.
• The resulting organism, and all of its offspring, will then also contain the change. So long brown eyes!

This is the ultimate gene drive. Whatever modification you create can then be replicated in every descendent with nearly 100% reliability5. This will cause the new trait to flow through a species’ genome with unprecedented speed.

There will be many applications of gene drive as we attempt to control or manage various organisms, from mosquitos to bacteria. However, let’s take a moment to imagine the applications in our own species. One might argue that it is immoral to modify human embryos in this fashion and that politicians, religious leaders, and ethicists will outlaw using CRISPR and, especially, gene drive to change humanity. But on the other hand, think of the benefits. Gene-related disease can be eliminated entirely, before a child is even born. What’s more, as our understanding of the genome improves, think of the advantages we might confer on our children, and with gene drive, our children’s children. And think about the advantages to a society that can, for example, raise all children’s intelligence by 10 (or 20 or 30) IQ points.

What will stop people from attempting to drive desirable characteristics into a population? Continuing the example above, what happens if and when scientists develop a solid understanding of the genetic underpinnings of advanced intelligence? What will stop a government from mandating those changes in their population? And what will competing governments then choose to do?

We have already begun to fund CRISPR based companies at Y Combinator, and I expect we will fund many more. It is important to understand that once genetic programmers have access to life’s code, the sky’s the limit. Already the variety of applications, real and proposed, is stunning. Scientists have used CRISPR to modify goats to produce spider silk, an extraordinary material but very difficult to make in quantity, in their milk! It has been proposed that CRISPR can cure hemophilia and replace antibiotics. And Chinese scientists have controversially used CRISPR to modify (non-viable) human embryos. CRISPR-enhanced humans are not as distant as you might think.

CRISPR techniques are getting better and better. More accurate. More predictable. Cheaper. And we are learning more and more about the genetic code (partially thanks to our ability, now using CRISPR, to see what happens when we poke out one gene and replace it with another). The trends are unstoppable and the conclusion unavoidable: in the not very distant future we will be able to program most any animal in most any way we wish, including human beings. Whether we will resist the urge to tinker with what it means to be human is an open question, but I predict it’s only a matter of time until someone or some society will take that plunge.

Thanks to Sam Altman, Craig Cannon, Karen Lien, and Jon Ralston, who read and commented on early versions of this essay.


References
RadioLab – CRISPR
Breakthrough DNA Editor Born of Bacteria – Quanta Magazine
Rewriting the Code of Life – The New Yorker, Jan 2 2017 – Michael Spector
Emerging Technology: Concerning RNA-guided gene drives for the alteration of wild populations – Kevin M Esvelt, Andrea L Smidler, Flaminia Catteruccia, and George M Church
Wiki for CRISPR
Wiki for Gene Drive


Notes
1. For those who prefer listening to reading, RadioLab did a great podcast on CRISPR here.
2. UC Berkeley in California and the BROAD institute in Massachusetts are the two main adversaries and have already taken the issue of CRISPR ownership to the courts. Here is the BROAD institute’s timeline.
3. The implications of the Ken Thompson hack are profound. Here is Thompson’s classic essay “Reflections on Trusting Trust” and here is a good summary of the implications.
4. If you read about Thompson’s hack, you’ll see he actually took his hack one step further and modified the C-compiler to recognize itself and to inject the code that recognized login that injects the code in login.
5. This is not strictly true for all generations. Although the effectiveness should be very high at first, biology usually manages to throw in some surprises, and researchers have found that organisms actually build a sort of resistance to gene drives, much like antibiotic resistance.


Author

  • Geoff Ralston

    Geoff Ralston is the former President of Y Combinator and has been with YC since 2011. Prior to YC, he built one of the first web mail services, RocketMail which became Yahoo Mail in 1997.