From Magic to Molecules: The Intelligent Microscope

Clive R. Taylor M.D., D.Phil.

Almost 200 years ago a new technology, the microscope, revolutionized the way in which medicine was practiced, resulting in the invention of Surgical Pathology as a discipline. Today the advent of molecular testing of tissue sections, coupled with digital whole slide imaging and analysis using an ‘intelligent microscope,’ is bringing about changes in Pathology of even greater magnitude, that will again ‘re-invent’ how we practice Pathology.

학습 목표

Describe in outline Companion Diagnostics, how they are developed, approved and used in pathology.
Understand the advantages and limitations of on-slide evaluation and scoring.
Evaluate the potential role and utilization of artificial intelligence in assisting the pathologist with histopathologic diagnosis.

웨비나 전사

Introduction

There are some basic objectives for today, really to talk a little bit about companion diagnostics, and to understand some of the limitations in how we assess them, and perhaps to look at the future role of artificial intelligence as to how it may assist us. This, of course, is a personal view, and the opinions I present are entirely my own for good or for ill. On reflection, pathology has exploited technology. We've been good at that, but really I believe that technology has shaped pathology and who we are, and what we are. One hundred and eighty years ago, I think the microscope invented the pathologist, not the other way around, and I think that today the digital microscope or the intelligent microscope is about to reinvent all of us.

I was fortunate over the past couple of years to work with Jan van den Tweel, and Jiang Gu in a book called From Magic to Molecules, which involved 30 pathologists from around the world who reviewed their various disciplines as to how pathology, and the understanding of disease had evolved. In every instance, when we reflected on that book it was clear that the whole process had been technology driven. That applies to microbiology, of course, where the microscope was introduced two or three hundred years ago, and really led to precision diagnosis in infectious disease through the evolution of antibiotics.

Within anatomic pathology, things have been a little bit different, but, again, the microscope has been central to the whole process. The first leukemia was described by Bennet in Edinburgh in about 1840, and since then there have been increasing entities recognized by microscopy. There is a similar and dramatic shift towards precision diagnosis and precision treatment, and it's changing the face of pathology through companion diagnostics. This era really started very abruptly on September 25, 1998—that's 20 years ago almost—when the FDA wrote to DAKO, as it was then, to approve the HercepTest. Digital pathology has since evolved dramatically. It really wasn't available at the time that the HercepTest was introduced, and digital pathology itself was approved by the FDA for primary diagnosis just last year. I'll refer to that again a little later because the impact of that approval is potentially very large.

The early days

Now, just on reflection then, prior to the microscope there was morbid anatomy, but by about 1850 that began to change. John Hunter was one of the people that changed it. He was a surgeon in London, in fact. He's a Scotsman, but he was a surgeon in London. In his career he performed more than 10,000 autopsies, and he made 13,000 specimens in jars, if you will. For those of you that do visit London, there's a wonderful collection of many of his specimens in the Royal College of Surgeons Museum in London. His original museum you can see there on the right-hand side was lost mainly during the war, and many specimens were destroyed but there are still many at the Royal College of Surgeons.

So why did things change in the 1840s and 50s? Well, we have to not really give credit but attribute the change to two individuals named Burke and Hare who were providing bodies to a surgeon in Edinburgh whose name was John Knox, and he was paying them for the bodies about seven pounds and ten schillings of gold. These were for the education of medical students. The problem is he needed more bodies than were available by natural means, so Burke, and Hare sort of accelerated the process a little bit, and eventually were convicted for murdering 16 people whose bodies were sold to Knox. Hare plea bargained, and he disappeared. He was thought to have died of tuberculosis in London a few years later, but Burke was executed, and he was publicly dissected in Edinburgh.

This is the execution and confession of Burke, and he was publicly dissected by the Professor of Pathology in Edinburgh at that time who was known as Monro Tertius because his father had been professor before him and his grandfather before him. For a hundred plus years the Munro's were the professors of pathology in Edinburgh. Now, all of this produced a dramatic change because it changed the law in the United Kingdom, and eventually in Europe, and around the world that made dissection legal for medical purposes because prior to then dissecting a body was illegal. That was the - - pathology up to about 1850.

The Microscope in Pathology

Now, as I mentioned earlier, the microscope had been around since 1700, but it took about 140 years until 1840 before the microscope was widely used in pathology. There were reasons for that. Adoption was slow because resolution was poor. Quality of the instruments was poor, and the cost was very high. A man called Joseph Lister, who was actually Lord Lister's son, the man who invented antisepsis, Lord Lister's father rather, he introduced improved lens grinding methods. He actually wrote a paper with Thomas Hodgkin called Observations of the Blood in Animal Tissues Using a Microscope. Interestingly, when Hodgkin published his paper five years later on the absorbent glands of the spleen, that's the paper that essentially introduced Hodgkin's disease to the world, Hodgkin did not use the microscope because he felt it still was not generally useful.

It was another decade or two before the first pathology microscopy texts appeared. Virchow's book, of course, is the more famous of the two, Cellular-Pathologie, but the book by Sir James Paget was actually titled Lectures in Surgical Pathology. Our discipline of anatomic or surgical pathology as we know it today with the microscope began about 1850. This led to a tremendous explosion of pathology. In fact, anatomic pathologists, surgical pathologists did not exist prior to 1850 because the technology had not invented them. It's actually been the age of the microscope, and for 150 years that's how we diagnose cancer, and everything else. This was really image analysis by mind, and by microscope. The names of the people on this slide here almost all have stains attached to them, Weigert, Aschoff, Cajal, Virchow, and I put Bob Lukes on the slide because I was privileged to work with him for 30 years. He was a wonderful microscopist.

So all of that led to about 1900 when the technology of pathology really was the microscope, the formalin paraffin section, and a set of histochemical stains. If you notice, all of those stains, the basic biologic stains that we use today all were invented, and introduced in conjunction with the microscope. Again, new technology in the period from about 1850 to 1900. So in 1900 then the legacy was the microscope, and the H&E, and, interestingly, by 2018 not much had changed. But things are already beginning to change.

Our legacy then with the formalin paraffin section was forming the standard method of processing tissue. That was introduced by Blum, and Blum, father and son, in 1893. That's really still the standard. There have been attempts to introduce other fixatives, but none have replaced formalin or displaced formalin. Secondly, our legacy was that morphologic diagnosis is by H&E. We have to recognize that that is subjective. Many clinicians have been concerned that morphologic diagnosis is subjective. If we just look at these six cases, for example, the diagnosis is to a degree a matter of opinion. The more experience one had in the field, the one more one's opinion is respected and the more likely it is to be correct in relationship to whatever the consensus diagnosis is.

A number of clinicians had been a little disrespectful, and perhaps rightly. This is Marcel Bessis in 1980 at a meeting where we were talking about anatomic pathology. He was a hematologist, and he said well it's a level of science. It's about the same level of science as collecting butterflies. In retrospect, he was correct. It's an opinion whether a butterfly is this or that. There's not much in the literature about the subjectivity of pathology, and perhaps that's because we don't like to write too much about our deficiencies.

Here's one manuscript from about 20 years ago. These were soft tissue tumors. They looked at 89 lesions, four different pathologists independently, and they each scored them. So 89 multiplied by four gives 356 calls. These pathologists looked at these without a clinical history, and of those 356 calls 172 were malignant. There was then a washout period to forget the cases, and they re-looked at them, but then they were given the clinical history and suddenly 227 were malignant. Same glass slide, same morphology, same everything, but it just shows the subjectivity of the diagnostic process on H&E.

There was a seminal paper published in 1985 by Kevin Gatter, who happened to have been a student in my lab back in Oxford in 1974 when we were first doing immunohistochemistry. Kevin did a very brave thing. He was then a junior faculty person at Oxford, and he took 120 consecutive routine cases that had been classified as anaplastic malignancy or as anaplastic carcinoma or lymphoma. So he had 120 of those. Twenty-four were unclassified. Forty-three were carcinomas. Fifty-three were lymphomas. He did very basic immunohistochemistry on those. With that basic immunohistochemistry, just staining for keratin, and leukocyte common antigen, which was all he had available in 1985, he found that of those classified as carcinoma 43 cases—in fact, 29 were lymphomas. So over half of all these diagnoses were, in fact, wrong.

New Independent Techniques

This was the beginnings of recognition that the morphology could be improved by having independent techniques to give a more definitive cellular recognition. Now, over four decades all of this has accumulated, and the role of pathology has changed. We were pretty excited at USC ten years ago if we could persuade our residents to identify four morphologic types of lung cancer—small cell, non-small cell, adeno, squamous, and large cell not otherwise specified. It was hard work getting to do that. We added some immunohistochemistry. They got better at it, and now if we do that and send that result to the clinician they're not happy. What they want to know now are what the subtypes of adenocarcinoma are not in relationship to morphology but in relation to the abnormal proteins that are expressed, particularly those proteins for which they have in their toolbox a targeted therapeutic, a targeted therapeutic for HER2 or BRAF or ALK or whatever it happens to be.

So the diagnosis has suddenly started to shift on us, and it's shifting away from morphology towards a diagnosis that is more directly related to a specific narrowly specified therapy. How do we make those definitions? Well, we can't do it by H&E or by the eye of the pathologist, but we can begin to do it by various techniques that will let us identify the abnormal protein in relationship to a potential targeted therapeutic. In addition to these expressed proteins, in the past five years it's become clear that PD1, PDL1, and some of the immune therapeutic markers also are critically important. None of those can be assessed by H&E alone.

Let's just review some of the nomenclature. A targeted therapeutic is a drug such as Herceptin that targets a specific molecule on a cell or a tumor. We need to identify which patients will respond. The molecule in this case is HER2 , and the test that identifies the HER2 on the cell is the HercepTest. This is an FDA approved test. The HercepTest distinguishes patients into responders versus non-responders. It's critically important to recognize that in the stance of this first approval and approval for subsequent companion diagnostics it's done in response—it's done in relationship to response data in a clinical trial environment where the drug, and the targeted therapeutic, and the companion diagnostic are developed side by side, hand in hand in relation to patient responses.

Is that going to be important? Well, a decade ago business thought we would need large numbers of these types of companion diagnostics. It was, if you like, the diagnostic wonder drug of Wall Street was the companion diagnostic. Business says we'll need a lot of these things. Science equally, and more importantly in a way also says we'll need a lot. This is just a summary slide from Bert Vogelstein's work from a paper in 2013. What Vogelstein and colleagues really showed that, in the case of colon cancer in this illustration, is that as the cancer evolves there are a series of mutational events in a clonal population. Those mutational events are translated into different molecules that appear on the surface of the cell. Those different molecules that appear on the surface of the cell are then targets for therapy.

Our job then as a pathologist is to recognize these targets, and also to recognize that different cases of colon cancer even if they look the same on an H&E, in fact, may express, and do express different targets on their surface. How do we achieve that? Well, those are what companion diagnostics are designed to do. It's for the first time then that the pathologist, that's you, determine the treatment of choice directly. It's not as though you call it non-small cell cancer of the lung, and the clinicians then decide whether they're going to give radiotherapy, surgery, CHOP, MOPP, - - or whatever. You tell them that it's HER2 positive, and they treat it with HER2 . This is a definitive direct connection of the pathologist to the patient, and we better get it right because there is no other parameter that checks against it.

Then, the problem becomes even more extensive or the challenge or the opportunity becomes more extensive because it's not just colon cancer. It's all cancers. These are from a paper by Jiang Gu, and myself about three or four years ago, but it shows that particularly regardless of cancer type almost all cancers have these morphologic expressions. This just happens to show lung, colon, and melanoma. Melanoma is really whether the story started. How do we detect these molecules? By companion diagnostics. How do we do that? Basically, by any technology that is available to us. I'm mainly going to talk about IHC today with a little bit about FISH. Not going to really talk about PCR, and next generation sequencing.

Immunohistochemistry has an advantage over other methods because it retains the morphologic cellular identification. After all, as morphologic or surgical pathologists, we've invested ten, 20, 30 years of our life in learning how to make these interpretations. We really don't want to throw that information away because that information is lost in PCR or NGS, which are obviously performed on tissue that's extracted. Therefore, the morphology is destroyed. In addition to that, we also have the issue that it's not just the proteins we're interested in. It's also the immune cells. This is the PD1, PDL1 contribution. It's the lymphocytes, the macrophages, their activation status, and how they are located in relationship to the tumor. I want to talk about that for a little bit.

In the United States, class III immunohistochemical tests are approved by the FDA , and that pretty much is Big Brother watching over us. That's as close as Big Brother gets we hope, but there is more regulation, of course, as we make more exacting claims as pathologists. That is that we can define the treatment. Then we must expect to be scrutinized more closely that we're getting it right. An approved diagnostic, in a sense, is a companion diagnostic that's gone through a very rigorous validation of the methodology. The methods are validated. The reagents are validated. The controls and the scoring, etc. The challenges for us and pathologists, of course, then are to identify, and score these cases—sometimes in addition scoring the immune cell. This is a challenge.

Some Challenges

Here, for example, is a PDL1 stain on a lung cancer. Is it PDL1 positive in the context of defining therapy? Let's assume the threshold here is five percent or more. Is this tumor five percent positive? Does this patient get treated, or do they not? It's your decision, and your decision alone. Well, percentage is a problem, isn't it? It requires if we're going to be precise about it that we count the number of cancer cells, and the number of positive cancer cells, and the total cancer cells, and that we'll do that with the naked eye, as it were. - - tissue section, a large one. It's prostate cancer, as it happened. In this tissue section there can be two million upward cells in total. We don't look at the whole section in that percentage calculation. We look at a field, a x4, a x10, or more often a x20 or a x40 to score.

If we look at the x40 field, there are about 600 cancer cells. That's assuming that each cancer cell is about 20 microns in diameter. You can then do pi R squared, and calculate it out, and you get a number of about 600. That's if the cancer cells are about 20 microns on average and they're sitting shoulder to shoulder in the field. Then, we have to look at the positive cancer cells over the total cancer cells. We should count them. Likewise, We don't. We guess. We guess fairly well.

Try guessing this one. Let's assume the threshold is 5%. Does this patient get treated or not? It's PDL1 in lung cancer. We need the numerator, the positive cells, and the denominator, the total cancer cells. The denominator, how many are there? Well, we can't actually count that. See if there are 600 cells there or 300. It would take forever, and even then we couldn't do it accurately without a graticule. Let's assume there are 600 if it's the whole field. About half the field is cancer cells. Let's say the denominator is 300, an educated guess. How many positive cells are there for the numerator—ten, 20, 30, 40, 50? If there are 14, the patient does not get treated. If there are 16, they do. Anybody think you can do that reliably, reproducibly with your colleague or even with yourself? I believe it's not possible. Even if we can do that, if we got the denominator slightly wrong—if the denominator is 320 instead of 300, then 16 would be the cut-off number. Or if it was 280 instead of 300 then 14 would give a patient treatment. This clearly is not possible at the threshold.

What can we do about that? Even if we count one field, we've only counted far less than one percent of all the cancer cells. There's heterogeneity. That's a problem. There's a problem of we've assumed that all of these positive cells are cancer cells. That assumption is actually not justifiable.

Here are two different cancers, both lung, an adeno, and a squamous. The top one, brown equals PDL1, and positive cell to be the numerator and there are a lot of them. Maybe 100. The problem is that they're not all cancer cells. At the bottom we've got some positive cells. Is that a positive cancer? Is it one percent or five percent? We look at the top one and we do a second stain, in this case for CD-68, which identifies macrophages. We can then see that, in fact, about 40% of these PDL1 positive cells are, in fact, not cancer cells. They are, in fact, macrophages, and so our numerator would be way off. On a morphologic basis, distinguishing a cancer cell like that one from a macrophage like that one on morphology alone—there's the macrophage and there's the cancer cell—is not possible. While we are very good at cell population identification overall, we're actually not very good at individual cell identification one for one. So we would be wrong on the first one.

Let's look at the second one. Here we've got some positive cells. This is a squamous carcinoma. Here we've stained the squamous carcinoma with p63. We've stained the nuclei of the squamous carcinoma positive with p63. So the question is this a positive test or is it not? If we look now at the PDL1 positive cells, there's not a single PDL1 cell that has a red nucleus. That is, there are no squamous carcinoma cells that are PDL1 positive, and this, in fact, is a negative test with a score of zero. I actually showed this very case to a dozen pathologists, and everyone scored it above one percent. Several scored it above five percent, and we were all wrong, myself included.

This again, morphology is fantastic. It's survived 180 years. It's still the diagnostic gold standard for cancer, but when it comes to this level of precision and scrutiny, it falls short. This is where we can begin to use the intelligent microscope to assist us. If we use multiplex staining, and we get multiple colors, either in bright field or fluorescents, which I'll come to in a moment, then we can actually use this spectral separation to separate the cells and recognize them and count them by digital methods. This is something fairly simply. Here's CD-20 staining B cells, and Ki67 stained red. The question is what's the proliferation index of the B cells. We can look at a high magnification. We can pseudo color. We can change the brown to green, and keep the red as red. We can see it a little more clearly. In a sense, we can change the bright field peroxidase to look a bit more like an immunofluorescence field. We can count, and separate, and there are, in fact, almost no proliferating B cells or the proliferating cells in this tumor that was thought to be highly proliferating, in fact, are something else, not B cells. They're actually T cells.

Importance of Immune Cells

So why are the immune cells important? It really stems from about five years ago. Melanoma started the process off. It was shown in a study by Tumeh, and Taylor. This little Taylor is not me. It's actually my daughter at UCLA. Cases of melanoma that responded to PDL1 therapy had lots of brown dots in them, and the brown dots are CD-8 positive killer T cells. If there are lots of killer T cells around the tumor and in the tumor prior to therapy, then the patients respond to therapy. On rebiopsy there are even more killer cells. Those cases lacking CD-8 positive cells prior to therapy do not respond, and subsequent biopsies show no accumulation of CD-8 positive cells. In this case, response to PDL1 was not just the presence of PDL1 on the tumor but was also the presence of CD-8 cells in the background. That turns out to start the beginnings of a new story, and that new story was we need to look at the immune cells at least in some cancers. We can do that by multiplexing immunohistochemistry. We can't do it by methods that extract tissue, NGS, PCR, or proteomics. We lose too much information.

It does lead to the notion that there might be two major classes of cancer, those that are immunologically silent and have no inflamed cell or no lymphocytes in the background. These tend to have low mutation load. It's no good treating these with PDL-1 because there's no immune response that can be activated. These actually need types of therapies that stimulate immune responses like BCG, tumor vaccines, etc. versus those neoplasms that are inflamed in the context of having many lymphocytes and macrophages in the background. Those do tend to respond to checkpoint inhibitors. We need to distinguish these two tumor classes apart, one from the other.

For pathology, this turns out to add an enormous level of complexity and challenge that Hadi Yaziji and I sort of wrote about a couple of years ago. This just shows five different drugs in this context, PD1 drugs and PDL1 drugs. These are just five, and they may be used in a variety of different tumors, and the list of tumors for which there's evidence of potential value grows almost daily. It's half a dozen or a dozen. It really means that we have to do all of these different tests, and there are many of them. The problem is that the approved test for pembro cannot be used for any of the other drugs, and vice versa. This is because the FDA approval process in clinical trial approves that specific test, the antibody, the platform upon which it's performed, and the scoring method. They are not translatable from drug to drug. This produces a huge problem for the lab. If your clinicians want to use two or three of these different drugs, then you have to set up two or three of these different assays if you want to use an approved test. Or you have to develop a lab developed test with all the expensive validation and the risk that that involves. Big challenge. Very expensive. Big problem.

Multi-marking

We're going to look at the potential ligand pairs, the targets like PD1 and PDL1 or CD-40, and we're going to look at the immune cells not by morphology because we can't distinguish T and B cells by morphology. We have to use phenotype. Then, we need to use several markers at the same time. We can't do it by tissue extracts. We have to do this in a tissue section. This really requires that we go to multiplex immunohistochemistry, which can be multiplexed with FISH, if you wish, but multiplex IHC and use fluorescence method. We finish up with something that looks like this. This, again, comes from Tumeh and Taylor. It's a whole slide scan of a multiplex fluorescence stain that shows a biomarker here, PDL1 in red. It shows killer T cells green, etc. Furthermore, It shows multiple markers in a single section, and you can get some idea of it just by glancing at it, but you can't interpret this without the use of a computer.

It leads to some interesting manuscripts. This is one you might want to go look at. This is a Merkel cell tumor, and this patient had a Merkel cell tumor here that was biopsied and then treated, and then re-biopsied here three weeks later. At the time prior to treatment, green shows PDL1. There's a lot of it. Yellow shows' killer T cells. There are several but not a huge number. Three weeks later after successful treatment the lesion has decreased. PDL1 cells are going away, and there is an increase in CD-8 positive little yellow cells in the background. There is an immune response that has been released by the PDL1 therapy and has proved effective in this tumor, but none of this could have been evaluated by an H&E alone.

The Need for Digital Pathology

The problem then with these slides is that you can't read them with your wonderfully up-to-date microscope such as the one I had in my office right here, which actually dates from about 1860. You can't read this slide with the naked eye, but this can with your help. This is a whole slide image, and the computer is helping you. It's not replacing you, so we don't have to panic and go and change our careers. We do need, I believe, to learn to adapt and adopt and become the users of this new technology and let it shape the future for us and for our career.

There were obstacles to digital pathology. It's taken 20 years or more to evolve, actually many more than 20 years since the first digital presentation, but some of these obstacles are rapidly disappearing. Resolution is now fantastic; x40 scans could be done in less than a minute. The image quality is superb. Storage is becoming possible, particularly with storage becoming increasingly occurring in the cloud. Apps for scoring are increasing rapidly. Hardware costs are coming down. Software costs are coming down, and particularly software, again, is available from the cloud, so you don't actually need it on your computer. The two main obstacles are regulatory obstacles and reimbursement obstacles in the U.S. and you, that is the pathologists who are resisting this to some extent because we're accustomed to using our microscope. We're not yet accustomed to using a computer screen. This is changing rapidly, and I'll address that in a moment.

Really, computers perhaps more than anything in the last two decades have accelerated in their performance capabilities almost out of sight. Here's the Lunar Lander from 1969, and it actually had less computer power on than my old iPhone 3, which I bought in 2010. That's sort of remarkable. Bill Gates, of course, is pretty well-known, very successful man, right? But he made a prediction just after the Lunar Lander that about 640k of memory should be enough for anyone. He, of course, was wrong by a factor of about a million-fold. It just shows you can be wrong and still make a lot of money, but he was smart enough to recognize he was wrong and move forward. This is what I think pathology needs to do. We need to recognize the value of this new technology because if we put the power of immunohistochemistry and immunofluorescence together with whole slide imaging and digital analysis, there are enormous things that we can do in the future that we could not do in the past.

We can, for example, take this multiplex fluorescence slide, and we can pseudo color it. Then, we can make it more familiar. Here we see one here. We can take this slide and we can take these multiple colors, and we can segment the slide into areas of tumor and non-tumor and score particular cell types within the tumor area, the non-tumor area. We can do all that by deep machine learning and by computer that will give us the data on a file that we can then look at that file in relationship to the slide and produce a final interpretation that's a combination of our morphologic expertise and the analytic power of the computer.

We can go even further than that. We can actually take a tumor like this, and we can pseudo color this multiplex slide, so the tumor cells now have yellow nuclei. All of these cells here are tumor cells. The natural killer cells here have been colored red. Now, you see them, and they have been colored red. We could look at this tumor just now as a fluorescent image with the naked eye, and we could say well there are a lot of natural killer cells here, a lot of red cells, a lot of killer T cells rather, not natural killer cell. A lot of killer T cells. They're red. This - - respond, but, in fact, if we go one step further and let the computer analyze this only about three percent of the tumor cells here have a killer T cell within 25 microns. Although there are lots of killer T cells here, they are not opposed closely enough to the tumor cells to kill them. This is the work of Bernie Fox and colleagues at Oregon. It's brilliant work in conjunction with Cliff Hoyt at PerkinElmer.

This is the analytic ability that's available to us as morphologic pathologists to begin to use. We can even take these images here that we can't interpret with the eye and turn them around to produce pseudo H&Es from them and pseudo bright fields such that we can actually look at the PDL1 positive cells that are red here but mysterious in terms of morphology and are now brown here and have the morphologic features of cancer cells in this simulated H&E. The power to do things is enormous. We just have to learn to adopt and exploit it.

Problems to Overcome

One problem, of course, if we're going to use immunohistochemistry to do this we have to improve the quality and consistency of immunohistochemistry. We need to look at the whole test, and we need to improve sample preparation. We need to qualify the tissues. We need to define the analytes rather better than we do right now. We need to validate and standardize the methods better than we do right now. We need to develop better control systems than we currently have, and then we need to develop scoring by the computer. All of this, of course, is the subject of a different seminar that I'm not going to talk about today beyond mentioning that we still have a lot of work as pathologists to do.

We have another problem too. There's many scanners out there, not exactly a new one every day. This is a slide from Marcial Rojas, and, in fact, it shows about 30 different scanners out there. This you might think is great except it frightens the FDA because the FDA are worried that different scanners will produce different qualities and different colors. Therefore, as pathologists, we will make mistakes. We did some work with the FDA about seven or eight years ago through the Biologic Stain Commission, some of which was published here. We tried to show the FDA that as pathologists we actually are not stupid. If H&Es look a bit different, we can adapt to that and adopt those H&Es and use them. We actually took parallel sections from the same block. It's actually a placenta, as you can see, and we sent it to five different well-known labs saying give us your best Monday morning H&E and send them back. This is what they looked like. You can see all five are really quite different in terms of the quality of the H and the quality of the E. We showed this to the FDA , and they said oh wow that's really quite remarkable.

Nonetheless, they will not abandon their requirement that scanners do show reproducibility on a day-to-day basis. Fortunately, that's being controlled within the technology of the scanners themselves. Scanners can compare their performance on - - both in terms of the morphology and the resolutions that they give and the color. A lot of this work comes from Mike Feldman, who has taught me a lot about this. You can automatically recognize prostatic glands by segmentation, and if the scanners produce differences you can then use control technologies in the background to iron out and harmonize some of those differences.

Here's another example of scanner variation. This includes the - - scanning system. You can have pre color and then post color normalization, which brings them closely or more closely in comparison to one another. All of this can be done. It's Photoshopping in a way, but it's controlled Photoshopping against reference points.

Then we come back to the old friend, the H&E. Bauer and colleagues had published papers as long as 2013. This was 607 cases. They concluded that whole slide imaging was not inferior to microscopic slide review. They had pathologists diagnose these cases microscopically. They had them diagnose the same cases on a whole slide image, and they were not inferior in terms of diagnosis within a margin of four percent. There are a number of other manuscripts that had supporting data. Gilbertson WSI missed a few cases. Ho, they were equivalent. Some said that Ho's slide imaging was, in fact, superior to glass slides. The overall conclusion, no difference. Not acceptable to the FDA . They wanted a formalized study. They eventually got one. Philips produced this study in four centers with 27 pathologists over four years. It cost a bit of money. Basically, the study compared the diagnosis, digital by whole slide imaging versus the diagnosis optical on the same slides. They compared both back to the original diagnosis. The requirement was that whole slide imaging was not inferior to glass slide back to the glass slide diagnosis remade or reconfigured on the same slide. That was the conclusion. It was not inferior.

This is the abstract of the paper. A large multicenter study shows concordance, and the final paper appeared in Surgical Pathology in 2017 with Sanjay Mukohopadyhay as the first author. The conclusion on the basis of 2000 cases by 16 different pathologists was that whole slide imaging is not inferior to glass slides for primary diagnosis. With that approval, which obviously will be followed by other manufacturers down the road, it alters—it provides an opportunity now to go back to the H&E slide and say well look at the H&E slide. What other information can we extract from an H&E if we use the power of digital analysis and overlay it and interface it with the power of the pathologist and their morphologic experience accumulated over 180 years. What other riches can we find?

Machine Learning

There are already studies beginning to appear. This is Journal of the Royal Society by Juan in 2014. Just using deep machine learning to look for lymphocytes in relationship to malignant disease they were able to conclude that the lymphocytic infiltration is associated with a favorable prognosis and can predict response to chemotherapy in many cancer types. We have tried to do that with the naked eye as pathologists over the years, and we've not been successful. It's because the analytic process with the naked eye is incredibly complicated, and the scoring and the counting is incredibly difficult. Whereas a computer once it's educated into the algorithm will perform it reproducibly. Those types of data are emerging.

This is then the beginnings of deep learning. Deep learning essentially is where you stop trying to tell the computer what to do and you let the computer learn what to do. You give it training sets, which have material diagnoses in that you recognize and you're comfortable with, and then you give it test sets and you compare the data. By doing that, you can come up with a probability map—in this case a probability map for cancer. This is in Nature Science Reports just last year.

A little bit more about that. Deep learning can identify tumor in tissue sections. You can do it manually, or you can let deep learning do the recognition process for you. This is a slide from Mike Feldman in 2017 at a seminar we did together at the FDA . You can go beyond low magnification or x40 type deep learning to even deep learning on a single cell basis to what the features are on a single cell basis for cancer versus normal tissue. Amazingly, it's quite reliable. Many of you are practically familiar with the CAMELYON study. It's a study where slides were collected, whole slide images were collected. Pathologists were asked—these were sentinel lymph nodes, and they were asked to recognize whether they were involved by cancer or not. The challenge was put out to artificial intelligence groups. See if you can come up with an algorithm that can match or beat the pathologist. There was a training set made available of about 170 slides, a mixture of normals, and cancer cases. The pathologists then looked at the test set, and the artificial intelligence systems looked at the test set. Two systems performed astonishingly well. Others performed quite well. This is a University of Warwick system. The winner overall was Andrew Beck at Beth Israel. If you look at the error rate, the pathologist error rate was 3.5%. The artificial intelligence error rate was 2.9%. It beat the pathologists. This is IBM's Big Blue beating Boris Spassky at chess 20 years ago.

Even more remarkable, if you put the two together, the pathologist plus the artificial intelligence you finish up with an error rate of less than one percent, which is quite phenomenal and is, in fact, a first step, a landmark along the pathway as to where I think we join with intelligent pathology. The trend also, of course, is that you don’t need special software. The special software can be parked in the cloud. This is a courtesy of Anagha Jadhav at Optra. You put the software up in the cloud, so now all you need is a browser and access to the net and a pretty good resolution screen. It lets you do something like artificial intelligence is already part of our lives. If you're wandering around a strange morphologic area such as Malibu, California, and you say boy I wish I knew where the restaurants were—well, basically, you can get artificial intelligence to show you exactly where they are. If you're wandering around an area of pathology like a strange lymph node and you say I wish I knew where the cancer cells were, you can get artificial intelligence to show you where they are. You come back and check it. Say yeah I can validate that, or you can check this and say yeah I can validate those really are restaurants and those really are cancer cells. Again, that combination of the artificial intelligence plus the pathologist.

Conclusion

So how's it going to affect us? I'll close with the last couple of slides here. It's going to affect us this way. I gave a talk similar to this in Silicon Valley about ten months ago just as Apple was celebrating the tenth year of the iPhone. There were a number of their Apple personnel wandering around. We were chatting about various things. The revolution, I think, will affect us. It's not just for hardware. It's the software. In fact, it's going to be the apps that will affect us as well. What applications can be used? Talking to these Apple people, you know, they sort of—when the iPhone first came out there were five or six apps for the iPhone. Now there are over two million.

Give a digital file to a bunch of intelligent people, and they will do things with it that we have never even thought of. That will lead us, I think, to what I call the Horus seeing eye. This is after the Egyptian all seeing god. The Path PAD, the microscope of the future. I think the microscope of the future is going to look a bit like a computer screen. It's going to have a whole series of apps on it like these. When I want to make a diagnosis, I will come back and I will click the app for interpreting a multiplex. I'll click the app for looking for the micromets. I'll click the app for is this cancer or not. I'll click the app for interpreting a PDL1 score. It will give me a file with a pre-interpreted image that I then look at and I interpret in conjunction with my morphologic experience and I come up with a diagnosis that is better than my diagnosis alone. I believe that's where the future may be for us as pathologists. I think the intelligent microscope might turn us into more intelligent pathologists if we are willing to take that risk and learn how to use it.

Just a few references. These are personal opinions. They can be wrong or right. You have your own opinion about those, of course, which is fine. I also included a few references about control for immunohistochemistry, which, of course, might be the subject of a different seminar. These can help you improve the quality of your immunohistochemistry for companion diagnostics in your lab. With that, I'll close. Many thanks.

Q&A

When using an FDA approved companion diagnostic can the staining protocol be adjusted using the appropriate in-house controls?

PROFESSOR TAYLOR:

Okay, so that's a key question in a sense. If your lab is using an approved diagnostic for a drug such as pembro and you're using the FDA approved diagnostic, you cannot change any aspect of that test, either the reagent or the protocol. You can't change the incubation time or the retrieval method or the interpretation or the scoring. Change anything to any extent whatsoever and that approved test immediately becomes an LDT, a lab developed test, which requires much more extensive validation. It's a huge problem because you can only do a particular approved FDA test if you actually have the platform that the test was approved for. If you don't have that platform, you can't do it. You have to do an LDT test.

These new kinds of learning would require large amounts of training data, and will we get to a time when a standard format maybe DICOM Pyramid TIFF and annotation format will pathologists be willing to share and pool annotations?

PROFESSOR TAYLOR:

Well, I think we might get to a standard. I think the Digital Pathology Association - - many of the vendors are working hard to try to come to some sort of DICOM standard. What a pathologist will share—we sometimes are not too good at sharing, but I do believe that academically sharing already has occurred. The CAMELYON project, for example, used shared slides. I think sharing will become—on an anonymous basis, of course—will become much more common because I think many pathologists see the benefit of sharing definitive well-supported diagnostic cases for training purposes, not just training pathologists but training automated intelligence systems. I don't see that that will be a big problem, but I could be wrong. Maybe you should ask the participants today because they're the ones that are going to share.

Next, the counting of the cells. Is that what is considered computational pathology?

PROFESSOR TAYLOR:

Well, it's a piece of computational pathology. It's not the whole piece. When we talk about quantitative pathology, mostly when people talk about quantification they really are talking about counting. That is a form of quantification, of course. You know, how many lymphocytes are there per cancer cell? How many activated lymphocytes per total number of lymphocytes? Quantification, I think, will eventually go beyond that to looking at measuring expression levels of proteins on individual cells. Instead of scoring a HER2 test as +2+3+, I think we'll eventually score a HER2 test as 100 attograms per cell or 1000 attograms per cell of HER2 on a direct measurement basis. We're not close to that yet, but that will involve having calibrated control systems that people are working on.

Are evaluation and scoring of multiplex stained tissue sections accomplished with equal accuracy by either manual microscopic reading or by image analysis algorithms?

PROFESSOR TAYLOR:

No, I believe image analysis is far superior in terms of multiplex slides. Once you got more than one parameter that you have to evaluate, it becomes extremely difficult manually. Very tedious. Very time-consuming. I think the algorithms can do it. One limitation of current algorithms is that often they are performed on fields that are selected by the pathologist. That does lead to selection bias. I believe that whole slide imaging eventually will lead to scoring of the whole slide, not selected areas. That will improve reproducibility and accuracy. Already, that's being accomplished in some environments.

How do we integrate whole slide image analysis with high power fields manual counts?

PROFESSOR TAYLOR:

That's an interesting question. Again, it's a really learning process. I think if you're going to use an algorithm to count mitotic figures or Ki-67 positive cells or HER2 positive cells, whatever, then the algorithm has to, obviously, be developed and validated, and trained. Once it's validated and trained, the huge advantage of the algorithm is that it is reproducible. It will do the same thing day by day whereas pathologists we drift. I've had that experience with PDL1 scoring with pathologists and pathology groups. We can coordinate our scoring capabilities if we all sit around a microscope, but if we then don't meet for a month or so we drift apart whereas algorithms do not drift.

발표자 소개

Clive R. Taylor , M.D., D.Phil.

Dr. Clive Taylor is Professor of Pathology at the Keck School of Medicine, University of California. Beginning in Oxford, England, in 1972, his laboratory was the first to adapt immunohistochemical (IHC) techniques for use in formalin fixed paraffin embedded tissues and was the first to utilize the resultant stains in routine diagnosis for surgical pathology. He then opened a similar laboratory at the University of Southern California in Los Angeles, focused on lymphoma research and diagnosis. From the early 1990s his laboratory initiated use of IHC methods for assessing prognostic markers (such as ER,PR and AR), and then predictive markers, exemplified by HER2. From 1990, he became a Trustee of the Biological Stain Commission (BSC), which serves as an FDA surrogate and certifies biological stain for use in the USA. As President of the BSC, Dr. Taylor worked closely with the FDA in developing guidelines to manufacturers for IHC reagents, improving reproducibility. Dr. Taylor works with many international and national bodies for improved standardization of IHC and also works on developing computer-based algorithms for ‘Companion Diagnostics’, including HER2, ER, PR and EGFR.