Keynote Transcript


Intel Developer Forum, Fall 2001

Gadi Singer and Sean Maloney
San Jose, Calif., USA
August 30, 2001

GADI SINGER: Good morning, and welcome everybody. Year 2001 is an exciting year for Intel in the enterprise space. We are significantly stepping up the number and the breadth of our products and technologies in that space.

Just to give a few examples, this year we introduced a new architecture, the Intel(r) Itanium(tm) Processor with all its collateral. We brought the Intel(r) Pentium(r) 4 into the enterprise space with the Intel(r) Xeon(tm) for the workstation.

We embraced the emerging Ultra dense server space with highly tuned Pentium(r) III processors. And we announced a few initiatives in the telco space in bringing carrier-grade IA based systems into that space.

We'll get back to some of those products, but I want to start with what is at the top of our interest, which are the technologies. Intel is driving an acceleration of the technologies in the enterprise space. Part of it is through innovations, and we'll talk about some of those, providing solutions to some of the big challenges that were not solved in the past, getting to the next level.

The other part is standards that enable the industry to provide its own creativity and innovation to move the whole technology in the enterprise space further.

When we talk about going further, there's one thing that comes immediately to mind, which is speed. Just driving the frequency up of every element in the system. Intel has done that, and Intel will continue to do that aggressively going forward.

But we're looking at the value beyond just the gigahertz. And I want to focus on how we're looking at getting the maximum efficiency out of each time tick in the system, each clock cycle in the processor.

When you look at the requirements of the enterprise technology, there are several classes. First, one is having efficient compute power. To have efficiency per cycle, you need to understand what are the needs of your market segment. And in this case, there are some very unique needs for the enterprise;, for example, transactions that are very common in enterprise, not so much in the desktop and mobile and others. We need to optimize the computer resources for that, and we'll talk more about it.

It needs to be done in a way that's scalable. We'll talk about scale up and scale out which you're familiar with and some more.

And in a way that's reliable, available, and serviceable, the RAS features, like the end-to-end up time which is so important, or having the confidence in the accuracy of data at all times.

I want to take you through a journey, and we're going to start with systems that span thousands of miles and go all the way down to inside the processor, taking a few hundred microns. And what is interesting is there are three principles that apply to efficient computing, to getting the most out of each cycle, at all those levels.

The first one is the abundance of parallel resources. We'll mark it by those boxes. You have to have all the resource lined up and ready so that you're not waiting for them.

Having abundance of resources is necessary but it's not sufficient, because it can lead to things like bottleneck, under utilization. So the second principle that we'll see working all across those layers is the ability to efficiently manage those parallel resources so that you maximize utilization and you minimize the idle time.

The third principle is that you have to provide a high bandwidth, very fast interconnect between those elements. Having those three elements in there will provide very efficient use of every time tick.

Now, those could be tuned to one particular environment, but these principles need to be scalable in a way that's very easy and reliable, so as you scale it, you keep the reliability. And I mean Reliability with a capital R which is not only at the signal level but the overall system reliability.

So let's start from the top and see how those three principles applies all the way from the top level down into the processor.

When we're looking at the overall enterprise space, there are two trends that we've seen over the last couple of years, and those will continue to be the two big categories in which you see people organize the work. One is scale up. You have larger and larger tasks, you have a single large database you want to access, you build larger servers with many processors within the server. It's still a single server, but it has multiple processors. That's the scale up.

And then there's the scale out. You can break your workload to multiple activities, multiple tasks. You can scale out just adding more and more servers.

With those trends, that are going to continue to be the key, we're seeing the emergence of potentially a third one. What will happen when you want to solve real big problems, like the biggest problems of the universe or of your corporate, whatever is more important to you, and you want to take problems that used to be in the domain of scale up, but with technologies that belong in the scale out, which are cost effective compute and storage blocks that are connected by high bandwidth interconnect.

What you create is a distributed compute system that can solve the major problems acting as one large system, but doing it through the collection of those disaggregated boxes. And just as an example of that, I want to show the National Science Foundation program. It awarded $53 million to four facilities to create the TeraGrid, 13.6 trillion calculations per second, and have about 450 trillion bytes of data. It will be the most powerful supercomputer on earth. This is created in a distributed manner by putting together 3300 Itanium and McKinley processors connected via a dedicated interconnect which starts from a speed of about 40 gigabytes per second going to 80 gigabytes per second over those four sites.

So this is a compute paradigm we're going to see deployed not only at the very large scales of national, but even at the corporate level.

As we start to go down, as we saw in the previous example, we will talk about connections. Maybe to give a little bit of a perspective on the way we connect things. There are complementary standards that are covering this major big range.

The first one is the Ethernet, which is a powerful standard that covers a wide range of environments from site to site, server to server, and even blade to blade.

InfiniBand* is highly tuned, very effective inside the data center, is used in connecting the servers, the storage, and the networking into one tight network.

When we go even further down we have the 3GIO standard that provides a high bandwidth unifying interface within the server box and blade to blade.

So each of them has a very strong point with some overlap that cross over. Since we're now in our journey from the mega down and we're now at the data center, let's talk about InfiniBand.

The InfiniBand architecture is very powerful and elegant. Let's take a look at a few of its capabilities. And to do that, let's strip out the boxes, look at the diagram of that, and look at some of the capabilities.

For example, availability. Supporting the failover with redundant links and fabrics. Another example is scalability. A single InfiniBand subnet can support up to 64K nodes, and that should be sufficient for a while, at least, especially when you put a few of those subnets together.

Reliability. The fabric eliminates the shared-bus contentions, which are one of the areas where we see some of the issues in today's environment.

And serviceability. Now you have all those nodes, but they have dedicated links versus a shared bus to connect to the heart of the system.

So overall, the InfiniBand architecture gives us a scalable, reliable, high bandwidth interconnect. So in our three principles, it provides the foundation of high-speed, reliable, scalable interconnect within the data center. And to show how it all works together I would like to invite Boris Bialek, manager of strategic technology at IBM. Boris.

BORIS: Good morning, Gadi.

GADI SINGER: Good morning. So what do we have here on stage?

BORIS: This is an enterprise solution stack running mySAP.com and IBM DB2 EEE Extended Enterprise Edition running OLTP over an InfiniBand fabric.

GADI SINGER: What is new about this InfiniBand demo?

BORIS: This is the first time you will see publicly a complete enterprise production-level solution stack running over an InfiniBand fabric. It's a huge leap, jumping from interprobability demos to the e-Business solution you can see here today.

GADI SINGER: That's great.

BORIS: So why don't we look more at the system. We took one rack around to see the view. You see here the server back side. Each of the servers has an InfiniBand adaptor with the HCA connecting into one key logic switch running the Intel switch chipset.

GADI SINGER: Before you go further. Didn't you forget something when you assembled the system? Where is the cable spaghetti?

BORIS: This is the beauty and ease of use and elegance of the InfiniBand solutions. You just have this one single cable per server storage connectivity, and client/server connectivity as well. That's all you need. So all the snake pits are past. You don't need that anymore.

The other six servers running here are connected to the InfiniBand running an Intel® chipset and at the top we have an OmegaGEM that is attached to this storage subsystem. The first server here runs Windows* 2000 with the SAP application server. The next ten servers as the audience can see on the slides runs a database. So we have 20 processors, 24 Gigabytes of memory in one single database assembled and connected over the InfiniBand fabric.

The last server, the 12th server, is a hot spare for failover and robustness.

So what you see here on the screen is the SAP GUI. So we don't have 10,000 people to enter transactions for us here, so we do a little bit, table scan and a table reorganization to see some load happening here.

So you can see what happens here since the load, and you can see the system starts going. Another subject which is as important is, of course, the usability. You see here the line 15 management console on the screen. You see all the different nodes here, and if you click on them on the right side, you see here the different nodes, the two switches connected in the center for that solution.

GADI SINGER: So how many servers can be supported by this infrastructure?

BORIS: Our database supports up to thousand server partitions connected. That means several thousands of processors into one single system. So this system is good for 10,000 users in parallel.

GADI SINGER: Great. So if I'm a developer and I'm sitting in the audience, what would be your advice?

BORIS: I would suggest you participate in the Intel interoperability labs like the one announced in the IDF. Use the available PDK and the DB2 InfiniBand test kits to test solutions you have today and participate in the interoperability testing. And think about the solutions that you need and use InfiniBand to enhance your architecture and the advantages of your solutions you plan to deliver.

GADI SINGER: Okay. Thank you and good luck.

BORIS: Thank you, Gadi.

(Applause.)

GADI SINGER: So we continue our journey, now going inside the server box. And inside the server box, what determines a lot of the availability of the multiple resources and the ability to have a very fast interconnect between them is obviously the chipset and the bus that we have in there.

So what you have here in the diagram is an 870-based system. And some of the features that it has with the scalability port, for example, is the high scalable architecture. You can use that for two way, four way, eight way. You can use the same building blocks in a very convenient manner to build much larger systems than that.

Another focus was on the RAS features of the system. And there's a long list; I just picked a couple. The extensive error detection, correction and containment. So overall you have the processor machine check architecture and its ability to deal with errors if they arise, that you have the chipset now fully extending this into the system level. And there's the Hot-Plug on the scalability port.

Those features of scalability, memory, I/O, are done in a balanced manner to give us an overall balanced system.

Same principles. You're looking at a way to apply all those resources in a way that's managed well, and that's easily connected and scaled.

Going further down into the processor, I would like to highlight two technologies that are supporting those principles of efficient computing at the processor level. And the first one is Hyper-Threading. The Hyper-Threading as a term was probably introduced only two days ago about this time, Tuesday morning, in Paul Otellini's speech.

Since then, in the last two days it became the buzz of the conference, and also of some press articles outside, and rightfully so because it's a very exciting technology. What it is, in Hyper-Threading we duplicate the architectural state of the processor, the data registers, a lot of the control registers. Everything that keeps track of your process is duplicated. There are two copies of this, two hardware copies of this.

However, the core resources, and core resources being the execution units, the cache, the system bus interface, and others, are shared. There is only one set of that.

Because those resources are never fully utilized, or rarely fully utilized, by a single thread, or by a single process, having those two threads, two processes, being able to share those, create a significant efficiency of the resources, going directly to our second principle of efficient use of the resources you have lined up.

By doing that, you can provide up to 30 percent of performance improvement compared to just having one thread on a single processor without Hyper-Threading. And this increases the instruction processing throughput when executing multi-threaded code.

The other thing about introducing Hyper-Threading in the enterprise space is many of the applications we have there are already ready for multiprocessors, dual processors or multiprocessors, the high-end workstation and in the server. And all those applications and environments that are ready for multiprocessors can immediately start taking full advantage of Hyper-Threading. For other applications, there might need to be some work to adopt it. But for a lot of the applications, there's no need for code recompilation, you can just use the capabilities. And the Hyper-Threading technology will first appear in a Intel(r) Xeon(tm) processor in '02.

The second element has to do with the EPIC technology and its processors. EPIC stands for Explicit Parallel Instruction Computing. So the fundamental philosophy of EPIC was based on these principles. You have to have abundance of resources, you have to be able to manage, and you have to be able to have very high bandwidth. And the Itanium processor delivered on the first installment of this architecture with abundance of resources and high bandwidth. But it was also done in a very extensible manner, so that when we designed the next generation, the McKinley processor, it just build on this extensibility on the resources, on the ability to utilize those resources ever more efficiently, and on the bandwidth.

So just to walk through some of the things that we disclosed this IDF, and there's a detailed class on that later on today, the bus bandwidth of the McKinley processor has been tripled compared to Itanium processor, from 2.1 gigabits per second to 6.4 gigabits per second.

The L3 cache has been brought on die to allow for improved, significantly improved, cache latencies.

We have more issue ports, which has to do with better utilization of the execution unit, from nine issue ports moving to 11 issue ports.

To have more execution units, we added from four integer units to six integer units and the ability to have more flexibility to have two loads and stores. And we bumped up the frequency from 800 MHz to 1 GHz.

So overall, building on the Itanium foundation, creating a significant improvement between the Itanium, which by itself is a very powerful processor, to the McKinley, the next generation.

The McKinley is binary compatible with Itanium. So those of you who have developed and are developing code for Itanium, your code is going to run very well with a significant improvement on the McKinley as is. And the McKinley performance improvement is expected to be about 1.5X to 2X compared to Itanium, most of it accomplished without any recompilation. Just by using the Itanium code.

Just as an example, we're having the McKinley system being tested in our performance labs, and based on the early testing, real testing, we estimate, we expect that SpecInt2000, as an example, will run 70 percent faster on the McKinley processor using Itanium processor code as is.

Now, what you see up there is us taking off. We need some lifting experiences, so we're taking off, and we're going to go through the clouds and go into space. What you are seeing is a streaming from backstage. We have a four-way McKinley system with an Intel(r) 870 chipset that is doing a real-time blending of uncompressed video. This is frame by frame, creating this sequence in the same way you create animation movies like Shrek and others.

This is done in real time, large data sets, well over 4 gigabytes. The way this is done is using Itanium Processor code, running very well with high performance over our McKinley-based system backstage.

Taking a sneak peek at some details of future Itanium products and technologies, we are now in advanced stages of the Madison processor development.

The Madison processor uses the same chipset, the same system bus, the same processor form factor as McKinley. It is developed on the latest processor, the .13 micron process, and it enables us to double the size of the L3 cache compared to McKinley. So we will have up to six megabyte of on-die L3 cache, and it is targeted for platform release in 2003.

Other technologies we're looking at are multithreading for the Itanium family, and multi-core and multi-die technologies for this family.

To remind you, when we met in spring IDF, the Itanium itself was part of the "future's" section. Since then, in May, we introduced the Itanium processor, and there were many platform releases. We had many OEM systems being introduced, several operating systems, many applications introduced since then.

There are people in the audience that were developed with us through this major grand introduction, and I would like to take this opportunity to thank you, because it is work that's done with the basis of the Intel Itanium processor, but with a lot of work from the industry. So thank you very much for that.

Since then, we had a lot of announcement. Just to highlight a couple.

One is the Compaq-Intel announcement that you are familiar with which provided significant momentum to the overall transition of future high-level systems to the Itanium processor family. And another announcement just this week from Microsoft about the advanced server LE which completes a line of operating systems in production that is the widest system in OS support for any 64-bit platform.

So the hardware is out there, the operating systems are out there, and the applications are coming in, and together creating this ecosystem that will drive this forward.

Now, we wanted to just illustrate a little bit the Itanium processor in action, so we thought about an environment that has thousands, tens of thousands events happening in real time, and then there are thousands of people that want to all get their own perspective on those events.

So after we give this kind of brief requirement description, we thought about financial systems. In order to give us a little bit of illustration on this, I would like to invite Tom McDonell. And Tom is the director of the Reuters Innovation Labs in New York. Welcome.

TOM: Hi, Gadi.

GADI SINGER: What are the innovation labs about?

TOM: Reuters business is around supplying news, trading information, so on to the financial communities. What the New York-based innovation lab is about is primarily exploring technologies, emerging technologies, coming out of the Intel-Microsoft space, determining its applicability to Reuters' businesses, and then trying to leverage any of the capabilities.

GADI SINGER: So what are we going to see today?

TOM: Well, what we have right here, what I wanted to show everybody, was an Internet application. So what we have here is a browser-based client with painting, connected back to our New York-based Web site. And that Web site is operated on a 24-by-7 basis and has several servers located in it.

One of those servers runs an application we call the e-Notifier which is responsible for all the dynamic behavior, or most of the dynamic behavior you see throughout this display right here. And that runs on an Itanium four-way processor.

GADI SINGER: So what we're seeing in the demo here, we just have a browser really loaded on the client and everything is working back in the New York labs using the Itanium server, which is up 24-7.

TOM: That's correct. Let me tell you a little bit about the browser-based client right here. Now, it's what we call a zero footprint desktop in that it downloads a script, and the script runs in the context of your browser, and provides you functionality for defining your screen and your layout.

So basically you divide your screen into various panes, and then you go over these panes, you highlight the bar, and then you put a various application or so in the different spots. Okay?

So you go through this manner, and you build the application up. You put your news, your quotes, your streaming data, and what happens in the background is we define an XML configuration file so you can then restore this once you've defined it.

GADI SINGER: I am not sure that putting real-time trade information is good for keeping the attention of the audience on the demo. This requires a lot of customization for each of the clients because they have to have their own filters and own views per each of the thousand clients.

TOM: That's correct. That's the other function of the script, is basically as you're filling in these applications, you're specifying what instruments you desire, you're specifying on the news and filter criteria that you want. And what happens is the script sends that information up to the e-Notifier service running on the Itanium processor which then maintains files on a per-client basis. So anytime any of the data stores that support these various windows receive an update, then the e-Notifier does a rapid turnaround to determine if it's applicable to a particular user and then customizes a stream down to that user with just that content.

GADI SINGER: Now, you chose an Itanium-based system to run this on, and I'm sure you had excellent reasons for that. Can you give us a little bit of insight, what is the value that the Itanium-based systems bring to this application?

TOM: This application, you'll see there's a lot going on here and this is but one client and our target is to scale this platform to thousands of clients, each running multiple panes of information, and many of these data stores are receiving multiple updates a second.

So we have to fan this wide, and bring this all the way through to the various clients in a very timely manner. And for that reason, we chose the Itanium for its scalability as well as its reliability.

GADI SINGER: Well, thank you very much, and good luck.

TOM: Thank you.

(Applause.)

GADI SINGER: So we talked about what are the requirements to get the most out of each time tick, out of each clock cycle, and we talked about the three principles of efficient compute power, the abundance of parallel resources, maximizing utilization of those resources, and giving high bandwidth interconnect at each layer. And we showed how some examples at all layers, be it box-to-box with the InfiniBand or lower down, provides you the scalability and the RAS required to do that.

I would like to conclude with your opportunity, because as I said, this was a very exciting year for us, and a lot of opportunities here. And for the industry to take advantage of this, one advice is to innovate around the increasing parallelism. This is going to be one of the fundamental forces driving innovation forward, this ability to understand and utilize a parallel system.

So develop your systems around concepts of multiple threads, of Hyper-Threading technologies, especially for Xeon(tm) processors and future IA-32 products.

And address the single largest growth opportunity, which is winning the enterprise with the Itanium processor family. The servers, the workstations, the operating systems, they are here and in production. And for the application vendors, you need to complete your Itanium processor validations and release the production software for the Itanium processor today.

They say "it takes a village". Well, for technologies to bear full fruit, it takes an industry. So this is up to us, all of us, to make the full potential that is in the enterprise space. Thank you very much.

(Applause.)

GADI SINGER: I would like to invite to the stage Sean Maloney. Sean is Intel's executive vice president and the general manager of the Intel Communications Group. Sean.

(Applause.)

SEAN MALONEY: Hi, Gadi. Well, thanks a lot, Gadi, and good morning everybody and welcome to the last IDF speech.

(Applause.)

SEAN MALONEY: Are you really happy? It's the last one.

(Applause.)

SEAN MALONEY: Yeah, right. Or would you prefer to stay the whole weekend and have a few more? I'd like to thank you very much for staying around until the bitter end, and we're going to have a little look at the communications industry and life after the slump in the communications industry.

It really is a slump in the communications industry rather than a recession. Probably this is no surprise to you because you've been looking at the industry over the last few months, but if you date the industry back to Alexander Graham Bell and the telephone in 1870, between then and 1990, there wasn't really ever a boom, there wasn't really a recession because it was an industry that was largely regulated. And government regulations kept the thing kind of just ticking along.

Then there was an explosion of deregulation, an inflow of stock market money. There was a huge explosion in capital expenditure and now we have a hangover after this party.

But this, too, shall pass, and the industry will, of course, recover again. And I want to look at that, what we consider to be the key technologies over the next three years or so. And also to look at the current state of the environment and what's going to be important to our customers, the service providers, and what isn't.

Now, of course, because it's a communications talk, we do have some demonstrations. We have some networked equipment. For the first time in history, you're going to see a demonstration by a communications organization that has a piece of equipment that is not networked. It is this microwave oven here.

Now, you remember about two years ago there was a kind of craze to have everything with an IP address, you know, fridges, dish washers, cat litter boxes, everything, right? This is a microwave, it doesn't have an IP address, an Ethernet cable, or anything to do with networking. It has an incredibly important function, and that is to make popcorn.

(Laughter.)

SEAN MALONEY: So I'm going to show you that a little bit later on and we'll weave it in somehow or other into the speech.

Let me start by talking about the current state of the business. I'd like to put ourselves for a minute in the shoes of our common customers who by and large are the service providers.

Now, many of you are making technology for the computer side of the communications business, and that by and large is moving along reasonably well at the current rate of PC sales and server sales.

Those of you who are selling into the telecommunications industry by and large are designing technology for telecommunications providers, and trapped between what I call a rock and a hard place.

Now, the rock is a constant increase in IP traffic. And we're sort of neurotic at looking at this month after month, week after week. But no matter how much you study it, it carries on increasing, increasing, increasing, despite the dot-com slow down, despite the number of companies that have gone bust, despite the huge slowdown in the U.S. and Taiwanese economy, and so on, this traffic keeps growing. The latest which came out in the newspapers last week is the traffic, the growth has accelerated, over the last year, and it has grown about fourfold over the previous 12 months, which is quite extraordinary and counter intuitive when you think that we've just gone through this slowdown.

The rise of Internet traffic appears to be inexorable. If you're a service provider meeting this traffic, it feels like good news that you have this constant demand. But on the other hand, there's a problem, and that's what I call the hard place. The line at the top there shows the capital expenditure relative to 1996. You see this huge peak in CapEx that went out when those making communications equipment had a huge boom last year, you're now seeing a drop in CapEx and at the bottom you see profit and revenue relative to 1996.

The thing to take away from this is that the profit and the revenue of our customers is moving back towards 1996 levels at the same time when they have to deal with massively more traffic.

I think even though the service providers, our customers, are trying to find more and more new services using intelligent network processes and so on, the likelihood is, for the next three to five years, revenue profit is going to be a huge challenge in the service provider community. That's the first observation.

Second observation is to do with the PC itself. And of course Moore's Law, as you all know, carries on moving, moving, moving. Along with Moore's Law, the hard drive, which carries the same principle, carries on. And so every few month, processor density increases.

One of the things if you look at it is the rate of technology absorption by our customers, the consumers. There is a very small gap between the maximum capacity hard drive or the maximum speed processor that can be made and what is actually bought by customers.

So right now, the average PC is somewhere in the gigahertz, 1.2 GHz range. The maximum shipping is around about 2 GHz. There's probably only about a six-month lag between the maximum theoretical number and the actual absorbed number in the marketplace. And the same is true of hard drives.

So on the computer side of the computer communications business, computer technology, new technology that you produce, moves extremely rapidly out through the marketplace.

Theoretically, the same thing happens in communications. Three or four years ago, one of the industry observers, Gilder, came out with an observation that he called Gilder's Law, and it was actually the same thing is happening in communications. And you can get so many lumda down a fiber that actually -- this was a graph they produced -- actually bandwidth was going to accelerate away from computers, and there will be more bandwidth available than compute power. And the real crisis was going to be that compute power couldn't keep up.

That was a theory that was fine in theory, but in practice just doesn't happen.

Look at the same data again between the maximum theoretical bandwidth and the actual bandwidth that's delivered to a consumer, and you see that the maximum theoretical bandwidth right now is around 10 gigabits. The actual bandwidth that a consumer gets, of course, is many orders of magnitude less than that. I don't think there's anybody in this room who believes that the average consumer is going to do much better than DSL speed, certainly in the United States, in the next five to ten years.

So we're moving inevitably towards a world, in five years' time, when you're going to have 20 GHz, 40 GHz processors, one terabyte, ten terabyte hard drives, hard drives that are big enough to take the entire contents of the U.S. Library of Congress and the consumer is likely to still have the same pipe.

Now, you can get depressed about that or you can accept it as a probable reality. There are many things we can do to increase bandwidth to the home, and we can carry on doing them, but we have to work on the assumption that there isn't really a corollary to Moore's Law in terms of bandwidth.

The consequence of those two observations for our customers is cost reduction is do or die. Our customers have to carry on delivering increased profit. That's what Wall Street expects, whether you're in Europe, iAsia, or the United States, capital expenditure is going to be under a huge crunch. The single biggest thing we have to do is constantly reduce the cost of handling packets. Not only the capital expenditure, the unit cost of the equipment to do packet handling, but the maintenance cost because operating cost in service businesses is 75 percent of the problem. So operating cost through reduction of maintenance cost, reduction of cost for adding lines, repositioning lines, reconfiguring, maintaining, all of those things are absolutely critical, and it's the number one, two, three, four, five thing we have to do for our customers in the next four or five years.

Technology is critical, but more than anything else for our customers in the next five years, it's going to be reduction in CapEx, reduction in operating cost.

In addition, the communications industry is going to have to move rapidly towards mass production based around more standardized electronics components rather than highly specialized boutique products. Now, that's already happening quickly. There is a rapid move in our customers in communications towards outsourcing from the other side of the Pacific, and that is likely to accelerate too.

Underneath all of this there are many, many essential technologies, but we believe the top two technologies, the top two technologies are Ethernet in all its manifestations, wireless Ethernet, one gig, ten gig, storage Ethernet; and network processing.

Ethernet is not the only access technology that will predominate in the next five to ten years, but by and large, in most cases, Ethernet is the one that will, over a period of time, tend to win. So I want to concentrate today on the next series of Ethernet transitions, and then give you an update on where we are on our packet processing technology and network processing.

So first of all, let's have a look at those four key Ethernet transitions. You can probably look at six or seven Ethernet transitions in the next three years. To me, the big four that we need to look at are wireless, 802.11; gigabit in the enterprise, the transition from 10/100 to gig; storage, where the storage industry has a series of opportunities using Ethernet in the next two to three years; and then 10 gigabit in the Metro, ten gig being kind of a unifying facility for linking gigabit and Metro.

First, let's look at wireless. Obviously, the big news there in wireless is the kind of inexorable rise of WiFi 802.11. Maybe a year and a half, two years ago there was a series of discussions between home RF, various other kinds of connectivity technology, and I think the industry as a whole has swung heavily towards 802.11, and I know many of you are busy designing those products in.

What I'd like to do now is call up Elise, and for Elise to kick off a couple demos we have on wireless. Hi, welcome.

ELISE: Thank you, Sean. I have my work laptop here. It has Windows XP on it, and I have an 802.11b wireless network card.

SEAN MALONEY: Right.

ELISE: So one of the features I've enabled on my laptop is Universal Plug and Play. What this allows me to do is to walk into my home, and my residential gateway will automatically detect my card and configure my system so if I want to share files through all the network in my house, I can do that simply through my wireless network.

SEAN MALONEY: So normally what happens, you have to reconfigure and change your IP address and do all that kind of stuff which is a real pain, right?

ELISE: Yes, it is.

SEAN MALONEY: So Universal Plug and Play with 802.11 sort of fixes that.

ELISE: Yes, it does.

SEAN MALONEY: Are you going to give us a demonstration?

ELISE: Absolutely.

SEAN MALONEY: Before you do that, is it okay if I make some popcorn?

ELISE: Ahhh, I really wish you wouldn't do that.

SEAN MALONEY: Why not? I'm feeling hungry.

ELISE: Well, 802.11b might have a little problem with the microwave, so I'd really prefer we don't do that right now.

SEAN MALONEY: Okay.

ELISE: So say I put my work laptop down, and I have some files on there I want to share with my family. So we'll just go into the living room and say I have a set-top device that's UP and P enabled on my TV. We can look at photo albums, some of the pictures I have on my laptop, we can listen to MP3s, my family has tons of them on the wireless network, or look at a video I've downloaded here.

SEAN MALONEY: This is where you want the popcorn or something like that, but I can't do that, right?

ELISE: Well, not really, yeah.

SEAN MALONEY: Okay. This is Gorillaz, right?

ELISE: Yeah, this is a rather interesting band.

SEAN MALONEY: Good, good, good. Thank you very much. And so I'm okay doing this. This is going to be a great home technology as long as I make sure that my kids don't use the microwave.

ELISE: Okay.

SEAN MALONEY: I get it. All right. So actually, the microwave isn't just the only issue, of course, that may impact this technology. And what we're trying to do as an industry is come out with something that is bulletproof, reliable, doesn't get people slightly worried like don't use the microwave, that kind of thing.

So there are a series of issues that need to be addressed. Bandwidth, security, how it aligns with the other home networking standards so we don't have confused customers, and how it becomes absolutely standard and ubiquitous, which comes down to some level of integration to the platform.

Now, the first one is generally on bandwidth. And of course to the rest, comes 802.11a. And as you guessed, my sarcastic comments about the microwave are to do with 802.11a.

Some of the slams that have been made in some areas of the press about 802.11a is that over a distance of a few hundred feet, you get a very, very dramatic drop-off in performance, and that dramatic drop-off in performance is such that it's not worth moving from B to A.

What you see up here shows that even when you go out over 200 feet out to 200, 250 feet, you're still getting kind of a two and a half times performance improvement over 802.11b, and that translates, obviously, into having an experience much more like a normal desktop experience. And also, into being able to simultaneously handle far more people, which in a public access point like San Jose Airport or the restaurants around here that are getting configured with WiFi, that's going to be very significant. And, Elise, maybe you can give us a demonstration of 802.11a?

ELISE: Sure.

SEAN MALONEY: While you do that, I'll put the microwave on.

ELISE: Absolutely, go ahead. What you're looking at here is video being streamed over my 802.11a wireless network, and as you can see your microwave is popping the popcorn and streaming video streaming seamlessly.

SEAN MALONEY: Great. Thank you very much.

ELISE: Looks like it's done. You know, I'm kind of hungry.

SEAN MALONEY: Here you go.

ELISE: Do you mind if I take that?

SEAN MALONEY: Not at all. You take it. Thanks very much, Elise.

(Applause.)

SEAN MALONEY: So the transition to 802.11b is actually a wonderful technology, obviously, and the chances of your microwave interfering with it are low, but any anxiety like that in the consumer base, we as an industry need to address. And obviously 802.11a is in a position where that problem is fixed.

In the business community, the big problem is security. You go back and look at The Wall Street Journal a month and a half, two months ago about the people cruising around the Valley here and eavesdropping on everybody's traffic, security is a really big deal.

About a year ago in the Intel Labs in Oregon, we discovered a flaw in WEP. And in common with the rest of the industry, we've been working together diligently to try and fix that.

We have editorship at the moment on the specification on the 802.11 TGI spec. That seems to be moving along very, very well. There is proposed draft coming out. And we believe that that specification will be able to intersect related to 802.11a over the next 12 months or so and by the second half of next year we're going to see 802.11 -- second half of 2002, we'll see a whole series of 802.11 products coming out that have new security features that will make people far more comfortable with the technology in addition to the bandwidth.

The third issue is on integration with other home networking technologies. Couple of areas there. The consumer appliance folks who have been very keen on the 1394 obviously now moving strongly towards 802.11. And there is a working group that Intel is heavily involved in running -- defining standards for running 1394 on top of 802.11. And we are also participating with many of you in solving the quality of service issues that are necessary on 802.11 with 1394 so that you can do seamless, very smooth stuff that you need in home consumer electronics, like video editing, without any hiccups or any pops.

So those issues are fixable. And we also believe we can work on some of the longer term connectivity issues, with automobile connectivity, so you'll be able to move your car, for example, move it into your garage, and your built in MP3 player in your garage will be able to synchronize with your home hi-fi and download files and so on.

Where we sit working with yourselves, it seems most of the big issues on 802.11 are getting the attention they need and this issue is moving along very, very quickly.

The final area then is ubiquity, and that's making sure everybody can have it. In the past, the best known method of solving that is to significantly reduce the cost and make it a standard feature. And in the past, one of the great ways we've had of doing that is by doing partial integration into the chipset. It worked wonderfully with USB. You may remember back on the 440, it worked with Ethernet as well. And obviously, in the Intel Communications Group, we're working very closely with Intel Architecture Group, and we see there some clear technology directions out over the next three years so that we can standardize the client with wireless connection. It would just be a really cool thing for our industry if every single client everywhere had built in wireless connectivity. It would change the way that people think about computing and give people new reasons for buying more computers.

So, that's the first of the Ethernet transitions. The second one is gigabit in the enterprise. And my suggestion is that gigabit in the enterprise is actually going to happen faster than the industry is currently expecting.

Let's just take a look back at the last big transition, which was 10/100. And what drove 10/100 was the convergence into the platform of several new technologies, silicon integration, and then also backwards compatibility.

What happened was those things came together, those three things came together, and in 1997 we had crossover from original 10 Ethernet onto 10/100. That happened in 1997. And then about a year and a half, two years later the switch multiport MAC 5 business also then toggled over, so about 18 months later the switch market went in volume from being 10, principally, to 10/100.

A similar series ever phenomena are beginning to happen on gigabit. There are new technologies emerging that are driving a new level of platform performance. Pentium 4, PCI-X.

Silicon integration is happening very, very quickly. I'll show you that in a second. The new gigabit silicon is highly integrated and low cost.

Obviously, we now have this multimode silicon that can do 10 or it can do 100 or it can do gigabit with auto sensing to the switch.

From Intel's point of view, we're in a very healthy competitive race with a principal competitor in this space, and our teams are busy reducing, reducing, reducing the size so we reduce the power consumption, and at the moment we are shipping our integrated MAC PHY, and we have a road map stretching out for very, very high integration that we think will keep us extremely competitive.

So we expect the gigabit transition due to cost reasons, platform integration, and to new features on the platform, gigabit -- the gigabit transition is going to be well underway in the course of the next 12 months. And by the time we get back here for next IDF, many people will be shipping gigabit, not only, obviously, standard in the server but also be standard as a standard option in the business desktop as well. I think that as an industry, we need to move ahead with the rest of the support infrastructure, including the switches, to support that.

The third of the key Ethernet transitions is in storage. What's driving the move towards this external Ethernet-connected storage is just efficiency.

Locally connected storage, as the hard drive capacity increases, gets more and more inefficient, locally attached disks become more and more underutilized. You have fragmented storage across the enterprise or throughout the office, and you have huge areas of our hard drives that aren't being used and other areas that are overused.

The logical solution is to attach the storage onto the network and onto Ethernet, and that transition is underway quickly.

Couple of big elements to that, two kinds of technology pieces. The move from ATA SCSI to serial ATA which kind of impacts inside the system. And then on the networking basis, a two, three, four, five-year transition away from fiber channel over to iSCSI.

I don't believe fiber channel is going to be eliminated for many, many years because of the huge store base on fiber channel, but there is a rapid move to iSCSI, very broad industry agreement, couple hundred companies now are supporting iSCSI, and it's going to have a big impact on our industry.

What I'd like to do now is call up John Wakerly who is vice president and chief technology officer at little old networking company Cisco. Good to see you.

JOHN WAKERLY: John, good to see you.

SEAN MALONEY: So we have little old chip company Intel and little old networking company, Cisco. And we're going to talk about storage. Maybe you can show us what we've got.

JOHN WAKERLY: Sure. What you will see shortly on the screen is a video that is actually a very high resolution, high bandwidth video stored in disk here. And it's actually quite a high bandwidth demand on the storage network, over 250 megabits her second.

So the disk itself is stored -- the video itself is stored on the disk in these JBODs. Who doesn't know what a JBOD is? Just a bunch of disks, okay? And the video is coming out of the disks on fiber channel. Actually, the servers file system will normally talk to the disks using SCSI transactions and a SCSI driver within the file server.

In this case, we have the fiber channel instantiation of the SCSI transactions coming across into a Cisco storage router, the SN 5420, which takes the SCSI transactions encapsulated in fiber channel and reencapsulates them into gigabit Ethernet with TCP/IP encapsulating the SCSI transactions.

In this way the storage traffic can be carried anywhere where there's Ethernet. You don't have to extend your fiber channel into the wide-area network and the like.

Finally, those storage transactions arrive at the file server itself, which contains an Intel pro 1000 T gigabit Ethernet adaptor card. Within the file server itself, again the driver is talking SCSI at one end of the file system, but on the other side it's actually talking to the gigabit Ethernet and doing the TCP/IP termination.

Now, that's quite a bit of computational load on the processor to terminate both the TCP/IP protocol as well as the iSCSI protocol. And if you look on the screen here you'll see we're using almost 100 percent of the processor computational capability, a little bit driving the video and a whole lot terminating these protocols.

Now, we have a second copy of this entire setup with another set of storage, 40 Cisco storage router, and another file server, the only difference being that in the second file server we've replaced the standard Intel pro gigabit Ethernet adaptor card with a TCP/IP gigabit Ethernet adaptor card. The difference in that second adaptor card is that it has additional silicon and intelligence that actually terminates the entire TCP/IP protocol. That offloads the protocol processing from the main processor and allows for faster operation.

SEAN MALONEY: I suppose if you're just making microprocessors, it's rather nice to consume the entire power with TCP/IP, right?

JOHN WAKERLY: Actually, we want to be able to leave a little bit left over for the storage processing; right? And the file system. And actually, if I show you on your left-hand side screen here, you'll see now the performance with the offloaded gigabit Ethernet adaptor. And there you see that the processor is cruising along with just about 15 percent CPU utilization running the video itself, and all that TCP processing, all the interrupt load that occurs during that processing is being handled by the adaptor card itself.

I might also add that the Intel silicon that is used to perform that offload is also used in a recently introduced Cisco product also to do TCP processing in a content switching application. So we're very, very happy with that technology.

SEAN MALONEY: Great. And in general, figuring out the TCP offload problem obviously is an essential part of making this thing useful.

JOHN WAKERLY: Absolutely, absolutely.

SEAN MALONEY: All right, John. That's great. Thank you very much, indeed.

(Applause.)

SEAN MALONEY: Okay. So the fourth and final one of the transitions is 10 gigabit. Now, 10 gigabit is tantalizing. It's tantalizing because 10 gigabit in the enterprise, 10 gigabit in the Metro, 10 gigabit in the WAN, we can imagine ourselves moving towards a time when you're almost unaware whether or not a resource is local or remote.

And one of the ideas that's been discussed on 10 gig over the last two or three years is it intersection technology where you're able to access resources, whether locally or remotely, and almost not know whether or not they are local or remote.

Now, I think that is some long ways off, but there is no question that 10 gig is a critical transition over the next year, year and a half.

10 gig really is driving through the Metro, and it's moving out into the long haul with OC 192. OC 192 is obviously increasingly being used on long haul. And there is no way that SDH or Sonet is going to be replaced by 10 gig E anytime soon. But you will see more and more products coming out that are capable of either supporting SDH/Sonet or Ethernet. And the two single sets of products with minor differences will either be Sonet/SDH capable or 10 gigabit Ethernet capable.

Eventually, you'll see 10 gig E pushing down into the data center, 10 gig E moving alongside InfiniBand, InfiniBand clearly as a high end, fantastic quality of service protocol for systems interconnect and for storage, but you'll see 10 gigabit Ethernet pushing into the data storage and other activities as well.

10 gig E over short distances is clearly going to be running over copper, but for the medium level distances, for data center connections, for connections across buildings and across campuses, I think we're going to see the industry moving towards standardizing our 13 nanometer cable, and that means we're going to have to significantly reduce the cost of optical connection.

A cost reduction in optical is going to come down to integration. And one of the things that we're working very intensely on in Intel communications group is driving down the path that you see here from left to right, moving down the path of integrating so that we reduce the overall cost of the 10 gig module. And in doing that, we also reduce the capital cost by making -- not only the capital cost but the operating cost for line dispositioning, reconfiguring and so on, which really adds a huge cost to our customers.

This is a relatively rapid movement, and right now the industry is moving quickly at the 10 gig level towards the use of transponders. And Intel and a number of other companies are moving into that area quickly.

Our chosen technology direction is to work right now on integrating the physical layer and the optics, so the kind of PMA, PMD level. And the SerDes, and put them into small industry type packages so they're very easy for people to plug in and plug out. Then we see the transport level activities being off the module so people can still do differentiation on quality of service or differentiation on reach and so on on stuff like advanced FECs and wrapers.

Over the next three years, optical will inevitably push right back into the data center, and as an industry we need clear standardization, movement around standardized fiber, cabling, just the same way as we all went towards Cat. 5 a decade ago. And in doing that, by standardizing the fiber connection in the data center, in the campus and the Metro, we should open up for more and more technology products to hang onto that network just as we did when we standardized around USB.

That's the four transitions on Ethernet. Let me move towards the close on this by giving you an update on network processing.

You remember a little bit over a year ago we kicked off our own network processing architecture, IXA. The progress has been pretty good. IXP1200, which was the first instantiation of that, has moved along very well. We've got a bunch of Developer Forum members. We've got lots and lots of design wins. A lot of those have already gone into production and are deployed live.

What's interesting now is the next generation of IXP, and that is working on a problem that we kind of talk about, net ops per second. Let me describe that to you.

You have a multiple problem. As the line speed goes up, you also want to do more and more things to each packet that comes in. So you've got a problem that you have the bit rate per second, or kind of packets per second, and then around that you have all these various services that need to be done, authentification, cryptography, intrusion detection. Each one of these requires a varying number of CPU cycles to handle it and the combination of those two things, the bit rate, 1 gig, 10 gig, 40 gig, times the operations per second comes up with this kind of net operations per second that you'll hear us referring to more and more to gauge how much power you need to handle and dispose packets.

The challenge is that, obviously, as you increase the line speed and you increase the things that you need to do, like security, you have a shorter and shorter window to dispose of those packets. And once you get up to 10 gigabits per second, you get down to something like 35 nanoseconds to be able to do anything much useful. And of course you cannot afford to have any dropped packets.

This requires, in our view, standardization around APIs so that the incredibly important software that gets written for this can be ported generation to generation. And also going around industry standard memory interfaces and buses and so on.

IXA is Intel's architecture for packet processing. You can think about IA-32 and the Itanium processor as Intel's architecture for data processing. For packet processing, our architecture out through the next decade and more is IXA. And the IXP processor is an example of that.

IXA is really the sum total of our XScale processor architecture, and on top of that, our microengine architecture for handling packets.

What I'd like to do now is talk about second generation IXA. And call up Matt who is the director of communications processing. I guess, to me, Matt is the guy who understands how all this works. Welcome, Matt.

MATT: Thank you.

SEAN MALONEY: What's special about this packet processing?

MATT: The IXP employs multithreaded microprocessors optimized for software pipelining and that's what's going to enable the ability to do those net ops per second at line rate. A key innovation of the IXP is the discovery of functional and context pipelining optimizations. And in fact, the next-generation IXP that I would like to demonstrate to you employs these functional techniques so we can supply to our customers a rich programming environment so they can deliver rich services to their customers quickly.

SEAN MALONEY: Great. Maybe you can show us what you've got.

MATT: Sure. So this is actually my work laptop, and actually on the laptop --

SEAN MALONEY: This is your work laptop, huh?

MATT: This is my work laptop. And on the laptop is actually the next-generation 10 gigabit network processor. So this is the actual design.

SEAN MALONEY: Wow. So what's in it? That's great.

MATT: Over here what we see is the XScale -- next-generation XScale processor. And over here we have a series of microengines, which are multi-threaded for providing the rich software pipeline capability. A series of integrated memory controllers to low latency memory. And then bulk memory storage with a series of integrated memory controllers.

From this design what I've done actually is extracted a cycle-accurate simulation model of the next-generation 10 gigabit network processor.

SEAN MALONEY: I've done the same thing on my notebook, by the way.

MATT: That's good. So you'll follow right along.

SEAN MALONEY: Yeah. How do I know this thing can run at line speed?

MATT: That's a great question. So what I'd like to do is actually simulate. And then what we're doing is using multiple processors. In the example I'm going to show, eight processors, each one of them having eight threads. And as you said, a packet is arriving every 35 nanoseconds.

The sum total is that by using eight threads on eight microengines, I need to complete a packet arrival every 3200 cycles.

So what I'd like to do is actually just simulate for a few moments here. And you can see in the bottom of the screen here the cycle count advancer. So we're actually simulating a cycle accurate model of our next generation 10 gigabit microprocessor.

I'm going to stop this for a second and show you that we're keeping up with line rate. What I'm pulling up right now is a thread history window. And here's the different microengines, zero through eight or however many we have inside the next-generation network processor. And right here we're starting a pipelined processing which we were discussing just a moment ago about providing an optimized environment for it. And you can see we're starting roughly around cycle 929.

So if we're to keep up with line rate, we must complete roughly 3200 cycles later from this or faster than 900 plus 3200, or roughly before 4100 cycles. And you can see the cycle count along the top here. And what we're going to be looking for is received pointer phase zero where we're receiving new packets.

And we're scrolling along and doing pipeline processing, and here is our 4100 cycle limit. And well before that we're returning and doing line rate processing, doing rich, deep packet inspection.

SEAN MALONEY: Great. And there's no dropped packets.

MATT: No dropped packets.

SEAN MALONEY: Who knows about this in terms of comrades in engineering opportunity?

MATT: Great question. There's a number of early adopters who are evaluating the software simulation environment now and the architecture, like we did previously on the IXP1200.

SEAN MALONEY: Excellent. All right. Great. Thanks a lot, Matt. That's excellent.

(Applause.)

SEAN MALONEY: Actually, Matt dropped in a little expression there on a pre-alpha program, a slam we do ourselves, in the past three years, we haven't engaged early enough in early phases of product design with yourselves, and we're on a real crusade at the moment to get engaged with you super, super, super early to get feedback and to work much more closely together.

Okay. So that's a quick update on where we are on the network processor family. Matt really went through there the high end of the family, which is for handling this 10 gigabit per second OC 192 which is going to be standard through the Metro and increasingly standard coming into the campus level over the next year or so and ultimately will be the data center standard as well. So it's essential we have mass-produced building blocks that can deal with packets at these phenomenal speeds.

We've also, obviously, got to go right down to the bottom end, the very, very cheap network processors for stuff like DSL modems, and this shows our IXA road map over the next year. We want to go all the way from the really low cost stuff for doing low-cost consumer premise equipment request up to this high end 10 gig as well.

So that's it really for me. I just want to finish with a couple of observations. It is a new group. It is a group that is very, very focused on low power, high density, obviously low cost for our customers is a kind of mantra throughout the organization.

We've done a lot of acquisitions. We have a lot of new engineers in the company that I encourage you to get to know. We've done a lot of acquisitions over the last year, year and a half. We've brought in a number of architects and CTAs from other companies that are now kind of part of the Intel family.

And the capital investment continues. 550 companies so far we've invested in. Many companies that we've invested in that you've also invested, are your companies as well.

And this Intel Communications Fund, $500 million that Mark Christensen talked about a year ago. We've now got more than 70-odd companies that that's been invested in. And despite the economic slowdown and despite the tougher economic times, we still have the money in the Intel Communications Fund to invest in any new ideas, new concepts, any new companies out there that can help us keep pushing ahead to solve some of these difficult technical problems. Those of you with a spark of a new idea, setting up a new company, please don't forget that there is this pool of resources still available to push those ideas forward.

So once again, thank you very much. I'm deeply grateful that you stayed around till the end of the week, and really looking forward to solving with you the problems of the next generation of the Internet. Thanks a lot.

About Intel
Intel (NASDAQ: INTC), the world leader in silicon innovation, develops technologies, products and initiatives to continually advance how people work and live. Additional information about Intel is available at www.intel.com/pressroom and blogs.intel.com.

* Other names and brands may be claimed as the property of others.

Back to Top