EDA Confidential: ISQED 2010 Panel ...
___________________________________________________________

Long Life Cycle Design:
Is it really different from CE Design

by Peggy Aycinena
___________________________________________________________

April 21, 2010  

The International Symposium of Quality Electronic Design took place March 22nd to 24th at the DoubleTree Hotel in San Jose. This was the 11th annual edition – one of the highlights of the event being the after-dinner panel moderated on Tuesday, March 23, by long-time EDA editor and observer, Tets Maniwa.

Maniwa’s panel topic was in line with the theme of the conference: Quality.

In particular, Maniwa lead his panel participants on a 90-minute journey exploring the differences in quality requirements between standard, garden-variety consumer electronics versus mission-critical systems designed for healthcare, aviation, or one-off sci-fi gadgets such as the Mars Rover.

Mars Rover

Maniwa primed the evening’s discussion by showing a photo of the Rover, and commenting on the fact that the engineers at NASA never expected their robot to last longer than 6 months in the unforgiving conditions of Mars.

Despite those very limited expectations, however, the Little Rover that Could soldiered on across the surface of the Red Planet, going over there, looking at this, and climbing up that, for more than 5 years. Maniwa did not let his audience forget, however, that “the Rover is a throw-away part!”

He emphasized that quality in design is seldom random, and never optional, even for throw-away parts. From exploding Toshiba laptop batteries, to Microsoft XBox failures and Toyota floor mats – per Maniwa, consumer electronics always require considerations of quality to be tantamount, even though they are not perceived as mission-critical devices.

Many handheld devices today suffer from problems, Maniwa said, “Because the people who designed them pushed design limits and made design decisions that were not necessarily in the category of long-term quality.”

Maniwa ended his opening remarks by putting up a generic design flow chart. He opined that, although hardware and software development share certain commonalities – specification, design, verification, and implementation – it’s in the integration of the hardware and software where the real difficulties arise. He noted that his panelists are well aware of those challenges; they are responsible for products that have to stand the test of time and market.

Having framed the conversation, Maniwa posed a series of questions to his panelists:

* Krishna Yarlagadda – CEO, President & Founder, HelloSoft
* David Abercrombie – DFM & Reliability Manager, Mentor Graphics
* Kevin Grundy – COO, Sezmi
* Jason Kim – CEO & Chief Systems Architect, Xilicom Research

In so doing, Maniwa provided the ISQED Conference Dinner attendees a superb mini-tutorial on why defining quality, when it comes to electronic design, is not a piece of cake, a walk in the park, or a stroll on Mars.

************************************

Round 1 – What brings people to a discussion on quality

* Tets Maniwa – Please give us some interesting facts about your backgrounds.

* David Abercrombie – I always wanted to work on tools that would help with manufacturing. At Mentor, I’m having that opportunity.

* Kevin Grundy – My experience includes having worked on several generations of the NEXT computer, including 7 years working directly for Steve Jobs. Now at Sezmi, we make a competitor product to Comcast.

* Jason Kim – We make heart-monitoring devices. Earlier on, I worked on spy satellites at TRW, which clearly needed to be rad-hard and couldn’t be tweaked once launched. They depended on maximum redundancy and reliability. The work of making disposable MP3 players is, however, not about making a product that will last for 30 years. In fact, the entire methodology surrounding consumer electronics design is not suited to the long time frame.

* Krishna Yarlagadda – I worked first on embedded systems in India, and then moved to SUN in Silicon Valley, working on the openSPARC project. I designed chips and I did okay, but then I spent a solid year on a tester and had nightmares that the project would bring SUN down. Currently, my company builds 4G pipes for WiFi or WiMax.

Traditionally, consumer electronics are built from scratch. But, by necessity, consumer electronics are becoming more and more of a platform. As your mobile device becomes more of a platform, you’re going to be forced to design systems differently.

************************************

Round 2 – Matching the design effort to the quality specs

* Tets Maniwa – What kinds of things do you do, if you’re making a long-term product?

* Jason Kim – I got to work on 3 different products at TRW, including satellites, and then worked on the iPod as an architect. Throughout all of these product designs, we had very different metrics of quality assurance. Yet, in all these projects the design flow was always simple. However, I would suggest that Tets’ [design flow] slide is missing the feedback loop that says, product development at each level of development has been deemed to be okay.

The reality is that a product under development is constantly being sent back upstream if it’s not good enough. Most dramatically, at TRW we could only accommodate 1 failure in a billion cycles. Clearly, that metric is very different for consumer electronics.

* David Abercrombie – I want to comment briefly on the link between education and the quality metric. To do that, I’m going to put on a different engineering hat. We often think of analog designers today as artists – people who know how to make resisters, etc. These designers, when they’re really good, know which configuration in their design will last, and which won’t.

But today’s college grads only know how to push buttons, wait for 2 hours for the results. Then, if there are no errors in the log, they say the design’s okay, because they don’t really understand the basic engineering behind what they’re doing.

Having said that, the metrics that engineers – good or bad – attempt to meet are generally Power, Performance, and Area. Clearly, we know how to measure whether or not we’ve hit those metrics, but at the same time we have no real way to measure the Quality of the Design.

As an EDA tool vendor, I can’t come to your facility and educate your workforce to fully grasp the concept of quality. However, I can say: “Let me provide a metric for you that will let us measure something – many things – about your layout that will predict the quality of the ultimate product.”

What we really need to learn within the manufacturing paradigm is that 1) you need to measure things after you’ve manufactured a product, and 2) you need to put in place a statistical process control chart that, over time, allows you to look at the results of your manufacturing and tweak that process until you get it right.

I propose to all of you, that in the manufacturing flow we must start measuring and tracking metrics over time in order to get to better designs and products.

* Kevin Grundy – I’ve designed 7 chips and 20 boards over the course of my career, and I’ve suffered through a lot of false starts in the process. From experience, I’ve seen that you learn early on what happens when you compress the design cycle.

And, as David says, design is not just about pushing a button – it really starts in your brain. If you don’t understand the concepts of speed and quality – you may not be able to achieve all of the quality you want, particularly if you try to shorten the development time. If you shorten the development time, you simply may not have the time to get things right.

So, the question is: What is the appropriate speed for getting to a final design for a consumer electronics devices versus, say, the Mars Rover? If you want to slant your efforts towards Mars, you’re going to have to go far beyond whatever resides on a standard Quality Assurance checklist – the type of checklist you would adhere to if the project was a CE product.

It’s only by engaging with your market that you can determine what your QA checklist must include. Remember, it’s no good if you design for quality and miss your market. You may have the quality, but if you miss your market – it’s all for naught.

* Krishna Yarlagadda – I’ve worked on 20 different projects, including software, hardware, and systems – everything from a 1-man show, to working with a team of 500 designers working on a CPU. I’ve learned that as the team grows larger, less than 5 percent of the people involved actually grasp the big picture of the project.

It’s a fundamental problem – everybody wants to push their part of the design just a little farther than the specifications they’ve been assigned to meet. Problems always develop, because pushing the envelope like that pulls the end results farther and farther away from the targeted quality metric.

************************************

Round 3 – The conundrum of Known Good IP

* Tets Maniwa – Known good IP may not be used in the mode that the IP provider thought about. If you have “suspected” good IP, how do you know it’s working and testable and reliable in your project.?

* Krishna Yarlagadda – I say, don’t go beyond what the data sheet says the IP can do. Remember, it’s $30 million to $100 million dollars to design an SoC – time and money!

Also remember, there are lot of horror stories in the market today about IP. Has your IP been proven in your process technology? Has it been taped out? Because of all of this stuff, IP on the market today is still art. Again, don’t push the IP – even limit the use of ‘known good IP.”

* David Abercrombie – I actually suspect the whole idea of “known good IP.” I question whether it really exists. What we’re often doing for our customers is to hide the things that Calibre [the Mentor tool] finds in their IP [that don‘t match the design rules]. But, how can this be?

Is it that the design rules are broken, because Calibre’s certainly not broken. But everybody says: I talked to the fab guy, and that particular rule’s not really important!

And, you imagine why. Rules are about width, space, and area – but everything’s in 2 dimensions, not 3 or 4. My customers want to waive the rules. We just don’t know that the IP’s good. We’re not taking real problems in the fab and communicating it into the rules. In fact, we don’t really know that known good IP really exists.

* Jason Kim – The requirement is to get product out as quickly as possible – that’s a given as an engineer. But to get there, is it easier to use specific parts for specific use? Or, to use some parts that we know like the back of our hand. That’s when we start using a product that we known from somewhere else, and then we design around it to make it work. There’s no question – Proven IP is

* Kevin Grundy – If I’ve never used it before, I won’t use it. If it’s been used before, even in some modest way, then I may use it. If it’s been used a hundred times in the way I plan to use it, then okay!

I’ve bought IP that I was told would work, and yet it wouldn’t. That’s how you learn as a designer to ask certain questions. Then you make certain notations that say: I won’t even go down that road!

************************************

Round 4 – Porting designs from one process to the next

* Tets Maniwa – if you’re going to use a block from one process, is it still viable at the next process node?

* Kevin Grundy – You can no longer distance a block in a design from the process technology. I’d never say it a design could be ported down, especially if there’s even a whiff of analog, because then there’s really a concern.

In reality, you have to go so much deeper in the testing methodology. Back in the day, when we were moving from 2 micron down to 1 micron, porting a design was a slam dunk. But at the smaller geometries, smaller than the wavelength of the lithography, porting a design is like going to Mars!

* David Abercrombie – Portability is over! Even running the same chip at 65 nanometers at this fab, is not at all the same as running the same design at 65 nanometers at a different fab.

* Krishna Yarlagadda – I agree. That’s why the cost of design is going up astronomically.

************************************

Round 5 – Quality IP versus portability

* Question from the floor – It sounds like you gentlemen are saying that the use of third-party IP is over.

* Krishna Yarlagadda – Portability is different than not using IP. Building blocks are the way the world is headed; it’s taken for granted today.

* Kevin Grundy – I’ve seen IP that I wouldn’t be caught dead using. I’m used to making products where there’s no do-over. If there is a do-over, I’m dead. If I have one miss, if Steve Jobs wants it on June 6th and it’s not ready, I’m dead.

* David Abercrombie – What I’ve seen working with the IP providers is that they have to build custom sets of IP for every foundry, on every node, tuned just for that process.

* Jason Kim – The knowledge base is always the guy who create the IP. He has to be willing to stay with it and help with the integration!

************************************

Round 6 – Questions of yield

* Question from the floor – It looks like there have been yield issues with TSMC at 45 nanometers. Can you comment on that?

* David Abercrombie – You can have a deign at one node, that won’t work at the another one at all. [To solve that], you have to know your tools completely. You have to use the tool for two-to-three years before you’re really comfortable with it, and understand what’s happening there!

* Jason Kim – I worked at Silicon Imaging developing HDMI interface chips. SI developed chips and also sold IP to other companies. People really don’t like to buy unproven IP, but yet they still need IP desperately, which was why selling IP was part of SI’s business plan.

A lot of RTL companies today aren’t doing well because they just can’t afford proven IP. Portability problems may not be as bad as comments on this panel might suggest, but you absolutely do need to understand your circuit, and its limitation. Unless you’re working with proven silicon, there are so many variations!

* David Abercrombie – Just remember, these problems are fricken’ difficult! As a design community, we all need better trained people, with better skills. And, you can’t just take the sweet-price deals from a single EDA vendor, just to get the right price point on the tools. You need best-in-class tools and the very best engineers.

It’s hard! And, things have got to change if we’re going to make progress in the area of quality!

************************************

Round 7 – When to take chances

* Question from the floor – When you design, you often don’t know if you’re actually working on the worst case. I hear you saying that we shouldn’t take chances, but how do we decide when, or even if, we are actually taking chances?

* Kevin Grundy – If I have really specific worries about a design, I would go ahead and spend the money to hedge my bets. If I had to build a circuit to get a waveform from one clock domain to the next, for instance, and had to worry about meta-stability – but was also concerned about moving to the next process technology – I’d be inclined to build some test chips to systematically look at all of the problematic circuits.

* David Abercrombie – You’ve got to pick the right tool, and you’ve got to put a good guy on the design, one with a lot of skill in the correct context. Only then, will it be right the first time out of the box.

************************************

Round 8 – Software as a workaround

* Question from the floor – Today the software content in products is bigger than ever – embedded software, firmware, middleware. From the outside, however, it’s indistinguishable for the hardware, even though from the hardware guy’s point of view, it’s very different. When can software be a workaround for problems with the hardware?

* Krishna Yarlagadda – The role that software plays in consumer electronics is very, very high. But they’re still not platforms, because there’s no reuse and everything’s done from scratch. People should go more towards the platform approach to solve things, letting the software be the differentiator.

* Jason Kim -- We always say: All bugs are software and all failures are mechanical. [laughter]

Look, autos have 100 million lines of code running the engineer control. After 20 years at TRW, we were saying that we had more lines of code in our products that IBM had on their mainframes.

* Kevin Grundy – The MTBF [mean time between failures] on our software is tens of thousands of hours. But, as soon as you load in the code – and deal with feature creep – the biggest control point is just how smart is the guy who’s writing that code.

It’s hard to put my finger on the right [ratio between software and hardware concerns], but perhaps it’s 10-to-1 that the software kills the quality of the end products.

************************************

Round 9 – Issues with Analog IP

* Question from the floor – What role does reuse play in consumer electronics, even a company‘s own internal IP? If an analog guy is an artist, will he ever really reuse even his own company’s own hardware and software and other elements?

* Kevin Grundy – Yes, analog designers do want to throw away other people’s stuff. But, I’ve had so many close calls of having products going out that are marginal – even if I have to suffer with a little older design [block], there’s just a lot to be said for reuse and about covering your behind.

************************************

Round 10 – Guaranteed performance

* Question from the floor – That guarantee you’re all talking about – it’s very hard to come by! New applications being used in a different range of conditions characterized in a way that you didn’t intend – suddenly, it’s not guaranteed performance any more for anything you design.

* Kevin Grundy – Yes! It’s always a calculation between the safety of the old and the speed of the new.

Sometimes, you have to abandon the old. For me, I’ll run both – and won’t rely on the fact that the one design I thought would work, isn’t working.

************************************

In summary ...

* Designers need to understand more about engineering and physics, far more than just how to push a button.

* Engineers need to see the bigger picture of what they’re doing, and not be so stuck in their silos.

* Companies need to get product to market sooner rather than later.

* Everybody needs to understand that manufacturing is about harvesting data and tweaking the project until you’ve reached high yield.

* No one should push the envelop if they want to achieve the required quality specification.

___________________________________________________________

Peggy Aycinena owns and operates EDA Confidential:
www.aycinena.com

Copyright (c) 2010, Peggy Aycinena. All rights reserved.