Wow, really important details here; very interesting. Now imagine this chip, but shrunk to Intel's latest 14nm process. If they make that chip I have no doubt that it will revolutionize the entire field.
What PeCaN is driving at, is that saying something like you "have no doubt that it will revolutionize the entire field," is simply hyperbole that is not supported. We'll see. Nervana didn't really gain that much traction, which is why it's being acquired. Intel hasn't had much of a deep learning presence so far, which is why they are acquiring.
> If they make that chip I have no doubt that it will revolutionize the entire field.
Ha, ha. Yeah… no. For one thing, there's no mention of its power consumption, no CUDA support, questionable memory design (sure, you can get a million TFlops without cache, now try to get that chip to do anything useful), etc.
Intel probably bought 'em to work on integrated GPUs or Xeon Phi or something.
> Intel probably bought 'em to work on integrated GPUs or Xeon Phi or something.
Ha, ha. Yeah… no.
> there's no mention of its power consumption
Can't be tremendously higher than Pascal for reasons of physics.
> sure, you can get a million TFlops without cache, now try to get that chip to do anything useful
"Without cache" is certainly an exaggeration. It won't have a globally coherent cache hierarchy in the style of CPUs. It certainly will have various on chip memories to hold intermediate results. Neural net workloads are incredibly predictable and homogeneous and are essentially the perfect scenario for hand optimization of data flows to beat automatic caching.
> no CUDA support
You're just being silly now. CUDA isn't a standard, it's proprietary to NVIDIA and this isn't a general purpose processor anyway.
The GP part stands for "something else than graphics". CUDA is not general purpose in the sense that a lot of types of computation and data movement results in terrible performance. You buy GPU accelerators for performance reasons.
Case in point, GPUs are standard (for now) for training DNNs, but the big server farms that actually do the scoring run on Xeons.
Nothing per se. It's just that it lags far enough behind CUDA that unless you have a specific reason to avoid NVIDIA hardware there are very few pragmatic reasons to use it.
Google is very different from Intel on these fronts at least:
1. Google's open framework TensorFlow is exploding in popularity effectively moving the common interface up from CUDA.
2. Google has a huge need for internal AI/DeepLearning
3. Google has cloud and AI/DeepLearning as a service business
4. Google doesn't sells hardware
However, Intel dominance is not to be underestimated, they definitely can make industry wide impact quickly. Just saying you can not easily draw parallels here
Google's TPU claims to have a >10x perf/w advantage over GPUs (they state competitors, but GPUs are the defacto standard) and thats frankly more important that raw throughput at scale.
If Intel can ape Nvidia at deep learning then it is a big growth opportunity for them.
They don't state flops because they don't do floats. The nervana chip has some weird fixed point format that they think works better for deep learning. I've heard similar noise out of Google (the other TPU company ...), so I wouldn't be surprised if we'll see the same in many of the specialized chips that people will build for deep learning.
I really don't think CUDA support is important for nervana's offering. They think of themselves as the Apples of deep learning - they want to offer an integrated stack from the chip all the way to the APIs. The way most people use deep learning you don't really need to know CUDA, you just need to use a library that is fast. So it's enough that nervana's engineers know how to write deep learning libs for their own chip. Furthermore I can't see Intel caring that much about CUDA support, since CUDA is owned by Nvidia.
What I've heard about the chip makes it sound really exciting. Many of the trade-offs in deep learning are different from the ones you do in graphics, so specialized hardware makes sense.
Most of the questions asked in these threads on HN are answered in an update article they posted this morning about 14 nanometer, the Neon framework's future and such. Also, looks like everyone who broke the news did got the acquisition amount wrong. http://www.nextplatform.com/2016/08/10/nervana-ceo-intel-acq...
Finally it looks like Intel is getting serious about competing with Nvidia GPUs in the nascent deep/machine learning market!
As good as this could be, it would be even better if we also get an open-source software stack that can compete with Nvidia's proprietary CUDA stack, which currently dominates everywhere (except maybe in Google's data centers).
Oh, so you (or OP) want Intel/Nervana to create yet another programming model which will succeed where OpenCL fails (e.g. beat CUDA)? Seems unlikely...
OpenCL is not well supported by any major learning framework out there. Theano has had WIP support for years with little visible progress, Torch has an unofficial fork for it, and IIUC Caffe doesn't use OpenCL at all.
Like it or not, Nvidia's currently cornered the market. With their outstanding work on CUDNN et al, they are milking that cow for all it's worth.
Likely a good move for Intel to get in on one of the faster growing areas of computing. Let's the compete with the GPU folks and offer something to the cloud players eventually (a la Tensor Processing Unit). Nice to see the competition.
Also lets Nervana scale and potentially get their Neon deep learning framework out there in the face of bigger players (a la TensorFlow).
All in all it's good to see the competition in this space.
I was aware of this deal before leaving. I made out pretty well with my vested shares but was more interested in working on the cutting edge of research than in continuing on with pure hardware optimization and design. I wish Nervana/Intel the best and I'm looking forward to seeing their hardware come to market. I'm mainly working on TensorFlow now, but would love to see Nervana finish the graph backend they've been working on.
Yep, NVIDIA. Nervana's dedicated ASIC will deliver 55 (mostly) int16 TOps in 2017. In contrast, the two Titan XP GPUs I bought last week for a total of $2400 deliver 44 such TOps. Next year, a single Volta GPU will deliver at least 36 so I saw no way for them to win on their own with NVIDIA's GPU roadmap merrily marching along since 2007.
However, getting access to Intel's fabs makes them a lot more interesting and competitive. It's not a slamdunk for Intel yet because they still have to incorporate this into their product line (anyone seen Altera's Stratix 10 yet? Because that was supposed to be 2014's 10+ TFLOP GPU killer), but it's a fantastic acquisition and I wish them the best.
Very exciting that Intel sees a need to compete with NVIDIA here in a way that isn't just more x86. NVIDIA certainly needs the competition. Now can AMD get in on the action? They should be in a pretty good place to compete but so far they seem to have missed the boat, with approximately zero software support.
Unfortunately, I am afraid that AMD is to far behind Intel technologically in the moment. What is so powerful in this deal, is that Nervana will be able to leverage Intel's capabilities and expertise in chip making, and they are definitely a leader in this area.
Update: An earlier version of the story indicated the purchase price was more than $350 million, according to a source. Multiple investors told Recode the purchase price was significantly above that price, with one pegging it at $408 million.
Summary:
- 28nm
- looks similar to P100 (interposer with HBM)
- 55 teraops/s performance
- custom number format (variable length fixed point?)
- simplified memory architecture (no cache?)
- no info about power consumption