Hacker Newsnew | past | comments | ask | show | jobs | submit | P-NP's commentslogin

> Schmidhuber is cited far more often than he should be

You haven't even read the paper, have you? Otherwise you'd see that it's Hinton and Bengio who are cited far more often than they should be. Just look at disputes B1, B2, B5, H2, H4, and H5 to see how they republished parts of his work again and again without citing it. No honest scientist can approve of something like that.


What a troll thing to write. User erostrate taking down the often so-called "father of modern AI" whose work is on erostrate's smartphone :-)


Are you saying Schmidhuber should also get credit for smartphones?


Agreed. The piece anticipated this straw man argument:

> "the inventor of an important method should get credit for inventing it. She may not always be the one who popularizes it. Then the popularizer should get credit for popularizing it (but not for inventing it)." Nothing more or less than the standard elementary principles of scientific credit assignment.[T22] LBH, however, apparently aren't satisfied with credit for popularising the inventions of others; they also want the inventor's credit.[LEC]


> without any real practical implementation

Don't you know that billions of people are using his work on a daily basis on their smartphone? CV: https://people.idsia.ch/~juergen/cv.html


Are you talking about the 1985 Genetic Programming paper by Cramer? Unlike Hinton and Bengio, Schmidhuber has corrected himself:

> BTW, I committed a similar error in 1987 when I published what I thought was the first paper on Genetic Programming (GP), that is, on automatically evolving computer programs[GP1][GP] (authors in alphabetic order). At least our 1987 paper[GP1] seems to be the first on GP for codes with loops and codes of variable size, and the first on GP implemented in a Logic Programming language. Only later I found out that Nichael Cramer had published GP already in 1985[GP0] (and that Stephen F. Smith had proposed a related approach as part of a larger system[GPA] in 1980). Since then I have been trying to do the right thing and correctly attribute credit.

Source: https://people.idsia.ch/~juergen/deep-learning-miraculous-ye...


No.


So which papers are you talking about?


This sounds a bit like a justification of plagiarism. In science, you must cite the original work.


I can see some angry comments here, but so far I have not seen any facts that refute his claims. Once I spent a long time reviewing a related paper on Hacker News, and I think he is right about disputes B1, B2, B5, H2, H4, H5. I'd have to study the others more closely:

B: Priority disputes with Dr. Bengio (original date v Bengio's date): B1: Generative adversarial networks or GANs (1990 v 2014) B2: Vanishing gradient problem (1991 v 1994) B3: Metalearning (1987 v 1991) B4: Learning soft attention (1991-93 v 2014) for Transformers etc. B5: Gated recurrent units (2000 v 2014) B6: Auto-regressive neural nets for density estimation (1995 v 1999) B7: Time scale hierarchy in neural nets (1991 v 1995)

H: Priority disputes with Dr. Hinton (original date v Hinton's date): H1: Unsupervised/self-supervised pre-training for deep learning (1991 v 2006) H2: Distilling one neural net into another neural net (1991 v 2015) H3: Learning sequential attention with neural nets (1990 v 2010) H4: NNs program NNs: fast weight programmers (1991 v 2016) and linear Transformers H5: Speech recognition through deep learning (2007 v 2012) H6: Biologically plausible forward-only deep learning (1989, 1990, 2021 v 2022)

L: Priority disputes with Dr. LeCun (original date v LeCun's date): L1: Differentiable architectures / intrinsic motivation (1990 v 2022) L2: Multiple levels of abstraction and time scales (1990-91 v 2022) L3: Informative yet predictable representations (1997 v 2022) L4: Learning to act largely by observation (2015 v 2022)


In the past few hours I have had more time to look at the entire piece and download some of the referenced papers. So far I haven't found any claim that's factually inaccurate.

I think there is a reason why the ACM Turing awardees have never tried to defend themselves by presenting facts to the contrary: because they can't.

This might get interesting:

> The "Policy for Honors Conferred by ACM"[ACM23] mentions that ACM "retains the right to revoke an Honor previously granted if ACM determines that it is in the best interests of the field to do so." So I ask ACM to evaluate the presented evidence and decide about further actions.


As the author points out: there isn’t any one Kalman filter. The Kalman filter is really a recipe for constructing optimal linear predictive filters, but the actual characteristics of the resulting filter will depend on the dynamics, state variables, and sensors that you’ve tuned it for, and that dependence gets reflected numerically in the various matrices that your specific Kalman filter is built from.


It's actually on an academic review site called OpenReview where the dispute is ongoing: https://openreview.net/forum?id=BZ5a1r-kVsf

LeCun claims four "main original contributions" and Schmidhuber basically debunks them one by one, for example:

> (IV) your predictive differentiable models "for hierarchical planning under uncertainty" - you write: "One question that is left unanswered is how the configurator can learn to decompose a complex task into a sequence of subgoals that can individually be accomplished by the agent. I shall leave this question open for future investigation."

> Far from a future investigation, I published exactly this over 3 decades ago: a controller NN gets extra command inputs of the form (start, goal). An evaluator NN learns to predict the expected costs of going from start to goal. A differentiable (R)NN-based subgoal generator also sees (start, goal), and uses (copies of) the evaluator NN to learn by gradient descent a sequence of cost-minimizing intermediate subgoals [HRL1].

It will be interesting to follow this.


A good way of ending this article: "I was fascinated by the way in which the personal agendas of committee members were clothed in seemingly reasonable attempts to place restrictions on the prize," says Christopher Hollings, a historian of mathematics at the University of Oxford in the United Kingdom, who attended Barany's talk. "It is a nice and interesting reminder that mathematicians are people, too."


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: