Choosing Weights: Small Changes, Big Differences There are a number of important, and sometimes subtle, choices that need to be made when building and training a neural network. You have to decide which loss function to use, how many layers to have, what stride and kernel size to use for each convolution layer, which optimization algorithm is best suited for the network, etc. With so many things that need to be decided, the choice of initial weights may, at first glance, seem like just another relatively minor pre-training detail, but weight initialization can actually have a profound impact on both the convergence rate and final quality of a network.
Have a great idea for an article?
We're always looking for guest contributors or article suggestions. Shoot us an email at email@example.com, we would love to hear yours!
Hacker News Title Tool Enter a potential title for a Hacker News submission below to see how likely it is to succeed or to be flagged dead. Once you play around a bit you can read on to learn how exactly these predictions are made. Background Submitting an article to Hacker News can be a little stressful if you’ve invested a lot of time in writing it. An article’s success really hinges upon getting the initial four or five votes that will push it on to the front page where it can reach a broader audience.