 |
|
| Computers Forum Index » Computer Artificial Intelligence - Neural Nets » number of input (variables)... |
|
Page 1 of 1 |
|
| Author |
Message |
| amir... |
Posted: Sat Oct 24, 2009 3:03 am |
|
|
|
Guest
|
Is there a limitation in number of input (variables). Say we have 100
input variable.what would it effect the number of neuron in hidden
layer or the number of hidden layer itself? Say we could produce
enough data for training. |
|
|
| Back to top |
|
|
|
| Greg Heath... |
Posted: Tue Nov 03, 2009 3:51 am |
|
|
|
Guest
|
On Oct 23, 11:03 pm, amir <beh.am... at (no spam) gmail.com> wrote:
Quote: Is there a limitation in number of input (variables).
Given a set of output variables there is usually a limit to the
number of necessary inputs.
Whether or not those inputs or an alternative set are available
is one issue.
Other issues are:
Whether or not a given set of inputs contains enough information
to yield the desired outputs.
If the given set of inputs is sufficient, but not minimal, should
time be spent finding and removing redundant and/or irrelevant
inputs or should an attempt be made to design a net which will
automatically ignore them.
Quote: Say we have 100
input variable.what would it effect the number of neuron in hidden
layer or the number of hidden layer itself? Say we could produce
enough data for training.
Usually one hidden layer is sufficient.
If training to convergence without regularization, the number of
hidden
nodes is usually determined by trial and error with the constraint
that
the number of training equations, Neq, is sufficiently large for
estimating the Nw unknown weights without overfitting.
If using an overfitting mitigation technique (e.g., weight decay
and/or Early Stopping) Neq can be significantly smaller than
when training to convergence.
These issues are thoroughly discussed in the FAQ.
Also, see my post on pretraining advice.
A Google Group search using keywords
greg-heath Neq Nw
may be helpful.
Hope this helps.
Greg |
|
|
| Back to top |
|
|
|
| Oxygen... |
Posted: Tue Nov 03, 2009 1:41 pm |
|
|
|
Guest
|
Greg Heath wrote:
Quote: On Oct 23, 11:03 pm, amir <beh.am... at (no spam) gmail.com> wrote:
Is there a limitation in number of input (variables).
Given a set of output variables there is usually a limit to the
number of necessary inputs.
Whether or not those inputs or an alternative set are available
is one issue.
Other issues are:
Whether or not a given set of inputs contains enough information
to yield the desired outputs.
If the given set of inputs is sufficient, but not minimal, should
time be spent finding and removing redundant and/or irrelevant
inputs or should an attempt be made to design a net which will
automatically ignore them.
Amir should do some reading on variable selection. It depends on
what you are trying to model and how much data you have.
Greg's comments cover this conceptually.
You need to explore the alternatives and measure their contribution
to prediction error.
For simple models (Linear Models or GLMs) there used to be a rule
that if you used more than the five most important predictors, you needed
a good argument.
With NNs or hierarchical models or SVMs, this doesn't apply any more.
Just be careful to manage co-linearity.
A hybrid approach is to project the first five principle components (PCs)
and use them as inputs to build a model. Then test each variable for its
ability to predict the residual for a model built with the first five PCs. Then
throw a few of the most important of those in the mix of inputs...
O. |
|
|
| Back to top |
|
|
|
| Greg Heath... |
Posted: Wed Nov 04, 2009 5:09 am |
|
|
|
Guest
|
On Nov 3, 3:41 am, "Oxygen" <brea... at (no spam) you.fool.au> wrote:
Quote: Greg Heath wrote:
On Oct 23, 11:03 pm, amir <beh.am... at (no spam) gmail.com> wrote:
Is there a limitation in number of input (variables).
Given a set of output variables there is usually a limit to the
number of necessary inputs.
Whether or not those inputs or an alternative set are available
is one issue.
Other issues are:
Whether or not a given set of inputs contains enough
information
to yield the desired outputs.
If the given set of inputs is sufficient, but not minimal, should
time be spent finding and removing redundant and/or irrelevant
inputs or should an attempt be made to design a net which
will automatically ignore them.
Amir should do some reading on variable selection. It depends
on what you are trying to model and how much data you have.
Greg's comments cover this conceptually.
You need to explore the alternatives and measure their
contribution to prediction error.
For simple models (Linear Models or GLMs) there used to be
a rule that if you used more than the five most important
predictors, you needed a good argument.
With NNs or hierarchical models or SVMs, this doesn't apply
any more.ust be careful to manage co-linearity.
A hybrid approach is to project the first five principle
components (PCs) and use them as inputs to build a model.
Good for regression but not necessaily for classification.
Think of two parallel cigar-shaped distributions in 2-D.
The dominant PC along the cigars is orthogonal to the
direction of separation.
Quote: Then test each variable for its
ability to predict the residual for a model built with the first five
PCs. Then throw a few of the most important of those in the
mix of inputs...
In classification, the most effective variables tend to
dominate in linear and quadratic classifier models.
Consider these first instead of PCs.
Hope this helps.
Greg |
|
|
| Back to top |
|
|
|
|
|
All times are GMT
The time now is Sun Nov 29, 2009 2:42 am
|
|