dave@autobox.com wrote in
news:1169437247.357393.290920@a75g2000cwd.googlegroups.com:
David Winsemius wrote:
"Pekka Jarvela" <pekkajarvela@email.com> wrote in
news:1169326747.707212.147920@v45g2000cwv.googlegroups.com:
CO2 concentrations in atmosphere measured at Mauna Loa in Hawaii
are known as Keeling curve,
http://en.wikipedia.org/wiki/Keeling_curve You can get the update
date from
http://scrippsco2.ucsd.edu/data/in_situ_co2/mlo_in_situ_record.txt
In this page it is said that
"The "detrended" data is seasonally adjusted by removing a
4-harmonic fit with a linear gain factor. The "fit" is based on a
stiff spline plus 4-harmonic functions with linear gain."
1. Is detrending fitting a line y = ax + b to data and then
subtracting this line from data?
2. What does "removing a 4-harmonic fit with a linear gain factor"
mean? Has this something to do with Fourier analysis?
http://repositories.cdlib.org/cgi/viewcontent.cgi?article=1190&context=sio
"The number of harmonics refers to a portion of the fitting function
which involves sinusoidal terms with a fundamental period of one year
plus higher order Fourier components. Thus, 2 harmonics indicates
that terms with periods of 1 year and 6 months were fit, 4 harmonics
indicates additional terms with periods of 4 and 3 months."
See also
http://repositories.cdlib.org/cgi/viewcontent.cgi?article=1110&context=sio
This data set can be adequately modeled as a ARIMA Model of the
follwing form.
Rather than assume a particular deterministiv form, the data
autocorrelative structure can be examined which yields Gaussian
Residuals while pointing to anaomalies that didn't followthe paradigm
...suggesting unusual events or readings..
snipped one entire model...
FORECASTING WITH FINAL MODEL
MODEL COMPONENT LAG COEFF STANDARD P
T
# (BOP) ERROR VALUE
VALUE
Differencing 12
1CONSTANT .120 .294E-01 .0001
4.07
2Autoregressive-Factor # 1 1 .916 .194E-01 .0000
47.23
3Moving Average-Factor # 2 1 .210 .481E-01 .0000
4.36
sinpped details regarding pulses...
MODEL STATISTICS AND EQUATION FOR THE CURRENT EQUATION (DETAILS
FOLLOW).
Estimation/Diagnostic Checking for Variable Y C02
: NEWLY IDENTIFIED VARIABLE X1 I~P00064 1964/ 4
PULSE
: NEWLY IDENTIFIED VARIABLE X2 I~P00160 1972/ 4
PULSE
: NEWLY IDENTIFIED VARIABLE X3 I~P00046 1962/ 10
PULSE
: NEWLY IDENTIFIED VARIABLE X4 I~P00509 2001/ 5
PULSE
: NEWLY IDENTIFIED VARIABLE X5 I~P00448 1996/ 4
PULSE
: NEWLY IDENTIFIED VARIABLE X6 I~P00266 1981/ 2
PULSE
: NEWLY IDENTIFIED VARIABLE X7 I~P00376 1990/ 4
PULSE
: NEWLY IDENTIFIED VARIABLE X8 I~P00506 2001/ 2
PULSE
: NEWLY IDENTIFIED VARIABLE X9 I~P00081 1965/ 9
PULSE
: NEWLY IDENTIFIED VARIABLE X10 I~P00079 1965/ 7
PULSE
: NEWLY IDENTIFIED VARIABLE X11 I~P00577 2007/ 1
PULSE
: NEWLY IDENTIFIED VARIABLE X12 I~P00350 1988/ 2
PULSE
Two comments on interpretability and the merits of physically-based
modeling vs. free-form modeling:
1) If you look at the second citation in more detail I think you will
find the geophysical meaning of the pulses. The anomalies look to be
associated with El Niņo/Southern Oscillation events.
2) In situations where much of the higher frequency signal is clearly
driven by an annually varying force, the Earth's inclined axis in orbit
around the Sun, a mathematical formulation using a frequency domain
analysis would be more readily interpretable. I suppose a two term
linear model with one AR(12) term would predict oscillation around a
rising trend line, but I do not see why that approach is superior to a
model built with knowledge about the underlying reality. I also wonder
whether the audience would have the background to interpret the terms
in an ARIMA model.
--
David Winsemius