Inverting differenced values - wrong starting point #14

Mkranj · 2024-05-19T10:27:49Z

In Chapter 5, there's a paragraph about reverting differenced values to the original scale, with the following code:

df['pred_foot_traffic'] = pd.Series()
df['pred_foot_traffic'][948:] = df['foot_traffic'].iloc[948] +
➥ pred_df['pred_AR'].cumsum()

However, df['foot_traffic'].iloc[948] is not the last point of training data, but the first point to predict, isn't it? So shouldn't the code actually use df['foot_traffic'].iloc[947] ?

The text was updated successfully, but these errors were encountered:

BorjaArroyo · 2024-11-13T12:10:10Z

Hi @Mkranj , I have found a similar issue in Chapter 5 as well. When the author applies the inverse transformation to the ARMA(2, 2) model for the hourly bandwith dataset, it proposes the following code.

df['pred_bandwidth'] = pd.Series()
df['pred_bandwidth'].iloc[9832:] = df['hourly_bandwidth'].iloc[9832] + predictions['sarimax'].cumsum()

Which gives a MAE of 14. Nevertheless, the index of the initial point for the difference is wrong, as it should be the last point of the training set. Therefore, the code should apply the addition with respect to the index 9831 as follows:

df['pred_bandwidth'] = pd.Series()
df['pred_bandwidth'].iloc[9832:] = df['hourly_bandwidth'].iloc[9831] + predictions['sarimax'].cumsum()

To verify this result, you can use the following code, which operates on the whole array:

df_final = train_diff.copy(True)
assert len(df_final) == len(train_diff)
df_final.loc[df.index[0]] = df.iloc[0]
df_final = df_final.sort_index()
assert len(df_final) == (len(train_diff) + 1)
df_final = pd.concat([df_final["hourly_bandwidth"], predictions["sarimax"]]).cumsum()
assert len(df_final) == len(df)

Which returns the same result as the fixed code. The variables are:

train_diff results from the difference with drop of the nan value.
df is the whole dataset as it is loaded with pd.read_csv.
predictions is the DataFrame with the three different models, including sarimax -> ARMA(2,2).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inverting differenced values - wrong starting point #14

Inverting differenced values - wrong starting point #14

Mkranj commented May 19, 2024

BorjaArroyo commented Nov 13, 2024

Inverting differenced values - wrong starting point #14

Inverting differenced values - wrong starting point #14

Comments

Mkranj commented May 19, 2024

BorjaArroyo commented Nov 13, 2024