目录
- 问题一
- 1.1 蔬菜类商品不同品类或不同单品之间可能存在一定的关联关系,请分析蔬菜各品类及单品销售量的分布规律及相互关系。
- 数据预处理
- 数据合并
- 提取年、月、日信息
- 对蔬菜的各品类按月求销量均值
 
- 季节性时间序列分解
- STL分解
- 加法分解
- 乘法分解
 
- ARIMA
- LSTM
 
 
 
import pandas as pd
path = '/home/shiyu/Desktop/path_acdemic/ant/数模/历年题目/2023/'
d1 = pd.read_excel(path + '附件1.xlsx')
d2 = pd.read_excel(path + '附件2.xlsx')
d3 = pd.read_excel(path + '附件3.xlsx')
d4 = pd.read_excel(path + '附件4.xlsx',sheet_name='Sheet1')
print(d1.shape)
print(d2.shape)
print(d3.shape)
print(d4.shape)
(251, 4)
(878503, 7)
(55982, 3)
(251, 3)
问题一
1.1 蔬菜类商品不同品类或不同单品之间可能存在一定的关联关系,请分析蔬菜各品类及单品销售量的分布规律及相互关系。

数据预处理
数据合并
d1['分类名称'].value_counts()
分类名称
花叶类      100
食用菌       72
辣椒类       45
水生根茎类     19
茄类        10
花菜类        5
Name: count, dtype: int64
import pandas as pd
d12 = pd.merge(d2, d1, on='单品编码')
d3.columns = ['销售日期'] + list(d3.columns[1:3])
d123 = pd.merge(d12, d3, on=['单品编码','销售日期'])
d1234 = pd.merge(d123, d4, on=['单品编码','单品名称'])
d1234.shape
(878503, 12)
提取年、月、日信息
d1234['月份'] = d1234['销售日期'].dt.month
d1234['月份'] = d1234['月份'].astype(str).str.zfill(2)
d1234['年份'] = d1234['销售日期'].dt.year
d1234['年月'] = d1234['年份'].astype(str) + '-' + d1234['月份'].astype(str) 
对蔬菜的各品类按月求销量均值
def my_group(category):
    d_sub = d1234[d1234['分类名称'] == category]
    # 对销量按月求均值
    sale_by_month = pd.DataFrame(d_sub.groupby(['年月'])['销量(千克)'].mean())
    sale_by_month.columns = [category + sale_by_month.columns]
    sale_by_month['年月'] = sale_by_month.index
    return(sale_by_month)
sale_by_month_leaves = my_group('花叶类')
sale_by_month_mushroom = my_group('食用菌')
sale_by_month_pepper = my_group('辣椒类')
sale_by_month_water = my_group('水生根茎类')
sale_by_month_eggplant = my_group('茄类')
sale_by_month_cauliflower = my_group('花菜类')
from functools import reduce
dfs = [sale_by_month_leaves, sale_by_month_mushroom, sale_by_month_pepper, sale_by_month_water, sale_by_month_eggplant, sale_by_month_cauliflower]
sale_by_month_all = reduce(lambda left,right: pd.merge(left,right), dfs)
sale_by_month_all.head()
| 花叶类销量(千克) | 年月 | 食用菌销量(千克) | 辣椒类销量(千克) | 水生根茎类销量(千克) | 茄类销量(千克) | 花菜类销量(千克) | |
|---|---|---|---|---|---|---|---|
| 0 | 0.464680 | 2020-07 | 0.308806 | 0.280185 | 0.418734 | 0.580838 | 0.473726 | 
| 1 | 0.483167 | 2020-08 | 0.334804 | 0.309298 | 0.533321 | 0.549105 | 0.455973 | 
| 2 | 0.500742 | 2020-09 | 0.351644 | 0.301242 | 0.557913 | 0.543880 | 0.464073 | 
| 3 | 0.529107 | 2020-10 | 0.458446 | 0.292424 | 0.651536 | 0.536834 | 0.510383 | 
| 4 | 0.625763 | 2020-11 | 0.553853 | 0.322914 | 0.643466 | 0.484198 | 0.535812 | 
df = pd.DataFrame(None, columns=['年月', '销量','蔬菜品类'], index=range(sale_by_month_all.shape[0]*6))
df['销量'] = list(sale_by_month_all.iloc[:,0]) + list(sale_by_month_all.iloc[:,2]) + list(sale_by_month_all.iloc[:,3]) + list(sale_by_month_all.iloc[:,4]) + list(sale_by_month_all.iloc[:,5]) + list(sale_by_month_all.iloc[:,6])
df['年月'] = list(sale_by_month_all.iloc[:,1]) * 6
names = list(sale_by_month_all.columns[0]) + list(sale_by_month_all.columns)[2:7]
df['蔬菜品类'] = [x for x in names for i in range(sale_by_month_all.shape[0])]
df.head(3)
| 年月 | 销量 | 蔬菜品类 | |
|---|---|---|---|
| 0 | 2020-07 | 0.464680 | 花叶类销量(千克) | 
| 1 | 2020-08 | 0.483167 | 花叶类销量(千克) | 
| 2 | 2020-09 | 0.500742 | 花叶类销量(千克) | 
import plotly.express as px
fig = px.line(df, x="年月", y="销量", color='蔬菜品类', title='各蔬菜品类月销量随时间变化')
# center title
fig.update_layout(title_x=0.5)
# remove background color
fig.update_layout({
'plot_bgcolor': 'rgba(0, 0, 0, 0)',
'paper_bgcolor': 'rgba(0, 0, 0, 0)',
})
fig.show()

季节性时间序列分解
sale_by_month_all.head(3)
| 花叶类销量(千克) | 年月 | 食用菌销量(千克) | 辣椒类销量(千克) | 水生根茎类销量(千克) | 茄类销量(千克) | 花菜类销量(千克) | |
|---|---|---|---|---|---|---|---|
| 0 | 0.464680 | 2020-07 | 0.308806 | 0.280185 | 0.418734 | 0.580838 | 0.473726 | 
| 1 | 0.483167 | 2020-08 | 0.334804 | 0.309298 | 0.533321 | 0.549105 | 0.455973 | 
| 2 | 0.500742 | 2020-09 | 0.351644 | 0.301242 | 0.557913 | 0.543880 | 0.464073 | 
水生根茎类
STL分解
https://www.geo.fu-berlin.de/en/v/soga-py/Advanced-statistics/time-series-analysis/Seasonal-decompositon/STL-decomposition/index.html
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose, STL
import matplotlib.pyplot as plt
stl = STL(sale_by_month_all.iloc[:,4], period=12)
res = stl.fit()
data = {'trend': res.trend,
        'seasonality': res.seasonal,
        'residuals':res.resid
         }
res_stl = pd.DataFrame(data)
res_stl.head()
| trend | seasonality | residuals | |
|---|---|---|---|
| 0 | 0.644373 | -0.247768 | 0.022129 | 
| 1 | 0.642466 | -0.132287 | 0.023142 | 
| 2 | 0.640681 | -0.059934 | -0.022835 | 
| 3 | 0.639011 | 0.018400 | -0.005875 | 
| 4 | 0.637452 | 0.007495 | -0.001481 | 
# Linux show Chinese characters *** important
plt.rcParams['font.family'] = 'WenQuanYi Micro Hei' 
plt.rcParams['figure.dpi'] = 300
plt.rcParams['savefig.dpi'] = 300
fig = res.plot()

import scipy.stats as stats
plt.figure(figsize=(18, 6))
plt.figure()
stats.probplot(res.resid, dist="norm", plot=plt)
plt.title("QQ-Plot")
plt.show()

# histogram plot
plt.figure(figsize=(9, 3))
plt.hist(res.resid)
plt.title("Residuals")
plt.show()

加法分解
add = seasonal_decompose(
    sale_by_month_all.iloc[:,4], period=12,
    model="additive"
)
data = {'trend': add.trend,
        'seasonality': add.seasonal,
        'residuals':add.resid
         }
res_add = pd.DataFrame(data)
res_add.iloc[6:10,:]
| trend | seasonality | residuals | |
|---|---|---|---|
| 6 | 0.619041 | 0.168495 | -0.011543 | 
| 7 | 0.622029 | 0.143684 | 0.092367 | 
| 8 | 0.627388 | 0.047414 | 0.054634 | 
| 9 | 0.632759 | -0.025056 | 0.054265 | 
add.plot()

乘法分解
multi = seasonal_decompose(
    sale_by_month_all.iloc[:,4], period=12,
    model="multiplicative"
)
data = {'trend': multi.trend,
        'seasonality': multi.seasonal,
        'residuals':multi.resid
         }
res_multi = pd.DataFrame(data)
res_multi.iloc[6:10,:]
| trend | seasonality | residuals | |
|---|---|---|---|
| 6 | 0.619041 | 1.259177 | 0.995523 | 
| 7 | 0.622029 | 1.221443 | 1.129390 | 
| 8 | 0.627388 | 1.072259 | 1.084304 | 
| 9 | 0.632759 | 0.963760 | 1.085500 | 
multi.plot()

ARIMA
https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/
基于STL分解得到的趋势性,使用ARIMA模型进行拟合预测
res_stl.head()
| trend | seasonality | residuals | |
|---|---|---|---|
| 0 | 0.644373 | -0.247768 | 0.022129 | 
| 1 | 0.642466 | -0.132287 | 0.023142 | 
| 2 | 0.640681 | -0.059934 | -0.022835 | 
| 3 | 0.639011 | 0.018400 | -0.005875 | 
| 4 | 0.637452 | 0.007495 | -0.001481 | 
import datetime
from matplotlib import pyplot
series = res_stl['trend']
series.plot()

from pandas.plotting import autocorrelation_plot
autocorrelation_plot(series)

from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(series, order=(10,1,0))
model_fit = model.fit()
print(model_fit.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                  trend   No. Observations:                   36
Model:                ARIMA(10, 1, 0)   Log Likelihood                 222.352
Date:                Thu, 01 Aug 2024   AIC                           -422.704
Time:                        20:01:12   BIC                           -405.596
Sample:                             0   HQIC                          -416.798
                                 - 36                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1          1.4304      0.096     14.970      0.000       1.243       1.618
ar.L2         -0.2795      0.077     -3.638      0.000      -0.430      -0.129
ar.L3          0.0147      0.078      0.187      0.851      -0.139       0.169
ar.L4         -0.0643      0.045     -1.417      0.156      -0.153       0.025
ar.L5         -0.1505      0.046     -3.291      0.001      -0.240      -0.061
ar.L6         -0.0301      0.071     -0.422      0.673      -0.170       0.110
ar.L7         -0.0039      0.066     -0.059      0.953      -0.132       0.125
ar.L8          0.0468      0.039      1.198      0.231      -0.030       0.123
ar.L9          0.0183      0.034      0.537      0.591      -0.048       0.085
ar.L10         0.0070      0.104      0.067      0.946      -0.198       0.212
sigma2      1.491e-07   1.57e-08      9.498      0.000    1.18e-07     1.8e-07
===================================================================================
Ljung-Box (L1) (Q):                   0.08   Jarque-Bera (JB):               449.77
Prob(Q):                              0.78   Prob(JB):                         0.00
Heteroskedasticity (H):               0.09   Skew:                             3.52
Prob(H) (two-sided):                  0.00   Kurtosis:                        19.09
===================================================================================
from pandas import DataFrame
# line plot of residuals
residuals = DataFrame(model_fit.resid)
residuals.plot()
pyplot.show()

# summary stats of residuals
print(residuals.describe())
               0
count  36.000000
mean    0.017927
std     0.107392
min    -0.001907
25%    -0.000042
50%     0.000025
75%     0.000113
max     0.644373
import warnings
warnings.filterwarnings("ignore")
from numpy import sqrt 
X = series.values
train, test = X, X
history = [x for x in train]
predictions = list()
# walk-forward validation
for t in range(len(test)):
 model = ARIMA(history, order=(5,1,0))
 model_fit = model.fit()
 output = model_fit.forecast()
 yhat = output[0]
 predictions.append(yhat)
 obs = test[t]
 history.append(obs)
 print('predicted=%f, expected=%f' % (yhat, obs))
    
# evaluate forecasts
rmse = sqrt(mean_squared_error(test, predictions))
print('Test RMSE: %.3f' % rmse)
predicted=0.746828, expected=0.644373
predicted=0.639646, expected=0.642466
predicted=0.637131, expected=0.640681
predicted=0.636508, expected=0.639011
predicted=0.638260, expected=0.637452
predicted=0.636941, expected=0.636011
predicted=0.635828, expected=0.634695
predicted=0.634524, expected=0.633510
predicted=0.633352, expected=0.632476
predicted=0.632332, expected=0.631610
predicted=0.631483, expected=0.630988
predicted=0.630880, expected=0.630979
predicted=0.630907, expected=0.633298
predicted=0.633330, expected=0.636138
predicted=0.636276, expected=0.639619
predicted=0.639865, expected=0.643978
predicted=0.644346, expected=0.649266
predicted=0.649768, expected=0.655252
predicted=0.655890, expected=0.661718
predicted=0.662507, expected=0.668408
predicted=0.669351, expected=0.675055
predicted=0.676136, expected=0.681361
predicted=0.682541, expected=0.687128
predicted=0.688354, expected=0.692332
predicted=0.693554, expected=0.697262
predicted=0.698456, expected=0.701650
predicted=0.702786, expected=0.705999
predicted=0.707089, expected=0.710305
predicted=0.711366, expected=0.714555
predicted=0.715603, expected=0.718749
predicted=0.719792, expected=0.722890
predicted=0.723909, expected=0.726981
predicted=0.728039, expected=0.731031
predicted=0.732095, expected=0.735043
predicted=0.736114, expected=0.739024
predicted=0.740101, expected=0.742980
Test RMSE: 0.017
# plot forecasts against actual outcomes
pyplot.plot(test)
pyplot.plot(predictions, color='red')
pyplot.show()

LSTM
https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# fix random seed for reproducibility
tf.random.set_seed(7)
df = d1234.iloc[:,3]
df = np.array(df).reshape(-1,1)
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
df_norm = scaler.fit_transform(df)
# split into train and test sets
train_size = int(len(df_norm) * 0.7)
test_size = len(df_norm) - train_size
train, test = df_norm[0:train_size,:], df_norm[train_size:len(df_norm),:]
print(len(train), len(test))
614952 263551
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), 0]
        dataX.append(a)
        dataY.append(dataset[i + look_back, 0])
        return np.array(dataX), np.array(dataY)
# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
Epoch 1/100
1/1 - 1s - loss: 0.0026 - 611ms/epoch - 611ms/step
Epoch 2/100
1/1 - 0s - loss: 0.0024 - 2ms/epoch - 2ms/step
Epoch 3/100
1/1 - 0s - loss: 0.0023 - 2ms/epoch - 2ms/step
Epoch 4/100
1/1 - 0s - loss: 0.0021 - 2ms/epoch - 2ms/step
Epoch 5/100
1/1 - 0s - loss: 0.0020 - 2ms/epoch - 2ms/step
Epoch 6/100
1/1 - 0s - loss: 0.0019 - 2ms/epoch - 2ms/step
Epoch 7/100
1/1 - 0s - loss: 0.0017 - 1ms/epoch - 1ms/step
Epoch 8/100
1/1 - 0s - loss: 0.0016 - 1ms/epoch - 1ms/step
Epoch 9/100
1/1 - 0s - loss: 0.0015 - 1ms/epoch - 1ms/step
Epoch 10/100
1/1 - 0s - loss: 0.0014 - 1ms/epoch - 1ms/step
Epoch 11/100
1/1 - 0s - loss: 0.0013 - 1ms/epoch - 1ms/step
Epoch 12/100
1/1 - 0s - loss: 0.0012 - 1ms/epoch - 1ms/step
Epoch 13/100
1/1 - 0s - loss: 0.0011 - 2ms/epoch - 2ms/step
Epoch 14/100
1/1 - 0s - loss: 9.5703e-04 - 2ms/epoch - 2ms/step
Epoch 15/100
1/1 - 0s - loss: 8.6722e-04 - 2ms/epoch - 2ms/step
Epoch 16/100
1/1 - 0s - loss: 7.8267e-04 - 2ms/epoch - 2ms/step
Epoch 17/100
1/1 - 0s - loss: 7.0334e-04 - 2ms/epoch - 2ms/step
Epoch 18/100
1/1 - 0s - loss: 6.2916e-04 - 2ms/epoch - 2ms/step
Epoch 19/100
1/1 - 0s - loss: 5.6005e-04 - 1ms/epoch - 1ms/step
Epoch 20/100
1/1 - 0s - loss: 4.9593e-04 - 1ms/epoch - 1ms/step
Epoch 21/100
1/1 - 0s - loss: 4.3668e-04 - 1ms/epoch - 1ms/step
Epoch 22/100
1/1 - 0s - loss: 3.8218e-04 - 2ms/epoch - 2ms/step
Epoch 23/100
1/1 - 0s - loss: 3.3229e-04 - 2ms/epoch - 2ms/step
Epoch 24/100
1/1 - 0s - loss: 2.8686e-04 - 2ms/epoch - 2ms/step
Epoch 25/100
1/1 - 0s - loss: 2.4572e-04 - 2ms/epoch - 2ms/step
Epoch 26/100
1/1 - 0s - loss: 2.0869e-04 - 2ms/epoch - 2ms/step
Epoch 27/100
1/1 - 0s - loss: 1.7558e-04 - 2ms/epoch - 2ms/step
Epoch 28/100
1/1 - 0s - loss: 1.4619e-04 - 2ms/epoch - 2ms/step
Epoch 29/100
1/1 - 0s - loss: 1.2030e-04 - 2ms/epoch - 2ms/step
Epoch 30/100
1/1 - 0s - loss: 9.7691e-05 - 2ms/epoch - 2ms/step
Epoch 31/100
1/1 - 0s - loss: 7.8142e-05 - 1ms/epoch - 1ms/step
Epoch 32/100
1/1 - 0s - loss: 6.1421e-05 - 1ms/epoch - 1ms/step
Epoch 33/100
1/1 - 0s - loss: 4.7296e-05 - 2ms/epoch - 2ms/step
Epoch 34/100
1/1 - 0s - loss: 3.5534e-05 - 2ms/epoch - 2ms/step
Epoch 35/100
1/1 - 0s - loss: 2.5905e-05 - 2ms/epoch - 2ms/step
Epoch 36/100
1/1 - 0s - loss: 1.8181e-05 - 2ms/epoch - 2ms/step
Epoch 37/100
1/1 - 0s - loss: 1.2141e-05 - 2ms/epoch - 2ms/step
Epoch 38/100
1/1 - 0s - loss: 7.5720e-06 - 2ms/epoch - 2ms/step
Epoch 39/100
1/1 - 0s - loss: 4.2689e-06 - 2ms/epoch - 2ms/step
Epoch 40/100
1/1 - 0s - loss: 2.0387e-06 - 2ms/epoch - 2ms/step
Epoch 41/100
1/1 - 0s - loss: 7.0006e-07 - 2ms/epoch - 2ms/step
Epoch 42/100
1/1 - 0s - loss: 8.5575e-08 - 1ms/epoch - 1ms/step
Epoch 43/100
1/1 - 0s - loss: 4.2072e-08 - 1ms/epoch - 1ms/step
Epoch 44/100
1/1 - 0s - loss: 4.3148e-07 - 2ms/epoch - 2ms/step
Epoch 45/100
1/1 - 0s - loss: 1.1312e-06 - 1ms/epoch - 1ms/step
Epoch 46/100
1/1 - 0s - loss: 2.0341e-06 - 1ms/epoch - 1ms/step
Epoch 47/100
1/1 - 0s - loss: 3.0485e-06 - 1ms/epoch - 1ms/step
Epoch 48/100
1/1 - 0s - loss: 4.0975e-06 - 1ms/epoch - 1ms/step
Epoch 49/100
1/1 - 0s - loss: 5.1187e-06 - 2ms/epoch - 2ms/step
Epoch 50/100
1/1 - 0s - loss: 6.0630e-06 - 2ms/epoch - 2ms/step
Epoch 51/100
1/1 - 0s - loss: 6.8934e-06 - 2ms/epoch - 2ms/step
Epoch 52/100
1/1 - 0s - loss: 7.5847e-06 - 4ms/epoch - 4ms/step
Epoch 53/100
1/1 - 0s - loss: 8.1211e-06 - 3ms/epoch - 3ms/step
Epoch 54/100
1/1 - 0s - loss: 8.4957e-06 - 3ms/epoch - 3ms/step
Epoch 55/100
1/1 - 0s - loss: 8.7090e-06 - 3ms/epoch - 3ms/step
Epoch 56/100
1/1 - 0s - loss: 8.7674e-06 - 3ms/epoch - 3ms/step
Epoch 57/100
1/1 - 0s - loss: 8.6819e-06 - 2ms/epoch - 2ms/step
Epoch 58/100
1/1 - 0s - loss: 8.4674e-06 - 2ms/epoch - 2ms/step
Epoch 59/100
1/1 - 0s - loss: 8.1409e-06 - 2ms/epoch - 2ms/step
Epoch 60/100
1/1 - 0s - loss: 7.7213e-06 - 2ms/epoch - 2ms/step
Epoch 61/100
1/1 - 0s - loss: 7.2276e-06 - 2ms/epoch - 2ms/step
Epoch 62/100
1/1 - 0s - loss: 6.6792e-06 - 2ms/epoch - 2ms/step
Epoch 63/100
1/1 - 0s - loss: 6.0943e-06 - 2ms/epoch - 2ms/step
Epoch 64/100
1/1 - 0s - loss: 5.4902e-06 - 2ms/epoch - 2ms/step
Epoch 65/100
1/1 - 0s - loss: 4.8822e-06 - 2ms/epoch - 2ms/step
Epoch 66/100
1/1 - 0s - loss: 4.2841e-06 - 2ms/epoch - 2ms/step
Epoch 67/100
1/1 - 0s - loss: 3.7074e-06 - 2ms/epoch - 2ms/step
Epoch 68/100
1/1 - 0s - loss: 3.1617e-06 - 1ms/epoch - 1ms/step
Epoch 69/100
1/1 - 0s - loss: 2.6544e-06 - 1ms/epoch - 1ms/step
Epoch 70/100
1/1 - 0s - loss: 2.1909e-06 - 1ms/epoch - 1ms/step
Epoch 71/100
1/1 - 0s - loss: 1.7745e-06 - 1ms/epoch - 1ms/step
Epoch 72/100
1/1 - 0s - loss: 1.4072e-06 - 2ms/epoch - 2ms/step
Epoch 73/100
1/1 - 0s - loss: 1.0890e-06 - 1ms/epoch - 1ms/step
Epoch 74/100
1/1 - 0s - loss: 8.1904e-07 - 1ms/epoch - 1ms/step
Epoch 75/100
1/1 - 0s - loss: 5.9503e-07 - 1ms/epoch - 1ms/step
Epoch 76/100
1/1 - 0s - loss: 4.1391e-07 - 1ms/epoch - 1ms/step
Epoch 77/100
1/1 - 0s - loss: 2.7203e-07 - 1ms/epoch - 1ms/step
Epoch 78/100
1/1 - 0s - loss: 1.6523e-07 - 2ms/epoch - 2ms/step
Epoch 79/100
1/1 - 0s - loss: 8.9118e-08 - 2ms/epoch - 2ms/step
Epoch 80/100
1/1 - 0s - loss: 3.9202e-08 - 1ms/epoch - 1ms/step
Epoch 81/100
1/1 - 0s - loss: 1.1048e-08 - 1ms/epoch - 1ms/step
Epoch 82/100
1/1 - 0s - loss: 4.0064e-10 - 1ms/epoch - 1ms/step
Epoch 83/100
1/1 - 0s - loss: 3.2746e-09 - 1ms/epoch - 1ms/step
Epoch 84/100
1/1 - 0s - loss: 1.6036e-08 - 1ms/epoch - 1ms/step
Epoch 85/100
1/1 - 0s - loss: 3.5442e-08 - 1ms/epoch - 1ms/step
Epoch 86/100
1/1 - 0s - loss: 5.8693e-08 - 1ms/epoch - 1ms/step
Epoch 87/100
1/1 - 0s - loss: 8.3416e-08 - 1ms/epoch - 1ms/step
Epoch 88/100
1/1 - 0s - loss: 1.0769e-07 - 1ms/epoch - 1ms/step
Epoch 89/100
1/1 - 0s - loss: 1.3003e-07 - 1ms/epoch - 1ms/step
Epoch 90/100
1/1 - 0s - loss: 1.4932e-07 - 1ms/epoch - 1ms/step
Epoch 91/100
1/1 - 0s - loss: 1.6482e-07 - 1ms/epoch - 1ms/step
Epoch 92/100
1/1 - 0s - loss: 1.7613e-07 - 1ms/epoch - 1ms/step
Epoch 93/100
1/1 - 0s - loss: 1.8309e-07 - 1ms/epoch - 1ms/step
Epoch 94/100
1/1 - 0s - loss: 1.8579e-07 - 1ms/epoch - 1ms/step
Epoch 95/100
1/1 - 0s - loss: 1.8451e-07 - 2ms/epoch - 2ms/step
Epoch 96/100
1/1 - 0s - loss: 1.7963e-07 - 1ms/epoch - 1ms/step
Epoch 97/100
1/1 - 0s - loss: 1.7169e-07 - 1ms/epoch - 1ms/step
Epoch 98/100
1/1 - 0s - loss: 1.6123e-07 - 1ms/epoch - 1ms/step
Epoch 99/100
1/1 - 0s - loss: 1.4884e-07 - 1ms/epoch - 1ms/step
Epoch 100/100
1/1 - 0s - loss: 1.3510e-07 - 1ms/epoch - 1ms/step
<keras.callbacks.History at 0x7f1d50766be0>
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
1/1 [==============================] - 0s 192ms/step
1/1 [==============================] - 0s 10ms/step
Train Score: 0.06 RMSE
Test Score: 0.19 RMSE
# shift train predictions for plotting
trainPredictPlot = np.empty_like(df_norm)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(df_norm)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(df_norm))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()



















