TF 2.0 - 时间序列预测入门

博主： Lucien
发布时间：2019 年 11 月 11 日
6570 次浏览
暂无评论
8099字数
分类：机器学习

TF 2.0 - 时间序列预测入门

本文链接：https://blog.lucien.ink/archives/483/

最近 Google 正式将 TensorFlow 2.0 作为默认 TensorFlow 版本了，作为一名初学者，决定用相对易用的新版的 TensorFlow 来进行实践。

在接下来的内容中，我将记录我用 LSTM 和 Beijing PM2.5 Data Set 来进行时间序列预测的过程。

因为 ipynb 文件里都包含图片，所以在文章里就不上图了哈。

0. 环境

Package	Version
`tensorflow`	`2.0.0`
`numpy`	`1.17.3`
`pandas`	`0.25.3`
`matplotlib`	`3.1.1`
`sklearn`	`0.21.3`

1. 过程

1.1 数据集

Beijing PM2.5 Data Set 源自位于北京的美国大使馆在 2010 ~ 2014 年每小时采集的天气及空气污染指数。
　　数据集包括日期、PM2.5 浓度、露点、温度、风向、风速、累积小时雪量和累积小时雨量。

原始数据中完整的特征如下：

No 编号
year 年
month 月
day 日
hour 小时
pm2.5 PM2.5浓度
DEWP 露点
TEMP 温度
PRES 大气压
cbwd 风向
lws 风速
ls 累积雪量
lr 累积雨量

可以用此数据集搭建预测 PM 2.5 的模型，利用前 x 个小时来预测后 y 个小时的 PM 2.5 数值。

from TensorFlow import random
import TensorFlow.keras as keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
from sklearn.metrics import r2_score

# 固定随机种子
np.random.seed(10086)
random.set_seed(10010)

csv_data = keras.utils.get_file("PRSA_data.csv", "https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv")

raw_df = pd.read_csv(csv_data)

raw_df.head()

1.2 数据预处理

1.2.1 删除时间戳

目前的我认为，时间戳对于连续的时间序列预测来说并不重要，所以在这里先删掉。

# 删除时间戳
df = raw_df.drop(["No", "year", "month", "day", "hour"], axis=1, inplace=False)

print(df.shape)
df.head()

1.2.2 删除 nan

pm2.5 列有的值是空值，由于数量不多，所以考虑直接将包括 nan 的行删掉。

# 删除 pm2.5 列的 nan 值
df = df[pd.notna(df["pm2.5"])]

print(df.shape)
df.head()

1.2.3 打印当前状态的数据

# 查看每列的 unique
for i in range(df.shape[1]):
    if df.columns[i] in ["pm2.5", "TEMP", "DEWP", "PRES"]:
        continue
    print("{}: {}".format(df.columns[i], df[df.columns[i]].unique()))

# 画个图
columns = ["pm2.5", "DEWP", "TEMP", "PRES", "Iws", "Is", "Ir"]

plt.figure(figsize=(15, 15))
for i, each in enumerate(columns):
    plt.subplot(len(columns), 1, i + 1)
    plt.plot(df[each])
    plt.title(each, y=0.5, loc="right")  # center, left, right

plt.show()

1.2.4 将非数值类型的 label 转化为数值类型

# 将 label id 化
def label_fit_transform(data_frame, col_name):
    data_frame[col_name] = preprocessing.LabelEncoder().fit_transform(data_frame[col_name])
    return data_frame

label_fit_transform(df, "cbwd").head()

1.2.5 将数值归一化

归一化之后模型收敛会快一些，效果大概率会好一些，从感性角度去理解的话，我觉得知乎上的这个回答说的非常好。

def standardization(data_frame):
    buffer = data_frame.copy()
    min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
    standard_values = min_max_scaler.fit_transform(buffer)
    for i, col_name in enumerate(buffer.columns):
        buffer[col_name] = standard_values[:, i]

    return buffer

standardization(df).head()

1.2.6 将时间序列转化为有监督训练数据

原始的时间序列并不能直接 feed 给模型，需要处理为 input -> label 形式的数据才可以。

# 转化为监督序列
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    """
    Frame a time series as a supervised learning dataset.
    Arguments:
        data: Sequence of observations as a list or NumPy array.
        n_in: Number of lag observations as input (X).
        n_out: Number of observations as output (y).
        dropnan: Boolean whether or not to drop rows with NaN values.
    Returns:
        Pandas DataFrame of series framed for supervised learning.
    """
    from pandas import DataFrame, concat

    n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('%s(t-%d)' % (data.columns[j], i)) for j in range(n_vars)]

    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('%s(t)' % (data.columns[j])) for j in range(n_vars)]
        else:
            names += [('%s(t+%d)' % (data.coumns[j], i)) for j in range(n_vars)]

    # put it all together
    agg = concat(cols, axis=1)
    agg.columns = names

    # drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg

# 通过过去 2 小时的数据来预测未来 1 小时的数据
look_back = 2
predict_forward = 1

# standard (supervised) data frame
sdf = series_to_supervised(
    standardization(
        label_fit_transform(df, "cbwd")), look_back, predict_forward).drop(
        [
         "DEWP(t)", "TEMP(t)", "PRES(t)", "cbwd(t)", "Iws(t)", "Is(t)", "Ir(t)"
         ], axis=1, inplace=False).astype('float32')

sdf.head()

sdf.info()

1.2.7 划分数据集

从网上了解到，$train$、$valid$、$test$ 三个集合的比例一般为 $6:2:2$。

# train, valid, test 6:2:2 划分
total = sdf.shape[0]
split_point = [total * 60 // 100, total * 80 // 100]

print("total = {}, split_point = {}".format(total, split_point))

def transform(values):
    return values.reshape(values.shape[0], 1, values.shape[1])

train_data = sdf[:split_point[0]].values

valid_data = sdf[split_point[0]: split_point[1]].values

test_data = sdf[split_point[1]: ].values

print("train_data.shape = {}, valid_data.shape = {}, test_data.shape = {}".format(
    train_data.shape, valid_data.shape, test_data.shape))

train_x, train_y = transform(train_data[:, : -1]), train_data[:, -1]

valid_x, valid_y = transform(valid_data[:, : -1]), valid_data[:, -1]

test_x, test_y = transform(test_data[:, : -1]), test_data[:, -1]

print("train_x.shape = {}, train_y = {}".format(train_x.shape, train_y.shape))
print("valid_x.shape = {}, valid_y = {}".format(valid_x.shape, valid_y.shape))
print("test_x.shape = {}, test_y = {}".format(test_x.shape, test_y.shape))

1.3 模型

1.3.1 构建网络

model = keras.Sequential()
model.add(keras.layers.LSTM(64, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(keras.layers.Dense(1))

model.compile(loss="mae", optimizer="adam")

1.3.2 训练并记录历史

history = model.fit(train_x,
                    train_y,
                    validation_data=(valid_x, valid_y),
                    epochs=32,
                    batch_size=32,
                    verbose=1,
                    shuffle=False)

1.4 模型效果评估

1.4.1 loss 图

先画一下 train 和 valid 数据集的 loss 图，看起来没有 overfitting。

plt.plot(history.history["loss"], label="train loss")
plt.plot(history.history["val_loss"], label="valid loss")
plt.legend()
plt.show()

1.4.2 在 test 数据集上进行评估

1.4.2.1 loss

# test 集上的 loss
model.evaluate(test_x, test_y, verbose=0)

看起来很低的样子。

1.4.2.2 将预测值和真值进行比较

1.4.2.2.1 获取预测结果

prediction = model.predict(test_x)

1.4.2.2.2 对预测出来的结果进行反归一化

由于用的是 MinMaxScaler，所以直接按照公式逆着计算一下就可以。

max_value = np.max(df["pm2.5"])
min_value = np.min(df["pm2.5"])
prediction = prediction[:, 0] * (max_value - min_value) + min_value

1.4.2.2.3 评估拟合能力

# 因为 look_back 处理时会去掉值为 nan 的 input，所以这里要加上 look_back
expectation = df["pm2.5"][split_point[1] + look_back: ].values

print("prediction's shape = {}, expectation's shape = {}".format(prediction.shape, expectation.shape))

# 计算一下 R-square
print(r2_score(expectation, prediction, multioutput="raw_values"))

plt.figure(figsize=(30, 17))
plt.plot(expectation, label="expectation")
plt.plot(prediction, label="predict", color="yellow", alpha=0.5)
plt.legend()
plt.show()

2. All in One

ipynb 文件可在我的 GitHub 或者 Google CoLab 看到。

py 文件可在 pasteme.cn/21403 查看。

最后修改：2019 年 11 月 11 日

谢谢老板！

发表评论取消回复

评论 *

私密评论

名称 *

🎲

邮箱 *

地址

Red
感谢，搞好了！
flipped895
忘了从哪个友链点进来的,看到你也喜欢南京市民还是acm大佬果断...
jiyouzhan
这篇文章写得深入浅出，让我这个小白也看懂了！
潜心学习的匿名人士
该评论仅登录用户及评论双方可见
煎饼来一套
可以改一下吗？比如连续几次不健康才重启，避免随机干扰

TF 2.0 - 时间序列预测入门

Lucien • 2019 年 11 月 11 日

<div class="tip share">请注意，本文编写于 1720 天前，最后修改于 1720 天前，其中某些信息可能已经过时。</div>

<h1>TF 2.0 - 时间序列预测入门</h1><p>本文链接：<a href="https://blog.lucien.ink/archives/483/" target="_blank" >https://blog.lucien.ink/archives/483/</a></p><p>最近 Google 正式将 <code>TensorFlow 2.0</code> 作为默认 TensorFlow 版本了，作为一名初学者，决定用相对易用的新版的 TensorFlow 来进行实践。</p><p>在接下来的内容中，我将记录我用 LSTM 和 Beijing PM2.5 Data Set 来进行时间序列预测的过程。</p><p>因为 <code>ipynb</code> 文件里都包含图片，所以在文章里就不上图了哈。</p><h2>0. 环境</h2><table><thead><tr><th>Package</th><th>Version</th></tr></thead><tbody><tr><td><code>tensorflow</code></td><td><code>2.0.0</code></td></tr><tr><td><code>numpy</code></td><td><code>1.17.3</code></td></tr><tr><td><code>pandas</code></td><td><code>0.25.3</code></td></tr><tr><td><code>matplotlib</code></td><td><code>3.1.1</code></td></tr><tr><td><code>sklearn</code></td><td><code>0.21.3</code></td></tr></tbody></table><h2>1. 过程</h2><h3>1.1 数据集</h3><p><a href="https://blog.lucien.ink/go/aHR0cHM6Ly9hcmNoaXZlLmljcy51Y2kuZWR1L21sL2RhdGFzZXRzL0JlaWppbmcrUE0yLjUrRGF0YQ==" target="_blank" >Beijing PM2.5 Data Set</a> 源自位于北京的美国大使馆在 2010 ~ 2014 年每小时采集的天气及空气污染指数。 <br>　　数据集包括日期、PM2.5 浓度、露点、温度、风向、风速、累积小时雪量和累积小时雨量。</p><p>原始数据中完整的特征如下：</p><pre><code class="lang-plain">No 编号
year 年
month 月
day 日
hour 小时
pm2.5 PM2.5浓度
DEWP 露点
TEMP 温度
PRES 大气压
cbwd 风向
lws 风速
ls 累积雪量
lr 累积雨量</code></pre><p>可以用此数据集搭建预测 PM 2.5 的模型，利用前 x 个小时来预测后 y 个小时的 PM 2.5 数值。</p><pre><code class="lang-python">from TensorFlow import random
import TensorFlow.keras as keras
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
from sklearn.metrics import r2_score

# 固定随机种子
np.random.seed(10086)
random.set_seed(10010)

csv_data = keras.utils.get_file(&quot;PRSA_data.csv&quot;, &quot;https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv&quot;)

raw_df = pd.read_csv(csv_data)

raw_df.head()</code></pre><h3>1.2 数据预处理</h3><h4>1.2.1 删除时间戳</h4><p>目前的我认为，时间戳对于连续的时间序列预测来说并不重要，所以在这里先删掉。</p><pre><code class="lang-python"># 删除时间戳
df = raw_df.drop([&quot;No&quot;, &quot;year&quot;, &quot;month&quot;, &quot;day&quot;, &quot;hour&quot;], axis=1, inplace=False)

print(df.shape)
df.head()</code></pre><h4>1.2.2 删除 nan</h4><p><code>pm2.5</code> 列有的值是空值，由于数量不多，所以考虑直接将包括 <code>nan</code> 的行删掉。</p><pre><code class="lang-python"># 删除 pm2.5 列的 nan 值
df = df[pd.notna(df[&quot;pm2.5&quot;])]

print(df.shape)
df.head()</code></pre><h4>1.2.3 打印当前状态的数据</h4><pre><code class="lang-python"># 查看每列的 unique
for i in range(df.shape[1]):
    if df.columns[i] in [&quot;pm2.5&quot;, &quot;TEMP&quot;, &quot;DEWP&quot;, &quot;PRES&quot;]:
        continue
    print(&quot;{}: {}&quot;.format(df.columns[i], df[df.columns[i]].unique()))

# 画个图
columns = [&quot;pm2.5&quot;, &quot;DEWP&quot;, &quot;TEMP&quot;, &quot;PRES&quot;, &quot;Iws&quot;, &quot;Is&quot;, &quot;Ir&quot;]

plt.figure(figsize=(15, 15))
for i, each in enumerate(columns):
    plt.subplot(len(columns), 1, i + 1)
    plt.plot(df[each])
    plt.title(each, y=0.5, loc=&quot;right&quot;)  # center, left, right

plt.show()</code></pre><h4>1.2.4 将非数值类型的 label 转化为数值类型</h4><pre><code class="lang-python"># 将 label id 化
def label_fit_transform(data_frame, col_name):
    data_frame[col_name] = preprocessing.LabelEncoder().fit_transform(data_frame[col_name])
    return data_frame

label_fit_transform(df, &quot;cbwd&quot;).head()</code></pre><h4>1.2.5 将数值归一化</h4><p>归一化之后模型收敛会快一些，效果大概率会好一些，从感性角度去理解的话，我觉得 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly93d3cuemhpaHUuY29tL3F1ZXN0aW9uLzIwNDU1MjI3L2Fuc3dlci8xOTc4OTcyOTg=" target="_blank" >知乎上的这个回答</a> 说的非常好。</p><pre><code class="lang-python">def standardization(data_frame):
    buffer = data_frame.copy()
    min_max_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
    standard_values = min_max_scaler.fit_transform(buffer)
    for i, col_name in enumerate(buffer.columns):
        buffer[col_name] = standard_values[:, i]

return buffer

standardization(df).head()</code></pre><h4>1.2.6 将时间序列转化为有监督训练数据</h4><p>原始的时间序列并不能直接 feed 给模型，需要处理为 <code>input -&gt; label</code> 形式的数据才可以。</p><pre><code class="lang-python"># 转化为监督序列
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    &quot;&quot;&quot;
    Frame a time series as a supervised learning dataset.
    Arguments:
        data: Sequence of observations as a list or NumPy array.
        n_in: Number of lag observations as input (X).
        n_out: Number of observations as output (y).
        dropnan: Boolean whether or not to drop rows with NaN values.
    Returns:
        Pandas DataFrame of series framed for supervised learning.
    &quot;&quot;&quot;
    from pandas import DataFrame, concat

n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('%s(t-%d)' % (data.columns[j], i)) for j in range(n_vars)]

# forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('%s(t)' % (data.columns[j])) for j in range(n_vars)]
        else:
            names += [('%s(t+%d)' % (data.coumns[j], i)) for j in range(n_vars)]

# put it all together
    agg = concat(cols, axis=1)
    agg.columns = names

# drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg

# 通过过去 2 小时的数据来预测未来 1 小时的数据
look_back = 2
predict_forward = 1

# standard (supervised) data frame
sdf = series_to_supervised(
    standardization(
        label_fit_transform(df, &quot;cbwd&quot;)), look_back, predict_forward).drop(
        [
         &quot;DEWP(t)&quot;, &quot;TEMP(t)&quot;, &quot;PRES(t)&quot;, &quot;cbwd(t)&quot;, &quot;Iws(t)&quot;, &quot;Is(t)&quot;, &quot;Ir(t)&quot;
         ], axis=1, inplace=False).astype('float32')

sdf.head()

sdf.info()</code></pre><h4>1.2.7 划分数据集</h4><p>从网上了解到，$train$、$valid$、$test$ 三个集合的比例一般为 $6:2:2$。</p><pre><code class="lang-python"># train, valid, test 6:2:2 划分
total = sdf.shape[0]
split_point = [total * 60 // 100, total * 80 // 100]

print(&quot;total = {}, split_point = {}&quot;.format(total, split_point))

def transform(values):
    return values.reshape(values.shape[0], 1, values.shape[1])

train_data = sdf[:split_point[0]].values

valid_data = sdf[split_point[0]: split_point[1]].values

test_data = sdf[split_point[1]: ].values

print(&quot;train_data.shape = {}, valid_data.shape = {}, test_data.shape = {}&quot;.format(
    train_data.shape, valid_data.shape, test_data.shape))

train_x, train_y = transform(train_data[:, : -1]), train_data[:, -1]

valid_x, valid_y = transform(valid_data[:, : -1]), valid_data[:, -1]

test_x, test_y = transform(test_data[:, : -1]), test_data[:, -1]

print(&quot;train_x.shape = {}, train_y = {}&quot;.format(train_x.shape, train_y.shape))
print(&quot;valid_x.shape = {}, valid_y = {}&quot;.format(valid_x.shape, valid_y.shape))
print(&quot;test_x.shape = {}, test_y = {}&quot;.format(test_x.shape, test_y.shape))
</code></pre><h3>1.3 模型</h3><h4>1.3.1 构建网络</h4><pre><code class="lang-python">model = keras.Sequential()
model.add(keras.layers.LSTM(64, input_shape=(train_x.shape[1], train_x.shape[2])))
model.add(keras.layers.Dense(1))

model.compile(loss=&quot;mae&quot;, optimizer=&quot;adam&quot;)</code></pre><h4>1.3.2 训练并记录历史</h4><pre><code class="lang-python">history = model.fit(train_x,
                    train_y,
                    validation_data=(valid_x, valid_y),
                    epochs=32,
                    batch_size=32,
                    verbose=1,
                    shuffle=False)</code></pre><h3>1.4 模型效果评估</h3><h4>1.4.1 loss 图</h4><p>先画一下 train 和 valid 数据集的 loss 图，看起来没有 overfitting。</p><pre><code class="lang-python">plt.plot(history.history[&quot;loss&quot;], label=&quot;train loss&quot;)
plt.plot(history.history[&quot;val_loss&quot;], label=&quot;valid loss&quot;)
plt.legend()
plt.show()</code></pre><h4>1.4.2 在 test 数据集上进行评估</h4><h5>1.4.2.1 loss</h5><pre><code class="lang-python"># test 集上的 loss
model.evaluate(test_x, test_y, verbose=0)</code></pre><p>看起来很低的样子。</p><h5>1.4.2.2 将预测值和真值进行比较</h5><h6>1.4.2.2.1 获取预测结果</h6><pre><code class="lang-python">prediction = model.predict(test_x)</code></pre><h6>1.4.2.2.2 对预测出来的结果进行反归一化</h6><p>由于用的是 <code>MinMaxScaler</code>，所以直接按照公式逆着计算一下就可以。</p><pre><code class="lang-python">max_value = np.max(df[&quot;pm2.5&quot;])
min_value = np.min(df[&quot;pm2.5&quot;])
prediction = prediction[:, 0] * (max_value - min_value) + min_value</code></pre><h6>1.4.2.2.3 评估拟合能力</h6><pre><code class="lang-python"># 因为 look_back 处理时会去掉值为 nan 的 input，所以这里要加上 look_back
expectation = df[&quot;pm2.5&quot;][split_point[1] + look_back: ].values

print(&quot;prediction's shape = {}, expectation's shape = {}&quot;.format(prediction.shape, expectation.shape))

# 计算一下 R-square
print(r2_score(expectation, prediction, multioutput=&quot;raw_values&quot;))

plt.figure(figsize=(30, 17))
plt.plot(expectation, label=&quot;expectation&quot;)
plt.plot(prediction, label=&quot;predict&quot;, color=&quot;yellow&quot;, alpha=0.5)
plt.legend()
plt.show()</code></pre><h2>2. All in One</h2><p><code>ipynb</code> 文件可在 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9naXRodWIuY29tL0x1Y2llblNodWkvSGVsbG9NYWNoaW5lTGVhcm5pbmcvYmxvYi9tYXN0ZXIvTFNUTV93aXRoX0JlaWppbmdfUE0yXzVfRGF0YV9TZXQuaXB5bmI=" target="_blank" >我的 GitHub</a> 或者 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9jb2xhYi5yZXNlYXJjaC5nb29nbGUuY29tL2RyaXZlLzFVb2p1OWhicDRmTVc1OGp2WmVELVNJVVZseW9JaTZnZA==" target="_blank" >Google CoLab</a> 看到。</p><p><code>py</code> 文件可在 <a href="https://blog.lucien.ink/go/aHR0cHM6Ly9wYXN0ZW1lLmNuLzIxNDAz" target="_blank" >pasteme.cn/21403</a> 查看。</p>

TF 2.0 - 时间序列预测入门

TF 2.0 - 时间序列预测入门

0. 环境

1. 过程

1.1 数据集

1.2 数据预处理

1.2.1 删除时间戳

1.2.2 删除 nan

1.2.3 打印当前状态的数据

1.2.4 将非数值类型的 label 转化为数值类型

1.2.5 将数值归一化

1.2.6 将时间序列转化为有监督训练数据

1.2.7 划分数据集

1.3 模型

1.3.1 构建网络

1.3.2 训练并记录历史

1.4 模型效果评估

1.4.1 loss 图

1.4.2 在 test 数据集上进行评估

1.4.2.1 loss

1.4.2.2 将预测值和真值进行比较

1.4.2.2.1 获取预测结果

1.4.2.2.2 对预测出来的结果进行反归一化

1.4.2.2.3 评估拟合能力

2. All in One

发表评论取消回复

Codeforces-989C - A Mist of Florescence - 思维

让网站永久拥有HTTPS - 申请免费SSL证书并自动续期

Linux下SSR客户端的配置与开机自启

OpenWrt 安装 OpenClash

使用 vlmcsd 搭建微软 KMS 激活服务器

适用于 ACM 程序设计竞赛的 vimrc

UPC-5169 - Ingredients - 拓扑排序 & 动态规划

Codeforces 1080B - Margarite and the best present

在线剪切板、在线拼音输入法

UPCOJ-4148 - 洛谷P2566 - 围豆豆 - 状压spfa

TF 2.0 - 时间序列预测入门

TF 2.0 - 时间序列预测入门

0. 环境

1. 过程

1.1 数据集

1.2 数据预处理

1.2.1 删除时间戳

1.2.2 删除 nan

1.2.3 打印当前状态的数据

1.2.4 将非数值类型的 label 转化为数值类型

1.2.5 将数值归一化

1.2.6 将时间序列转化为有监督训练数据

1.2.7 划分数据集

1.3 模型

1.3.1 构建网络

1.3.2 训练并记录历史

1.4 模型效果评估

1.4.1 loss 图

1.4.2 在 test 数据集上进行评估

1.4.2.1 loss

1.4.2.2 将预测值和真值进行比较

1.4.2.2.1 获取预测结果

1.4.2.2.2 对预测出来的结果进行反归一化

1.4.2.2.3 评估拟合能力

2. All in One

发表评论 取消回复

TF 2.0 - 时间序列预测入门

发表评论取消回复