原创文章第240篇,专注“个人成长与财富自由、世界运作的逻辑与投资"。
今天做排序学习算法在ETF行业轮动上的策略,我们选用的DBDT框架是lightGBM,它的优点就是快且效果不错。
我们的候选集是29个行业ETF:
etfs = [
'159870.SZ',
'512400.SH',
'515220.SH',
'515210.SH',
'516950.SH',
'562800.SH',
'515170.SH',
'512690.SH',
'159996.SZ',
'159865.SZ',
'159766.SZ',
'515950.SH',
'159992.SZ',
'159839.SZ',
'512170.SH',
'159883.SZ',
'512980.SH',
'159869.SZ',
'515050.SH',
'515000.SH',
'515880.SH',
'512480.SH',
'515230.SH',
'512670.SH',
'515790.SH',
'159757.SZ',
'516110.SH',
'512800.SH',
'512200.SH',
]
我们使用的alpha数据集:
这里的因子列表可以再扩展的,可以替换成qlib的alpha158,或者是补充更多的技术指标。
class Alpha:
def __init__(self):
pass
def get_feature_config(self):
return self.parse_config_to_fields()
def get_label_config(self):
return ["shift(close, -5)/shift(open, -1) - 1", "qcut(shift(close, -5)/shift(open, -1) - 1,20)"
], ["label_c", 'label']
@staticmethod
def parse_config_to_fields():
# ['CORD30', 'STD30', 'CORR5', 'RESI10', 'CORD60', 'STD5', 'LOW0',
# 'WVMA30', 'RESI5', 'ROC5', 'KSFT', 'STD20', 'RSV5', 'STD60', 'KLEN']
fields = []
names = []
windows = [5, 10, 20, 30, 60]
fields += ["corr(close/shift(close,1), log(volume/shift(volume, 1)+1), %d)" % d for d in windows]
names += ["CORD%d" % d for d in windows]
fields += ['close/shift(close,20)-1']
names += ['roc_20']
return fields, names
下面是梯度提升树的代码:
from quant.datafeed.dataset import DataSet
import joblib, os
class LGBModel:
def __init__(self, load_model=False, feature_cols=None):
self.feature_cols = feature_cols
if load_model:
path = os.path.dirname(__file__)
self.ranker = joblib.load(path + '/lgb.pkl')
def _prepare_groups(self, df):
df['day'] = df.index
group = df.groupby('day')['day'].count()
return group
def predict(self, data):
data = data.copy(deep=True)
if self.feature_cols:
data = data[self.feature_cols]
pred = self.ranker.predict(data)
return pred
def train(self, ds: DataSet):
X_train, X_test, y_train, y_test = ds.get_split_data()
X_train_data = X_train.drop('symbol', axis=1)
X_test_data = X_test.drop('symbol', axis=1)
# X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=1)
query_train = self._prepare_groups(X_train.copy(deep=True)).values
query_val = self._prepare_groups(X_test.copy(deep=True)).values
query_test = [X_test.shape[0]]
import lightgbm as lgb
gbm = lgb.LGBMRanker()
gbm.fit(X_train_data, y_train, group=query_train,
eval_set=[(X_test_data, y_test)], eval_group=[query_val],
eval_at=[5, 10, 20], early_stopping_rounds=50)
print(gbm.feature_importances_)
joblib.dump(gbm, 'lgb.pkl')
模型训练好了之后,我们使用algo的机器学习“模块”来加载,预测:
from quant.context import ExecContext
class ModelPredict:
def __init__(self, model):
self.model = model
def __call__(self, context: ExecContext):
context.bar_df['pred_score'] = self.model.predict(context.bar_df)
return False # 需要继续处理
整合的代码依然简洁:
env = Env(ds)
from quant.context import ExecContext
from quant.algo.algos import *
from quant.algo.algo_model import ModelPredict
from quant.models.gbdt_l2r import LGBModel
model = LGBModel(load_model=True, feature_cols=ds.features)
env.set_algos([
RunWeekly(),
ModelPredict(model=model),
SelectTopK(K=2, order_by='pred_score', b_ascending=False),
WeightEqually()
])
env.backtest_loop()
env.show_results()

年化33.8%,夏普1.22。(整合回测框架代码,数据,策略均上传到星球,请大家前往下载我的开源项目及知识星球)
作为对比,如果我们把pred_score按最差的买入。
结果是如下这样,也侧面印证我们排序的正确性。

明天把alpha因子集扩充,然后加入前向滚动回测看效果。
闲庭独坐对闲花, 轻煮时光慢煮茶, 不问人间烟火事, 任凭岁月染霜华。
财富自由的生活,值得期待。



















