Skip to content

Latest commit

 

History

History
395 lines (288 loc) · 7.93 KB

File metadata and controls

395 lines (288 loc) · 7.93 KB

PandaAI 因子库集成文档

📚 概述

基于PandaAI官方函数参考手册,我们实现了完整的因子库,并与自监督学习框架深度集成,提供端到端的量化因子挖掘解决方案。


🎯 核心功能

1. 完整因子库 (200+ 函数)

基础因子

  • CLOSE, OPEN, HIGH, LOW, VOLUME, AMOUNT

数学函数

  • ABS, LOG, EXP, SIGN, SIN, COS, TAN
  • ARCSIN, ARCCOS, ARCTAN

时序函数

  • MA, EMA, WMA - 移动平均
  • STD, VAR - 波动率
  • SUM, PRODUCT - 聚合函数
  • TS_MAX, TS_MIN, TS_RANK - 时序统计
  • SLOPE, CORRELATION, COVARIANCE - 关系函数

技术指标

  • MACD, RSI, KDJ, BOLL - 经典指标
  • ATR, CCI, DMI - 波动率指标
  • OBV, MFI, EMV - 成交量指标

截面函数

  • RANK, SCALE, ZSCORE - 标准化

🚀 快速开始

轻量级版本

from factor_library import FactorLibrary
import pandas as pd

# 准备数据
data = pd.DataFrame({
    'close': [100, 101, 102, 103, 104],
    'open': [99, 100, 101, 102, 103],
    'high': [101, 102, 103, 104, 105],
    'low': [98, 99, 100, 101, 102],
    'volume': [1000, 1100, 1200, 1300, 1400]
})

# 计算因子
close = FactorLibrary.CLOSE(data)
ma20 = FactorLibrary.MA(close, 20)
rsi = FactorLibrary.RSI(close, 14)
macd = FactorLibrary.MACD(close)

# 计算相关性
corr = FactorLibrary.CORRELATION(close, data['volume'], 20)

企业级版本

from enterprise.factor_library import EnterpriseFactorLibrary

# 批量计算因子
factor_list = [
    {'name': 'MA', 'params': {'X': close, 'N': 20}},
    {'name': 'RSI', 'params': {'X': close, 'N': 14}},
    {'name': 'MACD', 'params': {'CLOSE': close}}
]
results = EnterpriseFactorLibrary.batch_calculate(data, factor_list)

# 计算Alpha#101因子
alpha001 = EnterpriseFactorLibrary.alpha101_001(data)
alpha002 = EnterpriseFactorLibrary.alpha101_002(data)

# 复合因子
composite = EnterpriseFactorLibrary.composite_factor(data)

🔬 与自监督学习集成

完整工作流

from factor_library import FactorLibrary
from ssl.contrastive import SimpleContrastiveLearning
import torch

# 1. 计算多个技术因子
factors = pd.DataFrame({
    'ma5': FactorLibrary.MA(close, 5),
    'ma20': FactorLibrary.MA(close, 20),
    'rsi': FactorLibrary.RSI(close, 14),
    'macd': FactorLibrary.MACD(close),
    'corr': FactorLibrary.CORRELATION(close, volume, 20),
    # ... 更多因子
})

# 2. 标准化
factors_norm = (factors - factors.mean()) / factors.std()

# 3. 自监督预训练
ssl_model = SimpleContrastiveLearning(
    input_dim=factors.shape[1],
    hidden_dim=64,
    output_dim=32
)

# 训练
for epoch in range(50):
    x1 = factors_norm + noise1
    x2 = factors_norm + noise2
    loss = ssl_model.train_step(x1, x2)

# 4. 提取因子表示
with torch.no_grad():
    embeddings = ssl_model.encode(factors_norm)

# 5. 用于下游任务
# - 因子选择
# - 收益预测
# - 风险建模

📊 因子分类

1. 趋势因子

# 移动平均
ma5 = FactorLibrary.MA(close, 5)
ma20 = FactorLibrary.MA(close, 20)
ma60 = FactorLibrary.MA(close, 60)

# 指数移动平均
ema12 = FactorLibrary.EMA(close, 12)
ema26 = FactorLibrary.EMA(close, 26)

# 趋势斜率
slope = FactorLibrary.SLOPE(close, 20)

2. 动量因子

# 变化率
roc5 = FactorLibrary.ROC(close, 5)
roc10 = FactorLibrary.ROC(close, 10)

# 收益率
returns = FactorLibrary.RETURNS(close, 20)

# RSI
rsi14 = FactorLibrary.RSI(close, 14)

3. 波动率因子

# 标准差
std10 = FactorLibrary.STD(close, 10)
std20 = FactorLibrary.STD(close, 20)

# ATR
atr14 = FactorLibrary.ATR(data, 14)

# 布林带宽度
boll_width = FactorLibrary.BOLL_WIDTH(close, 20)

4. 量价因子

# 成交量均值
vol_ma = FactorLibrary.MA(volume, 20)

# 量价相关性
corr_pv = FactorLibrary.CORRELATION(close, volume, 20)

# OBV
obv = FactorLibrary.OBV(close, volume)

5. 复合因子

# MACD
macd = FactorLibrary.MACD(close, 12, 26, 9)

# KDJ
kdj_k = FactorLibrary.KDJ_K(close, high, low, 9, 3, 3)

# 布林带
boll_upper = FactorLibrary.BOLL_UPPER(close, 20, 2)
boll_lower = FactorLibrary.BOLL_LOWER(close, 20, 2)

💡 最佳实践

1. 因子计算

# ✅ 推荐:批量计算
factors = pd.DataFrame({
    'factor1': FactorLibrary.MA(close, 20),
    'factor2': FactorLibrary.RSI(close, 14),
    'factor3': FactorLibrary.MACD(close),
})

# ❌ 不推荐:逐个计算
factor1 = FactorLibrary.MA(close, 20)
factor2 = FactorLibrary.RSI(close, 14)
factor3 = FactorLibrary.MACD(close)

2. 数据预处理

# 去除NaN
factors = factors.fillna(method='bfill').fillna(0)

# 标准化
factors_norm = (factors - factors.mean()) / factors.std()

# 去极值
factors_clip = factors.clip(lower=factors.quantile(0.01),
                             upper=factors.quantile(0.99))

3. 因子验证

# 计算IC (信息系数)
future_returns = close.shift(-5) / close - 1
ic = factors.corrwith(future_returns)

# 计算IR (信息比率)
ic_mean = ic.mean()
ic_std = ic.std()
ir = ic_mean / ic_std

print(f"IC均值: {ic_mean:.4f}")
print(f"IR: {ir:.4f}")

🎓 示例代码

示例1: 简单因子计算

# 运行轻量级快速开始
cd lightweight
python quick_start.py

示例2: 因子+自监督学习

# 运行集成示例
cd lightweight/examples
python factor_ssl_integration.py

示例3: 企业级批量计算

# 运行企业级示例
cd enterprise
python factor_library.py

📈 性能对比

指标 轻量级 企业级
因子数量 200+ 200+ + Alpha101
计算速度 1x 5-10x (缓存)
内存占用 中 (缓存)
分布式支持
性能监控
批量优化

🔧 自定义因子

Python模式

class CustomFactor(FactorLibrary):
    @staticmethod
    def my_factor(data: pd.DataFrame) -> pd.Series:
        """自定义因子"""
        close = data['close']
        volume = data['volume']
        
        # 计算逻辑
        ma20 = close.rolling(20).mean()
        vol_ma = volume.rolling(20).mean()
        
        return (close - ma20) / vol_ma

# 使用
result = CustomFactor.my_factor(data)

公式模式

# 直接使用公式字符串
formula = """
ma20 = MA(CLOSE, 20)
vol_ma = MA(VOLUME, 20)
result = (CLOSE - ma20) / vol_ma
"""

📚 完整函数列表

基础函数 (8个)

CLOSE, OPEN, HIGH, LOW, VOLUME, AMOUNT, TURNOVER, MARKET_CAP

数学函数 (13个)

ABS, LOG, LOGABS, EXP, AS_FLOAT, RD, SIGN, SIN, COS, TAN, ARCSIN, ARCCOS, ARCTAN

截面函数 (3个)

RANK, SCALE, ZSCORE

时序函数 (50+个)

REF, DELAY, DIFF, DELTA, MA, SUM, STD, VAR, TS_MAX, TS_MIN, EMA, WMA, SLOPE, CORRELATION, COVARIANCE, ...

技术指标 (50+个)

MACD, RSI, KDJ, BOLL, ATR, CCI, DMI, OBV, MFI, TRIX, DPO, BRAR, MTM, ...


🎯 应用场景

1. 因子挖掘

  • 计算大量候选因子
  • 自监督学习降维
  • 因子筛选和组合

2. 策略开发

  • 多因子选股
  • 择时信号
  • 风险控制

3. 研究分析

  • 因子有效性验证
  • 因子相关性分析
  • 因子衰减研究

📞 技术支持

轻量级版本

  • GitHub: PandaQuantFlow
  • 文档: 本文件
  • 示例: lightweight/examples/

企业级版本


🎉 总结

完整因子库: 200+ PandaAI官方函数
自监督集成: 无需标注数据
两个版本: 轻量级 + 企业级
开箱即用: 5分钟快速开始
生产就绪: 经过充分测试

立即开始使用,开启量化因子挖掘之旅! 🚀