Predictor

Description

The predictor project is a comprehensive tool for timeseries prediction, equipped with a robust plugin architecture. This project allows for both local and remote configuration handling, as well as replicability of experimental results. The system can be extended with custom plugins for various types of neural networks, including artificial neural networks (ANN), convolutional neural networks (CNN), long short-term memory networks (LSTM), and transformer-based models. Examples of the aforementioned models are included alongside with historical EURUSD and other training data in the examples directory.

Installation Instructions

To install and set up the predictor application, follow these steps:

Clone the Repository:

git clone https://github.com/harveybc/predictor.git
cd predictor

Add the clonned directory to the Windows or Linux PYTHONPATH environment variable:

In Windows a close of current command line promp may be required for the PYTHONPATH varible to be usable. Confirm you added the directory to the PYTHONPATH with the following commands:

On Windows, run:
```
echo %PYTHONPATH%
```
On Linux, run:
```
echo $PYTHONPATH 
```

If the clonned repo directory appears in the PYTHONPATH, continue to the next step.

Create and Activate a Virtual Environment (Anaconda is required):
- Using conda:
```
conda create --name predictor-env python=3.9
conda activate predictor-env
```

Install Dependencies:

pip install --upgrade pip
pip install -r requirements.txt

Build the Package:
```
python -m build
```
Install the Package:
```
pip install .
```
(Optional) Run the predictor:
- On Windows, run the following command to verify installation (it uses all default valuex, use predictor.bat --help for complete command line arguments description):
```
predictor.bat --load_config examples\config\phase_1\phase_1_ann_6300_1h_config.json
```
- On Linux, run:
```
sh predictor.sh --load_config examples\config\phase_1\phase_1_ann_6300_1h_config.json
```
(Optional) Run Tests: For pasing remote tests, requires an instance of harveybc/data-logger
- On Windows, run the following command to run the tests:
```
set_env.bat
pytest
```
  pytest
(Optional) Generate Documentation:
- Run the following command to generate code documentation in HTML format in the docs directory:
```
pdoc --html -o docs app
```
(Optional) Install Nvidia CUDA GPU support:

Please read: Readme - CUDA

Usage

Example config json files are located in examples\config, for a list of individual parameters to call via CLI or in a config json file, use: predictor.bat --help

After executing the prediction pipeline, the predictor will generate 4 files:

output_file: csv file, predictions for the selected time_horizon (see defaults in app\config.py)
results_file: csv file, aggregated results for the configured number of iterations of the training with the selected number of training epochs
loss_plot_file: png image, the plot of error vs epoch for training and validation in the last iteration
model_plot_file: png image, the plot of the used Keras model

The application supports several command line arguments to control its behavior for example:

usage: predictor.bat --load_config examples\config\phase_1\phase_1_ann_6300_1h_config.json --epochs 100 --iterations 5

There are many examples of config files in the examples\config directory, also training data of EURUSD and othertimeseries in examples\data and the results of the example config files are set to be on examples\results, there are some scripts to automate running sequential predictions in examples\scripts.

Distributed NEAT Optimization (via DOIN Network)

The predictor integrates with doin-node for distributed NEAT hyperparameter optimization using an island-model approach. Multiple GPU nodes collaboratively optimize TCN model parameters, sharing champions via blockchain.

Data Format

Input CSVs must contain only two columns: DATE_TIME and the target column (e.g., typical_price). All additional features (temporal encodings, window statistics) are generated online by the stl_preprocessor plugin during training, controlled by NEAT-optimizable parameters.

DATE_TIME,typical_price
2024-01-01 00:00:00,1.10234
2024-01-01 04:00:00,1.10156
...

NEAT-Optimizable Parameters

The NEAT optimizer can evolve these parameters (defined in hyperparameter_bounds):

Parameter	Range	Description
`window_size`	[48, 160]	Input sliding window length
`tcn_filters`	[16, 128]	TCN convolutional filters
`tcn_kernel_size`	[2, 7]	TCN kernel size
`tcn_stack_layers`	[1, 4]	TCN residual stacks
`tcn_dilations_per_stack`	[2, 6]	Dilations per stack
`tcn_head_layers`	[1, 3]	Dense head layers per horizon
`tcn_head_units`	[16, 64]	Units per head layer
`use_temporal_features`	[0, 1]	Enable sincos temporal features (hod/dow/moy)
`hod_encoding`	[0, 2]	Hour-of-day encoding: 0=none, 1=onehot, 2=sincos
`dow_encoding`	[0, 2]	Day-of-week encoding
`moy_encoding`	[0, 2]	Month-of-year encoding
`add_window_stats`	[0, 1]	Enable rolling std/ema/price-minus-ema features
`add_multi_scale_returns`	[0, 1]	Enable multi-scale return features
`loss_type`	[0, 4]	Loss: 0=mae, 1=huber, 2=trend_sigma, 3=pearson, 4=soft_dtw
`use_log1p_features`	[0, 1]	Apply log1p transform to target column
`positional_encoding`	[0, 1]	Sinusoidal positional encoding on input
`learning_rate`	[1e-5, 1e-2]	AdamW learning rate
`batch_size`	[16, 64]	Training batch size
`tcn_dropout`	[0.0, 0.3]	Dropout rate
`l2_reg`	[1e-7, 1e-3]	L2 regularization

Base model starts with 7 input features: 1 price + 6 temporal sincos (when use_temporal_features=1 with sincos encodings). NEAT can optionally add 6 more window stats features (rolling_std, rolling_ema, price_minus_ema for 2 periods) by evolving add_window_stats=1.

GPU Environment

For NVIDIA GPUs, set these environment variables before launching to prevent GPU memory pre-allocation:

export TF_FORCE_GPU_ALLOW_GROWTH=true    # MUST be "true", NOT "1" (TF rejects "1" silently)
export TF_GPU_ALLOCATOR=cuda_malloc_async

Without these, the parent process allocates all GPU memory, leaving none for subprocess candidates.

If CUDA was installed via pip install tensorflow[and-cuda] (no system /usr/local/cuda), you also need:

NB=$CONDA_PREFIX/lib/python3.12/site-packages/nvidia
export LD_LIBRARY_PATH="${NB}/cudnn/lib:${NB}/cublas/lib:${NB}/cuda_runtime/lib:${NB}/cufft/lib:${NB}/curand/lib:${NB}/cusolver/lib:${NB}/cusparse/lib:${NB}/cuda_cupti/lib:${NB}/nvjitlink/lib:${NB}/cuda_nvrtc/lib:${NB}/nccl/lib"

Without LD_LIBRARY_PATH, TensorFlow silently falls back to CPU (check with nvidia-smi — 0% GPU means it's not working).

Optimization Config

The optimization config file (e.g., examples/config/phase_1_daily/optimization/phase_1_tcn_neat_1d_optimization_config.json) defines:

Data files (train/val/test CSVs)
Plugin selection (tcn, neat_optimizer, stl_preprocessor, stl_pipeline)
NEAT parameters (population_size, n_generations, mutation rates)
Hyperparameter bounds
Default values for non-optimized parameters

Running Locally (Single Node)

export TF_FORCE_GPU_ALLOW_GROWTH=1
export TF_GPU_ALLOCATOR=cuda_malloc_async

predictor --load_config examples/config/phase_1_daily/optimization/phase_1_tcn_neat_1d_optimization_config.json

Running Distributed (DOIN Network)

See the doin-node README for multi-node deployment instructions.

Champion Training (No Optimization)

To retrain the best solution found by the distributed optimization as a standalone candidate:

predictor --load_config examples/config/phase_1_daily/phase_1_tcn_neat_champion_1d_training_config.json

Optimization Results & Metabase Integration

Results are stored in examples/results/phase_1_daily/:

File	Description
`phase_1_tcn_neat_1d_optimization_stats.json`	Per-generation statistics (champion fitness, MAE, species count)
`phase_1_tcn_neat_1d_optimization_parameters.json`	Best champion hyperparameters found
`phase_1_tcn_neat_1d_optimization_resume.json`	Full NEAT population state for resuming optimization
`phase_1_tcn_neat_1d_rss.csv`	Memory usage log per candidate evaluation

The blockchain SQLite database from doin-node contains the full experiment history across all nodes and can be imported into Metabase for visualization. See the doin-node README for Metabase setup instructions.

Directory Structure

predictor/
│
├── app/                                 # Main application package
│   ├── __init__.py                     # Package initialization
│   ├── cli.py                          # Command-line interface handling
│   ├── config.py                       # Default configuration values
│   ├── config_handler.py               # Configuration management
│   ├── config_merger.py                # Configuration merging logic
│   ├── data_handler.py                 # Data loading and saving functions
│   ├── data_processor.py               # Core data processing pipeline
│   ├── main.py                         # Application entry point
│   ├── plugin_loader.py                # Dynamic plugin loading system
│   ├── reconstruction.py               # Data reconstruction utilities
│   └── plugins/                        # Prediction plugins directory
│       ├── predictor_plugin_ann.py     # Artificial Neural Network plugin
│       ├── predictor_plugin_cnn.py     # Convolutional Neural Network plugin
│       ├── predictor_plugin_lstm.py    # Long Short-Term Memory plugin
│       └── predictor_plugin_transformer.py # Transformer model plugin
│
├── tests/                              # Test suite directory
│   ├── __init__.py                    # Test package initialization
│   ├── conftest.py                    # pytest configuration
│   ├── acceptance_tests/              # User acceptance tests
│   ├── integration_tests/             # Integration test modules
│   ├── system_tests/                  # System-wide test cases
│   └── unit_tests/                    # Unit test modules
│
├── examples/                           # Example files directory
│   ├── data/                           # Example training data
│   └── scripts/                        # Example execution scripts
│
├── setup.py                           # Package installation script
├── predictor.bat                      # Windows execution script
├── predictor.sh                       # Linux execution script
├── set_env.bat                        # Windows environment setup
├── set_env.sh                         # Linux environment setup
├── requirements.txt                    # Python dependencies
├── LICENSE.txt                        # Project license
└── prompt.txt                         # Project documentation

Example of plugin model:

graph TD

    subgraph SP_Input ["Input Processing (Features Only)"]
        I[/"Input (ws, num_channels)"/] --> FS{"Split Features"};

        subgraph SP_Branches ["Feature Branches (Parallel)"]
             FS -- Feature 1 --> F1_FLAT["Flatten"] --> F1_DENSE["Dense x M"];
             FS -- ... --> F_DOTS["..."];
             FS -- Feature n --> Fn_FLAT["Flatten"] --> Fn_DENSE["Dense x M"];
        end

        F1_DENSE --> M{"Merge Concat Features"};
        F_DOTS --> M;
        Fn_DENSE --> M;
    end

    subgraph SP_Heads ["Output Heads (Parallel)"]

        subgraph Head1 ["Head for Horizon 1"]
            M --> H1_DENSE["Dense x K"];
            H1_DENSE --> H1_BAYES{"DenseFlipout (Bayesian)"};
            H1_DENSE --> H1_BIAS["Dense (Bias)"];
            H1_BAYES --> H1_ADD{"Add"};
            H1_BIAS --> H1_ADD;
            H1_ADD --> O1["Output H1"];
        end

         subgraph HeadN ["Head for Horizon N"]
            M --> HN_DENSE["Dense x K"];
            HN_DENSE --> HN_BAYES{"DenseFlipout (Bayesian)"};
            HN_DENSE --> HN_BIAS["Dense (Bias)"];
            HN_BAYES --> HN_ADD{"Add"};
            HN_BIAS --> HN_ADD;
            HN_ADD --> ON["Output HN"];
        end

    end

    O1 --> Z((Final Output List));
    ON --> Z;

    subgraph Legend
         NoteM["M = config['intermediate_layers']"];
         NoteK["K = config['intermediate']"];
         NoteNoFB["NOTE: Diagram simplified - Feedback loops not shown."];
    end

    style H1_BAYES,HN_BAYES fill:#556B2F,stroke:#333,color:#fff;
    style H1_BIAS,HN_BIAS fill:#4682B4,stroke:#333,color:#fff;
    style NoteM,NoteK,NoteNoFB fill:#8B413,stroke:#333,stroke-dasharray:5 5,color:#fff;

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
app		app
doin-data-predictor		doin-data-predictor
examples		examples
olap		olap
optimizer_plugins		optimizer_plugins
pipeline_plugins		pipeline_plugins
predictor_plugins		predictor_plugins
preprocessor_plugins		preprocessor_plugins
target_plugins		target_plugins
tests		tests
tools		tools
.gitignore		.gitignore
BINARY_PREDICTOR_PROMPT.md		BINARY_PREDICTOR_PROMPT.md
LICENSE.txt		LICENSE.txt
README.md		README.md
analyze_candidate_history.py		analyze_candidate_history.py
generate_binary_configs.py		generate_binary_configs.py
ls_pred.bat		ls_pred.bat
predictor.bat		predictor.bat
predictor.sh		predictor.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
set_env.bat		set_env.bat
set_env.sh		set_env.sh
setup.py		setup.py
test_all_binary_configs.sh		test_all_binary_configs.sh
test_fixed_binary_configs.sh		test_fixed_binary_configs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictor

Description

Installation Instructions

Usage

Distributed NEAT Optimization (via DOIN Network)

Data Format

NEAT-Optimizable Parameters

GPU Environment

Optimization Config

Running Locally (Single Node)

Running Distributed (DOIN Network)

Champion Training (No Optimization)

Optimization Results & Metabase Integration

Directory Structure

Example of plugin model:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Predictor

Description

Installation Instructions

Usage

Distributed NEAT Optimization (via DOIN Network)

Data Format

NEAT-Optimizable Parameters

GPU Environment

Optimization Config

Running Locally (Single Node)

Running Distributed (DOIN Network)

Champion Training (No Optimization)

Optimization Results & Metabase Integration

Directory Structure

Example of plugin model:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages