A Teacher/coach Can Create an Optimal Coaching Environment by:

Reinforcement Learning Coach environments with Cartpole and Atari Optimized by OpenVINO Toolkit

Introduction

The RL Charabanc OpenVINO procedure

Running experiments on RL Coach

          omnibus –p        

Installation of Anaconda IDE

Creation of Anaconda Python 3.5 surround

          conda create -northward py35 python=3.5 anaconda        
          source activate py35        

Installation of pre-requisites for Reinforcement Learning Coach

Irresolute the requirements.txt file
          pip3 install -e .        
          coach –l        

Cartpole –v0

Breakout Game environment

List of presets available in RL Charabanc

OpenVINO Toolkit Optimizer process

                      <INSTALL_DIR>/deployment_tools/model_optimizer/install_prerequisites                  
                      install_prerequisites_tf.sh                  
          model_name.meta          model_name.index          model_name.data-00000-of-00001 (digit function may vary)          checkpoint (optional)        
          jitney -r -p <preset_name>        
          omnibus –s 60        

Folders

Reinforcement Learning Coach preparation process with an environment

          (py35) abhi@abhi-HP-Pavilion-Notebook:~$ coach -r -p Atari_NEC -lvl breakout -s 60          Please enter an experiment proper noun: Atari_NEC          Creating graph — proper noun: BasicRLGraphManager          Creating agent — proper noun: agent          Alert:tensorflow:From /home/abhi/anaconda3/envs/py35/lib/python3.5/site-packages/rl_coach/architectures/tensorflow_components/heads/dnd_q_head.py:76: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.          Instructions for updating:          keep_dims is deprecated, use keepdims instead          simple_rl_graph: Starting heatup          Heatup — Name: main_level/agent Worker: 0 Episode: 1 Total advantage: one.0 Exploration: 0.1 Steps: 52 Training iteration: 0          Heatup — Name: main_level/amanuensis Worker: 0 Episode: 2 Full advantage: 0.0 Exploration: 0.ane Steps: 76 Preparation iteration: 0          Heatup — Proper noun: main_level/amanuensis Worker: 0 Episode: 3 Total reward: 0.0 Exploration: 0.i Steps: 98 Grooming iteration: 0        

The training process going on.
          (py35) abhi@abhi-HP-Pavilion-Notebook:/opt/intel/computer_vision_sdk_2018.four.420/deployment_tools/model_optimizer$ python mo_tf.py — input_meta_graph ~/experiments/Atari_NEC/17_01_2019–03_29/checkpoint/0_Step-605.ckpt.meta          Model Optimizer arguments:          Mutual parameters:          - Path to the Input Model: None          - Path for generated IR: /opt/intel/computer_vision_sdk_2018.4.420/deployment_tools/model_optimizer/.          - IR output proper name: 0_Step-605.ckpt          - Log level: SUCCESS          - Batch: Non specified, inherited from the model          - Input layers: Not specified, inherited from the model          - Output layers: Non specified, inherited from the model          - Input shapes: Non specified, inherited from the model          - Mean values: Not specified          - Scale values: Non specified          - Scale factor: Not specified          - Precision of IR: FP32          - Enable fusing: True          - Enable grouped convolutions fusing: True          - Move mean values to preprocess section: False          - Reverse input channels: Simulated          TensorFlow specific parameters:          - Input model in text protobuf format: False          - Offload unsupported operations: False          - Path to model dump for TensorBoard: None          - List of shared libraries with TensorFlow custom layers implementation: None          - Update the configuration file with input/output node names: None          - Utilise configuration file used to generate the model with Object Detection API: None          - Operations to offload: None          - Patterns to offload: None          - Use the config file: None          Model Optimizer version: 1.iv.292.6ef7232d        

Inferring using our model

          ./rl_coach -one thousand <xmlbin path> -i <algorithm> -d CPU          ./rl_coach -m 0060.xml -i NEC -d CPU        
The cartpole experiment before training
The Cartpole experiment after grooming

Decision

keatingflid1947.blogspot.com

Source: https://medium.com/intel-software-innovators/rl-coach-environments-with-cartpole-and-atari-optimized-by-open-vino-toolkit1-6088349bf657

0 Response to "A Teacher/coach Can Create an Optimal Coaching Environment by:"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel