Monitor job
This page explains how to monitor a submitted job
After you submit any job on our platform - training or validation - you can view the live metrics of your latest submitted job on our website in the /livetraining route (the route is same for training and validation both). The /livetraining route also allows you to download job as a ZIP and abort the submitted job.
Training
Metrics available:
Status - Status tells you the stage of the submitted job including
INITIALIZATION, IN PROCESS, TRAINING, ERROR OCCURED, UPLOADING RESULTS, FINISHED.Site-wise steps - After the training process starts, you will see a table on the dashboard indicating the steps completed versus total steps for each client site.
Validation accuracy vs epochs - Under the
Show Metricsbutton, you can see the tensorboard metrics for global model's validation accuracy across each site by epoch.Validation loss vs steps - Under the
Show Metricsbutton, you can see the tensorboard metrics for the validation loss against local steps for each site.Communication Logs - Under the
Show Logsbutton, you can see the live communication logs of the federated learning process which gives you an insight on the current state of the process
Validation
Metrics available:
Status - Status tells you the stage of the submitted job including
INITIALIZATION, IN PROCESS, VALIDATION, ERROR OCCURED, UPLOADING RESULTS, FINISHED.Communication Logs - Under the
Show Logsbutton, you can see the live communication logs of the federated validation process which gives you an insight on the current state of the process
Last updated