Troubleshooting steps to start training on Training and Learning Suite 2.0.
Started the training process with the following steps, the training did not start and no error messages are shown.
- Installed Training and Learning Suite 2.0
- Uploaded and labeled images
- Selected base model
- Clicked the play button to start training
There are two issues that may cause the training process not to start. Use sudo docker logs -f tls_core to print the log file and analyze the error message.
Follow these steps if the tlsapiui:2.0 docker image is restarting:
- Open a terminal, navigate to the training-learning-suite-2.0/webservices/components/cvat directory and run the following commands
sudo docker-compose down
sudo -E docker-compose -f docker-compose.yml -f ../../../docker-compose.cvat.override.yml up -d
- From the terminal, navigate to the training-learning-suite-2.0/ directory and run the following commands
sudo docker-compose down
sudo -E docker-compose up -d
- Verify the docker container is no longer restarting
sudo docker ps
Follow these steps if you are seeing RDB write errors:
- In the TLS2.0 directory, update the tlsredis.Dockerfile as below:
Change FROM redis:6-alpine to FROM redis:6.0-alpine
- Go to thirdparty/security directory, run ls -l command to list all the files. You will see these 6 files below having userID and groupID as tls:tls by default.
-rw-r--r-- 1 tls tls 1419 Feb 19 14:30 TLS_apiui_cert.crt
-rw------- 1 tls tls 1675 Feb 19 14:30 TLS_apiui_key.pem
-rw-r--r-- 1 tls tls 1590 Feb 19 14:30 TLS_core_cert.crt
-rw------- 1 tls tls 2455 Feb 19 14:30 TLS_core_key.pem
-rw-r--r-- 1 tls tls 2228 Feb 19 14:30 TLS_server_cert.crt
-rw------- 1 tls tls 4803 Feb 19 14:30 TLS_server_key.pem
- Run the following commands to change the userID and groupID to your default userID and groupID prior to rebuilding the redis container,
sudo chown <userID>:<groupID> TLS_apiui_cert.crt
sudo chown <userID>:<groupID> TLS_apiui_key.pem
sudo chown <userID>:<groupID> TLS_core_cert.crt
sudo chown <userID>:<groupID> TLS_core_key.pem
sudo chown <userID>:<groupID> TLS_server_cert.crt
sudo chown <userID>:<groupID> TLS_server_key.pem
- Navigate to TLS2.0 directory and rebuild the docker container.
sudo docker-compose build --no-cache tls_redis
- After build successfully , navigate to thirdparty/security directory and revert back the 6 files to the default userID and groupID as tls:tls using the same command earlier as below.
sudo chown tls:tls TLS_apiui_cert.crt
sudo chown tls:tls TLS_apiui_key.pem
sudo chown tls:tls TLS_core_cert.crt
sudo chown tls:tls TLS_core_key.pem
sudo chown tls:tls TLS_server_cert.crt
sudo chown tls:tls TLS_server_key.pem
- Navigate to the main TLS2.0 directory and restart the containers.
sudo docker-compose down
sudo -E docker-compose up -d
- Verify all containers are up and running and proceed to the TLS2.0 Web interface.
sudo docker ps