Box Logs Processing Job
This job downloads logs from the Box service and makes the data available via Tableau.
Servers
- Dev/Test – eaapi-test-01.uit.tufts.edu
- Prod – eaapi-prod-01.uit.tufts.edu
Job details
- Owned by user apiadm
- Files located at /home/apiadm/boxLogs
- Main script is /home/apiadm/boxLogs/boxLogs.sh
- Cron job scheduled Mon-Fri at 4:00am
- Source code (with further documentation) is in GitLab at https://gitlab.it.tufts.edu/EA-APIs/box-logs-api and https://gitlab.it.tufts.edu/EA-APIs/box-logs-token-app
Known Issues
Next Stream Position is Blank
Occasionally the job will fail partway through with an error stating "Next stream position is blank". This indicates that the connection was interrupted while downloading the log files. The error seems to be more common when there is a large amount of data to be downloaded – and as a result, if the job fails due to this error, when it is run again the following day it will likely hit the error again due to the large amount of backlogged data from the previous day.
To resolve this issue, simply manually run the script at /home/apiadm/boxLogs/boxLogs.sh. It will automatically continue to process data from the point where it previously left off. If there is a large amount of backlogged data to process, it may take a while (30+ minutes) but it will eventually work its way through all of the data. If the "Next stream position is blank" is generated again, just manually run the script again. When the job has successfully processed all of the data, it will write a number of JSON files to the 'eventFiles' directory and then print 'Process complete'.
Steps to Resolve
- Log into the eaapi server via SSH
- sudo su - apiadm
- cd /home/apiadm/boxLogs
- ./boxLogs.sh
- Repeat step 4 if necessary until 'Process complete' is printed
Successful job completion