Integration pipelines fails due to external data sources unavailability
Summary
The integration pipeline that retrieves GDAS data often fails due to unstable access to the RDA.
At the moment I had to override the requirement for pipelines to succeed in order to merge the recent MRs.
We know from the successful pipeline runs in the past that the functionality is working. Also, perhaps this integration routine is outdated and we need to make it closer to our real-life operations:
- Attempt to fetch GDAS data
- In case of failure, activate an alternative scenario and construct a contemporary MDP based on the observation date.
Related to #79 (closed)
What is the expected correct behavior?
Failed data retrieval from GDAS should not be the show-stopper.
Relevant logs and/or screenshots
INFO [workflow ] start
INFO [workflow ] starting step uc-dpps-cp-114
INFO [step uc-dpps-cp-114] start
INFO [job uc-dpps-cp-114] /tmp/ysa_5m40$ request_das \
--grib_download_path=download_dir
2023-05-26 09:36:25,972 rdams ERROR Returned JSON can't be decoded: Expecting value: line 1 column 1 (char 0)
2023-05-26 09:36:25,974 CRITICAL [calibpipe.DataRequest.MolecularAtmosphereCalibrator] (molecular_atmosphere_calibrator.request_rda_data): Request ID can't be retrieved, request can't be purged.Manual intervention is required to purge the request!
Response content:
{}
INFO [job uc-dpps-cp-114] Max memory used: 24MiB
WARNING [job uc-dpps-cp-114] exited with status: 1
ERROR [job uc-dpps-cp-114] Job error:
("Error collecting output for parameter 'grib_dir': uc-dpps-cp-114.cwl:32:7: Did not find output file with glob pattern: ['download_dir'].", {})
WARNING [job uc-dpps-cp-114] completed permanentFail
ERROR [step uc-dpps-cp-114] Output is missing expected field file:///builds/cta-computing/dpps/calibrationpipeline/calibpipe/calibpipe/tests/workflows/atmosphere/uc-dpps-cp-110.cwl#uc-dpps-cp-114/grib_dir
WARNING [step uc-dpps-cp-114] completed permanentFail
INFO [workflow ] completed permanentFail
{
"contemporary_atmo_model": null
}WARNING Final process status is permanentFail
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: command terminated with exit code 1