Further automatize Merging productions
Merging productions should be further automatized. The goal of Merging is to start from a large dataset, example 10000 files to produce a single merged file. To do so, currently we define 1 production with for instance 2 merging steps where:
- the first step groups files by 50 (GroupSize=50) and produces e.g. 10000/50=200 files
- then for the 2nd step we calculate in advance the nb of files produced by the 1st step, e.g. 200 and we set the GroupSize accordingly (e.g. GroupSize=200). So that only 1 job is created to merge all the input files and which produces a single final merged file
However, under some conditions this logic cannot work (some examples will be given in this issue) and we end up with the 2nd step creating more than 1 job and hence more than 1 file.
We should implement a different logic to automatically handle this use case.