How to gather and group all tuple elements in a NextFlow channel by the first and second indexes of each tuple, perform a process on a file containing the first and second indices, and then reform the original channel by scattering:
hc_ch = Channel.of(
['abc', 1, file('file1.txt')],
['abc', 1, file('file2.txt')],
['abc', 1, file('file3.txt')],
['def', 2, file('file4.txt')]
)
combinedChannel = hc_ch.groupTuple(by: [0, 1])
Expected structure of combinedChannel:
[ ['abc', 1, [file('file1.txt'), file('file2.txt'), file('file3.txt')]],
['def', 2, [file('file4.txt')]]
]
process deleteJointCalledVcf {
input:
tuple val(id), val(sub_id), val(files) from combinedChannel
output:
tuple val(id), val(sub_id), val(files) into processedChannel
script:
"""
rm ${id}_${sub_id}.txt
"""
}
Expected structure of processedChannel:
[ ['abc', 1, [file('file1.txt'), file('file2.txt'), file('file3.txt')]],
['def', 2, [file('file4.txt')]]
]
flattenedChannel = processedChannel.flatMap { id, sub_id, files -> files.collect { [id, sub_id, it] } }
Expected structure of flattenedChannel:
[ ['abc', 1, file('file1.txt')],
['abc', 1, file('file2.txt')],
['abc', 1, file('file3.txt')],
['def', 2, file('file4.txt')]
]