Sometimes you have a process to load data from a flat file to a database, even from a database to your DW.
But if something goes wrong? How do you know if your load process finished successfully or not? If not, how do you reprocess the files without duplicate records?
This post will bring an option to try addressing those problems.
The main ideas of control process are (1) generate an ID for the process (2) save a timestamp at beginning (3) save ID on all the tables controlled (4) save a timestamp at the end.
So, if some problem occurs and aborts the job, next time you will be able to identify all data inserted by that job, delete them and insert again.
I’m assuming PostgreSQL as Database. For others databases might be necessary some adjusts.