I'm trying to dedup a 14GB table with 11 million records and 203 columns and it times out after about 2 hours;
I'm trying a INSERT INTO new_table SELECT DISTINCT <200 columns> FROM old_table;
I also tried chunking it down to process 500,000 records at a time based on an ID range (that runs about 2 hours and times out).
Looses connection with the server: SQL Error (2013): Lost connection to MySQL server during query. / Connection to wisegarver.cd4bt5u8vv2s.us-east-2.rds.amazonaws.com closed at 2020-06-12 14:11:32 / / Affected rows: 0 Found rows: 0 Warnings: 0 Duration for 0 of 1 query: 0.000 sec. / / Connecting to wisegarver.cd4bt5u8vv2s.us-east-2.rds.amazonaws.com via MySQL (TCP/IP), username admin, using password: Yes ... / SELECT CONNECTION_ID(); / Connected. Thread-ID: 33 / / Characterset: utf8mb4 /
I am on AWS RDS mySQL/Aurora
Any ideas?