Annoyingly, I got saddled with a database for a system which contained a whole lot of “uncleaned data”, in other words data with a lot of trailing and leading, uncessary whitespace characters, as well as a generous helping of apostrophes and quotation marks – the cause of many a headache when working on a web-based application using PHP and MySQL.
Needless to say, inherent relationships and thus joins were failing left, right and center thanks to this badly formed data and as such I needed to quickly whip up a script that would strip all leading and trailing whitespace plus replace any apostrophes or quotation marks with their web code equivalent for each and every value that appeared in every row, column and table in the database.
And this is the little PHP script that did it for me:
//just to help prevent the script from timing out ini_set('memory_limit', '100M'); ini_set('default_socket_timeout',600); ini_set('max_input_time',600); ini_set('max_execution_time',600); //give us something to look at on the screen echo "Start Process..."; //create a database connection $mysql_hostname = 'localhost'; $mysql_user = 'username'; $mysql_password = 'password'; $mysql_database = 'databaseName'; $conn = mysql_connect($mysql_hostname,$mysql_user,$mysql_password); mysql_select_db($mysql_database, $conn); //build up the list of tables to process $sql = "SHOW TABLES"; $tables = array(); echo 'Building audit tables list...'; $tablestemp = $db->query($sql,2); foreach ($tablestemp as $tabletemp) { $tables[] = $tabletemp; } echo 'Audit table list built.'; //run through each table, build up a column list and then run an update SQL statement against the column foreach($tables as $table) { set_time_limit(80); $sql = "SHOW COLUMNS FROM `$table`"; $columns = array(); echo 'Building audit columns list for ' . $table . '...'; $columnstemp = $db->query($sql,2); foreach ($columnstemp as $columntemp) { $columns[] = $columntemp; } echo 'Audit column list for ' . $table . ' built.'; foreach($columns as $column) { set_time_limit(80); //trim and replace in one foul SQL swoop! $sql = "UPDATE `$table` SET `" . $column['Field'] . "` = TRIM(REPLACE(REPLACE(REPLACE(`" . $column['Field'] ."`,"'","'"),'"','"'),'~','''))"; echo "$sql"; mysql_query($sql); } } mysql_close($conn); echo "End Process.";
And there you go, pretty handy little helper script to have in your possession when you are presented a database with some dirty data included inside! :)