Lack of data validation is commonly pinpointed as the most common failure when assessing application security weakness. Employing it correctly, however; has a significant effect on not only ensuring security but also encouraging input completion, efficiency, consistency and the minimization of errors in data captured by information systems.
Understanding these identified benefits raises the obvious question – “Why isn’t good validation employed in all places”? The simple truth is that it isn’t easy. As the demands of data capture and tracking grow, validation rules must be developed in sync. Doing so – on both the client and server side of applications — is a tedious process and tedium begets errors.
In this problem, you’ll be provided with two data sets. The first is nearly three million City / Country pairs extracted from a source system before true data validation was employed. The second – known to be correct – data set maps the country codes in the first set to the corresponding country description. The output of your program should accurately match each entry to its correct/cleaned city spelling and country description.
The output file should be provided as a pipe '|' delimited .txt file. For each transaction of the input file it should contain four fields: Input_City, Input_CountryCode, Output_City, Output_CountryName. Other descriptions or Lat Long pairs may be added as additional output fields.
You’ll be evaluated solely on the basis of a correct match percentage,
Understanding these identified benefits raises the obvious question – “Why isn’t good validation employed in all places”? The simple truth is that it isn’t easy. As the demands of data capture and tracking grow, validation rules must be developed in sync. Doing so – on both the client and server side of applications — is a tedious process and tedium begets errors.
In this problem, you’ll be provided with two data sets. The first is nearly three million City / Country pairs extracted from a source system before true data validation was employed. The second – known to be correct – data set maps the country codes in the first set to the corresponding country description. The output of your program should accurately match each entry to its correct/cleaned city spelling and country description.
The output file should be provided as a pipe '|' delimited .txt file. For each transaction of the input file it should contain four fields: Input_City, Input_CountryCode, Output_City, Output_CountryName. Other descriptions or Lat Long pairs may be added as additional output fields.
You’ll be evaluated solely on the basis of a correct match percentage,