An apparatus includes a processor. The processor extracts a column from an external source for import into a database configured to store a set of columns including a first and second column. The processor splits the entries of the import column into a set of terms. The processor generates a first, second, and third vector based on the frequency of each term of the set of terms in the first, second, and import columns, respectively. The processor determines a first similarity measure between the first and third vectors and a second similarity measure between the second and third vectors. The first similarity measure is greater than the second. In response, the processor provides an indication to a user that the first column is a mapping candidate for the import column, such that entries of the import column may be stored in the database as additional entries in the first column.