India raw data comes in an immense variety of layouts and in multiple languages. We parse, cleanse and normalize data fields for easy consumption. For example, we separate voter’s relative name and relationship status into mother’s, father’s or husband’s name fields. We also derive an approximate year of birth from the voter’s age.
In addition to standardizing raw source data, we significantly enhance the voter roll by adding various data elements based on India Post. We add administrative divisions, such as Taluk, Circle, and District name to the datasets. We also add geographic coordinates to each record based on India PIN (Postal Index Number) Code data. The PIN Code is a 6-digit code of Post Office numbering used by India Post.