The principle of using AI with personal data

Large amounts of data is created/ collected from everywhere around us. For example, human behavior of using many functions in smart phones and in internet platforms which have already been transformed into a digital world, so that AI and Machine Learning could learn and evaluate data into something useful for business and society. It can help with the forecasting, prediction, and automation by using those datasets along with human analysis to create and enhance users' experience so that the business can keep track of their customers and stay loyal to the brand or business.​

As a consequence of using personal data by companies, many countries have formed the common data protection framework; including Thailand, so the companies are well considered for personal privacy when using customer personal data for their personalization work. It is a big challenge for businesses to be concerned about the use of customers behavior data and at the same time protecting their privacy when using it. The binding of data to a specific purpose in a business is the most important thing that must be respected when working with personal data, that it should be under the law. In this article, i would like to highlight the mindset or the questions everyone needs to be concerned and aware of when collecting personal data and managing them as part of Data Governance.

The principle and concept

1. Know your data existence, where it comes from, what’s in it, what it means. It all starts from 5Ws; Who, What, Why, When, Where and How.

  • Who: the Who of the data tells us whom we have collected data either individual or anonymize.

  • What: what data has been collected and measured. Also stick with the practice of data minimization.

  • Why: reveals the most about our data, since it also gives insight into the Who, What, and How of the data being used.

  • When: how long will data be kept, and the plan to delete the data.

  • Where: where to keep the data safe.

  • How: protecting sensitive data and how to retrieve data (encryption) in harddrive (data at rest or data in transit) and to practice pseudonymization. And most importantly, we need to have data flow from the beginning of collecting data until the end used.

2. Build data protection by default, is about considering data protection and privacy issues upfront in everything you do along the journey, from collecting data, analyse , pass through, delete and manage its risk of using data. It can help you ensure that you comply with the GDPR’s fundamental principles and requirements, and that it should be part of the Data Governance.

3. Use FAT

Fairness: the collection of data should reduce bias and ensure fairness to a group of people. As the model will work differently while individual fairness requires that each applicant be evaluated independently of any broader context during the training data of AI/ML.

Accountability: it requires controllers and processors to take responsibility for their processing activities and for how they comply with data protection principles. Having risk assessment and records in place to demonstrate your compliance is key.

Transparency: the ability to access and work with data no matter where they are located or what application created them.

Overall, the use of personal data should rely on the rule of using it, and others’ data should be protected just like ours.