Unique Identification Project aka Aadhar Project

What is Unique Identification Project (aka Aadhar Project)?

The Unique Identification project (also known as the Aadhar project) implemented in India completed the collection of demographic and biometric data with a total of more than 500 million Indian residents-it is currently the same kind of biological in the world Identify the project with the largest scale.


The implementation of the project for several years has been accompanied by controversial voices from privacy and security and other aspects. The latest progress of the Aadhar project has raised concerns about its methods of capturing, storing, and managing data, especially the role played by MongoDB, an American startup.


MongoDB is a non-relational database (NoSQLdatabase) startup that raised funds from the In-Q-Tel agency funded by the US Central Intelligence Agency last year. In-Q-Tel is an independent, non-profit capital organization backed by the CIA and some other US intelligence agencies.


In the past few days, several Indian media reports have quoted the opinions of political parties and activists in the country, suspecting that the privacy data of the Aadhar project has been stolen, pointing directly to Nandan Nilekani, the co-founder of the project’s leader Infosys.

There are also some reports articles that include MongoDB as a target of criticism.

Governments around the world are increasingly vigilant about the eavesdropping actions of the National Security Agency (NSA), and anything that has the slightest connection to the US government’s intelligence agencies will be in full swing.

Not only that, because India’s universal suffrage is coming next year, the country’s political opinions have reached an unprecedented level.


The timing of such allegations cannot be worse, at least for this ambitious identification project, Aadhar is waiting for the passage of the Congress bill to become a constitutionally recognized body this year.


The author visited the Aadhar project office in Bangalore. To be honest, according to the staff who introduced me to information, although some people accused the large-scale contract of including data sharing with MongoDB.

In fact Aadhar uses MongoDB Open source code does not touch sensitive data. This meeting also had the opportunity to learn how the largest biometric database on earth currently works and how to deal with security and privacy risks.


Not only that, the Unique Identification Authority of India refuted allegations of sharing Indian national data with any US agency.


What does Aadhar mean for India?

The first thing is to sort out the context of talking about Aadhar. What does this project mean for a country like India?

The country’s population of more than 500 million people does not have any formal identification (ID) or other such certificates, which has led to many other problems, such as the inability to receive government subsidies, register bank accounts, apply for loans, obtain driver’s licenses, and so on.

The Aadhar database project is currently being recorded at the rate of adding 1 million Indian nationals every day. It is estimated that about 1.2 billion people will be registered by the end of next year, and it will become the largest biometric database on earth.


The biggest advantage of obtaining a 12-digit length Aadhar code is that the government of the country can link bank accounts with the poor, and use direct cash rights and other subsidies for bank transfers. At present, nearly 40 million bank accounts in India have been matched with Aadhar data.


According to a report by market research agency CLSA, more than 40% of the Indian government’s $ 250 billion in subsidies and other national treatments are targeted at the country’s poor, but they will be wasted in government corruption in the next few years.

The Aadhar plan can remove the middle part of the process and transfer cash directly to those who need government subsidies, in this way to curb corruption.


But there are also think tanks and activists, including the Centre for Internet & Society in Bangalore, who are always skeptical about privacy issues and even question how effective the entire project can be.


Deep into the world's largest biometric database

I have tried to meet with Aadhar project officials to understand the security issues, current progress, and their reactions to criticizing the use of MongoDB technology.


On Friday Aadhar finally agreed to meet me at the headquarters in the southern suburbs of Bangalore. The headquarters of Intel and Cisco in India are also located in the region.

From the outside, the Aadhar Technology Center, which stores all Indian national data (currently 5 Petabytes), does not look like a government building at all-it is easy to think that it is one of the nearby Intel or Cisco office buildings.


Unique Identification Project aka Aadhar Project

Entering the interior, I came to a room with a dozen TV screens in a central location.

Several young engineers in their twenties sat excitedly in front, tapping on their respective computer keyboards, inquiring about the storage of data package transmission information, the whole scene is much like an advanced control center.

The TV screen they stared at showed the records of these data packages (each about 5MB), starting from 30,000 entry centers across the country and going through at least three information verification processes.

Verification of Aadhaar Number

The verification process package repeats the inspection for each file to ensure that the same person will not be generated twice for the Aadhar number.


In other words, every time you create a new data file, you must run a "deduplication" test against all existing files. This number has now exceeded 500 million.


Former Intel engineer Srikanth Nadhamuni helped design Aadhar's technology platform in September 2010, which is currently operating at the Khosla laboratory in Bangalore. He told me that these data packets are all processed through 2048-bit encrypted storage, and once an unauthorized call attempt is made, the self-destruction function is triggered.


Criticisms against MongoDB

So why did Aadhar work with MongoDB in the first place? Will this cooperative relationship continue?


Sudhir Narayana, assistant director general of the Aadhar Technology Center, said that MongoDB is just one of several products initially selected for data retrieval. Others include MySQL, Hadoop and Hbase. Unlike MySQL, which can only store demographic data, MongoDB can also store images.


But then Aadhar gradually transferred most of the database work to the MySQL platform because they realized that MongoDB could not handle large-scale data, that is, millions of data packages.


At present, they are already using the "database sharding (database sharding)" technology: store data packages on different machines to ensure that the system will not crash when the amount of data increases.


This approach helped Aadhar reduce its dependence on MongoDB, and instead used MySQL to store most of the data.


Ashok Dalwai, deputy director general of Aadhar Technology Center, told me that MongoDB cannot call any biometric data.


"We believe that the use of open source technology can avoid over-reliance on a certain supplier, but this does not mean that we make a compromise on security in any way." AshokDalwai said.


When we contacted the interview, a MongoDB spokesperson suggested that we go to the company’s website to read the statement document about the investment in In-Q-Tel.


More importantly, India’s Unique Identification Authority (UIDAI) started using MongoDB’s open source software technology long before the startup received investment from In-Q-Tel.

Crunchbase data shows that MongoDB only raised a total of $ 7.7 million from Red Hat, Intel Capital, and In-Q-Tel in 2012.


What is the outlook for Aadhar?

Aside from all the controversy, Aadhar will complete the goal of entering more than 1.2 billion Indian national data in 2014, and the total database will reach 15 petabytes.

The current project is progressing at a rate of 1 million people per day. Starting next year, it will achieve a rate of approximately 2 million people per day. The remaining 700 million people will be included in this database system.

Post a comment