On May 1,
the White House released a report: "Big Data: A Technological Perspective". The report was prepared by the President's Council of
Advisors on Science and Technology (PCAST), a group of leading scientists and
engineers that make policy recommendations to the President on important issues.
The President had asked PCAST to prepare a report on the privacy implications
of Big Data.
The
Report
Undoubtedly, some will question the need for such measures as it is a change in the way they have operated in the past with respect to geospatial datasets. However, it is important for the geospatial community to recognize that privacy concerns have evolved, due to in part to the rapid technological advancements that they helped create. Other sectors – finance, medical, education – that collect and use data are required to take these steps. As geospatial technology moves into the mainstream, and the number and variety of commercial uses grow, geospatial companies can expect to become subject to similar requirements. The alternative could be much worse.
However, that is beginning to change. According to the
report, the privacy concerns associated with born-analog datasets are that they
"likely contain more information than the minimum necessary for their
immediate purpose." Data minimization – collecting the minimum
required to perform the task at hand – is one of the tenets of privacy protection
around the world. While the report
acknowledges that there are a number of technological and business reasons for
this to occur, the authors suggests that there are inherent privacy risks with
such an approach. For example, “[a] consequence is that born-analog data will
often contain information that was not originally expected. Unexpected
information could in many cases lead to unanticipated beneficial products and
services, but it could also give opportunities for unanticipated misuse.”(p.23)
The line as to whether a use constitutes
an unanticipated benefit or an unanticipated misuse often depends upon your
point of view.
Many in the geospatial community
have believed they are immune from the privacy discussions because the technology
they use is not capable of “identifying” a specific individual. For example,
satellite and most aerial images are not of sufficient quality to identify an
individual’s face or read a license plate. However, privacy risks have evolved.
For example, the report cites the increased power of data fusion in connection with
born-analog data and states that the risks are not simply in “identifying” an
individual but also in developing correlations and creating profiles.
“Data
fusion occurs when data from different sources are brought into contact and new
facts emerge (See section 3.2.2). Individually, each data source may have a
specific limited purpose. Their combination, however, may
uncover new meanings. In particular, data fusion can result in the
identification of individual people, the creation of profiles of an individual
and the tracking of an individual’s activities. More broadly, data analytics
discovers patterns and correlations in large corpuses of data, using
increasingly powerful statistical algorithms. If those data include personal
data, the inferences flowing from data analytics may then be mapped backed to
inferences, both certain and uncertain about individuals” (p.x)
The report then goes on to describe various types of technologies
that create born-analog data that contains “personal information”. The geospatial
community relies on many of these for their products and services, including
(i) video from . . . overhead drones; (ii) imaging infrared video; and (iii)
synthetic aperture radar (SAR). (p 22) The report also identifies privacy risks
associated with LiDAR, acknowledging that while LiDAR is important to
governments, industry and a broad range of academic disciplines, “[s]cene
extraction is an example of inadvertent capture of personal information and can
be used for data fusion to reveal personal information.” (p. 27) In addition,
the report cites the privacy risks associated with “precise geolocation in
imagery from satellites and drones”. (p. 28)
The report makes several recommendations to the
President. The most relevant to the geospatial community are:
·
Policy attention
should focus more on the actual uses of big data and less on its collection and
analysis;
·
Policies and
regulation, at all levels of government, should not embed particular
technological solutions, but rather should be stated in terms of intended
outcomes; and,
·
The United States
should take the lead both in the international arena and at home by adopting
policies that stimulate the use of practical privacy-protecting technologies
that exist today. It can exhibit leadership both by its convening power
(for instance, by promoting the creation and adoption of standards) and also
by its own procurement practices (such as its own use of privacy-preserving
cloud services).
What does the Report Mean For the Geospatial Community?
It is unlikely that the White House report will result in
any laws being passed in this session of Congress that will specifically
address privacy risks associated with born-analog data. However, the report has
reframed the discussion on privacy in a way that will have a direct impact on
the geospatial community. For example, suppliers of geospatial data products
and services to the federal government soon may be required to certify that
they are taking proper steps to protect any personal information acquired from
born-analog data. The geospatial community also should expect that regulators,
such as the Federal Trade Commission - and the Federal Aviation Agency with
respect to UAVs – will begin citing the findings of this report in future discussion
on policies and regulations. Lawyers will
also likely cite the report to influence court decisions on matters regarding privacy
concerns associated with geospatial data.
As a result, organizations that collect, use, store
and/or distribute geospatial data should consider taking a number of steps.
These include:
-
Conducting an inventory
of their born-analog data to identify potential privacy risks;
-
Developing privacy
policies (external) and privacy statements (internal) with respect to born-analog
datasets that do (or could) contain personal information;
-
Incorporating explicit
language requiring compliance with privacy laws and regulations in their vendor
and customer agreements; and
-
Training employees who
work with born-analog data on privacy and internal procedures.
In addition, if geospatial datasets are deemed by law to
contain “personal information”, there may be additional obligations imposed
upon geospatial organizations. For example, they may be required to implement
specific information security measures, such as encryption, when the data is
transferred or stored. Geospatial organizations may also become subject
to state data breach laws, which details specific steps to be taken if networks
are hacked, or certain data is lost or stolen.
No comments:
Post a Comment